Data Types
Like all other computer languages, SQL deals with data. So let's first look at how SQL defines data.
Data Type: A group of data that shares some common characteristics and operations.
SQL defines the following data types:
- Character String - A sequence of characters from a predefined character set.
- Bit String - A sequence of bit values: 0 or 1.
- Exact Number - A numeric value who's precision and scale need to be preserved. Precision and scale can be counted at decimal level or binary level. The decimal precision of a numerical value is the total number of significant digits in decimal form. The decimal scale of a numerical value is the number of fractional digits in decimal form. For example, the number 123.45 has a precision of 5 and a scale of 2. The number 0.012345 has a precision of 6 and a scale of 6.
- Approximate Number - A numeric value who's precision needs to be preserved, and scale floated to its exponent. An approximate number is always expressed in scientific notation of "mantissa"E"exponent". Note that an approximate number has two precisions: mantissa precision and exponent precision. For example, the number 0.12345e1 has a mantissa precision of 5 and exponent precision of 1.
- Date and Time - A value to represent an instance of time. A date and time value can be divided into many portions and related them to a predefined calendar system as year, month, day, hour, minute, second, second fraction, and time zone. A date and time value also has a precision, which controls the number of digits of the second fraction portion. For example: 1999-1-1 1:1:1.001 has precision of 3 on the second fraction portion.
Data Binary Representations
Now we know what types of data SQL must work with. The next step is to understand how different types of data are represented in binary forms. Since computers can only work with binary digits, we have to represent all data in computer memory in binary forms.
1. Character String - A character string is usually represented in memory as an array of characters. Each character is represented in 8 bits (one byte) or 16 bits (two bytes) based on the character set and the character encoding schema. For example, with ASCII character set and its encoding schema, character "A" will be represented as "01000001". Character "1" will be represented as "00110001". Character string "ABC" will be represented as "010000010100001001000011".
2. Bit String - The binary representation of a bit string should be easy. A bit string should be represented in memory as it is. Bit string "01000001" should be represented as "01000001". There might an issue with memory allocation, because computer allocates memory in units of bytes (8 bits per byte). If the length of a bit string is not multiples of 8 bits, the last allocated byte is not full. How to handle the empty space in the last byte? I guess different SQL implementation will have different rules.
3. Exact Number - Exact numbers can be divided into two groups: integers and non-integers. An integer is an exact number with scale of 0. An integer is represented in either 4 bytes or 8 bytes based on the signed binary value system. For example, with 4 bytes, integer "1" will be represented as "00000000000000000000000000000001". Integer "-1" will be represented as "1111111111111111111111111111111".
As for exact non-integer numbers, I don't know exactly how they will be represented in binary forms.
4. Approximate Number - An approximate number is normally represented in binary form according to the IEEE 754 single-precision or double-precision standards in either 4 bytes or 8 bytes. The binary representation is divided into 3 components with different number of bits assigned to each components:
Code:
Sign Exponent Fraction Total
Single-Precision 1 8 23 32
Double-Precision 1 11 52 64
With the double precision standard, the mantissa precision can go up to 52 binary digits, about 15 decimal digits.
5. Data and Time - A date and time value is usually stored in memory as an exact integer number with 8 bytes representing an instance by measuring the time period between this instance and a reference time point in millisecond precision, second fraction precision of 3. How MySQL is store date and time values? We will try to find out later.
Data Literals
Now we know the types of data, and how they are stored in memory. Next we need know how data can get in to the computer. One way is to enter it through the program source code as a data literal.
Data Literal: An program source element that represents a data value. Data literals can be divided into multiple groups depending the type of the data it is representing and how it is representing.
1. Character String Literals are used to construct character strings, exact numbers, approximate numbers and data and time values. The syntax rules of character string literals are pretty simple:
- A character string literal is a sequence of characters enclosed by quote characters.
- The quote character is the single quote character "'".
- If "'" is part of the sequence, it needs to be doubled it as "''".
Examples of character string literals:
Quote:
'Hello world!'
'Loews L''Enfant Plaza'
'123'
'0.123e-1'
'1999-01-01'
|
2. Hex String Literals are used to construct character strings and exact numbers. The syntax rules for hex string literals are also very simple:
- A hex string literal is a sequence of hex digits enclosed by quote characters and prefixed with "x".
- The quote character is the single quote character "'".
Examples of hex string literals:
Code:
x'41424344'
x'31323334'
x'31323334'
x'01'
x'0001'
x'ff'
x'ffffffff'
x'ffffffffffffffff'
3. Numeric Literals are used to construct exact numbers and approximate numbers. Syntax rules of numeric literals are:
- A numeric literal can be written in signed integer form, signed real numbers without exponents, or real numbers with exponents.
Examples of numeric literals:
Quote:
1
-22
33.3
-44.44
55.555e5
-666.666e-6
|
4. Date and Time Literals are used to construct date and time values. The syntax of date and time literals are:
- A date literal is written in the form of "DATE 'yyyy-mm-dd'".
- A time literal is written in the form of "TIMESTAMP 'yyyy-mm-dd hh:mm:ss'".
Examples of data and time literals:
Quote:
DATE '1999-01-01'
TIMESTAMP '1999-01-01 01:02:03'
|