Supported Data Types
Supported Data Types#
Numeric Data Types#
Type | Notes |
---|---|
NUMBER | Default precision and scale are (38,0) |
DECIMAL, NUMERIC | Synonymous with NUMBER |
INT, INTEGER | Synonymous with NUMBER except precision and scale cannot be specified |
FLOAT | Displayed as FLOAT but stored as DOUBLE |
DOUBLE | Synonymous with FLOAT |
NUMBER#
Numbers up to 38 digits, with an optional precision and scale:
- Precision: Total number of digits allowed
- Scale: Number of digits allowed to the right of the decimal point
By default, precision is 38
and scale is 0
, i.e. NUMBER(38,0)
The maximum scale (number of digits to the right of the decimal point) is 37. Numbers that have fewer than 38 significant digits, but whose least significant digit is past the 37th decimal place, for example 0.0000000000000000000000000000000000000012 (1.2e-39), cannot be represented without losing some digits of precision.
INFO: If data is converted to another data type with lower precision, then back to the higher-precision form, the data can lose precision. For example, you lose precision if you convert a NUMBER(38,37) value to DOUBLE (which has a precision of approximately 17 decimal digits), and then back to NUMBER.
DECIMAL, NUMERIC#
Synonymous with NUMBER
INTEGER, INT, BIGINT, SMALLINT, TINYINT, BYTEINT#
Synonymous with NUMBER, except that precision and scale cannot be specified, i.e. always defaults to NUMBER(38,0)
. Therefore, for all INTEGER data types, the range of values is all integer values from -99999999999999999999999999999999999999 to +99999999999999999999999999999999999999 (inclusive).
The various names (TINYINT, etc.) are to simplify porting from other systems and to suggest the expected range of values for a column of the specified type.
FLOAT#
Datameer uses double-precision (64 bit) IEEE 754 floating-point numbers.
Precision is approximately 15 digits. For example, for integers, the range is from -9007199254740991 to +9007199254740991 (-253 to +253). Floating-point values can range from approximately 10-308 to 10+308. More extreme values between approximately 10-324 and 10-308 can be represented with less precision. For more details, see the Wikipedia article on double-precision numbers.
Floating point operations can have small rounding errors, especially when the operands have different precision or scale. Errors can accumulate, especially when aggregate functions (e.g. SUM() or AVG()) process large numbers of rows. Errors can vary each time the query is executed if the rows are processed in a different order (e.g. if partitioned differently across a distributed system). Casting to a fixed-point data type before aggregating can reduce or eliminate these errors.
DOUBLE, DOUBLE PRECISION, REAL#
Synonymous with FLOAT.
String Data Types#
Type | Notes |
---|---|
VARCHAR | Default and maximum is 16,777,216 bytes |
CHAR | Synonymous with VARCHAR except default length is VARCHAR(1) |
STRING | Synonymous with VARCHAR |
TEXT | Synonymous with VARCHAR |
VARCHAR#
VARCHAR holds Unicode characters. When you declare a column of type VARCHAR, you can specify an optional parameter (N), which is the maximum number of characters to store.
If no length is specified, the default is the maximum allowed length (16,777,216).
Although a VARCHAR’s maximum length is specified in characters, a VARCHAR is also limited to a maximum number of bytes (16,777,216 (16 MB)). The maximum number of Unicode characters that can be stored in a VARCHAR column is shown below:
Single-byte: 16,777,216.
Multi-byte: Between 8,388,608 (2 bytes per character) and 4,194,304 (4 bytes per character).
For example, if you declare a column as VARCHAR(16777216), the column can hold a maximum of 8,388,608 2-byte Unicode characters, even though you specified a maximum length of 16777216.
A column consumes storage for only the amount of actual data stored. For example, a 1-character string in a VARCHAR(16777216) column only consumes a single character.
CHAR#
Synonymous with VARCHAR, except that if the length is not specified, CHAR(1) is the default.
STRING#
Synonymous with VARCHAR.
Logical Data Types#
BOOLEAN#
BOOLEAN can have TRUE or FALSE values. BOOLEAN can also have an “unknown” value, which is represented by NULL. Boolean columns can be used in expressions (e.g. SELECT list), as well as predicates (e.g. WHERE clause).
Date and Time Data Types#
Type | Notes |
---|---|
DATE | - |
DATETIME | Alias for TIMESTAMP_NTZ |
TIME | - |
TIMESTAMP | Alias for one of the TIMESTAMP variations (TIMESTAMP_NTZ by default). |
TIMESTAMP_LTZ | TIMESTAMP with local time zone; time zone, if provided, is not stored. |
TIMESTAMP_NTZ | TIMESTAMP with no time zone; time zone, if provided, is not stored. |
TIMESTAMP_TZ | TIMESTAMP with time zone. |
DATE#
Datameer supports a single DATE data type for storing dates (with no time elements). DATE accepts dates in the most common forms (YYYY-MM-DD
, DD-MON-YYYY
, etc.). In addition, all accepted timestamps are valid inputs for dates; however, the TIME information is truncated.
DATETIME#
DATETIME is an alias for TIMESTAMP_NTZ.
TIME#
Datameer supports a single TIME data type for storing times in the form of HH:MI:SS
. TIME supports an optional precision parameter for fractional seconds, e.g. TIME(3). Time precision can range from 0 (seconds) to 9 (nanoseconds). The default precision is 9.
All TIME values must be between 00:00:00
and 23:59:59.999999999
. TIME internally stores “wallclock” time, and all operations on TIME values are performed without taking any time zone into consideration.
TIMESTAMP#
TIMESTAMP in Datameer is a user-specified alias associated with one of the TIMESTAMP variations. In all operations where TIMESTAMP is used, the associated TIMESTAMP variation is automatically used.
TIMESTAMP_LTZ#
TIMESTAMP_LTZ internally stores UTC time with a specified precision. However, all operations are performed in the current session’s time zone, controlled by the TIMEZONE session parameter.
TIMESTAMP_NTZ#
TIMESTAMP_NTZ internally stores “wallclock” time with a specified precision. All operations are performed without taking any time zone into account.
If the output format contains a time zone, the UTC indicator (Z)
is displayed.
TIMESTAMP_NTZ is the default for TIMESTAMP.
TIMESTAMP_TZ#
TIMESTAMP_TZ internally stores UTC time together with an associated time zone offset. When a time zone is not provided, the session time zone offset is used. All operations are performed with the time zone offset specific to each record.
Semi-structured Data Types#
VARIANT#
A VARIANT can store a value of any other type, including OBJECT and ARRAY, up to a maximum size of 16 MB. VARIANT stores both the value and the data type of the value. You can retrieve the value’s native data type by using the TYPEOF function.
A value of any data type can be implicitly cast to a VARIANT value, subject to size restrictions. value (without an explicit cast) and a NUMERIC value are coerced into the same type, the NUMERIC value is cast to VARIANT.
To convert a value to or from the VARIANT data type, you can explicitly cast using the CAST function, the TO_VARIANT function or the '::' operator
OBJECT#
An OBJECT contains key-value pairs.
ARRAY#
An ARRAY contains 0 or more pieces of data. Each element is accessed by specifying its position in the array.
Geospatial Data Types#
GEOGRAPHY#
Datameer uses the GEOGRAPHY data type, which models Earth as though it were a perfect sphere. The GEOGRAPHY data type follows the WGS 84 standard (spatial reference ID 4326; for details, see https://spatialreference.org/ref/epsg/wgs-84/).
Points on the earth are represented as degrees of longitude (from -180 degrees to +180 degrees) and latitude (-90 to +90). Altitude is currently not supported. Line segments are interpreted as geodesic arcs on the Earth’s surface.
Unsupported Data Types#
Currently the following data types are not supported in Datameer. Therefore all columns with those unsupported data types are invisible in Datameer.
The following data types are not supported:
- BINARY/ VARBINARY