Search This Blog

Wednesday, April 14, 2010

Data File Formats in bteq and load utilities

Bteq :

While creating the data file (exporting from table) the following modes are accepted.
DATA : Record Mode (Raw Binary Data).
INDICDATA : contains indicator bytes.number of indicator bytes = 1 bit for each column.that bit will be
set to 1 if the value for a perticular column is NULL else it will set to 0.

REPORT : Field Mode (column Headings).data is exported 1 column at a time. default format.
DIF : Data Interchange Format . to transport data to various pc programs.


while importing target table from data file the following modes are accepted.
DATA :
INDICDATA :
VARTEXT : each column is separated by a delimiter.


FastLoad,MultiLoad,FastExport,TPUMP,TPT
there are 5 data formats available for load or unload utilities.
Formatted :
Each record is in a format traditionally known as FastLoad or Teradata format. The data is prefixed with the data length and followed by and end-of-record (EOR) marker. This is the most stringent and robust data format and will allow any TPT supported data type to be included.
Let’s consider a record containing 6 bytes of inferred text data "abcdef";

x'00 06 61 62 63 64 65 66 0a' and in character: "..abcdef."

Text :
There is no length indicator as with formatted data (above). Records are separated only by an end-of-record (EOR) marker. To insure data integrity, each record must be entirely character data.
The EOR marker can be either a single-byte linefeed (X'0A') or a double-byte carriage-return/line-feed pair (X'0D0A'). Whether a single or double byte EOR is used is determined by the EOR marker encountered for the first record. If the first record has a single byte EOR and contains the text "abcd", it would look like this:
x'61 63 64 63 0a' and in character: "abcd."

Every subsequent record is expected to use the same single byte EOR.
If the first record has a double byte EOR and contains the text "acdc", it would look like this:
x'61 63 64 63 0d 0a' and in character: "abcd.."

Again, every subsequent record is expected to use the same double-byte EOR.
Single and double byte EORs cannot be mixed in the same data.

Binary :
This format is the same as formatted (see above) except that there is no trailing EOR byte. Each record contains a 2-byte integer data length, n, followed by n bytes of data. As with formatted data, all TPT data types are supported.
E.g., a record containing 4 bytes of text "abcd" would be;
x'00 06 61 62 63 64' and in character: "..abcd"

Delimited / variable length Text :
This format is an extension of the text format. It is similar because it must contain all character data and be terminated by an EOR marker. But it has a different restriction: all columns in the TPT schema must be VARCHAR** (as opposed to CHAR/ANSIDATE for simple text). The advantage of this format is that it supports an additional feature that allows the definition of individual fields (which map to DBS table row columns) separated by the special delimiter character. To successfully parse delimited data, the delimiter character can not appear within the data. Delimiter data embedded within a column will result in a data column count error.
UnFormatted :
The data does not conform to any external format. There is neither a prepended length as with binary and formatted nor is there an EOR marker as with text and delimited.



No comments:

Post a Comment