2844:Using The ASCII Driver With Comma-delimited Files
KEYWORDS: ASCII DRIVER ASCIIDRV.TXT COMMA DELIMITED TABLE DATABASE AREA: Datab
Delphi (and the BDE) has the capability to use ASCII files to a limited
degree as tables. The ASCII driver has the capability to translate the
data values in an ASCII fixed-length field or a comma-delimted file into
fields and values that can be displayed through a TTable component. How
this translation of the ASCII file takes place depends on an accompanying
schema file. The schema file for an ASCII data file defines various attri-
butes necessary for parsing the ASCII data file into individual field
values. The field definitions for an ASCII fixed-length field file is
relatively straightforward, the offsets of various fields in the ASCII
file being consistent across all rows in the file. However, for comma-
delimited files, this process is slightly more complicated due to the
fact that not all data values in such a file may be the same length for
all rows in the file. This article, then, concentrates on this more
difficult task of reading data from comma-delimited, or varying-length
field, files.
The Schema File
===============
The schema file for an ASCII data file contains information that defines
both the file type (comma-delimited versus fixed-length field), as well as
defining the fields that are represented by the data values in each row of
the ASCII data file. (All of the settings used in a schema file are case
insensitive, so "ascii" is just as valid as "ASCII".) In order that a
schema file be recognized as such, it must have the same filename as the
ASCII data file for which it provides definitions, but with the filename
extension .SCH (for SCHema). The attributes that describe the file are:
File name: Enclosed in square brackets, this setting specifies the
name of the ASCII data file (sans the filename extension,
which must be .TXT).
Filetype: Specifies whether the ASCII data file is structured as a
fixed-length field file (use a setting of FIXED) or a comma-
delimited file (with data values of potentially varying
length (use a setting of VARYING).
Delimiter: Specifies the character that surrounds String type data val-
ues (typically, the double quotation mark, ASCII decimal 34).
Separator: Specifies the character that is used to separate individual
data values (typically, a comma). This character must be a
visible character, i.e., cannot be a space (ASCII decimal
32).
CharSet: Specifies the language driver (use a setting of ASCII).
Following the file definition settings are field definitions, one for each
data value on each row of the ASCII data file. These field definitions
supply the information Delphi and the BDE will need to create a virtual
field in memory to hold the data value, that virtual field's data type
which will affect how the value is translated after being read from the
ASCII file, and size and positioning attributes. The various settings that
will appear in each field definition are:
Field: Virtual field name, will always be "Field" followed
by an integer number representing that field's ord-
inal position in respect to the other fields in the
ASCII data file. E.G., the first field is Field1,
the second Field2, and so on.
Field name: Specifies the display name for the field, which
appears as the column header in a TDBGrid. Naming
convention for ASCII table fields follows that for
Paradox tables.
Field type: Specifies the data tyoe BDE is to use in translating
the data value for each field and tells Delphi what
type of virtual field to create.
Use the setting For values of type
--------------- ---------------------
CHAR Character
FLOAT 64-bit floating point
NUMBER 16-bit integer
BOOL Boolean (T or F)
LONGINT 32-bit long integer
DATE Date field.
TIME Time field.
TIMESTAMP Date + Time field.
(The actual format for date and time data values
will be determined by the current setting in the BDE
configuration, Date tab page.)
Data value length: Maximum length of a field's corresponding data
value. This setting determines the length of the
virtual field that Delphi creates to receive values
read from the ASCII file.
Number of decimals: Applicable to FLOAT type fields; specifies the
number of digit positions to the right of the deci-
mal place to include in the virtual field defini-
tion.
Offset: Offset from the left that represents the starting
position for the field in relation to all of the
fields that preceed it.
For example, the field definition below is for the first field in the
ASCII table. It defines a String type data value with a name of "Text",
a maximum data value length of three characters (and the field will
appear as only three characters long in Delphi data-aware components such
as the TDBGrid), no decimal places (a String data value will never have
any decimal places), and an offset of zero (because it is the first field
and there would not be any preceeding fields).
Field1=Text,Char,3,00,00
Here is an example of a schema file with three fields, the first of String
type and the second and third of type date. This schema file would be
contained in a file named DATES.SCH to provide file and field definitions
for an ASCII data file named DATES.TXT.
[DATES]
Filetype=VARYING
Delimiter="
Separator=,
CharSet=ascii
Field1=Text,Char,3,00,00
Field2=First Contact,Date,10,00,03
Field3=Second,Date,10,00,13
This schema defines a comma-delimited field where all String type data
values can be recognized as being surrounded by the double quotation mark
and where distinct data values are separated by commas (excepting any
commas that may appear within the specified delimiter, inside individual
String data values). The character field has a length of three characters,
no decimal places, and an offset of zero. The first date field has a
length of 10, no decimals, and an offset of three. And the second date
field has a length of 10, no decimals, and an offset of 13.
For reading ASCII comma-delimited files, the length and offset parameters
for the field definitions do not apply to data values in the ASCII files
(as is the case for fixed-length field files), but to the virtual fields,
defined in the application, into which the values read will be placed. The
length parameter will need to reflect the maximum length of the data value
for each field -- not counting the delimiting quotation marks or the comma
separators. This is most difficult to estimate for String type data values
as the actual length of such a data value may vary greatly from row to row
in the ASCII data file. The offset parameter for each field will not be
the position of the data value in the ASCII file (as is the case for
fixed-length field files), but the offset as represented by the cumulative
length of all preceding fields (again, the defined fields in memory, not
the data values in the ASCII file.
Here is a data file that would correspond to the schema file described
above, in a file named DATES.TXT:
"A",08/01/1995,08/11/1995
"BB",08/02/1995,08/12/1995
"CCC",08/03/1995,08/13/1995
The maximum length of an actual data value in the first field is three
("CCC"). because this is the first field and there are no preceding
fields, the offset for this field is zero. The length of this first field
(3) is used as the offset for the second field. The length of the second
field, a date value, is 10, reflecting the maximum length of a data value
for that field. The accumulated length of the first and second fields are
then used as the offset for the third field (3 + 10 = 13) .
It is only when the proper length for the data values in the ASCII file
are used and each fields length added to any preceding fields to produce
offset values for succeeding fields that this process will correctly read
the data. If data is misread because of improper length settings in the
schema file, most values will suffer adverse translation effects, such
as truncation of character data or numeric values being interpreted as
zeros. Data will usually still be displayed, but no error should occur.
However, values that must be in a specific format in order to be trans-
lated into the appropriate data type will cause errors if the value read
includes characters not valid in a date value. This would include a date
data value which, when incorrectly read may contain extraneous characters
from other surrounding fields. Such a condition will result in a data
translation exception requiring an adjustment of the field length and
offset settings in the schema file.
TI