company logo

File access via property handle

Keywords:  file schema

One may access an external file by property handle.This allows reading or writing data from a program or from within an OSI function. Property handles for external files will not, however, import or export data automatically.

Opening a file via property handle activates property handle functionality for the external files. Although there are many features, which cannot be supported for an external file, many helpful functions of property handle are still working for this data source type.

Accessing external data via property handle does not require an exchange schema. A file schema, which does not define data mapping, would be sufficient. Since file schemata for CSV files can be derived very simple in many cases, the external file does not require additional information for being accessed.

The property handle access functionality is the base for the OSI functions fromFile and toFile, which are used for explicit data exchange.

Property handle for external files are supported for the following file types:

  • Flat (or binary) file with fix structure
  • CSV files (with head line support)
  • ESDF (extended self delimiter files - a CSV extension) for hierarchical data structures
  • OIF (Object Interchange Format) files
  • XML files
  • File system directory
Property handle file extent

The file schema for external files can be defined in advance within the ODABA dictionary in terms of structure and extent definition. In this case, the external file can simply be accessed via the extent name, similar to any other extents in the database.

The path to the file to be accessed is set in an option with the extent name.

The file type to be accessed has to be defined in the extent definition /access type) as AT_BIN (flat files) or AT_EXTERN for extended self delimiter (ESDF), comma separated (CSV), object interchange format (OIF) and xml files (XML). One more type supported is directory access (AT_DIR) .

Accessing external files via extents is limited in the sense that specific settings as head line option indicating self describing files or special delimiters are not supported. More flexible file access is provided via the openExtern() function, which allows opening a collection based on an external file without referring to definitions in the directory.

Open extern

Often, it is not very comfortable defining structure and property handles for external files in the dictionary. Especially, CSV files often carry metadata in the headline, which contains sufficient information for extracting a file schema. Thus, property handle supports an additional function for opening external data sources (Property::openExtern()), which are not explicitly defined in the dictionary.

In order to access external files that do not have a file schema definition at all, ad-hoc schemata can be created for semi-structured files as XML or OIF. In this case, the file is analyzed and a file schema is derived from property names passed with the data.

Data conversion

Several external data formats do not support special characters in data (e.f. line breaks, field or string separators etc.). Hence, data is converted into proper external formats, which may differ depending on target platform. E.g. some systems require double string separators within string text data, other require escaped string separators. In order to be flexible when exporting data, data conversion is performed for all data that contains control characters (\t, \n, \r), string or field separators(as " and ;).

By default, string delimiters within text are doubled and control characters remain. In order to provide different conversion, one may define data exchange replacement rules for import and export in an option:

    DataExchange.Replacements=rule1;rule2;...

Conversion works for export, only. Conversion rules are separated by semicolon (;). Source and target format within a rule are separated by colon (:), e.g. ":"" for changing single apostrophes into double apostrophes. Control characters have to be passed as escaped characters (e.g. \n). Backslash not intended as escape character have to be doubled (e.g. ":\\" for conversing apostrophe into escaped apostrophe).

Default replacement (no export replacement option) for export is defined as C-string export:

    DataExchange.Export.Replacements=":\";\:\\;\n:\\n;\t:\\t;\r:\\r

i.e. duplicating string delimiters.

Data exchange import typically expects C-strings, i.e. control characters (\t, \r, \n), backslash (\) and string delimiters as part of data have to be escaped (e.g. "text enclosed in \" should not contain line breaks (\n) and tabs (\t)."). Data exchange import also accepts control characters within text. Importing text using double string delimiters within text requires explicit import conversion before importing.

DataExchange.Replacements=\n:\\n;\t:\\t;":\\"

Related topics