Class DataImport

java.lang.Object
com.isomorphic.base.Base
com.isomorphic.tools.DataImport
All Implemented Interfaces:
com.isomorphic.base.IAutoConfigurable

public class DataImport extends com.isomorphic.base.Base
DataImport contains methods that can be used to import data from a test data file to the List of Maps format commonly used for DataSource records (importToRows()), or directly into a DataSource (importToDataSource()).

By default the input file is expected to contain comma-delimited data like a .csv file, but JSON and XML are also supported.

By default, the input file is expected to be encoded as UTF-8. You can override this if necessary by using one of the versions of importAndValidateDSRows() that accepts a DSRequest, and passing "encoding" in the values on that request.

If you are using UTF-8 or UTF-16 encoding, by default the import process will automatically consume a Byte Order Marker if one is present in the stream. If for some reason you do not want this behavior, set the flag "import.consume.bom" to false in your server.properrties file.

Imported data may be transformed during import, for details search SmartClient Reference for "dataSourceField.importStrategy".

See Also:
  • ServletTools.loadWebRootFile(path)
  • Constructor Details

    • DataImport

      public DataImport()
      Create a DataImport configured for CSV import with default quote string.
    • DataImport

      public DataImport(DataImport.ImportFormat theInputType, String theDelimiter)
      Create a DataImport configured for the specified import format.
      Parameters:
      theInputType - the form of the data input.
      theDelimiter - a java.lang.String character used as the delimiter, by default, ","
    • DataImport

      public DataImport(DataImport.ImportFormat theInputType, String theDelimiter, String theQuoteString)
      Create a DataImport configured for the specified import format and quote string.
      Parameters:
      theInputType - the form of the data input.
      theDelimiter - a java.lang.String character used as the delimiter, by default, ","
      theQuoteString - for delimited input only, a java.lang.String ("/"", by default) to signify a double quote since a double quote would otherwise be misinterpreted as the end of a string by the parser
    • DataImport

      public DataImport(String theInputType, String theDelimiter, String theQuoteString)
      Create a DataImport configured for the specified import format and quote string.
      Parameters:
      theInputType - the form of the data input as a String
      theDelimiter - a java.lang.String character used as the delimiter, by default, ","
      theQuoteString - for delimited input only, a java.lang.String ("/"", by default) to signify a double quote since a double quote would otherwise be misinterpreted as the end of a string by the parser
  • Method Details

    • setPopulateDisplayFields

      public void setPopulateDisplayFields(boolean populate)
      If true, the import process populates displayField values with the import data in cases where it transforms the underlying value using a related display record. For example, if a field "countryId" has a displayField of "countryName". If this flag is set then as part of the process of importing a "countryId" value and transforming the import value "United States" into the corresponding id value, the importer also sets "countryName" on that record to the display value it just transformed (ie, "United States"). By default displayFields are not populated - you must pass true to this method if you require that behavior.

      See DataSource.getRelatedDisplayRecord(String, Object, DSRequest) for more details about related display records and import transformation.

      Parameters:
      populate - If true, populate displayFields with the import values used to derive a key via a related display record. If false, just leave displayFields unpopulated
    • setAutoInterpretBooleans

      public void setAutoInterpretBooleans(boolean auto)
      If true, the import process auto interprets boolean fields values converting them to Booleans, otherwise leaves them as Strings.

      Conversion rules:

      • Accept all caps permutations of "true"/"false", "yes"/"no", and "null"
      • "T" == true, "F" == false
      • "Y" == true, "N" == false
      • "0" is false
      • empty string is null
      • everything else is true
      Parameters:
      auto - If true, the import process auto interprets values boolean fields values converting them to Booleans, otherwise leaves them as Strings.
    • importToDataSource

      public long importToDataSource(Reader in, String dataSourceName) throws Exception
      Throws:
      Exception
    • importToDataSource

      public long importToDataSource(Reader in, List columns, String tableName) throws Exception
      Throws:
      Exception
    • importToDataSource

      public long importToDataSource(Reader in, Map columnRemap, String tableName) throws Exception
      Throws:
      Exception
    • importToDataSource

      public long importToDataSource(Reader in, List columns, Map translators, String tableName) throws Exception
      Throws:
      Exception
    • importToDataSource

      public long importToDataSource(Reader in, Map columnRemap, Map translators, String dataSourceName) throws Exception
      Import from InputStream in either CSV/TSV, JSON or XML format and save the imported records to the target SQLDataSource.

      Search for "testData" in SmartClient Reference for the description of supported formats and examples. Same rules apply here.

      An optional columnRemap can be provided to translate between column names in CSV/TSV or property names in JSON to the field names of the DataSource. Use null as a Map value to cause data for a column to be discarded.
      For signatures that take "List columns", this is the same as providing a columnRemap does not rename any input fields.

      If no columnRemap or incomplete columnRemap is provided, column names that are not re-mapped will be matched to DataSource fields by comparing to both the field name and field title, ignoring letter case. Any column name that isn't matched to a DataSource field is discarded.

      For delimited input, the header line may be omitted. DataImport will attempt to automatically detect whether the first line is a header by attempting to match the data to expected column names. If the header line is omitted, the columns will be assumed to be in the order of the columnRemap if provided (use a class such as LinkedHashMap to preserve order), or in the order of the DataSource fields if no columnRemap is provided. Any extra columns will be discarded.

      Any SimpleType that inherits from the given base type also uses the same translators: DataImport.ParseDate
      DataImport.ParseDateTime
      DataImport.ParseTime
      DataImport.ParseText
      DataImport.ParseNumber
      DataImport.ParseFloat
      DataImport.ParseBoolean

      For CSV/TSV and JSON data a set of translators can be provided to in the format expected by importToRows. By default, the following translations are applied based on DataSource field type:

      • int, integer, sequence or number: parsed as Java Integer
      • float or decimal: parsed as a Java Double
      • date: parsed as a date value by standard Java DataFormat, lenient parsing in current locale
      • datetime: parsed as a datetime value by standard Java DataFormat, lenient parsing in current locale
      Parameters:
      in - input file as a Reader
      columnRemap - an optional mapping between the input data and the field names of the target DataSource.
      translators - optional translators to use to transform the data from the input to the type the DataSourceField expects
      dataSourceName - the ID of the target DataSource
      Returns:
      1 if the import was successful, and 0 otherwise
      Throws:
      Exception
    • importToDataSource

      public long importToDataSource(Map record, String dataSourceName) throws Exception
      Throws:
      Exception
    • importToDataSource

      public long importToDataSource(Map record, List columns, String dataSourceName) throws Exception
      Throws:
      Exception
    • importToDataSource

      public long importToDataSource(Map record, Map columnRemap, String dataSourceName) throws Exception
      Imports provided record (as Map parameter) to dataSource. This method uses same logic with column re-mapping as importToDataSource(...) and general features of DataImport such as transforming imported data according to import strategies.
      Parameters:
      record - input record as Map
      columnRemap - an optional mapping between the input data and the field names of the target DataSource.
      dataSourceName - the ID of the target DataSource
      Returns:
      1 if the import was successful, and 0 otherwise
      Throws:
      Exception
    • importDataSourceRecords

      public List importDataSourceRecords(Reader in, String dataSourceName) throws Exception
      Throws:
      Exception
    • importDataSourceRecords

      public List importDataSourceRecords(Reader in, List columns, String dataSourceName) throws Exception
      Throws:
      Exception
    • importDataSourceRecords

      public List importDataSourceRecords(Reader in, Map columnRemap, String dataSourceName) throws Exception
      Throws:
      Exception
    • importDataSourceRecords

      public List importDataSourceRecords(Reader in, List columns, Map translators, String dataSourceName) throws Exception
      Throws:
      Exception
    • importDataSourceRecords

      public List importDataSourceRecords(Reader in, Map columnRemap, Map translators, String dataSourceName) throws Exception
      Import from InputStream in either CSV/TSV, JSON or XML format and return the imported records in a List. Note that imported records are not saved, for that consider using importToDataSource(Reader, Map, Map, String) API.

      Search for "testData" in SmartClient Reference for the description of supported formats and examples. Same rules apply here.

      An optional columnRemap can be provided to translate between column names in CSV/TSV or property names in JSON to the field names of the DataSource. Use null as a Map value to cause data for a column to be discarded.

      If no columnRemap or incomplete columnRemap is provided, column names that are not re-mapped will be matched to DataSource fields by comparing to both the field name and field title, ignoring letter case. Any column name that isn't matched to a DataSource field is discarded.

      For delimited input, the header line may be omitted. DataImport will attempt to automatically detect whether the first line is a header by attempting to match the data to expected column names. If the header line is omitted, the columns will be assumed to be in the order of the columnRemap if provided (use a class such as LinkedHashMap to preserve order), or in the order of the DataSource fields if no columnRemap is provided. Any extra columns will be discarded.

      Any SimpleType that inherits from the given base type also uses the same translators: DataImport.ParseDate
      DataImport.ParseDateTime
      DataImport.ParseTime
      DataImport.ParseText
      DataImport.ParseNumber
      DataImport.ParseFloat
      DataImport.ParseBoolean

      For CSV/TSV and JSON data a set of translators can be provided to in the format expected by importToRows. By default, the following translations are applied based on DataSource field type:

      • int, integer, sequence or number: parsed as Java Integer
      • float or decimal: parsed as a Java Double
      • date: parsed as a date value by standard Java DataFormat, lenient parsing in current locale
      • datetime: parsed as a datetime value by standard Java DataFormat, lenient parsing in current locale
      Parameters:
      in - input file as a Reader
      columnRemap - an optional mapping between the input data and the field names of the target DataSource.
      translators - optional translators to use to transform the data from the input to the type the DataSourceField expects
      dataSourceName - the ID of the target DataSource
      Returns:
      the imported records in a List
      Throws:
      Exception
    • importDataSourceRecord

      public Map importDataSourceRecord(Map record, String dataSourceName) throws Exception
      Throws:
      Exception
    • importDataSourceRecord

      public Map importDataSourceRecord(Map record, DataSource dataSource) throws Exception
      Throws:
      Exception
    • importDataSourceRecord

      public Map importDataSourceRecord(Map record, List columns, String dataSourceName) throws Exception
      Throws:
      Exception
    • importDataSourceRecord

      public Map importDataSourceRecord(Map record, Map columnRemap, String dataSourceName) throws Exception
      Imports provided record (as Map parameter) to dataSource record. This method uses same logic column re-mapping as importDataSourceRecords(...) and general features of DataImport such as transforming imported data according to import strategies.
      Parameters:
      record - input record as a Map
      columnRemap - an optional mapping between the input data and the field names of the target DataSource.
      dataSourceName - the ID of the target DataSource
      Returns:
      the imported record as Map
      Throws:
      Exception
    • importDataSourceRecord

      public Map importDataSourceRecord(Map record, Map columnRemap, DataSource ds) throws Exception
      Imports provided record (as Map parameter) to dataSource record. This method uses same logic column re-mapping as importDataSourceRecords(...) and general features of DataImport such as transforming imported data according to import strategies.
      Parameters:
      record - input record as a Map
      columnRemap - an optional mapping between the input data and the field names of the target DataSource.
      ds - instance of the target DataSource
      Returns:
      the imported record as Map
      Throws:
      Exception
    • importToRows

      public List importToRows(Reader in) throws Exception
      Throws:
      Exception
    • importToRows

      public List importToRows(Reader in, List columns) throws Exception
      Throws:
      Exception
    • importToRows

      public List importToRows(Reader in, Map columnRemap) throws Exception
      Throws:
      Exception
    • importToRows

      public List importToRows(Reader in, List columns, Map translators) throws Exception
      Throws:
      Exception
    • importToRows

      public List importToRows(Reader in, Map columnRemap, Map translators) throws Exception
      Import from InputStream in either CSV/TSV, JSON or XML format to a List of Maps.

      Search for "testData" in SmartClient Reference for the description of supported formats and examples. Same rules apply here.

      This API essentially performs part of the steps of importToDataSource() but does not actually insert records into a DataSource, instead just returning a List of Maps which could then be inserted into a DataSource, validated further, serialized to XML or otherwise processed.

      Input format and delimiter is specified in the constructor or via setInputType(). CSV/TSV delimited data is expected to have a leading line providing column names, which will be used as keys for the Maps returned.

      An optional columnRemap can be provided to translate between column names in CSV/TSV or property names in JSON to a different set of column names to be used in the output. Use a null as the value of a key in this Map to indicate that data for the column should be discarded. The matching process against the keys of the columnRemap is case-insensitive.
      For signatures that take "List columns", this is the same as providing a columnRemap does not rename any input fields.

      An optional set of translators can be provided to translate to desired Java types. The translators Map should be a map from column name (in the CSV/TSV file - not the remapped column name) to the fully qualified name of a specific inner class that will do the translation (for example: "com.isomorphic.tools.DataImport.ParseText"). The class invokes its own translate method which takes any Java Object and produces any Java Object as output.

      If the input type is set to autoDetect, importToRows() will attempt to autodetect the input format. In addition, for delimited input, if the delimiter is unspecified, importToRows() will attempt to detect whether the delimiter is a comma (",") or tab ("\t"), and will throw an exception if auto-detection fails.

      Example usage for CSV parsing (which is the default input type and delimiter):

       InputStream is = ServletTools.loadWebRootFile("files/portfolios.csv");
       Reader reader = new InputStreamReader(is);
       List portfolios = new DataImport().importToRows(reader);
       
      Parameters:
      in - a character-stream reader set to read the data input stream
      columnRemap - optional mapping from input column/property names to output Map keys.
      translators - optional list of translators
      Returns:
      imported rows as a List of Maps
      Throws:
      Exception