Data Conversion and Transformation Considerations

Data Transformation Services

Data Transformation Services

Data Conversion and Transformation Considerations

Before using Data Transformation Services (DTS) to convert or transform data between heterogeneous data and destinations, consider these variations in the way different programs, providers, and drivers support data types and SQL statements.

When using Microsoft® SQL Server™ as a data source, consider the following:

  • Transforming the real data type into the int data type may not return the exact value because SQL Server 2000 supports only six digits of precision for the real data type. For example, the real number 2147480000 may result in an int value of 2147480065.

  • When transforming a string (DBTYPE_WSTR) into a date (DBTYPE_DATE) or time (DBTYPE_TIME) column from a text file, only one date or time format (yyyy-mm-dd hh:mm:ss.fffffffff) is accepted by the OLE DB data conversion service component. Use the Date Time String transformation, or code a Microsoft ActiveX® script transform using the CDate function to transform dates correctly.

  • SQL Server 2000 does not support OLE DB types DBTYPE_DATE or DBTYPE_TIME. SQL Server 2000 only supports DBTYPE_DATETIME.

  • To access data across multiple steps, avoid using a temp table during transformations. Instead, use a global temp table or create a permanent table in tempdb.

  • Stored procedures that return rows from temp tables cannot be used as the source of a transformation. You can use stored procedures that return rows from a global temp table or table.

  • When using temporary tables in the Transform Data task, the Data Driven Query task, or the Execute SQL task in DTS Designer, be aware that you cannot use a Transact-SQL statement or stored procedure that calls a temp table as your source.

This limitation does not apply outside of DTS Designer. You can use source statements or stored procedures that access SQL Server temp tables programmatically.

DTS Import/Export Wizard and DTS Designer

When using the DTS Import/Export Wizard and DTS Designer to create packages, consider the following:

  • The DTS user interface allows sharing existing connections among tasks but the same connection cannot be used for both the source and destination of a transformation.

  • Using DTS Designer or the DTS Import/Export Wizard, it is possible to specify read-only or in-use status for some providers (for example, Microsoft Access and ODBC DSNs) that are to serve as data sources only. Click the Advanced tab in the Connection Properties dialog box, and in the Advanced Connection Properties dialog box, set the value of the mode property to 1.

  • When creating a table using the DTS Import/Export Wizard or DTS Designer, the owner of a table created at the destination is the current user (generally the dbo), regardless of who the owner is at the source. This can result in a situation where the dbo attempts to create a table at the destination and the table name already exists, thus causing the attempt to fail.

  • When defining a data-driven query using DTS Designer, a data destination must be able to support the OLE DB ICommand interface. Due to this restriction, destinations such as text files are not supported.

  • The Copy SQL Server Objects task of DTS truncates fields of type text, ntext, and image if they exceed 8388602 bytes in length. No error messages are displayed by either DTS Designer or the DTS Import/Export Wizard. These both indicate the task completed successfully.

    The only indication of failure is a log message written to a log file named <server>.<database>.log, in the Script File Directory specified on the Copy tab of the Copy SQL Server Objects Task Properties dialog box. The log message specifies the table and column, but not the row, where the truncation occurred. No error records are written to the DTS error file or to the SQL Server log.

Microsoft SNA Server

When using Microsoft SNA Server as a data source, consider the following:

  • The Microsoft OLE DB provider for AS/400 and VSAM does not support SQL statements that the DTS Import/Export Wizard uses to create or truncate a table.
Microsoft Access

When working with Access, consider the following:

  • When exporting data from SQL Server 2000 to Microsoft Access 97 or earlier, the Microsoft OLE DB Provider for Access buffers all inserts in memory and only commits them when the DTS Import/Export Wizard completes operation. As a result, you can face a low memory situation when you export large tables. However, you can resolve this issue by constructing SELECT statements that send smaller numbers of rows in multiple passes.
Microsoft Visual FoxPro

Microsoft Visual FoxPro® supports only a precision of (15,9) for numeric data types. Data exported to Visual FoxPro that exceeds this precision is truncated and rounded.

  • Visual FoxPro does not support the SELECT INTO statement.

  • The DTS Query Designer supports the Visual FoxPro INSERT VALUE statement, but not the INSERT statement using a SELECT statement.

  • The Microsoft OLE DB driver for ODBC is unable to write BLOBs to Visual FoxPro using the FoxPro ODBC driver because Visual FoxPro does not support dynamic cursors.
ODBC

When connecting to an ODBC data source, consider the following:

  • The Microsoft OLE DB Provider for ODBC requires a unique key on all destination tables with a BLOB data column when performing export operations.

  • When using the Microsoft OLE DB provider for ODBC with the SQL Server ODBC driver, all BLOB columns should be arranged after columns with other data types in a source rowset. You can use a SELECT statement to rearrange the BLOB columns to the end of the source rowset. The DTS Import/Export Wizard performs this operation automatically.

    Important  When using the Microsoft OLE DB Provider for ODBC with the SQL Server ODBC driver, attempts to preview stored procedures fail with a connection busy error. This problem does not occur if you use the Microsoft OLE DB Provider for SQL Server.

  • If a Microsoft ODBC Driver for SQL Server connection is being shared by multiple threads, the connection may fail, returning the error message "Connection is busy with results for another hstmt". In some cases, this affects packages built with the DTS Import/Export wizard. Use one of the following approaches to address this problem:
    • Set the MaxConcurrentSteps property to 1 to eliminate contending threads.

    • Create additional ODBC connections to eliminate connection sharing.

    • Use the Microsoft OLE DB Provider for SQL Server (SQLOLEDB) to connect to the database. If you need to connect to a SQL Server 6.5 database, run Instcatl.sql to enable access with the Microsoft OLE DB Provider for SQL Server.
Oracle

When using Oracle as a data source, consider the following:

  • The Microsoft ODBC and OLEDB drivers for Oracle support the Oracle 7.3 BLOB data types, not Oracle 8.0 data types. For example, BLOB, CLOB, NCLOB, and BFILE are not supported.

  • The Microsoft ODBC driver for Oracle does not support sending Unicode strings into an Oracle server. Oracle requires prefixing Unicode strings with the letter N.

  • The Microsoft ODBC driver for Oracle does not support negative scaling for the Oracle number data type.

  • The Microsoft ODBC driver for Oracle reports that an Oracle number data type without a specified precision has a size of 20 digits. When importing from Oracle (regardless of the destination), if there are more than 20 digits, you may have to manually increase the precision if the destination table does not already exist.

  • Oracle supports only one LONG (BLOB) data column in a table.

  • You cannot import or export Oracle columns that have mixed or lower case names. You also cannot transform or copy data using Oracle column names that contain spaces using the DTS Import/Export Wizard. Oracle requires case-sensitive column names to be precisely specified and quoted.

  • To perform distributed transactions between SQL Server 2000 and Oracle, you must use Oracle version 8.0.4.1 or later. For more information, see Distributed Transactions.

  • Because the Microsoft OLE DB Provider for Oracle does not support ICommandWithParameters, it cannot be used as the destination of a Data Driven Query task. When using this provider in DTS Designer, the Parameters buttons on a Transform Data task, Data Driven Query task, and Execute SQL task will be disabled.
DB2 on the IBM AS/400

When connecting to a DB2 data source, consider the following:

  • There is no Unicode or BLOB support on the AS/400 system.

  • You cannot transform any table with a NULL column value to an AS/400 server because the AS/400 does not support NULL syntax in its CREATE TABLE statement. However, you can send NULL values if you edit the CREATE TABLE syntax to remove the references to NULL. The AS/400 does not support NOT NULL; NULL is assumed if not specified.
Using the Sybase ODBC Driver

When connecting to a Sybase ODBC data source, consider the following:

  • When transforming data from SQL Server into Sybase version 11 using the DTS Import/Export Wizard:
    • The SQL Server numeric (3,0) data type maps to the Sybase smallmoney data type by default. Change this setting to avoid data loss.

    • The SQL Server numeric (18,x or 19,x) data type maps to the Sybase money data type by default. Change this setting to avoid data loss.

    • When moving data into a new Sybase table, if you click OK in the Column Mappings and Transformations dialog box, the wizard returns a "Table already exists" error message. You should ignore this message.

    • You cannot drop and re-create a Sybase table using the DTS Import/Export Wizard. You must perform this action without using a wizard.
  • The DTS Query Designer does not support the Sybase SQLAnywhere CREATE TABLE statement.

  • The DTS Import/Export Wizard can only move one table at a time to a SQLAnywhere database due to a limitation in the SQLAnywhere driver. You can overcome this limitation using DTS Designer. However, you must set the ExecuteInMainThread property of the Step object to True for each table, as the SQLAnywhere driver is not thread safe.

  • You cannot copy a table to a Sybase destination if it contains a BLOB column.

  • If you programmatically copy a table containing an image data type from Sybase, changing the default BLOB settings can result in failure.
dBase and Paradox

When connecting to dBase and Paradox data sources, consider the following:

  • Table names in dBase and Paradox are limited to eight characters. Column names in dBase are limited to 10 characters.
File Import or Export

When importing or exporting data from text files, consider the following:

  • If you import into or export from char or varchar columns, some extended characters may not be copied correctly if your client OEM code page is different from the code page on the server. When you import into or export from nchar or nvarchar columns, all characters copy correctly.

  • If you export BLOB (including SQL Server data types text and ntext) columns to a fixed length text field, the default length is set equal to the maximum BLOB field length (approximately two gigabytes). Prevent disk overflow by choosing a smaller but still adequate field length, or use a delimited format if possible.

  • The OLE DB provider for text files used in DTS cannot process columns with BLOB data greater than two megabytes (MB).
Code Pages, Collation, and Non-Unicode Data Issues

When using DTS to copy data between SQL Server databases with different code pages and collations, data may be lost or incorrectly translated.

To avoid translation issues, store international data in Unicode. Once converted to Unicode, you can easily transfer data in any collation or code page without loss or incorrect translation to any Microsoft SQL Server 2000 or Microsoft SQL Server 7.0 database.

In Microsoft SQL Server 2000, collations are associated with particular code pages and are assigned to individual columns. (Microsoft SQL Server 7.0 uses a single default code page, and does not support column-level collations). If the code page used for a source and destination column match, no data loss will occur in non-Unicode columns. When data is copied between non-Unicode columns, and the source and destination code pages do not match, loss of data can result. In some cases, DTS will perform a best fit mapping, with data loss if the source contains characters that do not occur in the destination code page. In other cases, DTS will perform a copy without any intervening translation, resulting in the loss of any data not represented by the same binary value in both code pages. Following are problems and guidelines for using the Copy SQL Server Objects task and when copying data with the Copy Column transformation using different collations or code pages.

Copy SQL Server Objects Task

The following refers to how the Copy SQL Server Objects task handles non-Unicode data:

  • When copying data from one instance of SQL Server 2000 to another instance of SQL Server 2000 there is no loss of data provided you set the UseCollation property of the Copy SQL Server Objects task.

  • When copying data from an instance of SQL Server 2000 to SQL Server 7.0, a best fit mapping is used for columns that have collations that match the database default collation code page. Data stored in a column with a different code page is interpreted as being encoded in the default code page, with attendant losses on translation.

  • When copying data from SQL Server 7.0 to an instance of SQL Server 2000, the UseCollation property is not available because SQL Server 7.0 is unable to determine which of several collations its default code page maps to. No collations are supported during Copy SQL Server Objects task execution, thus, non-Unicode destination columns will be assigned the default collation for the destination database. If the code page associated with the collation does not match that of the source database, DTS will perform a best fit mapping

  • When copying data from SQL Server 7.0 to SQL Server 7.0, if the source and destination databases use different default code pages, DTS will perform a best fit mapping.

If you want to ensure that there is no data loss when copying non-Unicode data, you can use the SQL Server bulk copy feature to export data in Unicode format, then use bulk copy or DTS to import it.

To disable the default scripting of collations, add code or use Disconnected Edit or the Dynamic Properties Task to add the value of SQLDMOScript2_70Only to the ScriptOptionEx property of the Copy SQL Server Objects Task.

Copy Column Transformation

The following refers to how the Copy Column Transformation handles non-Unicode data between different code pages:

  • If the source column is Unicode and the destination column in non-Unicode, a best fit mapping is done, and an attempt is made to translate the data between source and destination.

  • If the source column is non-Unicode and the destination column is Unicode, DTS interprets the source column as belonging to code page 1252 regardless of the actual code page used.

  • If both the source and destination columns are non-Unicode, raw data will be copied without translation, and some loss of data will occur.