Recording Data Lineage in DTS

Data Transformation Services

Data Transformation Services

Recording Data Lineage in DTS

Design a plan before implementing and using data lineage information. Base your plan on factors such as auditing needs, the amount of lineage data that can be managed, performance considerations, and whether you need to track row- or column-level lineage, or both.

If you save a Data Transformation Services (DTS) package to Microsoft® SQL Server™ 2000 Meta Data Services, you can record and track two types of data lineage:

  • Row-level data lineage, which reveals the source of any piece of data and the transformations applied to that data. This data lineage can be tracked at the package and row levels of a table and provides a complete audit trail of data transformation and DTS package execution information. In Meta Data Services, an object associated with this type of lineage contains lineage values (package execution globally unique identifiers), the user name and server name for package execution, and the time of execution.

  • Column-level data lineage, which provides information about a package version and the database tables and columns the package uses as a source or destination (if any). You can browse packages and package versions that use a specific column as a source or destination, and check to see whether a specific column is used as a source or destination in any package saved to Meta Data Services.

Use the DTS Browser to display data lineage information. For more information, see Viewing Meta Data in DTS.

Implementing Row-Level Data Lineage

These are the steps for implementing row-level data lineage:

  1. Connect to and import meta data from the database whose tables will be used by the package as a source or destination.

    To import the meta data, use either the DTS Browser or right-click Import Meta Data on either the Meta Data Services Packages or Meta Data nodes in SQL Server Enterprise Manager.

  2. Create the table columns in your data warehouse for receiving row-level data lineage information. DTS maintains the following lineage identifiers for each package execution:
    • A short (integer) value, which is the package version checksum.

    • A long (uniqueidentifier) value, which is the globally unique identifier (GUID) of the package version.

    The short lineage value occupies less space in the destination table but is potentially less unique. Either one or both of these values need to be written to columns in your data mart or data warehouse.

  3. Select the data lineage options in DTS Designer.

    These options can be found on the Advanced tab of the DTS Package Properties dialog box. After you have turned on the data lineage options in a package, the two lineage values (GUIDs) are available when you configure a Transform Data task and Data Driven Query task.

  4. Map the data lineage source columns to destination columns in DTS Designer.

    You configure the lineage columns on the Transformations tab for these tasks and on the tabs of the Transformation Options dialog box.

  5. Save the package containing the data lineage information to Meta Data Services.

  6. Execute the package.
Implementing Column-Level Data Lineage

These are the steps for implementing column-level data lineage:

  1. Create a DTS package containing at least one transformation task.

  2. Save the package to Meta Data Services.

  3. Scan the meta data. During the save operation, click Scanning, and in the Scanning Options dialog box, click Resolve package references to scanned catalog metadata to save the meta data, and link the package meta data to a repository database.

For more information, see Importing and Saving Meta Data in DTS.