DTS Custom Transformation Example: Format Names

DTS Programming

DTS Programming

DTS Custom Transformation Example: Format Names

The following code example in Microsoft® Visual C++® implements a custom transformation that merges two source columns that are presumed to be a first and last name. It formats them LastName, FirstName and copies the combined name to a destination column.

The source and destination columns must be string types, but they can be ANSI or wide character strings. If some columns are ANSI and others are wide character, the source columns are converted to the character width of the destination column as they are copied. The destination column is set to NULL if both source columns are NULL.

The transformation verifies there are exactly two source and one destination column, and that they are string types. It validates that if both source columns can contain NULLs, either the destination can contain NULL or the DTSTransformFlag_AllowNullChange transform flag has been set. It also verifies either that the destination column is large enough to hold any name that will fit in the source columns or that the DTSTransformFlag_AllowStringTruncation transform flag has been set.

Implementing the Format Names Example

To implement this example, use the Active Template Library (ATL) custom transformation template to create the transformation framework. Name the component DTSStrings and the transformation class FormatName. For more information, see Building a Custom Transformation from the ATL Custom Transformation Template.

Add Custom Properties

After creating the transformation framework, you need to add two properties to the transformation.

Property name Property type Description
FirstNameColumn BSTR The name of the source column that contains the first name.
RemoveTrailingSpaces VARIANT_BOOL A boolean that indicates whether trailing spaces are to be trimmed from the first and last names.

To add properties in Visual C++

  1. On the ClassView tab of the Workspace window, right-click the IFormatName interface, and then click Add Property.

  2. In the Property Name box, enter a name, and then in the Property Type list, select or enter the type of property you want to add. No parameters are needed.

  3. Select the Get Function and Put Function check boxes, and then click PropPut.
Add Custom Code

Add the following code segments to the framework:

  • Initializations in the CFormatName class constructor

  • Declarations of module level variables in the CFormatName class

  • Overloaded function RemoveTrailingSpace

  • CFormatName::PreValidateSchema method

  • CFormatName::ValidateSchema method

  • CFormatName::AddVariable method

  • CFormatName::GetTransformServerInfo method

  • CFormatName::ProcessPhase method

  • Property get_ and put_ functions

  • Error code definitions
Initializations in CFormatName Constructor

This code provides initial values for the transformation properties.

Immediately after the following lines in FormatName.h:

public:
    CFormatName()
    {

Add these code lines:

        // Initialize the properties
        m_bstrFirstNameColumn.m_str = NULL;    
        m_vbRemoveTrailingSpaces = FALSE;
Declaration of CFormatName Variables

These are the declarations of the internal storage for the properties and other module level variables.

Adding CFormatName Declarations

Immediately after these lines in FormatName.h:

    STDMETHOD(SetExecuteThreadComplete)(THIS)
        {
            return NOERROR;
        }

add these lines of code for the private section :

private:
    // Local variables.
    LONG                m_lFirstNameOrd;        // Ordinal of first-name column, 0 or 1.
    BOOL                m_bNullIntoNonNull;     // True if can get Null into Non-null error.
    ULONG               m_ulSrcLength;          // Combined maximum source length.
    BOOL                m_bFirstWide;           // First name is wide chars.
    BOOL                m_bLastWide;            // Last name is wide chars.
    BOOL                m_bDestWide;            // Destination column is wide chars.
    LPBYTE              m_pstrBuffer;           // Intermediate buffer.
    IDTSErrorRecords*   m_pErrorRecords;

    // Properties
    CComBSTR            m_bstrFirstNameColumn;
    VARIANT_BOOL        m_vbRemoveTrailingSpaces;
Function RemoveTrailingSpace

This function removes trailing spaces from ANSI and wide character strings.

Adding Code for RemoveTrailingSpace

Immediately ahead of the following line near the end of FormatName.h:

#endif //__FORMATNAME_H_

add these code lines:

// Overloaded function to remove trailing spaces.
inline void RemoveTrailingSpace( LPSTR lpstr, int iLength )
{
    for( ULONG ii = iLength - 1; ii >= 0; ii-- ) 
        if( !isspace( lpstr[ ii ] ) && lpstr[ ii ] != '\0') 
            break;
    lpstr[ ii + 1 ] = '\0';
}
inline void RemoveTrailingSpace( LPWSTR lpwstr, int iLength )
{
    for( ULONG ii = iLength - 1; ii >= 0; ii-- ) 
        if( !isspace( lpwstr[ ii ] ) && lpwstr[ ii ] != L'\0') 
            break;
    lpwstr[ ii + 1 ] = L'\0';
}
PreValidateSchema

The code for PreValidateSchema checks that there are two source columns and one destination column, and checks that each column is an ANSI or wide character string type. Validation of the FirstNameColumn property and the transform flags is deferred to the ValidateSchema method.

Adding PreValidateSchema Code

Immediately ahead of the

    return NOERROR;

 statement in CFormatName::PreValidateSchema in file FormatName.cpp, add these code lines:

    // Validate the count of source and destination columns.
    if (pDestMetadata->cColumns != 1)
        return DTSStrings_E_NumDestCols;

    if (pSrcMetadata->cColumns != 2)
        return DTSStrings_E_NumSourceCols;

    // Validate that the destination column is ANSI or Unicode. Remove BYREF flag.
    const DBCOLUMNINFO* pDestDBColumnInfo = &(pDestMetadata->rgDBColumnInfo[0]);
    WORD                wDestType =         (pDestDBColumnInfo->wType & (~DBTYPE_BYREF));

    if( wDestType != DBTYPE_STR && wDestType != DBTYPE_WSTR )
        return DTSStrings_E_OnlyStringCols;

    // Validate that the source columns are ANSI or Unicode. They can't have BYREF flag.
    for (UINT i = 0; i < pSrcMetadata->cColumns; i++)
    {
        const DBCOLUMNINFO* pSrcDBColumnInfo =  &(pSrcMetadata->rgDBColumnInfo[i]);
        WORD                wSourceType =       (pSrcDBColumnInfo->wType);

        if( wSourceType != DBTYPE_STR && wSourceType != DBTYPE_WSTR )
            return DTSStrings_E_OnlyStringCols;
    }
ValidateSchema

The code for ValidateSchema duplicates the logic of PreValidateSchema. In addition, it does the following:

  • Checks to see whether a NULL value might be moved to the destination column where NULLs are not allowed and DTSTransformFlag_AllowNullChange is not set.

  • Checks to see whether the combined source columns plus the separator exceed the destination column width, and DTSTransformFlag_AllowStringTruncation is not set.

  • Verifies that the FirstNameColumn property contains the name of one of the source columns.
Adding ValidateSchema Code

Immediately ahead of the

    return NOERROR; 

statement in CFormatName::ValidateSchema in file FormatName.cpp, add these code lines:

    // Mark that first name column unknown.
    m_lFirstNameOrd = -1;
    
    // Validate the count of source and destination columns.
    if (pDestColumnInfo->cColumns != 1)
        return DTSStrings_E_NumDestCols;

    if (pSrcColumnInfo->cColumns != 2)
        return DTSStrings_E_NumSourceCols;

    // The pointers to the binding structures and the data area are Null in ValidateSchema.
    const DBCOLUMNINFO* pDestDBColumnInfo   = pDestColumnInfo->rgColumnData[0].pDBColumnInfo;
    WORD                wDestType           = (pDestDBColumnInfo->wType & (~DBTYPE_BYREF));
    ULONG               ulDestLength        = pDestDBColumnInfo->ulColumnSize;
    BOOL                bDestNullable       = ( ( pDestDBColumnInfo->dwFlags &
                                                  ( DBCOLUMNFLAGS_ISNULLABLE | DBCOLUMNFLAGS_MAYBENULL ) ) != 0 );
    BOOL                bSrcNullable        = TRUE;

    // This is length of ", ".
    m_ulSrcLength = 2;

    // Validate the destination column is ANSI or Unicode.
    if( wDestType != DBTYPE_STR && wDestType != DBTYPE_WSTR )
        return DTSStrings_E_OnlyStringCols;

    // Make sure column not a BLOB type.
    if( ulDestLength > DTS_DEFAULT_INMEMORY_BLOB_SIZE )
        return DTSStrings_E_NoBLOBCols;

    // Validate that the source columns are ANSI or Unicode. They can't have BYREF flag.
    for (UINT i = 0; i < pSrcColumnInfo->cColumns; i++)
    {
        const DBCOLUMNINFO* pSrcDBColumnInfo = pSrcColumnInfo->rgColumnData[i].pDBColumnInfo;
        WORD                wSrcType = (pSrcDBColumnInfo->wType);
        LPCOLESTR           pwzColName = (pSrcDBColumnInfo->pwszName);

        // Accumulate total of source column widths.
        m_ulSrcLength += pSrcDBColumnInfo->ulColumnSize;

        // Accumulate nullability of combined source columns. All columns must be nullable.
        bSrcNullable &= ( ( pSrcDBColumnInfo->dwFlags &
                            ( DBCOLUMNFLAGS_ISNULLABLE | DBCOLUMNFLAGS_MAYBENULL ) ) != 0 );

        // Save index if this is specified first name column.
#if defined(_WIN32) && !defined(OLE2ANSI)
        if( !wcscmp( pwzColName, (m_bstrFirstNameColumn.m_str) ) )
#else
        if( !strcmp( pwzColName, (m_bstrFirstNameColumn.m_str) ) )
#endif
            m_lFirstNameOrd = i;

        if( wSrcType != DBTYPE_STR && wSrcType != DBTYPE_WSTR )
            return DTSStrings_E_OnlyStringCols;

        // Make sure column not a BLOB type.
        if( m_ulSrcLength > DTS_DEFAULT_INMEMORY_BLOB_SIZE )
            return DTSStrings_E_NoBLOBCols;
    }

    // Error if first name column is not found.
    if( m_lFirstNameOrd < 0 )
        return DTSStrings_E_NoFirstNameCol;

    // Error if combined source cols might overflow destination, unless string truncation is allowed.
    if( m_ulSrcLength > ulDestLength )            
        if( !( eTransformFlags & DTSTransformFlag_AllowStringTruncation ) )
            return DTSStrings_E_StrTruncPossible;

    // Error if combined source cols are nullable while destination is not nullable, unless null change is allowed.
    if( m_bNullIntoNonNull = ( bSrcNullable && !bDestNullable ) )            
        if( !( eTransformFlags & DTSTransformFlag_AllowNullChange ) )
            return DTSStrings_E_NullNotNullPossible;
AddVariable

The code for AddVariable looks for the DTSErrorRecords variable and uses it to obtain a pointer, via QueryInterface, to the IDTSErrorRecords interface. Other variables are ignored.

Adding AddVariable Code

Immediately ahead of the

    return NOERROR;

statement in CFormatName::AddVariable in file FormatName.cpp,

add these code lines:

    // Dig out the error variable if this is it. Others, ignore.
    HRESULT hr = NOERROR;
#if defined(_WIN32) && !defined(OLE2ANSI)
    if (!wcscmp(wzDTSErrorRecords, pwzName)) {
#else
    if (!strcmp(wzDTSErrorRecords, pwzName)) {
#endif
        //m_pErrorRecords->Clear();
        if (!V_DISPATCH(&Variable))
            return E_POINTER;
        hr = V_DISPATCH(&Variable)->QueryInterface(IID_IDTSErrorRecords, (LPVOID *)&m_pErrorRecords);
        if( hr == NOERROR )
            m_pErrorRecords->Clear();
    }
GetTransformServerInfo

The code for GetTransformServerInfo returns a help string and specifies the phases the custom transformation supports. In addition to DTSTransformPhase_Transform, this transformation uses DTSTransformPhase_PreSourceData and DTSTransformPhase_PostSourceData.

Adding GetTransformServerInfo Code

Replace the body of CFormatName::GetTransformServerInfo in file FormatName.cpp with the following code:

    BSTR bstrHelp = _bstr_t("Format source column names as Last, First.");

    // If help string pointer is valid, define help string.
    if (pbstrHelpString) 
        *pbstrHelpString = bstrHelp;

    // If supported phases pointer is valid, define supported phases.
    if (peSupportedPhases) 
        *peSupportedPhases = DTSTransformPhase_Transform +
                            DTSTransformPhase_PreSourceData +
                            DTSTransformPhase_PostSourceData;
    return NOERROR;
ProcessPhase

The code for ProcessPhase supports the phases DTSTransformPhase_PreSourceData, DTSTransformPhase_PostSourceData and DTSTransformPhase_Transform.

In the DTSTransformPhase_PreSourceData phase, ProcessPhase allocates an intermediate buffer and determines which columns are wide character strings. It uses DTSTransformPhase_PostSourceData to release the buffer.

In the DTSTransformPhase_Transform phase, ProcessPhase first copies the last name to the intermediate buffer, converting character width and truncating spaces, if necessary. It appends a comma and space, although only if neither last name nor first name is NULL. It appends the first name to the intermediate buffer, converting character width and truncating spaces, if necessary. It then moves the intermediate buffer to the destination, only copying the part of the string that will fit in the destination buffer. It sets the destination to NULL only if both first and last name are NULL.

Adding ProcessPhase Code

Immediately ahead of the

    return NOERROR;

 statement in CFormatName::ProcessPhase in file FormatName.cpp,

add these code lines:

    DTSColumnData*      pDTSDestColumnData;
    const DBBINDING*    pDBDestBinding;

    DTSColumnData*      pDTSSourceColumnData;
    const DBBINDING*    pDBSourceBinding;

    // Process the current transform phase.
    switch( pPhaseInfo ? pPhaseInfo->eCurrentPhase : DTSTransformPhase_Transform )
    {
    // Delete intermediate buffer.
    case DTSTransformPhase_PostSourceData:

        delete( m_pstrBuffer );

        break;

    // Allocate intermediate buffer and gather column widths.
    case DTSTransformPhase_PreSourceData:

        // Get destination char width.
        pDTSDestColumnData      = &( pDestColumnInfo->rgColumnData[0] );
        pDBDestBinding          = pDTSDestColumnData->pDBBinding;
        m_bDestWide             = ( pDBDestBinding->wType == DBTYPE_WSTR );

        // Get last name char width.
        pDTSSourceColumnData    = &( pSrcColumnInfo->rgColumnData[ 1 - m_lFirstNameOrd ] );
        pDBSourceBinding        = pDTSSourceColumnData->pDBBinding;
        m_bLastWide             = ( ( pDBSourceBinding->wType & ~DBTYPE_BYREF ) == DBTYPE_WSTR );

        // Get first name char width.
        pDTSSourceColumnData    = &( pSrcColumnInfo->rgColumnData[ m_lFirstNameOrd ] );
        pDBSourceBinding        = pDTSSourceColumnData->pDBBinding;
        m_bFirstWide            = ( ( pDBSourceBinding->wType & ~DBTYPE_BYREF ) == DBTYPE_WSTR );

        // Allocate intermediate buffer w/ space for null term.
        m_pstrBuffer            = (LPBYTE)new char[ ( m_ulSrcLength + 1 ) * ( m_bDestWide ? 2 : 1 ) ];

        break;

    case DTSTransformPhase_Transform:

        USES_CONVERSION;
        
        // Get last name info structures.
        pDTSSourceColumnData    = &( pSrcColumnInfo->rgColumnData[ 1 - m_lFirstNameOrd ] );
        pDBSourceBinding        = pDTSSourceColumnData->pDBBinding;

        // Get last name status, data length, and data ptr.
        ULONG   ulLastNStatus   = *(ULONG *)( pDTSSourceColumnData->pvData + pDBSourceBinding->obStatus );
        ULONG   ulSourceLength  = *(ULONG *)( pDTSSourceColumnData->pvData + pDBSourceBinding->obLength );
        LPBYTE  pSourceString   = ( pDBSourceBinding->wType & DBTYPE_BYREF ?
                                    *(LPBYTE *)(pDTSSourceColumnData->pvData + pDBSourceBinding->obValue) :
                                    (LPBYTE)(pDTSSourceColumnData->pvData + pDBSourceBinding->obValue) );

        // If last name not null, move to interm buffer.
        if( ulLastNStatus != DBSTATUS_S_ISNULL )
            if( m_bDestWide )
            {
                // Move to buffer, and convert to wide if necessary.
                if( m_bLastWide )
                    wcscpy( (LPWSTR)m_pstrBuffer, (LPCWSTR)pSourceString );
                else
                    wcscpy( (LPWSTR)m_pstrBuffer, (LPCWSTR)A2W( (LPCSTR)pSourceString ) );

                // Remove trailing spaces, if specified.
                if( m_vbRemoveTrailingSpaces )
                    RemoveTrailingSpace( (LPWSTR)m_pstrBuffer, 
                                         ( m_bLastWide ? ulSourceLength / 2 : ulSourceLength ) );
            }
            else
            {
                // Move to buffer, and convert to ANSI if necessary.
                if( m_bLastWide )
                    strcpy( (LPSTR)m_pstrBuffer, (LPCSTR)W2A( (LPCWSTR)pSourceString ) );
                else
                    strcpy( (LPSTR)m_pstrBuffer, (LPCSTR)pSourceString );

                // Remove trailing spaces, if specified.
                if( m_vbRemoveTrailingSpaces )
                    RemoveTrailingSpace( (LPSTR)m_pstrBuffer, 
                                         ( m_bLastWide ? ulSourceLength / 2 : ulSourceLength ) );
            }

        // Otherwise put a null terminator. Works for both ANSI and wide.
        else
            *(LPWSTR)m_pstrBuffer = L'\0';
        
        // Get first name length, status and data ptr.
        pDTSSourceColumnData    = &( pSrcColumnInfo->rgColumnData[ m_lFirstNameOrd ] );
        pDBSourceBinding        = pDTSSourceColumnData->pDBBinding;
        ulSourceLength          = *(ULONG *)( pDTSSourceColumnData->pvData + pDBSourceBinding->obLength );
        pSourceString           = ( pDBSourceBinding->wType & DBTYPE_BYREF ?
                                    *(LPBYTE *)(pDTSSourceColumnData->pvData + pDBSourceBinding->obValue) :
                                    (LPBYTE)(pDTSSourceColumnData->pvData + pDBSourceBinding->obValue) );

        ULONG   ulFirstNStatus  = *(ULONG *)( pDTSSourceColumnData->pvData + pDBSourceBinding->obStatus );

        // If first name not null, append it to interm buffer.
        if( ulFirstNStatus != DBSTATUS_S_ISNULL )
        {
            // If neither name null, append ", ".
            if( ulLastNStatus != DBSTATUS_S_ISNULL )
                if( m_bDestWide )
                    wcscat( (LPWSTR)m_pstrBuffer, L", " );
                else
                    strcat( (LPSTR)m_pstrBuffer, ", " );

            // Now append the first name.        
            if( m_bDestWide )
            {
                // Move to buffer, and convert to wide if necessary.
                if( m_bFirstWide )
                    wcscat( (LPWSTR)m_pstrBuffer, (LPCWSTR)pSourceString );
                else
                    wcscat( (LPWSTR)m_pstrBuffer, (LPCWSTR)A2W( (LPCSTR)pSourceString ) );

                // Remove trailing spaces, if specified.
                if( m_vbRemoveTrailingSpaces )
                    RemoveTrailingSpace( (LPWSTR)m_pstrBuffer, wcslen( (LPWSTR)m_pstrBuffer ) );
            }
            else
            {
                // Move to buffer, and convert to ANSI if necessary.
                if( m_bFirstWide )
                    strcat( (LPSTR)m_pstrBuffer, (LPCSTR)W2A( (LPCWSTR)pSourceString ) );
                else
                    strcat( (LPSTR)m_pstrBuffer, (LPCSTR)pSourceString );

                // Remove trailing spaces, if specified.
                if( m_vbRemoveTrailingSpaces )
                    RemoveTrailingSpace( (LPSTR)m_pstrBuffer, strlen( (LPSTR)m_pstrBuffer ) );
            }
        }

        // Destination information structures.        
        pDTSDestColumnData      = &( pDestColumnInfo->rgColumnData[0] );
        pDBDestBinding          = pDTSDestColumnData->pDBBinding;
        
        // Destination buffer size, ptrs to data, length, and status.
        ULONG   ulDestLength    = pDBDestBinding->cbMaxLen;
        LPBYTE  pDestString     = (LPBYTE)(pDTSDestColumnData->pvData + pDBDestBinding->obValue);
        ULONG*  ulLength        = (ULONG *)( pDTSDestColumnData->pvData + pDBDestBinding->obLength );
        ULONG*  ulStatus        = (ULONG *)( pDTSDestColumnData->pvData + pDBDestBinding->obStatus );
        
        // If both first and last name Null, set destination to Null.
        if( ulLastNStatus == DBSTATUS_S_ISNULL && ulFirstNStatus == DBSTATUS_S_ISNULL )
            *ulStatus = DBSTATUS_S_ISNULL;

        // Otherwise move data to destination buffer, write length and status.
        else
        {
            // Calculate length of data (in bytes) to be moved.
            *ulLength = ( m_bDestWide ?
                            min( ulDestLength - 2, wcslen( (LPCWSTR)m_pstrBuffer ) * 2 ) :
                            min( ulDestLength - 1, strlen( (LPCSTR)m_pstrBuffer ) ) );

            // Set good status and move data.
            *ulStatus = DBSTATUS_S_OK;
            memcpy( pDestString, m_pstrBuffer, *ulLength );

            // Null terminate the string.
            if( m_bDestWide )
                *(LPWSTR)( pDestString + *ulLength ) = L'\0';
            else
                *(LPSTR)( pDestString + *ulLength ) = '\0';
        }
        break;
    }                
Property get_ and put_ Functions

The property get_ functions return the property value through their pointer parameter after verifying the pointer is non-null. The property put_ functions save the property value, or a pointer to the value, in local storage.

You need to add the code that does this to the get_ and put_ functions of the FirstNameColumn and RemoveTrailingSpaces properties. In each case, replace the comment

    // TODO: Add your implementation code here

with the appropriate code. The get_ and put_ functions were added by Visual C++ when the properties were added to the project. They are located near the end of FormatName.cpp.

CFormatName::get_FirstNameColumn

Replace the comment in CFormatName::get_FirstNameColumn with the following code:

    if( !pVal )
        return E_POINTER;
    *pVal = m_bstrFirstNameColumn.Copy( );
CFormatName::put_FirstNameColumn

Replace the comment in CFormatName::put_FirstNameColumn with the following code:

    SysFreeString( m_bstrFirstNameColumn.m_str );       
    m_bstrFirstNameColumn.m_str = SysAllocString( newVal );
CFormatName::get_RemoveTrailingSpaces

Replace the comment in CFormatName::get_RemoveTrailingSpaces with the following code:

    if( !pVal )
        return E_POINTER;
    *pVal = m_vbRemoveTrailingSpaces;
CFormatName::put_RemoveTrailingSpaces

Replace the comment in CFormatName::put_RemoveTrailingSpaces with the following code:

    m_vbRemoveTrailingSpaces = newVal;
Error Code Definitions

These error codes are returned by methods in the transformation.

Adding Transformation Error Codes

Immediately following the

    import "ocidl.idl";

 statement in file FormatName.idl,

add these code lines:

    //Error codes for this custom transformation
    typedef [helpstring("Error codes generated by the DTSStrings transformations")] enum DTSStringsError {
        DTSStrings_E_NumDestCols            =   0x80041001,
        DTSStrings_E_NumSourceCols          =   0x80041002,
        DTSStrings_E_OnlyStringCols         =   0x80041003,
        DTSStrings_E_NoFirstNameCol         =   0x80041004,
        DTSStrings_E_StrTruncPossible       =   0x80041005,
        DTSStrings_E_NullNotNullPossible    =   0x80041006,
        DTSStrings_E_NoBLOBCols             =   0x80041007,
    } DTSStringsError, *LPDTSStringsError;
Building and Testing FormatName

For more information about building and testing this project, see Implementing and Testing a DTS Custom Transformation.