MSBI # 53 – SSIS # 19 – Data Flow Task # 5 – All Transformation in Data Flow task (Introduction and Usage) {Part – IV}

Hi Folks,

Its has been very long time I have not posted on SSIS,Some of my reader really wan me to post only on SSIS really thanks to all of your comment.

Now lets continuing form my last post on basics of Data flow task in which we have covered following this parts

    We can say we will just introductory information about few transformation task in this post.
    We will surely have detailed explanation each if this task in my future post !!

Export Column Transformation

The name its self explains Export Column Transformation task is useful for exporting .

The Export Column transformation reads data in a data flow and inserts the data into a file. For example, if the data flow contains product information, such as a picture of each product, you could use the Export Column transformation to save the images to files.

We can configure the Export Column transformation in the following different ways:

  • Specify the data columns and the columns that contain the path of files to which to write the data.
  • Specify whether the data-insertion operation appends or truncates existing files.
  • Specify whether a byte-order mark (BOM) is written to the file. Export Column
  • Transformation task exports Binary Data – which means things like Images, Documents and other media – which have been stored in a relational database. Also Export Column Transformation task exports them out to the file system.

The main use for this would be for extracting items stored in the database, or for placing them as files as you move them from point to point in or between data flows.

image_thumb9

Fuzzy Grouping Transformation

The Fuzzy Grouping transformation performs data cleaning tasks by identifying rows of data that are likely to be duplicates and selecting a canonical row of data to use in standardizing the data.

The Fuzzy Grouping transformation  Finds close or exact matches between multiple rows in the data source and also adds columns to the output including the values and similarity scores.

image_thumb10

Fuzzy Lookup Transformation

The Fuzzy Lookup transformation differs from the Lookup transformation in its use of fuzzy matching.

A Fuzzy Lookup transformation frequently follows a Lookup transformation in a package data flow. First, the Lookup transformation tries to find an exact match. If it fails, the Fuzzy Lookup transformation provides close matches from the reference table.

The Lookup transformation uses an equi-join to locate matching records in the reference table. It returns either an exact match or nothing from the reference table. In contrast, the Fuzzy Lookup transformation uses fuzzy matching to return one or more close matches from the reference table.

How it works : A Fuzzy Lookup transformation needs access to a reference data source that contains the values that are used to clean and extend the input data. The reference data source must be a table in a SQL Server 2000 or later database. The match between the value in an input column and the value in the reference table can be an exact match or a fuzzy match. However, the transformation requires at least one column match to be configured for fuzzy matching. If you want to use only exact matching, use the Lookup transformation instead.

This transformation has one input and one output.

image_thumb1111

Import Column Transformation

This task is an extension of Export Column transformation

The Import Column transformation reads data from files and adds the data to columns in a data flow. Using this transformation, a package can add text and images stored in separate files to a data flow.

Lets also try this explanation : Import column reads data from the file name given in the column of the input dataset. It reads the file content from the given file and add the data along with the data flow as a column data for each row.

The Import and Export Wizard protects you from the complexity of SSIS while allowing you to move data between any of these data sources:

  • SQL Server databases
  • Flat files
  • Microsoft Access databases
  • Microsoft Excel worksheets
  • Other OLE DB providers image_thumb12

lookup Transformation

This is most important task while performing any lookup or checking Data Consistency

The Lookup transformation performs lookups by joining data in input columns with columns in a reference dataset. You use the lookup to access additional information in a related table that is based on values in common columns.

The Lookup transformation supports the following database providers for the OLE DB connection manager:

  • SQL Server
  • Oracle
  • image_thumb13

Merge Transformation

    The Merge transformation combines two sorted datasets into a single dataset. The rows from each dataset are inserted into the output based on values in their key columns.

    By including the Merge transformation in a data flow, we can perform the following tasks:

  • Merge data from two data sources, such as tables and files.
  • Create complex datasets by nesting Merge transformations
  • Remerge rows after correcting errors in the data.

The Merge transformation is similar to the Union All transformations. Use the Union All transformation instead of the Merge transformation in the following situations:

  • The transformation inputs are not sorted
  • The combined output does not need to be sorted.
  • The transformation has more than two inputs.
  • image_thumb14

    Merge Join Transformation

    The Merge Join transformation provides an output that is generated by joining two sorted datasets using a FULL, LEFT, or INNER join. For example, you can use a LEFT join to join a table that includes product information with a table that lists the country/region in which a product was manufactured. The result is a table that lists all products and their country/region of origin. For more information, see Using Joins.

    You can configure the Merge Join transformation in the following ways:

  • Specify the join is a FULL, LEFT, or INNER join.
  • Specify the columns the join uses.
  • Specify whether the transformation handles null values as equal to other nulls.

image_thumb15Multicast Transformation

Multicast Transformation helps us for finding data duplication process by getting a single input dataset and sends each row to all the output pipeline. Yes, Multicast task has single input and multiple output pipeline.

    Using the Multicast transformation, a package can create logical copies of data. This capability is useful when the package needs to apply multiple sets of transformations to the same data.

We can configure the Multicast transformation by adding outputs.

The difference between the two is that the Multicast transformation directs every row to every output, and the Conditional Split directs a row to a single output. For more information, see Conditional Split Transformation.image_thumb16

We will surely have post on each task sooner !!

I am preparing for that one for you all !!

Hope this helps !!

  • Hope you have understood basic aspect of Few transformation task and ready to use every aspects for same

    If you really like reading my blog and understood at lest few thing then please don’t forget to subscribe my blog

    If you wan daily link and analysis or interesting link go to following website which will give @ your inbox please subscribe our following link resource blog

    Where todays links are

    Link Resource Website>> Dactylonomy of Web Resource

6 thoughts on “MSBI # 53 – SSIS # 19 – Data Flow Task # 5 – All Transformation in Data Flow task (Introduction and Usage) {Part – IV}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s