donderdag 25 april 2013

SSIS : Using Foreach Loop container building an initial load package


Suppose you have a challenge at a customer.You have to load a staging area with data from multiple sources like a couple of files and some tables from another database. All these data is snapshot data of different source systems. The source systems delivers every month a file with all of the customers that are currently active. Another challenge is that not all source systems delivers files on the same frequency. Some of the information is delivered once a couple of months. You could create a SSIS package for every month but You could  build a smarter solution.

The Problem

Suppose I've this situation at hand: There are multiple sources where the information is available. These sourcesystems delivers files on a (in)frequent base. For this purpose I've created a meta data table that controls the loading of the information.

Suppose, I've to process 4 months of data and I've 3 sourcesystems that delivers files:

  • Source A.
  • Source B.
  • Source C

Initial setup

For this Lab I've created a table example. In this table are five columns available:

  • Year.
  • Month.
  • Source_A.
  • Source_B.
  • Source_C.
Below the script with creating the table and loading the table with data:

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[InitialLoad]')
AND type in (N'U'))
DROP TABLE [dbo].[InitialLoad]

CREATE TABLE [dbo].[InitialLoad](
 [Year] [varchar](50) NOT NULL,
 [Month] [varchar](50) NOT NULL,
 [Source_A] [varchar](50) NULL,
 [Source_B] [varchar](50) NULL,
 [Source_C] [varchar](50) NULL

INSERT INTO [InitialLoad] ([Year], [Month], [Source_A], [Source_B], [Source_C] )
VALUES ('2013', '1', 'File_SourceA_20130101', 'File_SourceB_20130101' , NULL),
('2013', '1', 'File_SourceA_20130201', 'File_SourceB_20130201' , NULL),
('2013', '1', 'File_SourceA_20130301', 'File_SourceB_20130301' , 'File_SourceC_20130301'),
('2013', '1', 'File_SourceA_20130401', 'File_SourceB_20130401' , NULL)

SELECT * FROM [InitialLoad]

The SELECT statement results in the following table:

Building the package

First, let me show you the package we are going to build in this blogpost. The package has the following interesting parts:
  • Variables. There are four variables present, one Object variable (ObjectsList) and three String variables (varSourceA,varSourceB, varSourceC).
  • Execute SQL Task. In this step we query the InitialLoad table and fill the ObjectsList with the resultset of the Query.
  • ForEach container. In the Foreach loop container we loop through the ObjectsList Object and fill the appropriate variable with the value of the ObjectsList variable.
  • Scripts tasks. These tasks are primarly created for debugging purposes.

Now let's walktrough the steps creating this SSIS package.

1. Create a SSIS package
2. Create the four variables ObjectsList, varSourceA, varSourceB and varSourceC. The SSIS Datatypes are important and choose System.Object for the ObjectsList variable and String for the others.
3. Create a connection to the database (in my case TEST).
4. Drag an Execute SQL Task to the package and rename this to "List of Source Objects".
5. Drag a ForEach Loop container to the package.
6. Drag the script components to the ForEach Loop container and name them: Source A, Source B and Source C.
7. Configure the Execute SQL task by setting the right values on the General Tab

And on the Result Set tab:

8.Configure the ForEach Loop container

And enter the variable mappings.

9.Configure the Script tasks

10. Run the package

In Run 1:

Run 2

Run 3:

and in Run 4:

And Successsss......


In this blogpost I've presented a simple method for building a meta driven loading SSIS package with aid of a metadata table and SSIS components like ForEach loop.


Geen opmerkingen:

Een reactie posten