Friday, October 22, 2010

SSIS DB2 and Stored Procedures

The Scenario

Extracting data from a DB2 file using SSIS when the data is accessed through a stored procedure.

With DTS packages this could be done by using a ODBC driver to connect to the DB2.

The Problems

With DTS packages there was an option to choose Other Driver. From this I could set up an ODBC connection to a DB2 database and execute a stored procedure – this acted as my data source. This ODBC connection used a system DSN which pointed at eh db2 database on an AS400 server.


SSIS does not give you the option of using other driver. What they have instead are OLE DB drivers or .Net drivers.


The OLE DB drivers will not allow the execution of a stored proc on the DB2 to retrieve data.

The IBM OLE DB DB2 provider and Microsoft OLE DB DB2 provider do not support using an SQL command that calls a stored procedure. When this kind of command is used, the OLE DB source cannot create the column metadata and, as a result, the data flow components that follow the OLE DB source in the data flow have no column data available and the execution of the data flow fails..

See: http://msdn.microsoft.com/en-us/library/ms141696.aspx

The Dot.Net connection, in SSIS 2008, will allow you to execute the procedure in an Execute Script Task. It will not allow execution in a data flow task – although it will allow you to preview the data in data flow.


After setting up a Linked Server to the DB2 I encountered the same problem – although the error message was slightly misleading and I spent a long time investigating permissions.


So the only options now are to carry out a row by row import by using an Execute TSQL Task to populate a record set object and then rolling through this to import the data or asking the DB2 DBA to use the stored procedures to populate file which I can then import from.


My forum question on MSDN relating to the issue:
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/7ea40a98-c930-48fc-a961-2fdb977de0d3/#e2edc306-f0f4-45bb-9844-53cf83fb845a


Connect issue raised with Microsoft:
https://connect.microsoft.com/SQLServer/feedback/details/615901/ssis-db2-and-stored-procedures
Please follow the link and vote if the issue effects you and maybe Microsoft will bring out a new driver.

Thursday, October 21, 2010

Importing Large Amounts of Data

The Problem


I have a system which imports large amounts of data over night. This will normally be in the low millions. If it fails for a few nights it can then be tens of millions. The problems is space – I don't have enough drive space to allow the transaction log to grow large enough to accommodate more than two million rows of data.


The Solution


Create a variable called @loopRowCounter.

Create Another called @loopCount

Use an execute TSQL task to populate that variable with the number of rows in the data source.

You can do this by:

Under General

SQL command should start like this - SELECT count(*) as loopRowCount FROM …

Set the ResultSet to single row

Under Result Set

Set result name to loopRowCount and set it to match your variable - User::loopRowCounter.

Now you have a variable with the number of rows in.

Move on to a For Loop.

In InitExpressions enter this @loopCount = 2000002 – this is to set a default value

In EvalExpression enter this @loopCount > 0

In asignExpression enter @loopCount = @loopRowCounter – inside the container @loopRowCounter will decremented.

Now use a data flow and in the Data source use Select Top (500000) …

You will need to do a check that the rows in the source are not in the destination – I do a join on the table and compare ids

This will limit the flow of data to 500000 rows.

Next got to an Execute SQL Task

Under Parameter Mapping

Select the variable Name User::loopRowCounter give the parameter a name and set it to type Long with a size of 4

Under General

SQL command should start like this:

checkpoint;

SELECT ? - 500000 as loopRowCount

Set the ResultSet to single row

Under Result Set

Set result name to loopRowCount and set it to match your variable - User::loopRowCounter.

We know how may rows were in the table and we assigned that value to User::loopRowCounter. We know we have just move 500000 out so we subtract 500000 from the parameter which represents User::loopRowCounter and the result is then assigned back to User::loopRowCounter.

Now, no matter how many millions of records go through my log file only has to cope with 500000 at a time.

Sunday, October 3, 2010

Rename a SQL Server

Working with VMs a server can be cloned with SQL Server installed. The server is renamed but SQL server maintains its original name.


Run this script in SSMS:

sp_dropserver 'old server name'

sp_addserver 'new server name', 'local'


Restart the SQL Server service then test the rename worked:

Select @@servername;


Make sure the SQL Agent is running – it will stop when the sql service stops. If you restart sql server service through the SQL configuration control panel it should restart the sql agent for you.