by Bruce de Freitas – July 21, 2009
As I was maintaining some old DTS packages (that run on SQL Server 2005), I noticed a pattern in them. Most of them have numerous Data Pump tasks between the exact same SOURCE and TARGET databases, but the Connection objects were repeated as in Example #1 below, with no workflow dependencies... I'm used to seeing this done more like Example #2, where you define unique connection objects just ONE time, and link multiple Data Pump tasks between them, when order of task execution does not matter. But I wasn't sure about the difference in Example #1 and Example #2...
I set up this small test moving 3 large tables via 3 Data Pump tasks in 5 DTS packages to see the performance differences. All 5 packages had the "max number of parallel tasks" set to 5, so that setting had no bearing on the results, as there were only 3 tasks per package anyway....
Bottom line is that only Example #3 actually multi-tasked (in 6m:53s), all other cases the 3 Data Pump tasks single-threaded thru (in 9m:35s). Example 1 and 2 execute virtually the same, performance-wise. If the real DTS packages, not these examples, were written with the intent that tasks would run concurrently, they aren't, they're single threading. I didn't do tests under SSIS, but I'd think it works the same with connection objects and pooling.... Moral of the story, if you want to run concurrent Data Pump tasks with the same SOURCE\TARGET connections, think about creating separate connection objects like Example #3 below (for DTS).... Some other time I'll try this under SSIS, unless someone else has tried it with SSIS and wants to share what they have found.....
Example #1 (575 secs) - This is the style I am seeing in existing DTS packages... Single SOURCE connection object and single TARGET connection object, re-used multiple times in the package. This method single threads thru each transform (one at a time).
Example #2 (575 secs) - This method uses one unique SOURCE and one unique TARGET connection object, has 3 data transforms between them. This appears to execute exactly as Example #1 does, single threading thru each transform.
Example #3 (413 secs) - This example runs the fastest because I created separate connection objects on both the SOURCE and TARGET databases. I could visually see them all clicking up concurrently only when run this way in Example #3, all other of these DTS examples, would pause and do one at a time, which makes sense as there is only one connection to pipe thru in those cases..... See the SQL Profiler trace and the DTS Log summary at the bottom of this article... Note in the trace, for Example#3, that the "SELECT *" generated by the Data Pump, is not waiting for completion of each task, they are all concurrently running!
These last 2 examples were to see how it worked with EITHER of the SOURCE or TARGET connection objects unique, but not both, like Example #3.
Example #4 (568 secs) - Used unique SOURCE connection objects but shared the same TARGET connection object.
Example #5 (571 secs) - Used unique TARGET connection objects but shared the same SOURCE connection object.
DTS Log Output from all 5 Example Packages
SQL Profiler Output from all 5 Example Packages






