We have a new community! Please visit support.timextender.com
Follow

SSIS vs ADO.Net

We provide two ways to extract data from a source table and add them to another table. We either use SSIS packages or ADO.Net.

Requirements to run ADO.Net

Most of the requirements is added here. If you have a local SQL server installation, you need SQL Express as a minimum, though you will need a Standard edition if you want to use Analysis services. Is it externally located, you will need CLR, SMO and possibly these.

Requirements to run SSIS

As before most requirements are added here. No matter if you have the SQL server locally, or on another server, it requires at least SQL Standard edition, though for all features SQL Enterprise edition. Here is a list of features and for what edition.

Is it located on another server you will need the remote SSIS execution feature as well. Discovery Hub Remote SSIS Execution package

Features of ADO.Net

  • There is no deployment needed on the transfer task
  • You can deploy without access to the data source (But not execute)
  • You can deploy and execute to Azure databases.

Features of SSIS

  • Microsoft best practice
  • It has better performance for large tables with many columns and rows (faster and with better memory management)
  • Checks the source connection / table / columns on deployment (Can not deploy without access to the source)

Speed tests

Since ADO doesn't need to deploy any transfer package, i will only focus on execution.

I have four different tests.

  1. I have a large project based on the Discovery Hub model, you can download from the cube store. It contains 492 tables with 271 tables in the Business Unit, 146 tables in the DSA and 75 tables in the MDW. Many tables run incremental load, therefore i made two execution packages. One is a full load package that fully executes Business Unit, DSA and MDW. The other does not fully execute the tables, but are otherwise similar.
  2. I have a smaller project based on a two tier architecture with a DSA/Staging table and a DWH. It contains 73 tables with 47 in the DSA and 26 in the DWH. Similar to the previous test i have two execution packages set up as DSA, DWH. One runs it full load and one doesn't.
  3. I execute a single table. It contains about 500000 rows and i only focus on the data transfer time.
  4. I execute a single table. It contains about 500000 rows and every time i execute it, i increase the amount of rows with 500000. Here i do not measure the difference in time but the average increase in time in each iteration.

I run all tests eight times in a row. the first two split between Full and Std and the rest just fully loaded.

# Project Full SSIS Minutes ADO Minutes Diff
1 Test Yes 52 0:00:52 58 0:00:58 -6
2 Test Yes 50 0:00:50 54 0:00:54 -4
3 Test Yes 50 0:00:50 56 0:00:56 -6
4 Test Yes 48 0:00:48 54 0:00:54 -6
5 Test No 44 0:00:44 46 0:00:46 -2
6 Test No 46 0:00:46 46 0:00:46 0
7 Test No 44 0:00:44 46 0:00:46 -2
8 Test No 58 0:00:58 48 0:00:48 10
Test average 49,0 0:00:49 51,0 0:00:51 -2,0
9 Discovery Hub Yes 996,0 0:16:36 650,0 0:10:50 346
10 Discovery Hub Yes 701,0 0:11:41 608,0 0:10:08 93
11 Discovery Hub Yes 785,0 0:13:05 583,0 0:09:43 202
12 Discovery Hub Yes 733,0 0:12:13 551,0 0:09:11 182
13 Discovery Hub No 372,0 0:06:12 334,0 0:05:34 38
14 Discovery Hub No 334,0 0:05:34 322,0 0:05:22 12
15 Discovery Hub No 336,0 0:05:36 296,0 0:04:56 40
16 Discovery Hub No 334,0 0:05:34 304,0 0:05:04 30
Discovery Hub average 573,9 0:09:34 456,0 0:07:36 117,9
17 Value Entry Yes 14,0 0:00:14 18,6 0:00:19 -4,6
18 Value Entry Yes 13,7 0:00:14 16,5 0:00:16 -2,8
19 Value Entry Yes 12,0 0:00:12 17,4 0:00:17 -5,4
20 Value Entry Yes 11,8 0:00:12 16,6 0:00:17 -4,8
21 Value Entry Yes 12,4 0:00:12 15,6 0:00:16 -3,2
22 Value Entry Yes 11,7 0:00:12 23,7 0:00:24 -12,0
23 Value Entry Yes 11,7 0:00:12 15,5 0:00:15 -3,8
24 Value Entry Yes 11,8 0:00:12 23,2 0:00:23 -11,4
Single table average 12,4 0:00:12 18,4 0:00:18 -6,0
25 Value Entry Yes 14,5 0:00:14 32,6 0:00:33 -18,1
26 Value Entry Yes 27,8 0:00:28 38,2 0:00:38 -10,5
27 Value Entry Yes 45,3 0:00:45 62,9 0:01:03 -17,6
28 Value Entry Yes 60,4 0:01:00 80,4 0:01:20 -20,0
29 Value Entry Yes 87,2 0:01:27 144,3 0:02:24 -57,1
30 Value Entry Yes 112,8 0:01:53 167,4 0:02:47 -54,6
31 Value Entry Yes 124,7 0:02:05 172,8 0:02:53 -48,1
32 Value Entry Yes 156,8 0:02:37 198,5 0:03:19 -41,7
Single table double lines 19,6 0:00:20 24,8 0:00:25 -5,2
Total average 311,4 0:05:11 253,5 0:04:14 57,9

So as you can see SSIS is faster executing a single table and a small project, but only by less than 5 seconds on average. ADO.net is faster at running a large project.

Both projects could be executed way faster if you used Direct Read Same server and database on DSA and MDW where Business Unit, DSA and MDW using the same database.

Was this article helpful?
1 out of 1 found this helpful

1 Comments

  • 0
    Avatar
    Luke Pargiter

    Good article.  This was helpful.   We switched to ADO.Net because we were seeing better Data Warehouse to Data Warehouse performance for large tables. It seems to be working well in testing.

    Just in case any one else runs into this, we did see one difference, that was not so obvious, on reconciling our Data Warehouse after an upgrade and making this change.   

    We noticed that one of our facts was very slightly off and after researching determined that when moving a DECIMAL(38,6) into a DECIMAL(38,4), it, appears anyway, that SSIS floors (or truncates) and ADO.NET rounds.  Granted the data types should have been the same in both databases, and would have acted the same using both methods, but this highlighted that unexpected conversion differences could happen if data type differences exist in the data flow.

Please sign in to leave a comment.