We have a different method to do incremental load in regards to the DWH. To read about this go to this guide. Incremental load method for Data Warehouses
Incremental loading in an ODX
When you have created an data source in the ODX, you have the option of setting up Incremental Load.
In there you can add an rule with the following options.
You can set it up for some specific schemas, tables or specific fields. Most importantly is the column it will look for. In the above I look for the ModifiedDateTime field across all tables. The additional actions are regarding whether we use only inserts, or also handles deletes and updates.
Without deletes it will only insert new rows that are larger than the rule field, which means that it doesn't check if the rows already exist in the source, or not.
With those set a PK table or folder is generated, where the primary keys are compared to the source for updates or deletes.
If you do not use updates in your setup, you will not be able to use updates in the DWH and the same goes for deletes. So if you do not use this, you will need to run frequent full loads.
When added and you have clicked on Refresh, you will see all the tables that are hit by the rule and on which field this is applied on.
If you use Deletes or Updates in your rule and click OK in this, you may see the following menu, which shows you the tables with missing primary keys and gives you the possibility to choose what these are.
When this is set incremental load is set on the source and it will attempt to use it on transfer tasks with this setting.
If you uncheck this, it will always run full load transfers. How this looks in the store is explained in the DWH incremental guide.
Incremental loading in an BU
Enabling this option requires that the source table have a reliable natural or surrogate key, and a reliable incremental field. This field must be set to a larger value when a record has changed. Usually, this will be a single field called something like "ModifiedDateTime" or "TimeStamp."
When you enable this incremental option and deploy and execute the table, TimeXtender will fully load the table and create an _I table, which records the largest incremental field value from a given load. This allows each subsequent load to only transfer records with an incremental value larger than the one recorded in the _I table, which is the set of new or updated records.
During data cleansing, TimeXtender will compare the transferred records to the records in the valid table using on the table's primary key. Existing records will be updated and new records will be inserted, in that order.
If you have enabled deletes, TimeXtender will create a _PK table when you deploy this option. This table maintains a live record of all primary key combinations present in the source, and is populated after the _R table. If a record is present in the valid table but not in the _PK table and _R table, that record will be flagged or removed. This step happens after the update and insert steps mentioned above.
If you are using certain adapters TimeXtender may create an _INCR table instead of an _I table. The _INCR table works on the same principle that an _I table does, but records incremental values for each source, company or account in the source system.
After the initial load, the valid table will not be truncated unless there is a change in a data selection rule, incremental selection rule, or primary key. A full reload can also be triggered by an execution package set to do so.
You can also set it up by using the automated Set Up Incremental Load wizard. Run the Add Suggested Constraints before doing so, to automatically set what the primary keys are in the source.
Note that unchanged records will not be refreshed in an incremental load. That means means transformations, lookups and new fields will by default only be updated for new or changed records. We have listed some ways of dealing with this limitation here. Alternatively you can apply the setting Keep field values up-to-date.
0 Comments