Note: There is an updated version of this article, which applies to the next generation of TimeXtender.
TimeXtender began supporting Azure SQL Database with the release of version 17.5. Azure SQL Single Database is a fully managed, database-as-a-service in the Microsoft Azure Cloud. This service is always running the latest SQL Engine, never require patches or updating and can be deployed in minutes. When configured using the "Serverless" Compute Tier, the database can even scale and pause automatically based on workload. It is an ideal solution for most cloud-based data warehouses utilizing fewer than billion row tables and requiring less than several terabytes of storage.
If you prefer to watch a video tutorial of the process to use Azure SQL DB for Project Repository, please check out our TimeXtender Tuesday episode, Configure TimeXtender Environment in Azure App Server.
If you prefer to watch a video tutorial of the process to use Azure SQL DB for DW Storage, please check out our TimeXtender Tuesday episode, Use SQL DB Serverless for DW Storage.
Important: Azure SQL Database does not support cross-database queries, so the typical architecture of having separate Staging and Data Warehouse databases is not ideal. In this case, we need to configure TimeXtender to setup Staging & the Data Warehouse to use a shared Azure SQL Database, and prevent table naming collisions using separate schemas.
In this article you will learn how to:
- Configure an Azure SQL Database
- Configure SQL Users & Permissions
- Setup Staging & Data Warehouses in a shared Azure SQL Database
- Configure Schemas to prevent table naming collisions
- Allow read access to the Database
1. Configure an Azure SQL Database
Create an Azure SQL Database. You will need two, one for the project repository and one for your Business Unit / DSA / MDW. Using the "Serverless" Compute Tier, the database can even scale and pause automatically based on workload.
Note: There are two common issues when using "Serverless" tier with TimeXtender:
1. Initial Execution may fail when serverless is paused - to fix this issue, enable retires in your execution package with a 2-3 minute delay.
2. Repeat executions may fail when serverless is scaling. This is because the processing speed at lower vCores is still too slow. To fix it, ensure the Min vCores is raised to a reasonable limit, or very close to the Max vCores.
2. Configure SQL users & Permissions
You will need to create users in Azure SQL Database with the necessary permissions. Alternatively, you can use Azure Active Directory.
3. Setup Staging & Data Warehouses in a shared Azure SQL Database
Create a new project, make sure SSIS is not selected, add a Business Unit and a STAGE/DSA database. Alternatively, you can use the ODX Server instead.
Add the server name, use SQL Server authentication and add the same user configured above. Type in the name of your Azure SQL Database create in previous steps.
Add a Data warehouse. It is set up so it points to the same database that created earlier. Use the same server and user account. In all Data Warehouse databases, be sure to set Direct Read to 'Matching Server and Database'. This will enable TimeXtender to perform a "Select Into" operation to move the data rather than attempting to use ADO.Net.
4. Configure Schemas to prevent table naming collisions
1. In the Staging Database, create a DSA schema. This is to split up the two parts of the database in DSA and DWH. It needs to be set up as the "Main default schema".
2. In the Data Warehouse Database, create an etl Schema and set it as "Main default schema".
3. In the Data Warehouse Database, create a dbo Schema and set it as "Main valid schema". This will make sure all valid tables are present in their own schema.
5. Allow read access to the Database
The first step in the process is to give admin privileges to the user making the initial changes.
- Identify the SQL Server instance in your resource group where the database is.
- Inside the instance click on the Azure Active Directory in the settings tab.
- To set your user as the admin click on the Set admin option-> Search and select the name in the dialogue box.
- In your SSMS Login to the Azure SQL as the Azure Active Directory admin specified above
- Run the below script in SSMS in the same instance-
- CREATE USER [Azure AD Group Name] FROM EXTERNAL PROVIDER;
- In TimeXtender desktop application where the Azure SQL DB, expand Security right click on the Database Role to Add: Add Database Role-> Add Manually-> Please enter the same AD Group Name used in the SSMS
- To Provide the users in the Group access to only the valid schemas, right click Database Roles-> Object Security Setup-> Click on Schemas and check user Access Role to the valid schema.
- Deploy the security tab
Optionally, to give the users read access to all the tables in the database, run the below script in SSMS
EXEC sp_addrolemember 'db_datareader', ‘AD Azure Group’;
Alternative ways of using Azure Databases
Perhaps you want only the Data Warehouse database to be located on Azure. This is possible, as long as you do not use Direct Read and SSIS.
You can also set up multiple environments where a development environment is local and a production environment is fully on Azure SQL DBs.
When not using one database and schemas to split up the layers we have to rely on ADO.NET for data transfer between database. How much performance degradation can we expect compared with using SSIS in a typical server setup?
I would be very interested to see comments from anyone who is using an Azure SQL DB for TX, with an idea of which pricing model (DTU/vCore) and service tier is working for them.
This is clearly going to be data-dependent, but some starting point in terms of general DWH scale (total gigabytes or rows) and appropriate service tier for that scale, would be very helpful.
Thanks in advance if anyone sees this and replies!
You only need CLR and SMO. I added a link to where you can download them, in the description.
Is it best practice to only use SQL login when accessing Azure SQL database?
If i set windows authentication to on, it get the following error message:
'Windows login are not supported for this version of SQL server'
From security perspective i would preffer to use windows authentication instead of static sql login.
Are you able to connect to a Azure SQL database through Microsoft Management Studio with a windows authenticated user?
If so then yes, it probably should be an added possibility.
Hi Thomas, so what is the best way to handle the ETL process since SSIS is not available. given all db are in azure (no VM).
If you're just running TX DWA on an Azure VM (no SQL Server installed) what are the recommended specs for that VM?
Yep, I saw that. I'm looking for details on the infrastructure requirements side, such a suggested number of CPU cores & RAM for the Azure VM. I'm guessing the number of cores will correspond with how many parallel processes you need to run?
Regarding the infrastructure requirements.
You can split out the threads in as many as you have cores, but it is not a limitation in itself. If you want, you can split it out into 8, even though you have 4 cores. That said it is still a good rule to follow.
In this instance where the SQL server is located in another location, i would probably use a "Compute Optimized" server. How fast is up to you, but it seems to be the best fit.
Regarding the ETL process and no SSIS.
In my article i use a middle step with the direct read that get used instead of the normal transfer. Other than that there still is the normal data cleansing procedure afterwards.
There will be a new article that will be about the differences between SSIS and ADO.Net.