zaterdag 10 november 2018

Azure SQL Database exam glossary


Currently studying for the Microsoft 70-473, Designing and Implementing Cloud Data Platform Solutions, and I have gathered some of the buzzwords that appeared when studying for this exam. I've written them down for myself and for you for referencing and studying. Good luck!

In the coming months I'll update this list to reflect my findings during investigation of  new study material for this exam. Some terminology is not yet explained, but I'll do that later.


  • Active Directory Federation Services (ADFS) : 
  • Always On Availability Groups : An availability group supports a replicated environment for a discrete set of user databases, known as availability databases. You can create an availability group for high availability (HA) or for read-scale. An HA availability group is a group of databases that fail over together. 
  • Always On Failure Cluster instances :
  • App Service Authentication / Authorization : App Service Authentication / Authorization is a feature that provides a way for your application to sign in users so that you don't have to change code on the app backend. It provides an easy way to protect your application and work with per-user data.
  • Application Insights : Application Insights is an extensible Application Performance Management (APM) service for web developer. With Application Insights, you can monitor your live web applications and automatically detect performance anomalies.
  • Asynchronous communication.Communication between loosely coupled system by using storage queues or Service bus queues for later processing. 
  • Availability database : A database that belongs to an availability group. For each availability database, the availability group maintains a single read-write copy (the primary database) and one to eight read-only copies (secondary databases).
  • Availability group : A container for a set of databases, availability databases, that fail over together.
  • Availability group listener : A server name to which clients can connect in order to access a database in a primary or secondary replica of an Always On availability group. Availability group listeners direct incoming connections to the primary replica or to a read-only secondary replica.
  • Availability Modes : The availability mode is a property of each availability replica. The availability mode determines whether the primary replica waits to commit transactions on a database until a given secondary replica has written the transaction log records to disk (hardened the log). Always On availability groups supports two availability modes—asynchronous-commit mode and synchronous-commit mode.
  • Availability sets : Azure positions the virtual machines in a way that prevents localized hardware faults and maintenance activities from bringing down all of the machines in that group. Availability sets are required to achieve the Azure SLA for the availability of Virtual Machines.
  • Availability Replica : Each set of availability database is hosted by an availability replica. Two types of availability replicas exist: a single primary replica. which hosts the primary databases, and one to eight secondary replicas, each of which hosts a set of secondary databases and serves as a potential failover targets for the availability group.
  • Azure Advisor : Azure Advisor is a personalized cloud consultant that helps you to optimize your Azure deployments. It analyzes your resource configuration and usage telemetry.
  • Azure Compute Units (ACU) : The concept of the Azure Compute Unit (ACU) provides a way of comparing compute (CPU) performance across Azure SKUs. This will help you easily identify which SKU is most likely to satisfy your performance needs. 
  • Azure Cross Platform Interface (CLI ): The Azure CLI is Microsoft's cross-platform command-line experience for managing Azure resources. You can use it in your browser with Azure Cloud Shell, or install it on macOS, Linux, or Windows and run it from the command line.
  • Azure Disk Encryption : Azure Disk Encryption is a new capability that helps you encrypt your Windows and Linux IaaS virtual machine disks. It applies the industry standard BitLocker feature of Windows and the DM-Crypt feature of Linux to provide volume encryption for the OS and the data disks. 
  • Azure DNS : The Domain Name System, or DNS, is responsible for translating (or resolving) a website or service name to its IP address. Azure DNS is a hosting service for DNS domains, providing name resolution using Microsoft Azure infrastructure.
  • Azure Key Vault :
  • Azure Monitor. Azure Monitor offers visualization, query, routing, alerting, auto scale, and automation on data both from the Azure infrastructure (Activity Log) and each individual Azure resource (Diagnostic Logs).
  • Azure Load Balancer : Azure Load Balancer delivers high availability and network performance to your applications. It is a Layer 4 (TCP, UDP) load balancer that distributes incoming traffic among healthy instances of services defined in a load-balanced set. 
  • Azure Region :
  • Azure Resource Manager (ARM) : Azure Resource Manager enables you to work with the resources in your solution as a group.
  • Azure Resource Manager template : You use an Azure Resource Manager template for deployment and that template can work for different environments such as testing, staging, and production. 
  • Azure Scheduling Service: Azure Logic Apps is replacing Azure Scheduler, which is being retired. To schedule jobs, try Azure Logic Apps instead.
  • Azure Security Center (ASC) : Azure Security Center helps you prevent, detect, and respond to threats with increased visibility into and control over the security of your Azure resources.It provides integrated security monitoring and policy management across your Azure subscriptions, helps detect threats that might otherwise go unnoticed, and works with a broad ecosystem of security solutions.
  • Azure Service Fabric (ASF)
  • Azure Site Recovery (ASR) : Azure Site Recovery helps orchestrate replication, failover, and recovery of workloads and apps so that they are available from a secondary location if your primary location goes down.
  • Azure SQL Database Query Performance Insight : Managing and tuning the performance of relational databases is a challenging task that requires significant expertise and time investment. Query Performance Insight allows you to spend less time troubleshooting database performance by providing the following:
    • Deeper insight into your databases resource (DTU) consumption.
    • The top queries by CPU/Duration/Execution count, which can potentially be tuned for improved performance.
    • The ability to drill down into the details of a query, view its text and history of resource utilization.
    • Performance tuning annotations that show actions performed by SQL Azure Database Advisor
  • Azure SQL logical server : A logical server acts as a central administrative point for multiple single or pooled databases, logins, firewall rules, auditing rules, threat detection policies, and failover groups. A logical server can be in a different region than its resource group. The logical server must exist before you can create the Azure SQL database. 
  • Azure Storage : blobs, tables, queues
  • Azure Storage Analytics : Azure Storage Analytics performs logging and provides metrics data for a storage account. You can use this data to trace requests, analyze usage trends, and diagnose issues with your storage account. 
  • Azure subnets : 
  • Azure Traffic Manager (WATM) : Once a datacenter-specific failure occurs, you must redirect traffic to services or deployments in another datacenter. This routing can be done manually, but it is more efficient to use an automated process. 
  • Azure Virtual Network (VNET) : An Azure virtual network (VNet) is a representation of your own network in the cloud. It is a logical isolation of the Azure network fabric dedicated to your subscription. You can fully control the IP address blocks, DNS settings, security policies, and route tables within this network
  • Always Encrypted : 
  • Blob :
  • Bring Your Own Key (BYOK) :
  • Cold data : Cold data is inactive data that is rarely used, but must be kept for compliance reasons.
  • Compute scale-units :
  • Compute optimized VM : High CPU-to-memory ratio. Good for medium traffic web servers, network appliances, batch processes, and application servers.
  • Content Delivery Network (CDN) :
  • Cross-Origin Resource Sharing (CORS) : Cross-Origin Resource Sharing (CORS) is a mechanism that allows domains to give each other permission for accessing each other’s resources. The User Agent sends extra headers to ensure that the JavaScript code loaded from a certain domain is allowed to access resources located at another domain. 
  • Data in motion :
  • Database Transaction Unit (DTU) : The Database Transaction Unit (DTU) represents a blended measure of CPU, memory, reads, and writes. 
  • DBaaS : Database as a Service
  • Direct Server Return : 
  • Disaster Recovery (DR) :
  • Disk Striping :
  • DTU-based purchasing model : This model is based on a bundled measure of compute, storage, and IO resources. Compute sizes are expressed in terms of Database Transaction Units (DTUs) for single databases and elastic Database Transaction Units (eDTUs) for elastic pools.
  • Dynamic Scalability : Dynamic scalability is different from autoscale. Autoscale is when a service scales automatically based on criteria, whereas dynamic scalability allows for manual scaling without downtime. Dynamic scalability enables your database to transparently respond to rapidly changing resource requirements and enables you to only pay for the resources that you need when you need them.
  • eDTU : eDTU measures the shared resources in an elastic pool. A pool is given a set number of eDTUs for a set price. Within the elastic pool, individual databases are given the flexibility to auto-scale within the configured boundaries. A database under heavier load will consume more eDTUs to meet demand. Databases under lighter loads will consume less eDTUs
  • Effective availability : Effective availability considers the Service Level Agreements (SLA) of each dependent service and their cumulative effect on the total system availability.
  • Elastic : 
  • Elastic Pool : 
  • Elastic Query Database (head node)  :
  • Encryption at rest : There are three Azure storage security features that provide encryption of data that is “at rest”: Storage Service Encryption allows you to request that the storage service automatically encrypt data when writing it to Azure Storage. Client-side Encryption also provides the feature of encryption at rest. Azure Disk Encryption allows you to encrypt the OS disks and data disks used by an IaaS virtual machine.
  • Encryption in transit : Encryption in transit is a mechanism of protecting data when it is transmitted across networks. Transport-level encryption, such as HTTPS when you transfer data into or out of Azure Storage. Wire encryption, such as SMB 3.0 encryption for Azure File shares. Client-side encryption, to encrypt the data before it is transferred into storage and to decrypt the data after it is transferred out of storage.
  • Estimated Recovery Time (ERT) : 
  • Express route :  Microsoft Azure ExpressRoute is a dedicated WAN link that lets you extend your on-premises networks into the Microsoft cloud over a dedicated private connection facilitated by a connectivity provider.
  • Fail Over :
  • Fail Over groups :
  • Fault Detection : 
  • Fault Domain (FD):
  • Fault Tolerance : A fault tolerant solution detects and maneuvers around failed elements to continue and return the correct results within a specific timeframe.
  • Feature parity :
  • Forced tunneling: Forced tunneling is a mechanism you can use to ensure that your services are not allowed to initiate a connection to devices on the Internet.
  • Geo-redundant storage (GRS)
  • hardware Security modules (HSMs) : Key Vault provides the option to store your keys in hardware Security modules (HSMs) certified to FIPS 140-2 Level 2 standards. 
  • High Availablity/Disaster Recovery (HADR) :  
  • High Availability (HA) : A highly available cloud solution should implement strategies to absorb the outage of the dependencies of services that are provides by the cloud platform.
  • High performance compute VM : Our fastest and most powerful CPU virtual machines with optional high-throughput network interfaces (RDMA).
  • Hyperscale Databases :
  • Hyperscale service tier
  • General purpose VM : Balanced CPU-to-memory ratio. Ideal for testing and development, small to medium databases, and low to medium traffic web servers.
  • Geo-Replication : 
  • Geo-Replication Support (GRS) :
  • GPU optimized VM : Specialized virtual machines targeted for heavy graphic rendering and video editing, as well as model training and inferencing (ND) with deep learning. Available with single or multiple GPUs.
  • PaaS :
  • Premium Storage :
  • IaaS :
  • Internal Load Balancer (ILB) : Load balance traffic between virtual machines in a virtual network, between virtual machines in cloud services, or between on-premises computers and virtual machines in a cross-premises virtual network. This configuration is known as internal load balancing.
  • Log Analytics:  Provides an IT management solution for both on-premises and third-party cloud-based infrastructure (such as AWS) in addition to Azure resources. Data from Azure Monitor can be routed directly to Log Analytics so you can see metrics and logs for your entire environment in one place.
  • Load balancing : 
  • Locally redundant storage (LRS) :
  • Log shipping :
  • Managed Disks :  Managed Disks handles storage behind the scenes. In addition, when virtual machines with Managed Disks are in the same availability set, Azure distributes the storage resources to provide appropriate redundancy. Microsoft recommends Managed Disks for SQL Server.
  • Memory optimized VM : High memory-to-CPU ratio. Great for relational database servers, medium to large caches, and in-memory analytics.
  • Multi-master replication : Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group, and resolving any conflicts that might arise between concurrent changes made by different members.
  • Multi-Factor Authentication (MFA) :
  • Multi-Tenant (Sharding): List mapping or Range mapping
  • Network access control : Network access control is the act of limiting connectivity to and from specific devices or subnets and represents the core of network security. 
  • Network Security groups (NSGs) :  Network Security groups (NSGs) can be used on Azure Virtual Network subnets containing App Service Environments to restrict public access to API applications.
  • Next Generation Firewall (NGFW) : Add a Next Generation Firewall Recommends that you add a Next Generation Firewall (NGFW) from a Microsoft partner to increase your security protections
  • Pricing Tier :
  • Priority routing method : 
  • Public Virtual IP (VIP) :
  • Resilience :
  • Read-access geo-redundant storage (RA-GRS) :
  • Read-scale balancing : A read scale availability group provides replicas for read-only workloads but not high availability. In a read-scale availability group there is no cluster manager.
  • Read scale out : Read scale-out is a feature available in where you are getting one read-only replica of your data where you can execute demanding read-only queries such as reports. Red-only replica will handle your read-only workload without affecting resource usage on your primary database.
  • Read Only (RO) :
  • Replication option :  Locally redundant storage (LRS), Zone-redundant storage (ZRS), Geo-redundant storage (GRS), Read-access geo-redundant storage (RA-GRS)
  • Role-Based Access Control (RBAC) :You can secure your storage account with Role-Based Access Control (RBAC). Restricting access based on the need to know and least privilege security principles is imperative for organizations that want to enforce Security policies for data access
  • RPO: Recovery point Objective is the acceptable amount of data lost in case a disaster.
  • RTO: Recovery Time Objective is the acceptable amount required to recover from a disaster.
  • Secure Sockets Layer (SSL) : Enforcing SSL connections between your database server and your client applications helps protect against "man in the middle" attacks by encrypting the data stream between the server and your application.
  • Service Tier :
  • Shared Access Signature (SAS) : A shared access signature (SAS) provides delegated access to resources in your storage account. The SAS means that you can grant a client limited permissions to objects in your storage account for a specified period and with a specified set of permissions. 
  • Standard Storage : Standard Storage has varying latencies and bandwidth and is only recommended for dev/test workloads. This includes the new Standard SSD storage.
  • Stretch database: Stretch Database in SQL Server 2017 migrates your historical data transparently to Azure. Using the Stretch Database approach only applies to SQL Server 2017 and does not apply to SQL Database.
  • Stretch Database Advisor
  • Scalable solutions : Scalable solutions are able to meet increased demand with consistent results in acceptable time windows
  • Scale units
  • Sharding : Data is partitioned horizontally to distribute rows across a scaled out data tier. With this approach, the schema is identical on all participating databases. Sharding can be performed and managed using (1) the elastic database tools libraries or (2) self-sharding. An elastic query is used to query or compile reports across many shards.
  • SKU :
  • Stateless :
  • Stateless compute nodes :
  • Storage scale-units :  Examples are Azure Table partition, Blob container, and SQL Database.
  • Storage optimized VM : High disk throughput and IO. Ideal for Big Data, SQL, and NoSQL databases.
  • System-versioned temporal tables :
  • Read and Write access (RW) :
  • Time-To-Live (TTL) : Time to live (TTL) or hop limit is a mechanism that limits the lifespan or lifetime of data in a computer or network. TTL may be implemented as a counter or timestamp attached to or embedded in the data. 
  • Traffic isolation: A virtual network is the traffic isolation boundary on the Azure platform. Virtual machines (VMs) in one virtual network cannot communicate directly to VMs in a different virtual network, even if both virtual networks are created by the same customer. Isolation is a critical property that ensures customer VMs and communication remains private within a virtual network.
  • Transient : The word “transient” means a temporary condition lasting only for a relatively short time. 
  • Transparant Data Encryption (TDE) :
  • Tenant :
  • Update Domain (UD) :
  • vCore model : 
  • vCore-based purchasing model : This model allows you to independently choose compute and storage resources. It also allows you to use Azure Hybrid Benefit for SQL Server to gain cost savings. Managed Instances in Azure SQL Database only offer the vCore-based purchasing model.
  • VPN gateway : To send network traffic between your Azure Virtual Network and your on-premises site, you must create a VPN gateway for your Azure Virtual Network. 
  • web application firewall (WAF) : The web application firewall (WAF) in Azure Application Gateway helps protect web applications from common web-based attacks like SQL injection, cross-site scripting attacks, and session hijacking. 
  • Windows Server Failover Cluster(WSFC). Deploying Always On availability groups for HA on Windows requires a Windows Server Failover Cluster(WSFC). Each availability replica of a given availability group must reside on a different node of the same WSFC. 
  • Zone-redundant storage (ZRS)
  • Zone-redundant Databases


  • MySQL Community Edition : MySQL Community Edition is the freely downloadable version of the world's most popular open source database. It is available under the GPL license and is supported by a huge and active community of open source developers.
  • MySQL Document Store including X Protocol, XDev API and MySQL Shell
  • Transactional Data Dictionary :
  • Pluggable Storage Engine Architecture : 
  • InnoDB :
  • NDB :
  • MyISAM :
  • MySQL Replication :
  • MySQL Group Replication for replicating data while providing fault tolerance, automated failover, and elasticity
  • MySQL InnoDB Cluster : Deliver an integrated, native, high availability solution for MySQL
  • MySQL Router : For transparent routing between your application and any backend MySQL Servers
  • MySQL Partitioning : to improve performance and management of large database applications
  • Performance Schema for user/application level monitoring of resource consumption
  • MySQL Workbench : for visual modeling, SQL development and administration
  • Navicat :


  • Azure_maintenance - This database is used to separate the processes that provide the managed service from user actions. You do not have access to this database.
  • Azure_sys - A database for the Query Store. This database does not accumulate data when Query Store is off; this is the default setting. For more information, see the Query Store overview.
  • Continuous sync capability : DMS performs an initial load of your on-premises to Azure Database for PostgreSQL, and then continuously syncs any new transactions to Azure while the application remains running. After the data catches up on the target Azure side, you stop the application for a brief moment (minimum downtime), wait for the last batch of data (from the time you stop the application until the application is effectively unavailable to take any new traffic) to catch up in the target, and then update your connection string to point to Azure. 
  • GIN operator
  • Supported PostgreSQL Database Versions (at the moment of writing: 10.4, 9.6.9 and 9.5.13
  • pg_dump :
  • pg_restore :
  • Postgres - A default database you can connect to once your server is created.
  • PostgreSQL Extensions : PostgreSQL provides the ability to extend the functionality of your database using extensions. Extensions allow for bundling multiple related SQL objects together in a single package that can be loaded or removed from your database with a single command. After being loaded in the database, extensions can function as do built-in features
  • Pricing Tiers : Basic, General purpose and memory optimized.


vrijdag 9 november 2018

70-473 : Designing and Implementing Cloud Data Platform Solutions


One of my targets in the coming year is to learn more about Azure. Recent news shows that Microsoft earnings on Azure is increasing quite a lot and cloud seems to be the future. I've already invested in courses like the Microsoft professional program Data science and the Big data program. These courses are a great resource if you want to know more about big data solutions with Azure and data science with AzureML and R. As said before, I feel more and more confident and familiar with the Azure environment and it's time to pass some Microsoft exams. The first one is the exam 40-473 : Designing and Implementing Cloud Data Platform Solutions. This exam is about the relational databases like SQL Server, Azure SQL database, MySQL and ProgresSQL. The latter two have been added in June 2017 and replaced the SQL Data warehouse section. Why ? I don't know. I'm not into MySQL and PostgresSQL but I want to pass the exam! Therefore I need to study some things I don't really need for my day-to-day basis of work.

Now, I got the idea from MSSQLTips to copy the content of the requirements of the exam and search for relevant content and add them to exam requirements. I've used most of the information that was already present, but reorganized and added my own links for studying the items.

General sources

Now for this exams I studied quite some resources. These are sometimes generic information about  SQL Server, SQL database, Azure or else. These help you with some background information:

General Microsoft Azure information

1. Design  and implement database solutions for Microsoft SQL Server and Microsoft Azure SQL Database

1.1 Design a hybrid SQL Server solution

1.2 Implement SQL Server on Azure Virtual Machines (VMs)

1.3 Design a database solution on Azure SQL database and SQL Server in Azure

1.4 Implement Azure SQL Database

1.5 Design and implement MySQL and PostgreSQL database solutions in Azure

2 Design and Implement Security

2.1 Design and implement SQL Server Database security

2.2 Design and implement Azure SQL Database security

3 Design for high availablity, Disaster recovery and scalability

3.1 Design and implement high availability solutions

3.2 Design and implement scalable solutions

3.3 Design and implement Azure SQL Database data recovery

4 Monitor and manage database implementations in Azure

4.1 Monitor and troubleshoot SQL Server VMs on Azure

4.3 Monitor and troubleshoot SQL Database

4.4 Automate and manage database implementations on Azure

zondag 4 november 2018

DevOps : Deploying a Tabular cube with Powershell


One step in a SQL data warehouse DevOps project is to deploy a SSAS tabular project on an instance. In this blogpost I'm going to show you a script that I'm using for deploying SSAS tabular cubes. As inspiration for creating the deployment script, I used information from a blogger Harbinger Singh. I had to make some changes to the script to make it work in my situation.


In the script, I've created a couple of blocks of code :
  1. An array of cubes, I want to deploy to the server. This will help me control which cubes to deploy. Another option is to loop over the content of a folder and deploy the cubes.
  2. Create a loop and loop through the array.
  3. Check if the cube is present and print a warning if it can't find the cube.
  4. Adjust the .asdatabase file database connectionstrings. I've multiple connections to databases and they must be changed.
  5. Adjust the .deploymenttargets file database connectionstring.
  6. Generate a .configsettings file. This file is not generated with the build of a SSAS tabular model.
  7. Adjust .configsettings file database connectionstrings with the desired connectionstrings.
  8. Not every cube uses a connectionstring to two databases. There is check whether there is a DWH_Control connectionstring in the .configsettings file. 
  9. Adjust .deploymentoptions file database connectionstrings.
  10. Create the xmla script with AnalysisServices.Deployment wizard.
  11. The last step is to deploy the xmla script to the server with Invoke-ASCmd.

The code

The complete script is written below.

# AllCubes.SSASTAB.Dev.Script
# General variables
$path           = "C:\<directory>"
$SSASServer     = "localhost"
$DwDBnameDM     = "DWH_Datamart"
$DwDBnameCTRL   = "DWH_Control"
$DwServerName   = "localhost"

# Structure bimname, CubeDB, modelname
$CubeArray = @(
             ("<filename1>" , "<cubeDB1>"           , "<modelname1>"),
             ("<filename2>" , "<cubeDB2>"           , "<modelname2>")

Write-Host "------------------------------------"
foreach ($element in $CubeArray) {

    $bim            = $element[0]
    $CubeDB         = $element[1]
    $CubeModelName  = $element[2]

    $AsDBpath             = "$path\$bim.asdatabase"
    $DepTargetpath        = "$path\$bim.deploymenttargets"
    $ConfigPath           = "$path\$bim.configsettings"
    $DeployOption         = "$path\$bim.deploymentoptions"
    $SsasDBconnection     = "DataSource=$SsasServer;Timeout=0"
    $DwDbDMConnString     = "Provider=SQLNCLI11.1;Data Source=$DwServerName;Integrated Security=SSPI;Initial Catalog=$DwDBnameDM"
    $DwDbCTRLConnString   = "Provider=SQLNCLI11.1;Data Source=$DwServerName;Integrated Security=SSPI;Initial Catalog=$DwDBnameCTRL"
    $IsDMConnStrPresent   = [bool]0
    $IsCTRLConnStrPresent = [bool]0

    if (!(Test-Path $AsDBpath))  {
      Write-Warning "$AsDBpath absent from location"
      Write-Host "------------------------------------"

    #Adjust .asdatabase file database connectionstring
    $json = (Get-Content $AsDBpath -raw) | ConvertFrom-Json
    $json.model.dataSources | % {if($ -eq 'DWH_DataMart'){$_.connectionString=$DwDbDMConnString ; $IsDMConnStrPresent=[bool]1 }}
    $json.model.dataSources | % {if($ -eq 'DWH_Control'){$_.connectionString=$DwDbCTRLConnString ; $IsCTRLConnStrPresent=[bool]1 }}
    $json | ConvertTo-Json  -Depth 10 | set-content $AsDBpath

    #Adjust .deploymenttargets file database connectionstring
    $xml  = [xml](Get-Content $DepTargetpath)
    $node = $xml.DeploymentTarget
    $node.Database = $CubeDB
    $node = $xml.DeploymentTarget
    $node.Server = $SsasServer
    $node = $xml.DeploymentTarget
    $node.ConnectionString = $SsasDBconnection

    # generate .configsettings as this file is not generated with the build. 
    if (($IsDMConnStrPresent) -and ($IsCTRLConnStrPresent))  {
        '<ConfigurationSettings xmlns:xsd="" xmlns:xsi="" xmlns:ddl2="" xmlns:ddl2_2="" xmlns:ddl100_100="" xmlns:ddl200="" xmlns:ddl200_200="" xmlns:ddl300="" xmlns:ddl300_300="" xmlns:ddl400="" xmlns:ddl400_400="" xmlns:ddl500="" xmlns:ddl500_500="" xmlns:dwd="">
                <ConnectionString>Provider=SQLNCLI11.1;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=DWH_Datamart</ConnectionString>
                  <ImpersonationMode xmlns="">ImpersonateServiceAccount</ImpersonationMode>
                  <Account xmlns="">
                  <Password xmlns="">
                  <ImpersonationInfoSecurity xmlns="">Unchanged</ImpersonationInfoSecurity>
                <ConnectionString>Provider=SQLNCLI11.1;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=DWH_Control</ConnectionString>
                  <ImpersonationMode xmlns="">ImpersonateServiceAccount</ImpersonationMode>
                  <Account xmlns="">
                  <Password xmlns="">
                  <ImpersonationInfoSecurity xmlns="">Unchanged</ImpersonationInfoSecurity>
        </ConfigurationSettings>' | Out-File -FilePath $path\$bim.configsettings
    else {
        '<ConfigurationSettings xmlns:xsd="" xmlns:xsi="" xmlns:ddl2="" xmlns:ddl2_2="" xmlns:ddl100_100="" xmlns:ddl200="" xmlns:ddl200_200="" xmlns:ddl300="" xmlns:ddl300_300="" xmlns:ddl400="" xmlns:ddl400_400="" xmlns:ddl500="" xmlns:ddl500_500="" xmlns:dwd="">
                <ConnectionString>Provider=SQLNCLI11.1;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=DWH_Datamart</ConnectionString>
                  <ImpersonationMode xmlns="">ImpersonateServiceAccount</ImpersonationMode>
                  <Account xmlns="">
                  <Password xmlns="">
                  <ImpersonationInfoSecurity xmlns="">Unchanged</ImpersonationInfoSecurity>
        </ConfigurationSettings>' | Out-File -FilePath $path\$bim.configsettings

    #Adjust .configsettings file database connectionstring
    $xml = [xml](Get-Content $ConfigPath)
    $nodeDM = $xml.ConfigurationSettings.Database.DataSources.DataSource | ? { $_.ID -eq $DwDBnameDM }
    $nodeDM.ConnectionString = $DwDbDMConnString
    $nodeCTRL = $xml.ConfigurationSettings.Database.DataSources.DataSource | ? { $_.ID -eq $DwDBnameCTRL }

    # In case here is not a DWH_Control Connectionstring in the .configsettings file
    if (![string]::IsNullOrEmpty($nodeCTRL))
        $nodeCTRL.ConnectionString = $DwDbCTRLConnString

    #Adjust .deploymentoptions file database connectionstring
    $xml = [xml](Get-Content $DeployOption)
    $node = $xml.DeploymentOptions
    $node.ProcessingOption = "DoNotProcess"

    # Create the xmla script with AnalysisServices.Deployment wizard
    Write-Host "Deploying Cube : $CubeDB"
    $path = $path
    cd $path 
    $exe = "C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\Microsoft.AnalysisServices.Deployment.exe"
    $param1 = $bim + ".asdatabase"
    $param2 = "/s:" + $bim + ".txt"
    $param3 = "/o:" + $bim + ".xmla"
    $param4 = "/d"

    Write-Host "Importing SQL modules..."
    # import modules 
    if ((Get-Module -ListAvailable | where-object {($_.Name -eq 'SqlServer') -and ($_.Version.Major -gt 20) } |Measure).Count -eq 1){
        # implementation of new sql modules migated into new location
        Import-Module SqlServer -DisableNameChecking
        # fallback for SQLPS 
        Import-Module SQLPS -DisableNameChecking -Verbose

    Write-Host "Invoking deployment script... This may take several minutes."
    Invoke-ASCmd -Server:$SsasServer -InputFile $path\$bim.xmla | Out-File $path\$bim.xml
    Write-Host "Please check $path\$bim.xml as this is output of this deployment"
    Write-Host "Done."
    Write-Host "------------------------------------"

Final thoughts

Although it is quite a script, it is fairly easy to setup and deploy a cube with a Powershell script. In order to use it in Azure DevOps you have to replace some of the variables with the Azure DevOps variables to make it work as you desire.


woensdag 3 oktober 2018

DevOps : Testing of a Database project in Azure Devops (Part VI)


In my previous blogpost, I've mostly talked about the Microsoft hosted agent and that works great when you want build and release your data warehouse and BI project. The code is checked in, an automatic build is started and the code is released to a release environment. But, what about testing? Testing is one of the key elements of Azure Devops (formerly known as VSTS): build the code, test it and release the code to a new environment. I'm using the terminology VSTS and Azure Devops interchangeably throughout this blogpost as the name change just happened while I was writing this blogpost.

If you want to test your database code with unit testing scripts, there has to be an environment where the testscripts can be run against it. Now as far as I can see now, there are a couple of options here:
  • Deploy the database code to the localDB instance on the Azure Devops server and run the database tests on the localDB instance.
  • Deploy the code on an on - premise box and run the test locally. For this setup there must be local agent installation.
  • Deploy the code to an Azure Database / VM box and run some tests on that database/machine.
I've noticed that as a starter it is a good idea to begin with the first option : Deploy the code to the localDB on the Azure DevOps server. It is possible to deploy and run the test against the localDB on the Azure DevOps server, although it took me while to figure it out. So, this blogpost is a report of my experiments with unit testing and automating this on a localdb instance, in a DevOps environment.

I've used an article by Rahul Mittal on code project for setting up a Unit testing. I've used this example to automate my builds and testing in Azure DevOps.

DBUnitTesting project

The database test project as described in the article on CodeProject describes 7 types of tests on a database. For this blog post,  I followed the article in detail and executed the steps. I can recommend doing this too, if you want to know more about this subject.

There are a couple of important steps that are necessary to implement a database test project:
  • Create a database project in VS2015 or VS2017 (I used VS2017).
  • Create a testproject in VS2015 or 2017 together with the database project in the solution.
  • Define the tests in the project.
  • Execute and test the testproject.
As said before, there are 7 types of test possible:
  • Counting the number of rows in a table.
  • Data checksum in order to check whether some data has changed in the table.
  • Check if the schema has changed.
  • A performance check if a stored procedure is performing within some time boundaries.
  • Check if a query or a stored procedure returns an empty resultset.
  • Check if a query or a stored procedure returns a non-empty empty resultset.
  • Standard not test option: "inconclusive".

The tests are located in the test explorer where it is possible to run the tests.

I made a couple of mistakes while setting up the environment in Azure DevOps. One of the mistakes was that I didn't use the right dll. One helpful tool was vstest.console.exe to test the different local test dll. So I quickly found out that I had to use the another dll.

 "C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\
 Common7\IDE\CommonExtensions\Microsoft\TestWindow\vstest.console" testcases.dll
Microsoft (R) Test Execution Command Line Tool Version 15.7.2
Copyright (c) Microsoft Corporation.  All rights reserved.

Starting test execution, please wait...
Passed   SqlTest1
Passed   SqlTest1
Passed   SqlTest1
Passed   SqlTest1
Passed   SqlTest1
Passed   RowCountCompany
Passed   SqlTest1

Total tests: 7. Passed: 7. Failed: 0. Skipped: 0.
Test Run Successful.
Test execution time: 0.8713 Seconds

If you use the wrong dll you will get the following messages :

"C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\
Microsoft (R) Test Execution Command Line Tool Version 15.7.2
Copyright (c) Microsoft Corporation.  All rights reserved.

Starting test execution, please wait...
No test is available in 
Make sure that test discoverer & executors are registered and 
platform & framework version settings are appropriate and try again.

Additionally, path to test adapters can be specified using 
/TestAdapterPath command. Example  /TestAdapterPath:<pathToCustomAdapters>.

Build the database project in VSTS (Azure Devops)

1. Let's start with setting up the build process in a CI process in VSTS. The first step is to Create a build definition. Click on Build& release in the toolbar and click on Builds. When no build definitions have been created so far you should see a empty Build Defintions screen

2.  Click New Pipeline. There are a number of options to choose from: Select a source control system, the team project , the repository and the branch.

3. Click on Continue. Now you should see a template screen where you can choose from a template Pipeline. Choose the Empty Pipeline template.

4. Click on Apply. and the empty pipeline should look like this. You choose a name, get the resource and choose which agent to run on the build.

5. Choose the Default Pipeline in the Agent pool selection.

6. Choose the Team project, repository, branch and set some options if you want.

7. Choose Visual Studio build task on the agent and click on Add.

8. Click on the Visual Studio Build task and set the options. Set the following properties:
  • Name Build  : SQLUnitTestingDB solution
  • Solution : SQL UnitTestingDB
  • Visual studio Version : Visual Studio 2017
  • MSBuild Arguments: /t:Build;Publish /p:SqlPublishProfilePath=SQLUnitTestingDB.publish.xml

For the MSBuild option, I've included a publish file in my project. This will publish the database to the localdb on the VSTS server.

9. Add a Visual Studio Test Task to the pipeline and use the following settings:
  • Display Name
  • Testfiles : **\*Testcases.dll !**\*TestAdapter.dll!**\obj\**
  • Search folder : $(System.DefaultWorkingDirectory)

Run the Database Unit Testing Pipeline

When you are done setting up the pipeline in VSTS (Azure Devops) it is time to test the Azure DevOps pipeline. If nothing is changed, the pipeline will run into an error. We have some results from the Visual studio Test Task but not what we want.

Failed   RowCountCompany
Error Message:
 Initialization method TestCases.CompanyScalar.TestInitialize threw exception. 
 System.Data.SqlClient.SqlException: System.Data.SqlClient.SqlException: A network-related or 
 instance-specific error occurred while establishing a connection to SQL Server.
 The server was not found or was not accessible. Verify that the instance name 
 is correct and that SQL Server is configured to allow remote connections.
 (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) 
 ---> System.ComponentModel.Win32Exception: The system cannot find the file specified.
Stack Trace:
at Microsoft.Data.Tools.Schema.Sql.UnitTesting.SqlDatabaseTestService.OpenExecutionContext()
at Microsoft.Data.Tools.Schema.Sql.UnitTesting.SqlDatabaseTestClass.InitializeTest()
at TestCases.CompanyScalar.TestInitialize() in D:\a\1\s\TestCases\CompanyScalar.cs:line 24

The problem is that the configuration settings in the testproject is set to localhost.  Therefore the settings in the app.config should be changed into localdb.

Commit the code to Azure DevOps and let's see whether the Visual Studio Test task will execute the testproject. As it is shown in the screen below, the test has failed, but 2 of the 7 tests went successful.

The one that went wrong are the ones that assumes there is data in the table.

It is important to start your scripts from scratch or from a initial situation with some data. In latter case you have execute some scripts or restore a backup of the database on the local db. I'm not sure if it is possible to restore a database to the localdb instance. THat is one for a future blogpost.

Final thoughts

Setting up testing in your database project has some explorative aspects in it. There aren't much resources that clearly explains the possibilities, options and howto's. In this blogpost I've explored the possibility of testing a database project in the localDB. This blogpost is report of that experiment.


zaterdag 25 augustus 2018

DevOps : Investigating different options 'building' Data warehouse / BI project in VSTS (part V)


I'm gaining more and more knowledge and experience with the Microsoft VSTS environment and 'building' a typical data warehouse and BI project. Now, there is 'building' and there is 'building'. I mean with building, the 'Build' option as we know it in Visual Studio, resulting in deployable packages or others. In Visual Studio, you can build a solution or a project and this will result in a deploy able package, for instance a dacpac or an asdatabase file. 

A typical Data warehouse/BI project may consist of the following types of projects :
  • a Database project.
  • a SSIS project.
  • a SSRS project.
  • a SSAS Tabular project.
  • a SSAS Multidimensional project.

In order to setup a build and release process, the first step is to build these kind of projects and there seems some confusion on how to build these kind of projects with VSTS. These are the options:
  • Use MSBuild Task.
  • Use Visual Studio Build Task.
  • Use commandline Task.
  • Use a third part solution Task.
  • Use Custom Powershell script in a Powershell task.
For each of these types of projects, I'll explore and describe the different options of building the different type of projects.

Database project build

First, let's start with a database project. This is the main project of every Microsoft Data warehouse/BI project. In a database project there are all kinds of database objects stored and those can be build in a so called dacpac file. Such a dacpac file is deployable on different environments. There are a couple of options and I'll explore them and describe them here. There is a lot of information available on building such projects in VSTS. 

MSBuild build
This is the most popular option regarding building a database project. MSBuild is the Microsoft Build engine and is used for building software, not only database projects but also software like C#. Now below is a screenshot shown of a typical build with a MSBuild task in VSTS. 

So there are five steps building a database project:
  1. Get the code from the repository (I use Git).
  2. Choose the right agent (I use VS2017).
  3. Build the project with MSBuild.
  4. Copy the files from the build folder to the artifacts staging folder.
  5. Publish the files to the artifacts folder, where you can download the file.

I've included the specific build step code here. Just specifying the sqlproj file is enough to build the sql projects. The **/* is a wildcard for every subfolder were .sqlproj exists. build
Yet another option for building a database project is using is the IDE environment and is used as a container for creating applications with different languages. In the folowing build process in VSTS I've replaced the MSbuild task with a command line task and used as the command line utility to build the database project.

I've used the following settings for the commandline:

I specified the solution, whether or not to rebuild and the project location.

Visual Studio Build
Yet another task, that is available for building a database project is the Visual Studio Build task. The options almost covers the same options of the MSbuild and the command line utility

I've used the following settings :

And here I say Build every sqlproj file in a folder.

SSIS project

Also an important part of a data warehouse project is ETLing with SSIS and therefore it needs a deployment package for releasing it in a DevOps environment. This will be a ispac file.

MSBuild build
I tried to build a SSIS project with MSBuild but the MSBuild doesn't recognize the dtproj file, unfortunately. That is a pity. The following error occurred.

The element <DeploymentModel> beneath element <Project> is unrecognized.
It is possible building a SSIS Project with the tool. This is done exactly in the same way as building a database project with

Below I've included the specific steps for building a SSIS Project with the program

Tool :
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\

/rebuild $(BuildConfigurationSSIS) 

Visual Studio Build
Now building the project with Visual Studio is also a problem. There is also an error returned when executing the Visual Studio Build.

The target "Build" does not exist in the project.

SSIS Build by Toxic globe
SSIS Build & Deploy is a 3rd party tool that is developed by ToxicGlobe and is available on the Marketplace. It is free. It has quite some downloads and a good rating. I've tested the build task and it works great.

There are a couple of more 3rd party custom tasks that build and/or deploy SSIS task. I haven't tried them myself.

Custom Powershell script
Now, In my builds for my customer, I'm using and for some reason, strange and unpredictable errors occurs with Builds that ran fine for weeks now suddenly run into errors. Yet even strange is that when I build the SSIS project I run into errors, but when someone else checks in the code and triggers a build it just run fine (!?). So, I started looking into something else. Now, I encountered an option I would like explore some more and that is a Powershell script that is proposed by Koos van Strien. Here, he rebuilds the ispac content files from scratch and zips it into an ispac file. A nice solution, but it has some drawbacks like, will this script run in future versions of ispac file? Nevertheless, I going to give it a try in my project and see if it comply with my needs.

$dtProjFolderToLoad = '$(Build.SourcesDirectory)\SSISWideWorldImporters-SSDT'
$dtProjFileToLoad = Join-Path $dtProjFolderToLoad 'SSISWideWorldImporters-SSDT.dtproj'
[xml]$dtProjXmlDoc = New-Object System.Xml.XmlDocument
$dtProjXmlDoc.PreserveWhitespace = $true
# Create folder with the project name. This will essentially be zipped into an ispac
$ispacFolder = '$(Build.SourcesDirectory)\SSISWideWorldImporters-SSDT\bin\$(BuildConfigurationSSIS)\SSISWideWorldImporters-SSDT'
New-Item -ItemType Directory -Force -Path $ispacFolder
# Create the project manifest in the ispac folder
# Exists in node /Project/DeploymentModelSpecificContent/Manifest/SSIS:Project
$projectManifestXml = $dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.OuterXml
$projectManifestFullPath = Join-Path $ispacFolder "@Project.manifest"
$projectManifestXml | Out-File $projectManifestFullPath -NoNewline
# Add [Content types].xml, which has a static content
$contentTypesXml = "<?xml version=`"1.0`" encoding=`"utf-8`"?><Types xmlns=`"`"><Default Extension=`"dtsx`" ContentType=`"text/xml`" /><Default Extension=`"conmgr`" ContentType=`"text/xml`" /><Default Extension=`"params`" ContentType=`"text/xml`" /><Default Extension=`"manifest`" ContentType=`"text/xml`" /></Types>"
$contentTypesFullPath = Join-Path $ispacFolder '[Content_Types].xml'
$contentTypesXml | Out-File -LiteralPath $contentTypesFullPath -NoNewline
# Iterate over all SSIS packages (*.dtsx) inside the .dtproj file add them to the ispac folder
$dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.Packages.Package | ForEach-Object {
    $fileToCopy = (Join-Path $dtProjFolderToLoad ([string]$_.Name))
    Copy-Item $fileToCopy $ispacFolder

    # To be rename file with Path
    $ispacNamePath = (Join-Path $ispacFolder  ([string]$_.Name))

    # Replace ' ' with %20
    $newname = ([string]$_.Name).Replace(" ", "%20")

    # Rename the file
    $newnamePath = (Join-Path $ispacFolder $newname)

    Rename-item -Path  $ispacNamePath $newnamePath
# Iterate over all project-level connection managers (*.connmgr), add them to the ispac folder
$dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.ConnectionManagers.ConnectionManager | ForEach-Object { 
    $fileToCopy = (Join-Path $dtProjFolderToLoad ([string]$_.Name))
    Copy-Item $fileToCopy $ispacFolder 
# Copy the parameters file to the ispac folder
$paramsFullPathSource = Join-Path $dtProjFolderToLoad "Project.params"
Copy-Item $paramsFullPathSource $ispacFolder
# Archive the ispac folder as a ".ispac" file
Compress-Archive ($ispacFolder + "\*") ($ispacFolder + ".zip") -Force
Rename-Item ($ispacFolder + ".zip") ($ispacFolder + ".ispac") -Force
Remove-Item $ispacFolder -Recurse -Force    

Again all the credits to Koos van Strien. It just works perfectly. Almost;-) I had to make a small adjustment in the powershell script because of an error :

Name cannot start with character ., hexadecimal  0x00. row 1, position 40. (System.Xml)

I had to replace a line in the script with an unicode specification.

$contentTypesXml | Out-File  -Encoding "UTF8" -LiteralPath $contentTypesFullPath –NoNewline

Below a screenshot of the powershell script

Another problem I had was SSIS packages with a space in the name. That resulted in an error during deployment.

This is the specific error message:

The specified part is not present in the package. (WindowsBase)

I changed the piece of code that copied the dtsx files. A space is replaced with %20.

$dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.Packages.Package | ForEach-Object {

    $fileToCopy = (Join-Path $dtProjFolderToLoad ([string]$_.Name))
    Copy-Item $fileToCopy $ispacFolder

    # To be rename file with Path
    $ispacNamePath = (Join-Path $ispacFolder  ([string]$_.Name))

    # Replace ' ' with %20
    $newname = ([string]$_.Name).Replace(" ", "%20")

    # Rename the file
    $newnamePath = (Join-Path $ispacFolder $newname)

    Rename-item -Path  $ispacNamePath $newnamePath


A SSRS project

Another important project involved in a Microsoft DWH and BI project is a SSRS project, a reporting project. Building a report project results in a copy of the rdl files to the build folder.

MSBuild build
I tried to build a SSRS project with MSBuild but the MSBuild doesn't recognize the rptproj file, unfortunately. The following error occurred.

The target "Build" does not exist in the project.

Rptproj project files are also not supported by MSbuild.
Fortunately, with the command line utility it is possible to build your SSRS project. Once setup properly, it will run smoothly.

These are the settings, I used for building this project file

Tool :
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\

/rebuild $(BuildConfigurationSSRS) 

Visual Studio Build
Building a SSRS project with the Visual Studio Task will give the following error:

The element <DeploymentModel> beneath element <Project> is unrecognized.

3rd party task
There is a third party task and it is downloaded 80 times (at the moment of writing) and so not heavenly used. I tried it shortly but didn't manage to make it work in VSTS. 

SSAS tabular project

It is possible to build a SSAS tabular project with MSBuild, just like the database project build. 

Building a Tabular project is executed with the MSBuild task and now the extension .smproj is used for building the projects.
I don't have a working example available for building a tabular project with

Visual studio build
This is an example of building a Tabular project with Visual studio build task

3rd party
I didn't try the 3rd party examples for building a tabular project.

SSAS Multidimensional project

Again, the build of SSAS Multidimensional project is not possible with MSBuild and therefore use devenv again to build the SSAS multidimensional project.

Tool :
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\

/rebuild $(BuildConfigurationSSIS) 
/project "$(Build.SourcesDirectory)\SSASMDWideWorldImporters-SSDT\SSASMDWideWorldImporters-SSDT.DWproj"

A typical build

Now, we have gathered the findings of the experiments with the different options of building a BI project, we can now set up a best practice for setting up a Build in VSTS. I have created the following steps :
  • Build Projects.
  • Copy the builds to the artefactsfolder.
  • Publish the artefacts in the artefactsfolder.

Now this results in the following artefacts:

Some files are not needed and can be excluded in the copy task, for instance, or setting the build options. It is also possible to group the different type of project files in separate project artefacts. In that case, the setup of the build pipeline is bit different.

Final thoughts

There are several options for building data warehouse projects : MSbuild, Visual Studio Build, 3rd party builds and Powershell scripts. At first it can be very confusing what task to use for a what project. Here is an oversight of my experiments with the different tasks in VSTS.