zaterdag 25 augustus 2018

DevOps : Investigating different options 'building' Data warehouse / BI project in VSTS (part V)

Introduction

I'm gaining more and more knowledge and experience with the Microsoft VSTS environment and 'building' a typical data warehouse and BI project. Now, there is 'building' and there is 'building'. I mean with building, the 'Build' option as we know it in Visual Studio, resulting in deployable packages or others. In Visual Studio, you can build a solution or a project and this will result in a deploy able package, for instance a dacpac or an asdatabase file. 

A typical Data warehouse/BI project may consist of the following types of projects :
  • a Database project.
  • a SSIS project.
  • a SSRS project.
  • a SSAS Tabular project.
  • a SSAS Multidimensional project.

In order to setup a build and release process, the first step is to build these kind of projects and there seems some confusion on how to build these kind of projects with VSTS. These are the options:
  • Use MSBuild Task.
  • Use Visual Studio Build Task.
  • Use Devenv.com commandline Task.
  • Use a third part solution Task.
  • Use Custom Powershell script in a Powershell task.
For each of these types of projects, I'll explore and describe the different options of building the different type of projects.

Database project build

First, let's start with a database project. This is the main project of every Microsoft Data warehouse/BI project. In a database project there are all kinds of database objects stored and those can be build in a so called dacpac file. Such a dacpac file is deployable on different environments. There are a couple of options and I'll explore them and describe them here. There is a lot of information available on building such projects in VSTS. 

MSBuild build
This is the most popular option regarding building a database project. MSBuild is the Microsoft Build engine and is used for building software, not only database projects but also software like C#. Now below is a screenshot shown of a typical build with a MSBuild task in VSTS. 



So there are five steps building a database project:
  1. Get the code from the repository (I use Git).
  2. Choose the right agent (I use VS2017).
  3. Build the project with MSBuild.
  4. Copy the files from the build folder to the artifacts staging folder.
  5. Publish the files to the artifacts folder, where you can download the file.

I've included the specific build step code here. Just specifying the sqlproj file is enough to build the sql projects. The **/* is a wildcard for every subfolder were .sqlproj exists.



DevEnv.com build
Yet another option for building a database project is using devenv.com. Devenv.com is the IDE environment and is used as a container for creating applications with different languages. In the folowing build process in VSTS I've replaced the MSbuild task with a command line task and used devenv.com as the command line utility to build the database project.



I've used the following settings for the commandline:


I specified the solution, whether or not to rebuild and the project location.

Visual Studio Build
Yet another task, that is available for building a database project is the Visual Studio Build task. The options almost covers the same options of the MSbuild and the command line utility DevEnv.com.



I've used the following settings :


And here I say Build every sqlproj file in a folder.

SSIS project

Also an important part of a data warehouse project is ETLing with SSIS and therefore it needs a deployment package for releasing it in a DevOps environment. This will be a ispac file.

MSBuild build
I tried to build a SSIS project with MSBuild but the MSBuild doesn't recognize the dtproj file, unfortunately. That is a pity. The following error occurred.


The element <DeploymentModel> beneath element <Project> is unrecognized.
 

Devenv.com
It is possible building a SSIS Project with the Devenv.com tool. This is done exactly in the same way as building a database project with devenv.com.



Below I've included the specific steps for building a SSIS Project with the program Devenv.com.

Tool :
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\devenv.com

Arguments:
"$(Build.SourcesDirectory)\WideWorldImporters-SSDT.sln"  
/rebuild $(BuildConfigurationSSIS) 
/project 
"$(Build.SourcesDirectory)\SSISWideWorldImporters-SSDT\SSISWideWorldImporters-SSDT.dtproj" 

Visual Studio Build
Now building the project with Visual Studio is also a problem. There is also an error returned when executing the Visual Studio Build.


The target "Build" does not exist in the project.
 


SSIS Build by Toxic globe
SSIS Build & Deploy is a 3rd party tool that is developed by ToxicGlobe and is available on the Marketplace. It is free. It has quite some downloads and a good rating. I've tested the build task and it works great.




There are a couple of more 3rd party custom tasks that build and/or deploy SSIS task. I haven't tried them myself.


Custom Powershell script
Now, In my builds for my customer, I'm using Devenv.com and for some reason, strange and unpredictable errors occurs with devenv.com. Builds that ran fine for weeks now suddenly run into errors. Yet even strange is that when I build the SSIS project I run into errors, but when someone else checks in the code and triggers a build it just run fine (!?). So, I started looking into something else. Now, I encountered an option I would like explore some more and that is a Powershell script that is proposed by Koos van Strien. Here, he rebuilds the ispac content files from scratch and zips it into an ispac file. A nice solution, but it has some drawbacks like, will this script run in future versions of ispac file? Nevertheless, I going to give it a try in my project and see if it comply with my needs.

$dtProjFolderToLoad = '$(Build.SourcesDirectory)\SSISWideWorldImporters-SSDT'
$dtProjFileToLoad = Join-Path $dtProjFolderToLoad 'SSISWideWorldImporters-SSDT.dtproj'
[xml]$dtProjXmlDoc = New-Object System.Xml.XmlDocument
$dtProjXmlDoc.PreserveWhitespace = $true
$dtProjXmlDoc.Load($dtProjFileToLoad)
 
# Create folder with the project name. This will essentially be zipped into an ispac
$ispacFolder = '$(Build.SourcesDirectory)\SSISWideWorldImporters-SSDT\bin\$(BuildConfigurationSSIS)\SSISWideWorldImporters-SSDT'
New-Item -ItemType Directory -Force -Path $ispacFolder
 
# Create the project manifest in the ispac folder
# Exists in node /Project/DeploymentModelSpecificContent/Manifest/SSIS:Project
$projectManifestXml = $dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.OuterXml
$projectManifestFullPath = Join-Path $ispacFolder "@Project.manifest"
$projectManifestXml | Out-File $projectManifestFullPath -NoNewline
 
# Add [Content types].xml, which has a static content
$contentTypesXml = "<?xml version=`"1.0`" encoding=`"utf-8`"?><Types xmlns=`"http://schemas.openxmlformats.org/package/2006/content-types`"><Default Extension=`"dtsx`" ContentType=`"text/xml`" /><Default Extension=`"conmgr`" ContentType=`"text/xml`" /><Default Extension=`"params`" ContentType=`"text/xml`" /><Default Extension=`"manifest`" ContentType=`"text/xml`" /></Types>"
$contentTypesFullPath = Join-Path $ispacFolder '[Content_Types].xml'
$contentTypesXml | Out-File -LiteralPath $contentTypesFullPath -NoNewline
 
# Iterate over all SSIS packages (*.dtsx) inside the .dtproj file add them to the ispac folder
$dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.Packages.Package | ForEach-Object {
    $fileToCopy = (Join-Path $dtProjFolderToLoad ([string]$_.Name))
    Copy-Item $fileToCopy $ispacFolder

    # To be rename file with Path
    $ispacNamePath = (Join-Path $ispacFolder  ([string]$_.Name))

    # Replace ' ' with %20
    $newname = ([string]$_.Name).Replace(" ", "%20")

    # Rename the file
    $newnamePath = (Join-Path $ispacFolder $newname)

    Rename-item -Path  $ispacNamePath $newnamePath
}
 
 
# Iterate over all project-level connection managers (*.connmgr), add them to the ispac folder
$dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.ConnectionManagers.ConnectionManager | ForEach-Object { 
    $fileToCopy = (Join-Path $dtProjFolderToLoad ([string]$_.Name))
    Copy-Item $fileToCopy $ispacFolder 
}
 
# Copy the parameters file to the ispac folder
$paramsFullPathSource = Join-Path $dtProjFolderToLoad "Project.params"
Copy-Item $paramsFullPathSource $ispacFolder
 
# Archive the ispac folder as a ".ispac" file
Compress-Archive ($ispacFolder + "\*") ($ispacFolder + ".zip") -Force
Rename-Item ($ispacFolder + ".zip") ($ispacFolder + ".ispac") -Force
Remove-Item $ispacFolder -Recurse -Force    
 

Again all the credits to Koos van Strien. It just works perfectly. Almost;-) I had to make a small adjustment in the powershell script because of an error :

Name cannot start with character ., hexadecimal  0x00. row 1, position 40. (System.Xml)

I had to replace a line in the script with an unicode specification.

       
$contentTypesXml | Out-File  -Encoding "UTF8" -LiteralPath $contentTypesFullPath –NoNewline
       

Below a screenshot of the powershell script


Another problem I had was SSIS packages with a space in the name. That resulted in an error during deployment.


This is the specific error message:

The specified part is not present in the package. (WindowsBase)

I changed the piece of code that copied the dtsx files. A space is replaced with %20.

$dtProjXmlDoc.Project.DeploymentModelSpecificContent.Manifest.Project.Packages.Package | ForEach-Object {

    $fileToCopy = (Join-Path $dtProjFolderToLoad ([string]$_.Name))
    Copy-Item $fileToCopy $ispacFolder

    # To be rename file with Path
    $ispacNamePath = (Join-Path $ispacFolder  ([string]$_.Name))

    # Replace ' ' with %20
    $newname = ([string]$_.Name).Replace(" ", "%20")

    # Rename the file
    $newnamePath = (Join-Path $ispacFolder $newname)

    Rename-item -Path  $ispacNamePath $newnamePath

}


A SSRS project

Another important project involved in a Microsoft DWH and BI project is a SSRS project, a reporting project. Building a report project results in a copy of the rdl files to the build folder.

MSBuild build
I tried to build a SSRS project with MSBuild but the MSBuild doesn't recognize the rptproj file, unfortunately. The following error occurred.


The target "Build" does not exist in the project.
 

Rptproj project files are also not supported by MSbuild.

Devenv.com
Fortunately, with the Devenv.com command line utility it is possible to build your SSRS project. Once setup properly, it will run smoothly.



These are the settings, I used for building this project file

Tool :
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\devenv.com

Arguments:
"$(Build.SourcesDirectory)\WideWorldImporters-SSDT.sln"  
/rebuild $(BuildConfigurationSSRS) 
/project 
"$(Build.SourcesDirectory)\SSRSWideWorldImporters-SSDT\SSRSWideWorldImporters-SSDT.rptproj"

Visual Studio Build
Building a SSRS project with the Visual Studio Task will give the following error:


The element <DeploymentModel> beneath element <Project> is unrecognized.
 


3rd party task
There is a third party task and it is downloaded 80 times (at the moment of writing) and so not heavenly used. I tried it shortly but didn't manage to make it work in VSTS. 

SSAS tabular project

It is possible to build a SSAS tabular project with MSBuild, just like the database project build. 

MSBuild
Building a Tabular project is executed with the MSBuild task and now the extension .smproj is used for building the projects.


Devenv.com
I don't have a working example available for building a tabular project with Devenv.com.

Visual studio build
This is an example of building a Tabular project with Visual studio build task


3rd party
I didn't try the 3rd party examples for building a tabular project.

SSAS Multidimensional project

Again, the build of SSAS Multidimensional project is not possible with MSBuild and therefore use devenv again to build the SSAS multidimensional project.

Devenv.com



Tool :
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\devenv.com

Arguments:
"$(Build.SourcesDirectory)\WideWorldImporters-SSDT.sln"  
/rebuild $(BuildConfigurationSSIS) 
/project "$(Build.SourcesDirectory)\SSASMDWideWorldImporters-SSDT\SSASMDWideWorldImporters-SSDT.DWproj"

A typical build

Now, we have gathered the findings of the experiments with the different options of building a BI project, we can now set up a best practice for setting up a Build in VSTS. I have created the following steps :
  • Build Projects.
  • Copy the builds to the artefactsfolder.
  • Publish the artefacts in the artefactsfolder.



Now this results in the following artefacts:

Some files are not needed and can be excluded in the copy task, for instance, or setting the build options. It is also possible to group the different type of project files in separate project artefacts. In that case, the setup of the build pipeline is bit different.

Final thoughts

There are several options for building data warehouse projects : MSbuild, Visual Studio Build, 3rd party builds and Powershell scripts. At first it can be very confusing what task to use for a what project. Here is an oversight of my experiments with the different tasks in VSTS.



Hennie