woensdag 16 juli 2014

SSAS : Introduction to Tabular Model in SQL Server 2014 (RTM) (Part I)


In this introduction blogpost I'll explain the Tabular Model with a walk through the tutorial that is available on MSDN. In this tutorial a Tabular Model is created on the AdventureWorks Database. This tutorial is based on the Adventure Works Cycles, a fictitious company. They produce and distributes bicycles in North America, europe and Asia.

1. Setting up the project

The first thing to do is creating the project with Visual Studio 2012. In SQL Server Data Tools, on the File menu, click New, and then click Project. In the New Project dialog box, under Installed Templates, click Business Intelligence, then click Analysis Services, and then click Analysis Services Tabular Project. In Name, type AW Internet Sales Tabular Model, then specify a location for the project files. Click OK.

2. Select an Analysis Services Instance

The next dialog is about selecting the right Workspace Server and the right compatibility level. First set the Workspace server to the Tabular Model Instance in SQL Server. The next thing to do is setting the compatibility level of Tabular model. In SQL Server 2014 Analysis Services Instance supports the following compatibility levels (database version) (MSDN):
  • SQL Server 2012 (1100)
  • SQL Server 2012 SP1 (1103)
  • SQL Server 2014 (1103)
Strangly enough, I see two compatability levels: SQL Server 2012 (1100) and SQL Server 2012 SP1 (1103).

I assume that SQL Server SQL Server 2012 SP1 (1103) and SQL Server 2014 (1103) are the same. Checking the compatibilty level in the Analysis Service Properties in SQL Server Managament Studio shows me that the compatibility level is also 1103.

Kasper de Jonge blogged about the differences between compatability levels 1100 and 1103. Please take a look at the differences between the two.

3. The Tabular project

Once you have created the Tabular project the following window is presented on the screen. There are a couple of interesting menus, options and properties available. In the Model menu, you can launch the Table Import Wizard, view and edit existing connections, refresh workspace data, browse your model in Microsoft Excel with the Analyze in Excel feature, create perspectives and roles, select the model view, and set calculation options. In the Table menu, you can create and manage relationships between tables, create and manage, specify date table settings, create partitions, and edit table properties. And in the Column menu you can add and delete columns in a table, freeze columns, and specify sort order. 

In the properties window of the model.bim, you can also see the compatbility level again(1103), the DirectQuery property. This property handles the In Memory (or not) possibillity of deploying a Tabular model.


In this blogpost I described an introduction to the Tabular model (Part I). I wrote about the compatability level of Tabular Model in SQL Server 2014 and Visual Studio 2012.


donderdag 10 juli 2014

R: getwd() and setwd() (Part II)


R is a very popular programming language for statistical analysis of data. All kind of tools are supporting R or they will. In this blogpost and followup blogpost I've gathered some constructs in R that are quite often used when you program in R. This is not a complete list but it is a handy list.

This is the second post about R in a series:
  • R : An introduction (Part I).
  • R : getwd() and setwd()  (Part II).

getwd() and setwd()

The first step using R is checking out the working directory. The working directory is the place where all of your project files are placed.You need to keep track in each of your R session.If you read or write files, this takes place in the working directory.

> getwd()
[1] "C:/Users/Hennie/Documents"

Setting the working directory is executed with setwd()

> setwd("~/Coursera-RProgramming")
> getwd()
[1] "C:/Users/Hennie/Documents/Coursera-RProgramming" 


With setwd() and getwd() the workingdirectory is set and checked.


woensdag 9 juli 2014

SSAS : Partitions in a multidimensional cube


When your fact tables grows, consequently your multidimenional cube will grow, too (if you have a multidimensional cube off course). You can increase your performance by splitting up the measures in partitions. Partitions allows you to divide a cube in one or more folders. These folders are usually placed on one or more hard drives for improved performance. Multiple partitions can have the following advantages (mssqltips):
  • Increased performance by placing the partitions on different disks.
  • Each partition can have it's own storage mode.
  • Parallel processing of partitions
  • etc. 
In this blogpost I'll show you how to create partition in SSAS by using the SSAS tutorial of MSDN. In this tutorial a cube is created on AdventureWorks2012DW database. The measuregroup internetsales is divided into partitions by year.

Creating the partitions

Open the cube and navigate to the partitions tab and click on New partition. A screenshot is shown in the picture below. Here you can zsee the standard partitions.

A wizard is opened and the following window appears. click on Next.

Change from Table binding to Query binding. Check the "internet tables" in the box Available tables and press OK

If you change to Query binding the following window is shown.The WHERE statement is not finished yet and there you have to enter your partition range.

The data of AdventureWorks is from 2005 until 2008. Below the first range is entered from 20050101 until 20051231. Now press on check and the messagebox "Syntax check succesful". Press on OK.

Now you have to enter the location of your partition. You can also process the partitions on different servers. Press OK

Do you want to design partitions by aggregating it? In this example I will not do that. Press Finish

Now The first partition is ready and now you can create the next partition.

Finally, when you have done this for 2005, 2006, 2007 and 2008 the partition tab will look like below.


It's is very easy to add partitions to a cube and a simple partition design is very straightforware and simple.



vrijdag 4 juli 2014

Microsoft SQL Server 2014 Sample databases


Today, I decided to do some study on SQL Server 2014 and I wanted to install the sample databases of SQL server 2014. But, I can't find the sample AdventureWorks databases for SQL server 2014. It seems that the sample databases of SQL Server 2012 are applicable for SQL Server 2014.

By default, sample databases and sample code are not installed during the installation of SQL Server and you have to search, find and download the sample databases.

Sample databases

On Codeplex you can download the AdventureWorks databases for SQL Server. It's quite a list and I grouped the files together.

Relational Databases (OLTP):
  • AdventureWorks2012 Data File
  • AdventureWorks 2012 OLTP Script
  • AdventureWorks2012-Full Database Backup.zip
  • AdventureWorks2012_Data.zip
  • AdventureWorks2012 CS Data File
  • AdventureWorks 2012 CS OLTP Script
  • AdventureWorksLT2012_Data
  • AdventureWorks 2012 LT Script

Databases (DW)
  • AdventureWorksDW2012 Data File
  • AdventureWorksDW2012Images

  • AdventureWorks Multidimensional Models SQL Server 2012
  • Analysis Services Tutorial SQL Server 2012

SSAS Tabular
  • AdventureWorks Tabular Model SQL Server 2012
  • AdventureWorks Internet Sales Tabular Model SQL Server 201

Here are some other samples for SQL Server I've found so far:


It seems that the sample database of SQL server 2012 are reusable for SQL Server 2014. There are some more samples created specifically SQL Server 2014. But, the adventure Works databases are the same...



zondag 25 mei 2014

Datavault : Kanban in a Datavault project (part III)


I've finished my Lean Six Sigma Orange Belt certification a couple of months ago. For this certification I needed a case study that would justify a certification. I decided to use my project as a base for my certification. In my current project we are developing a analytic platform for analysts with Datavault. We already used a kanbanboard but I thought that some improvements was possible.

We encountered some throughput problems of the implmentations and I decided to investigate our development proces and thought about what could aid speeding up the process. The teammembers discussed our method and we tried to adopt the various techniques of Lean Six Sigma.

One of the subject areas of Lean Six Sigma is Kanban and Kanban can be very useful for making your processes visible and identifying possible bottlenecks in the process. Read more about Kanban on Wikipedia and this free book about Kanban and Scrum. These are very learnful.

This is the approach I have taken (with the development team) in order to develop a (new) Kanban board:
  • Identifying the steps we take in the development process?
  • What do we do when we build Datavault implementations (the objects) ?
  • Who are the customers and who are the suppliers in the process?
    • Supplier, Input, Process, output and Customer  (SIPOC).
  • What are the criteria for moving a card from one phase to another?

This blogpost is one in a series of blogposts about datavault:

Task analysis

At first We decided to take a look at our activities in the project. What are the activities in the project? Well, We came up with this list:
  • Analysis (gaining knowledge)
    • Deskresearch
    • Data-analysis
    • interviews
  • Design
    • Datamodeling (Conceptual, logical, SourceVault and  BusinessVault)
    • ETL
  • Development (DDL + ETL Packages)
    • StagingIn
      • Tables
    • SourceVault
      • Hubs
      • Links
      • Sats
      • EndSats (Effectivity)
      • Refs
      • Other DV Tables
  • Build (DDL + ETL packages)
    • BusinessVault
      • Hubs
      • Links
      • Sats
      • EndSats
      • Refs
      • Other DV Tables
  • BusinessAccessLayer
    • Development (ETL packages)
    • Entities
  • ProcessFlows
    • MasterProcess flows (PCF and MPCF)
    • Input file processing (IFP)
    • StagingIn
    • SourceVault
    • BusinessVault
  • Initial loads (This is for going back in time and resample (historize) the data).
  • ProcessManagement
    • Configuration
  • Test
    • Systemtest
    • Acceptancetest
  • Documentation
    • (Conceptual Datamodel)
    • SourceVault Datamodel
    • BusinessVault Datamodel
    • Functional and Technical descriptions
    • SSIS packages
  • Additional
    • Supply contracts with the datasuppliers


The next step was investigating our working process. What are our phases we actually executing during development of a deliverable? And what were the criteria for moving from one phase to another?

The next step was studying the SIPOC (supplier, Input, Process, Output and Customer). With SIPOC you have a tool that help you to improve a process. This is mostly used during the Define phase int he DMAIC process.

I noticed that there are two kinds of output in a process: Work and Criteria. The criteria says something about go/no go decisions and work is about delivering value to the valuestream, for instance a Datavault model.

The Kanban board

The next step we took was designing a new Kanban board, based on the taskanalysis and the SIPOC schema. I made some designs of the board and We had quite some discussions about the board. The discussion was about finding a balance between not too simple but also not too complicated. If it was too simple it will not help us understanding the process and if it was too difficult people wouldn't accept the board. We finally made this board:

We have defined the columns as identified (more or less) in the SIPOC schema during the Define process : Analysis/design, Develop, build, test, QA, Ready for production and Production.

We introduced two kinds of lanes: Projectlanes and fastlanes. Fast lanes were intended for high priority issues that had a higher priority than the projects that we were running.

Definition of Done (DoD)
And the last part that we added to the board was a Definition of Done row. This DoD were the criteria in the SIPOC schema. These criteria helped us describing the DoD. This helped us agreeing about when a card should be moved from one column to the other, as a group.

The result

Finally, we created a new kanban board. This is shown below.


Kanban can be very useful in a project with a number of teammembers. Developers have to tell something about what they are doing, the impediments they have and what they are planning to do. A Kanbanboard can be very effective when all members of team are working with one goal: deliver value within a certain period of time. I think it's less effective when all members are working on their own deliverable and with their own deadline. When teammembers are working on a collaborative goal more co-operation  and more synergy effects will happen