maandag 8 december 2014

Let's Disco!


Disco is a process mining tool that let's you discover processes by analyzing event logs. It's suitable for analyzing transactions that happen in a process. For instance, a sales process or a order process are examples that can be analyzed. You can examine bottlenecks in your process, loops, time, durations, averages, fast/slow lanes, conformance issues, resource performance, etc.

So, processmining with a tool like ProM, Disco (and I've even seen a demo of SAS Visual Analytics with a Sankey Diagram) are very well suited for process model analysis. Different tooling are enhancing process analysis. Process analysis can be a great addition to Business Intelligence. Where Business intelligence is more like phishing with a fishing rod in an organisation for KPI's and Business Metrics, Process Mining is much more 'on' the process. Business Intelligence and Process Mining can work together to optimize processes.

A simple tour through Disco

In this blogpost, I've included some screenshots about Disco. I've used a small file that I've borrowed from the Coursera Course "Process mining:Data Science in Action" lectured by TUE, Wil van der Aalst. Below the opening screen of Disco.

The first thing that you have to is load the data. In this case a csv file.

Then you have to set the CaseID, Events, Timestamps and the resources. This is needed for analysis of process.

The next step is importing the data and generating the processmap. Below an example of the duration of the processsteps.

Here an example of the mean of durations

And below, an example when you play the data on the model. The yellow/red dots are markers that flow through the model

 And below some statistical analysis with Disco:

Some more information about the cases.


Disco is a great tool for analysis of processes. Process mining can be a great addition to Business Intelligence and very helpful for analysis of processes. Both analyses processes but on different levels.


donderdag 4 december 2014

Process mining


Currently following the course processmining on Coursera and this course is combination of data mining and processmodels. Like the same as with Business intelligence (sort of), processmining analyzes data about processes in an organisation. I'm quite enthusiastic about this approach because it analyzes processes on a scientific manner with data mining. Datamining analyzes data but not with the (direct) aim of looking at processes in an organisation. Processmining does. There is also a relation between BPM (Business Process modelling). In the course sometimes BPM models are used aside petrinets.

Although I've just started with the course, I want to share some interesting things, I've come along during the course. In this blog post I'll describe this.

Defining process mining

The definition of Processmining according to Wikipedia :

"Process mining is a process management technique that allows for the analysis of business processes based on event logs.".

But in my opinion event logs is a bit of a narrow keyword. A more broader definition could be applicable. Think about facts in a star schema or satellite information in a Datavault model that are very often used in business intelligence and data warehousing. These are transactions that happens in the operations of an organisation These are also events. Events that happened. Think about an order entry system with statuses. Sometimes, customers told me how the order entry process worked and when I studied the different statuses an order should have, I sometimes found out that different sequences of the order process were possible. With process mining you can identify undiscovered routes of your business process in automated way. This is truly an addition in the  field of Business intelligence, Lean Six Sigma,  datamining and BPM.

Just a simple example, suppose from the customer you hear that the model is this (orders with order statusses):

But when we study the transactions of the order entry system the following is noticed (records are identified by a case ID (the grouping of the records) and the activity at a certain moment):

Here we see that order 4568 is reopened and this should not have supposed to happen according to the designed model. After analyzing the events in a log or perhaps a transactional modelled star schema the model appears like this (corrected):

It could mean that in the operational process order entry personnel has reopened the order for some reason. If you want optimize the process in order to reduce the wastes (Lean Six sigma) than this is very interesting information. Process mining can do this for you.


Although I've just started with studying process mining, this seems a very interesting approach for analyzing processes with datamining. And, this is also applicable on huge log files and analyzing log files is one of the applications of Big data analytics. 

Hope you have read this blogpost with pleasure..



maandag 1 december 2014

SSAS SSRS : Bad Performance SSRS parameters and SSAS


SSRS Reporting parameters and SSAS cubes......There is a lot to read about on the internet about the bad performance of  parameters that are generated by Reporting Services based upon a SSAS Cube. Today, I've ran into this problem too. I've a parameter with a dataset that queries a customer table of about 40000 records. Peanuts for a simple SQL query, but not for a SSAS cube (pfffff). And, I'm not the only one who has encountered this problem. You can read hundreds of blogposts, forumposts and more. This is one blogpost more about this bad cooperation between SSRS and SSAS.

The lab situation is this: we have a main report with a customer parameter and a sub report. I want to pass a Customer list via the parameter to the subreport in order to show detailrecords. In my real word example a map is shown and clicking on a area on the map shows the detail records that belong to a region.

Analyzing the parameter performance problem

The initial problem of a parameter with 40000 records is simple. It doesn't perform! I will show you what kind of issues rise when you create a parameter with 40000 records.  First create a dataset with a parameter "Customer" by checking the parameter option in the query window.

A parameter dataset is created (you can't see that because it's hidden). Choose for "Show hidden datasets". If you run the report, you'll notice that the performance is bad..very bad. When you open profiler to take a look at what is happening during cube querying a lot of serialize events happen and these happen often (per 1000) and these are sloooowwww. The serialize results current is a very slow process. So processing the query is very fast but serializing the result for the Report is slow.

If you take a look at this diagram, you can see that the serialize is done from the Compute Cell data part and it's transferred to the client application.

Now, I haven't found a silver bullet unfortunately. But Chris Webb helped me a bit further. In his blogpost, Chris Web explains this problem a bit further and he gives a solution. Well, sort of but I'll explain this later. Suppose you add a parameter to a dataset according to the following screenshot. 

Then a parameter is automatically created with a dataset:

And, the query in this dataset looks like this:

  MEMBER [Measures].[ParameterCaption] AS [Customer].[Customer].CURRENTMEMBER.MEMBER_CAPTION 
  MEMBER [Measures].[ParameterValue] AS [Customer].[Customer].CURRENTMEMBER.UNIQUENAME 
  MEMBER [Measures].[ParameterLevel] AS [Customer].[Customer].CURRENTMEMBER.LEVEL.ORDINAL
  [Measures].[ParameterLevel]} ON COLUMNS , 
 [Customer].[Customer].ALLMEMBERS ON ROWS 
 FROM [Adventure Works]

According to Chris Webb  you can replace this query by an OLEDB query. This query performs much faster than the generated query by SSRS.

, [Customer].[Customer].ALLMEMBERS
FROM [Adventure Works]

This speeds up the loading the data into the parameter.

Invalid URI : The URI String is too long 

The next problem I had was that I got an error when I call a subreport with the same parameter. It seems that there is limitation of 65520 characters that you can pass with a parameter. This is the error:

A limitation of the number of characters that can be passed through a subreport. Theo Lachevs explains why. There are a couple of options. The best option is to limit the number of characters of the parameter that are passed to the subreport. This is not perfect because of growing the amount of the data somewhere in the future you'll run into problems again and you can't solve it anymore.

Okay, how do we limit the number of characters that are passed to the subreport? Well, Erika Bakse showed in het blogpost a way to limit the number of members. In this example I've created a main report and a subreport. The main report has a Customer parameter that lists customers

For this blogpost I've created a simple report with a Tablix. On the Textbox Properties I've created an Action with an expression.

This is the expression that is passed to the subreport:

=Join(Parameters!CustomerCustomerName.Label, "],")

This is the output of the expression:

And this expression is without the full membernames and therefore compacted. In the parameter of the subreport this is one large string:

With the following expression you can change these CSV separated string into full membernames:

="[Customer].[Customer Name].&[" + REPLACE(Parameters!Customer.Label(0) + "]", "],", "],[Customer].[Customer Name].&[")

Shown below:

The output of the expression is this:

And now we can pass this to a dataset and filter the records with this list.


This URI problem happens quite often. I encountered this problem on numerous environments with SQL Server 2008, SQL Server 2008 R2 and SQL Server 2012 in different situations. I've also encountered different behaviour between SSRS, Report manager and Reportbuilder.



zondag 16 november 2014

BI and the Anscombe quartet

The anscombe quartet is showing us something that is also related to Business intelligence. In Business Intelligence it's familiar to show KPI's or business metrics. A KPI as one number hides some times important information like the anscombe quartet is showing us.

And, they all have the same mean, variance, deviation, correlation and lineair regression!

What is this saying us? The statistician Francis Anscombe suggested this quartet to demonstrate the importance of graphical data analysis and the effect of outliers before they decide on a statistical analysis on the basis of their characteristics. The data sets show that the simple statistical characteristics are not always sufficient in order to describe the data.


Be very cautious showing numbers in General Business Metrics and KPI's in your report and your dashboards. Tell the whole story behind a KPI or Business metric and do not only show a number.

zaterdag 15 november 2014

SSRS : Multi Language Reporting


Years ago, I've worked in a team building a multiluanguage portal for an international company. We've built all kinds of functions for translating reports. Now, I've to built another multilingual portal for an international company again and google led me to the following blog on Codeproject. This is a neat solution for building multilingual reports. It stores labels, languages and translations. In this example a language parameter is used for parameterized translations. For my current project it seems that this is a cool solution.


In the example below I've created three tables: Label, Language and Translations. The Label contains the ID of the field that needs to be translated. I've modded the language table a bit so that the language field is not a descriptive field but contains the format string equal that of the regional settings on your windows system (eg. en-US). Now you can translate your report automatically with the regional settings of your windows system. I renamed the LabelText table to Transalation because that's what it actually does, in my opinion. This is the DDL:

IF  EXISTS (SELECT * FROM sys.objects 
   WHERE object_id = OBJECT_ID(N'[dbo].[Label]') AND type in (N'U'))
DROP TABLE [dbo].[Label]

IF  EXISTS (SELECT * FROM sys.objects 
   WHERE object_id = OBJECT_ID(N'[dbo].[LabelText]') AND type in (N'U'))
DROP TABLE [dbo].[LabelText]

IF  EXISTS (SELECT * FROM sys.objects 
   WHERE object_id = OBJECT_ID(N'[dbo].[Language]') AND type in (N'U'))
DROP TABLE [dbo].[Language]

CREATE TABLE [dbo].[Label](
 [LabelID] [int] IDENTITY(1,1) NOT NULL,
 [Description] [varchar](900) NOT NULL

CREATE TABLE [dbo].[Translation](
 [LanguageID] [int] NOT NULL,
 [LabelID] [int] NOT NULL,
 [LabelTranslation] [nvarchar](max) NOT NULL

CREATE TABLE [dbo].[Language](
 [LanguageID] [int] IDENTITY(1,1) NOT NULL,
 [LanguageCode] [varchar](100) NOT NULL,
 [Active] [bit] NOT NULL

Here is some content of the table:

INSERT INTO [dbo].[Label]  

INSERT INTO [dbo].[Language]  
VALUES ('en-US', 'English (United States)', 1), 
  ('nl-NL', 'Dutch (Netherlands)', 1),
  ('fr-FR','French (France)',0),
  ('de-DE','German (Germany)',0),
  ('it-IT','Italian (Italy)',0),
  ('ja-JP','Japanese (Japan)',0),
  ('es-ES','Spanish (Spain)',0),
  ('ar-SA','Arabic (Saudi Arabia)',0),
  ('zh-CN','Chinese (PRC)',0),
  ('zh-HK','Chinese (Hong Kong S.A.R.)',0),
  ('zh-TW','Chinese (Taiwan)',0),
  ('cs-CZ','Czech (Czech Republic)',0),
  ('da-DK','Danish (Denmark)',0),
  ('fi-FI','Finnish (Finland)',0),
  ('el-GR','Greek (Greece)',0),
  ('he-IL','Hebrew (Israel)',0),
  ('hu-HU','Hungarian (Hungary)',0),
  ('ko-KR','Korean (Korea)',0),
  ('nb-NO','Norwegian, Bokmål (Norway)',0),
  ('pl-PL','Polish (Poland)',0),
  ('pt-BR','Portuguese (Brazil)',0),
  ('pt-PT','Portuguese (Portugal)',0),
  ('ru-RU','Russian (Russia)',0),
  ('sv-SE','Swedish (Sweden)',0),
  ('tr-TR','Turkish (Turkey)',0),
  ('bg-BG','Bulgarian (Bulgaria)',0),
  ('hr-HR','Croatian (Croatia)',0),
  ('et-EE','Estonian (Estonia)',0),
  ('lv-LV','Latvian (Latvia)',0),
  ('lt-LT','Lithuanian (Lithuania)',0),
  ('ro-RO','Romanian (Romania)',0),
  ('sr-Latn-CS',' Serbian (Latin, Serbia)',0),
  ('sk-SK','Slovak (Slovakia)',0),
  ('sl-SI','Slovenian (Slovenia)',0),
  ('th-TH','Thai (Thailand)',0),
  ('uk-UA','Ukrainian (Ukraine)',0),
  ('af-ZA','Afrikaans (South Africa)',0),
  ('sq-AL','Albanian (Albania)',0),
  ('am-ET','Amharic (Ethiopia)',0),
  ('hy-AM','Armenian (Armenia)',0),
  ('as-IN','Assamese (India)',0),
  ('az-Latn-Z','Azeri (Latin, Azerbaijan)',0),
  ('eu-ES','Basque (Basque)',0),
  ('be-BY','Belarusian (Belarus)',0),
  ('bn-BD','Bengali (Bangladesh)',0),
  ('bn-IN','Bengali (India)',0),
  ('bs-Cyrl-BA','Bosnian (Cyrillic, Bosnia and Herzegovina)',0),
  ('bs-Latn-BA','Bosnian (Latin, Bosnia and Herzegovina)',0),
  ('ca-ES','Catalan (Catalan)',0),
  ('fil-PH','Filipino (Philippines)',0),
  ('gl-ES','Galician (Galician)',0),
  ('ka-GE','Georgian (Georgia)',0),
  ('gu-IN','Gujarati (India)',0),
  ('ha-Latn-NG','Hausa (Latin, Nigeria)',0),
  ('hi-IN','Hindi (India)',0),
  ('is-IS','Icelandic (Iceland)',0),
  ('ig-NG','Igbo (Nigeria)',0),
  ('id-ID','Indonesian (Indonesia)',0),
  ('iu-Latn-CA','Inuktitut (Latin, Canada)',0),
  ('ga-IE','Irish (Ireland)',0),
  ('xh-ZA','isiXhosa (South Africa)',0),
  ('zu-ZA','isiZulu (South Africa)',0),
  ('kn-IN','Kannada (India)',0),
  ('kk-KZ','Kazakh (Kazakhstan)',0),
  ('km-KH','Khmer (Cambodia)',0),
  ('rw-RW','Kinyarwanda (Rwanda)',0),
  ('sw-KE','Kiswahili (Kenya)',0),
  ('kok-IN','Konkani (India)',0),
  ('ky-KG','Kyrgyz (Kyrgyzstan)',0),
  ('lo-LA','Lao (Lao P.D.R.)',0),
  ('lb-LU','Luxembourgish (Luxembourg)',0),
  ('mk-MK','Macedonian (Former Yugoslav Republic of Macedonia)',0),
  ('ms-BN','Malay (Brunei Darussalam)',0),
  ('ms-MY','Malay (Malaysia)',0),
  ('ml-IN','Malayalam (India)',0),
  ('mt-MT','Maltese (Malta)',0),
  ('mi-NZ','Maori (New Zealand)',0),
  ('mr-IN','Marathi (India)',0),
  ('ne-NP','Nepali (Nepal)',0),
  ('nn-NO','Norwegian, Nynorsk (Norway)',0),
  ('or-IN','Oriya (India)',0),
  ('ps-AF','Pashto (Afghanistan)',0),
  ('pa-IN','Punjabi (India)',0),
  ('quz-PE','Quechua (Peru)',0),
  ('sr-Cyrl-CS','Serbian (Cyrillic, Serbia)',0),
  ('nso-ZA','Sesotho sa Leboa (South Africa)',0),
  ('tn-ZA','Setswana (South Africa)',0),
  ('si-LK','Sinhala (Sri Lanka)',0),
  ('ta-IN','Tamil (India)',0),
  ('tt-RU','Tatar (Russia)',0),
  ('te-IN','Telugu (India)',0),
  ('ur-PK','Urdu (Islamic Republic of Pakistan)',0),
  ('uz-Latn-UZ','Uzbek (Latin, Uzbekistan)',0),
  ('vi-VN','Vietnamese (Vietnam)',0),
  ('cy-GB','Welsh (United Kingdom)',0),
  ('wo-SN','Wolof (Senegal)',0),
  ('yo-NG','Yoruba (Nigeria)',0)

DECLARE @LanguageID   AS int = (SELECT LanguageID FROM [Language] WHERE   LanguageCode = 'en-US')     
DECLARE @LabelID   AS int = (SELECT LabelID FROM Label WHERE   Name = 'ReportName1')  

INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'ReportName!')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'PeriodFrom')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'PeriodFrom')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'PeriodUntil')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'PeriodUntil')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Currencies')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Currencies')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Total')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Total')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Time')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Time')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Source')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Source')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Location')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Location')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Page')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Page')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'TotalPages')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'TotalPages')


SET @LanguageID = (SELECT LanguageID FROM [Language] WHERE   LanguageCode = 'nl-NL')  
SET @LabelID  = (SELECT LabelID FROM Label WHERE   Name = 'ReportName1')  

INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'RaportNaam!')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'PeriodFrom')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Periode Van')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'PeriodUntil')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Periode Tot')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Currencies')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Valuta')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Total')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Totaal')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Time')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Tijd')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Source')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Bron')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Location')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Locatie')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'Page')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Pagina')

SET @LabelID = (SELECT LabelID FROM Label WHERE   Name = 'TotalPages')
INSERT INTO [dbo].[Translation] ([LanguageID],[LabelID],[LabelTranslation]) VALUES (@LanguageID, @LabelID, 'Totaal aantal paginas')

The above scripting can be improved be implementing stored procedures. I'll work on that in the future. The next thing to consider is to build a webapplication or .NET application to maintain the translations.


Next, in the report I created a dataset for the supported languages and made it visible, but when I'm almost finished I'll change this in a hidden parameter. Below I've printed some important screenshots of the solution.

This is the language dataset for filling the parameter:

The available values for the language parameter:

Then I've created the translation parameter:

Modded the properties accordingly to the following screenshot:

Then I added the code from the and tweaked it a bit.

Public Function GetTranslation(Parm as Parameter, LabelID as Integer, _
                Optional AddColon as Boolean=False) as String

Dim t as Integer

For t = 0 to Ubound(Parm.Value)
   If (Parm.Value(t) = LabelID) Then
      if AddColon = true then
         Return Parm.Label(t) & ":"
         Return Parm.Label(t)
      End if
   End if
Next t

Return ""

End Function

And I tested it with an expression in a textbox : Code.GetTranslation(Parameters!Translation, 2)

Depending on the language parameter the right translation is chosen form the translation table.


When you want to make your report truly multilanguage, then you have to go a step further. Then you have to set the language property in the report, for instance. This will format your fixed elements like a date (eg. 11/5/2014 or 5/11/2014).

Another important thing to consider is the data. I've blogged already about multi currency issues earlier (Part I, Part II and Part III). This solution will transform your amount fields in every currency you want. In a new blogpost I'm going to write about multicurrency (Part IV) I'll show how to use the formatstring and the Fields!Amount("FORMAT_STRING") option to format the currencies ccording to the settings of the SSAS cube. This way the currency symbols are inherited from the SSAS cube. Below an example of the report in English

Here an example in Dutch:

One thing to consider and to improve is to define multilingual values for contents of a table. For instance Bicycle, car and Bus for English and Fiets, Auto and Bus in dutch. In AdventureWorks this is solved by adding extra columns to a dimension: Englishname, FrenchName, etc. Perhaps I'll take a look into that....


It's possible to make your report multilingual. Standard, there is no support for a multilingual report in Reporting Services, unfortunately. In C# and .NET you have resource files that stores the translations. With aid of some logic in SSRS we can come quite along way..