woensdag 22 juni 2016

First impression of SQL Server R Services (part I)

Introduction

I wanted to know more about R Services and I decided to walkhrough the tutorial provided by Microsoft. There are a couple of lessons you can follow to understand R Services. So this will the first blogpost about R Services and many will follow during my discovery of R services. R services is a powerful addition to SQL Server and one of the advantages is that you don't need to load your data to the client anymore. You can puss your code to the R Services of SQL Server and have advantages of a powerful server.

So this blogpost is about the first step (Lesson 1: Prepare the Data (Data science End-to-End walkthrough) and that is getting everything ready for analysis: installing packages, getting the data, loading the data in SQL Server, installing some of the code in SQL Server.

The data is a representative sampling of the New York City taxi data set, which contains records of over 173 million individual trips in 2013, including the fares and tip amounts paid for each trip.
For more information about how this data was originally collected, and how you can get the full data set, see http://chriswhong.com/open-data/foil_nyc_taxi/.

To make the data easier to work with, the Microsoft data science team performed downsampling to get just 1% of the data.

Walkthrough the steps

First I opened R Studio and created a new script and copied the code for installing the packages.



if (!('ggmap' %in% rownames(installed.packages()))){  
  install.packages('ggmap')  
}  
if (!('mapproj' %in% rownames(installed.packages()))){  
  install.packages('mapproj')  
}  
if (!('ROCR' %in% rownames(installed.packages()))){  
  install.packages('ROCR')  
}  
if (!('RODBC' %in% rownames(installed.packages()))){  
  install.packages('RODBC')  
}  
 
 

And I executed this code for installing the packages on the server.

       
# Install required packages on the server
install.packages("ggmap", lib=grep("Program Files", .libPaths(), value=TRUE)[1])  
install.packages("mapproj", lib=grep("Program Files", .libPaths(), value=TRUE)[1])       
 

The next step was downloading the data from Github. The code I copied from the Micoroft site gave some errors because there were some missing slashes.

       
$source = 'https://raw.githubusercontent.com/Azure/Azure-MachineLearning-DataScience/master/Misc/RSQL/Download_Scripts_R_Walkthrough.ps1'
$ps1_dest = "$pwd\Download_Scripts_R_Walkthrough.ps1"
$wc = New-Object System.Net.WebClient
$wc.DownloadFile($source, $ps1_dest)
.\Download_Scripts_R_Walkthrough.ps1 –DestDir 'C:\tempR'      
 

And this a printscreen of the entry and the execution.


Once the script is finished the files are in the TempR folder:


Next step was to install the scripts in SQL Server. This is done with Run_R_Walkthrough.ps1 Powershell script.

      
.\RunSQL_R_Walkthrough.ps1

Input the database server name (the full address): WIN-11283AFOULP
Input the name of the database you want to create: nyctaxi
Input the user name which has the previlige to create the database:
Input the password of user name which has the previlige to create the databas
Input the path to the csv file you want to upload to the database: c:\tempR       
 

Here a printscreen of this process.



The script is executed and results are shown in the powershell window. I've grabbed the dump and placed it here.

Start creating database and table on your SQL Server, and uploading data to the table. It may take a while...
---------------------------------------------
Executed SQL packet:
DECLARE @db_name varchar(255), @tb_name varchar(255)
DECLARE @create_db_template varchar(max), @create_tb_template varchar(max), @create_tb_template2 varchar(max)
DECLARE @sql_script varchar(max)
set @db_name = 'nyctaxi'
SET @tb_name = 'nyctaxi_sample'
SET @create_db_template = 'create database {db_name}'
SET @create_tb_template = '
use {db_name}
CREATE TABLE {tb_name}
(
       medallion varchar(50) not null,
       hack_license varchar(50)  not null,
       vendor_id char(3),
       rate_code char(3),
       store_and_fwd_flag char(3),
       pickup_datetime datetime  not null,
       dropoff_datetime datetime,
       passenger_count int,
       trip_time_in_secs bigint,
       trip_distance float,
       pickup_longitude varchar(30),
       pickup_latitude varchar(30),
       dropoff_longitude varchar(30),
       dropoff_latitude varchar(30),
       payment_type char(3),
       fare_amount float,
       surcharge float,
       mta_tax float,
       tolls_amount float,
       total_amount float,
       tip_amount float,
       tipped int,
       tip_class int
)
CREATE CLUSTERED COLUMNSTORE INDEX [nyc_cci] ON {tb_name} WITH (DROP_EXISTING = OFF)
'

SET @create_tb_template2 = '
use {db_name}
CREATE TABLE nyc_taxi_models
(
        model varbinary(max) not null
)
'

-- Create database
SET @sql_script = REPLACE(@create_db_template, '{db_name}', @db_name)
EXECUTE(@sql_script)

-- Create table
SET @sql_script = REPLACE(@create_tb_template, '{db_name}', @db_name)
SET @sql_script = REPLACE(@sql_script, '{tb_name}', @tb_name)
EXECUTE(@sql_script)

-- Create the table to persist the trained model
SET @sql_script = REPLACE(@create_tb_template2, '{db_name}', @db_name)
EXECUTE(@sql_script)

Execution succesful
-----------------------------------------
C:\tempR\create-db-tb-upload-data.sql execution done
start loading the data to SQL Server table...
Unable to load BCP resource DLL.  BCP cannot continue.
This step (creating database and tables, and uploading data to table) takes 27.19 seconds.
Start running the .sql files to register all stored procedures used in this walkthrough...
---------------------------------------------
Executed SQL packet:
USE [nyctaxi]

Execution succesful
---------------------------------------------
Executed SQL packet:
/****** Object:  StoredProcedure [dbo].[PersistModel]    Script Date: 11/05/2015 23:04:39 ******/
SET ANSI_NULLS ON

Execution succesful
---------------------------------------------
Executed SQL packet:
SET QUOTED_IDENTIFIER ON

Execution succesful
---------------------------------------------
Executed SQL packet:

IF EXISTS (SELECT * FROM sys.objects WHERE type = 'P' AND name = 'PersistModel')
  DROP PROCEDURE PersistModel

Execution succesful
---------------------------------------------
Executed SQL packet:
-- =============================================
-- Author:              <Author,,Name>
-- Create date: <Create Date,,>
-- Description: Store model in database. Input is a binary representation of R Model object passed as Hex string
-- =============================================
CREATE PROCEDURE [dbo].[PersistModel]
@m nvarchar(max)
AS
BEGIN
        -- SET NOCOUNT ON added to prevent extra result sets from
        -- interfering with SELECT statements.
        SET NOCOUNT ON;
        insert into nyc_taxi_models (model) values (convert(varbinary(max),@m,2))
END

Execution succesful
-----------------------------------------
C:\tempR\PersistModel.sql execution done
---------------------------------------------
Executed SQL packet:
USE [nyctaxi]

Execution succesful
---------------------------------------------
Executed SQL packet:

/****** Object:  StoredProcedure [dbo].[PredictTip]    Script Date: 10/29/2015 4:36:55 PM ******/
SET ANSI_NULLS ON

Execution succesful
---------------------------------------------
Executed SQL packet:

SET QUOTED_IDENTIFIER ON

Execution succesful
---------------------------------------------
Executed SQL packet:

IF EXISTS (SELECT * FROM sys.objects WHERE type = 'P' AND name = 'PredictTipBatchMode')
  DROP PROCEDURE PredictTipBatchMode

Execution succesful
---------------------------------------------
Executed SQL packet:

CREATE PROCEDURE [dbo].[PredictTipBatchMode] @inquery nvarchar(max)
AS
BEGIN

  DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1
    model
  FROM nyc_taxi_models);
  EXEC sp_execute_external_script @language = N'R',
                                  @script = N'
mod <- unserialize(as.raw(model));
print(summary(mod))
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL,
          predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
str(OutputDataSet)
print(OutputDataSet)
',
                                  @input_data_1 = @inquery,
                                  @params = N'@model varbinary(max)',
                                  @model = @lmodel2
  WITH RESULT SETS ((Score float));

END


Execution succesful
-----------------------------------------
C:\tempR\PredictTipBatchMode.sql execution done
---------------------------------------------
Executed SQL packet:
USE [nyctaxi]

Execution succesful
---------------------------------------------
Executed SQL packet:

/****** Object:  StoredProcedure [dbo].[PredictTipSingleMode]    Script Date: 10/31/2015 5:44:43 PM ******/
SET ANSI_NULLS ON

Execution succesful
---------------------------------------------
Executed SQL packet:

SET QUOTED_IDENTIFIER ON

Execution succesful
---------------------------------------------
Executed SQL packet:

IF EXISTS (SELECT * FROM sys.objects WHERE type = 'P' AND name = 'PredictTipSingleMode')
  DROP PROCEDURE PredictTipSingleMode

Execution succesful
---------------------------------------------
Executed SQL packet:


CREATE PROCEDURE [dbo].[PredictTipSingleMode] @passenger_count int = 0,
@trip_distance float = 0,
@trip_time_in_secs int = 0,
@pickup_latitude float = 0,
@pickup_longitude float = 0,
@dropoff_latitude float = 0,
@dropoff_longitude float = 0
AS
BEGIN

  DECLARE @inquery nvarchar(max) = N'
  SELECT * FROM [dbo].[fnEngineerFeatures](
  @passenger_count,
@trip_distance,
@trip_time_in_secs,
@pickup_latitude,
@pickup_longitude,
@dropoff_latitude,
@dropoff_longitude)
        '
  DECLARE @lmodel2 varbinary(max) = (SELECT TOP 1
    model
  FROM nyc_taxi_models);
  EXEC sp_execute_external_script @language = N'R',
                                  @script = N'

mod <- unserialize(as.raw(model));
print(summary(mod))
OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet, outData = NULL,
          predVarNames = "Score", type = "response", writeModelVars = FALSE, overwrite = TRUE);
str(OutputDataSet)
print(OutputDataSet)
',
                                  @input_data_1 = @inquery,
                                  @params = N'@model varbinary(max),@passenger_count int,@trip_distance float,
@trip_time_in_secs int ,
@pickup_latitude float ,
@pickup_longitude float ,
@dropoff_latitude float ,
@dropoff_longitude float',
                                  @model = @lmodel2,
                                                                    @passenger_count =@passenger_count ,
                                                                        @trip_distance=@trip_distance,
                                                                        @trip_time_in_secs=@trip_time_in_secs,
                                                                        @pickup_latitude=@pickup_latitude,
                                                                        @pickup_longitude=@pickup_longitude,
                                                                        @dropoff_latitude=@dropoff_latitude,
                                                                        @dropoff_longitude=@dropoff_longitude
  WITH RESULT SETS ((Score float));

END



Execution succesful
-----------------------------------------
C:\tempR\PredictTipSingleMode.sql execution done
---------------------------------------------
Executed SQL packet:
USE [nyctaxi]

Execution succesful
---------------------------------------------
Executed SQL packet:

/****** Object:  UserDefinedFunction [dbo].[fnCalculateDistance]    Script Date: 10/29/2015 4:33:52 PM ******/
SET ANSI_NULLS ON

Execution succesful
---------------------------------------------
Executed SQL packet:

SET QUOTED_IDENTIFIER ON

Execution succesful
---------------------------------------------
Executed SQL packet:

IF EXISTS (SELECT * FROM sys.objects WHERE type IN ('FN', 'IF') AND name = 'fnCalculateDistance')
  DROP FUNCTION fnCalculateDistance

Execution succesful
---------------------------------------------
Executed SQL packet:

CREATE FUNCTION [dbo].[fnCalculateDistance] (@Lat1 float, @Long1 float, @Lat2 float, @Long2 float)
-- User-defined function calculate the direct distance between two geographical coordinates.
RETURNS float
AS
BEGIN
  DECLARE @distance decimal(28, 10)
  -- Convert to radians
  SET @Lat1 = @Lat1 / 57.2958
  SET @Long1 = @Long1 / 57.2958
  SET @Lat2 = @Lat2 / 57.2958
  SET @Long2 = @Long2 / 57.2958
  -- Calculate distance
  SET @distance = (SIN(@Lat1) * SIN(@Lat2)) + (COS(@Lat1) * COS(@Lat2) * COS(@Long2 - @Long1))
  --Convert to miles
  IF @distance <> 0
  BEGIN
    SET @distance = 3958.75 * ATAN(SQRT(1 - POWER(@distance, 2)) / @distance);
  END
  RETURN @distance
END

Execution succesful
-----------------------------------------
C:\tempR\fnCalculateDistance.sql execution done
---------------------------------------------
Executed SQL packet:
USE [nyctaxi]

Execution succesful
---------------------------------------------
Executed SQL packet:

/****** Object:  UserDefinedFunction [dbo].[fnEngineerFeatures]    Script Date: 10/29/2015 4:36:07 PM ******/
SET ANSI_NULLS ON

Execution succesful
---------------------------------------------
Executed SQL packet:

SET QUOTED_IDENTIFIER ON

Execution succesful
---------------------------------------------
Executed SQL packet:

IF EXISTS (SELECT * FROM sys.objects WHERE type IN ('FN', 'IF') AND name = 'fnEngineerFeatures')
  DROP FUNCTION fnEngineerFeatures

Execution succesful
---------------------------------------------
Executed SQL packet:

CREATE FUNCTION [dbo].[fnEngineerFeatures] (
@passenger_count int = 0,
@trip_distance float = 0,
@trip_time_in_secs int = 0,
@pickup_latitude float = 0,
@pickup_longitude float = 0,
@dropoff_latitude float = 0,
@dropoff_longitude float = 0)
RETURNS TABLE
AS
  RETURN
  (
  -- Add the SELECT statement with parameter references here
  SELECT
    @passenger_count AS passenger_count,
    @trip_distance AS trip_distance,
    @trip_time_in_secs AS trip_time_in_secs,
    [dbo].[fnCalculateDistance](@pickup_latitude, @pickup_longitude, @dropoff_latitude, @dropoff_longitude) AS direct_di
stance

  )

Execution succesful
-----------------------------------------
C:\tempR\fnEngineerFeatures.sql execution done
Completed registering all stored procedures used in this walkthrough.
This step (registering all stored procedures) takes 22.5 seconds.
Plug in the database server name, database name, user name and password into the R script file
This step (plugging in database information) takes 0.2 seconds.    
 

Two tables are created, nyc_taxi_models and nyctaxi_sample




Now the table needs to be filled with the data. There seems to be one datafile available but there are two tables created. I don't know why.


The data is loaded and inserted in the table nyctaxi_sample and contains 1.703.957 rows.


The right number in table.


Conclusion

The first blogpost about R Services and it is to early to say something about R Services. In the next blogpost I'll elaborate more about R Services and the functions needed.

Greet,

Hennie



PowerPivot : Discovering prediction of sales with simulation

Introduction

I was wondering whether I could add predictive analytics into my PowerPivot Workbooks. If I could predict the sales based on the sales in the former month or based on the same period last year or perhaps both, it would be a neat addition to my workbooks.

Now, the problem lies in the fact that statistical support in PowerPivot 2013 is very limited. In Excel you have statistical functions like NORM() and NORM.INV() that can help you building predictions (with simulation) in workbooks. But in PowerPivot 2010 and 2013 practically no support for statistics is available. Indeed we can calculate the Average and the standard deviation, but if you want to go a step further and use for instance a normal distribution function to calculate changes, you are out of luck.

So this blogpost is a kind of a discovery based blogpost where I try to find some solutions that would help predicting sales with aid of Excel 2013 and PowerPivot 2013. I found out that there are also (custom) functions in SQL Server that can help you building simulations, but that is for a later blogpost.

Simulation explained

Simulation is the ability to understand the system and implications of its changes over time including forecasting. Simulation is the mimicking of the operation of a real system, such as day to day operations in an organization. By using simulation you can quickly find out how the department or an organization might behave in the future. Simulation in a probabilistic model means that the outcome of the model changes every time you run it. It will change because a certain randomness in the formulas. This is in contrast with deterministic models when every time you run the model the output is the same.

Simulation doesn't show you what will occur; instead, it will show you many scenarios that might occur!

The data sets

I’ve created a couple of data sets in this model. The first one is the fact of sales for a certain organization. For simplicity, I’ve only included a date and a sales amount for the current month. Now we can extent this with another period If we like, for instance same period last year. I’ll not do explain this in more detail. This blogpost is just about getting the idea.

So this is the calendar dimension with a date as key and a daynumber as an attribute.



This is the sales data and it could be in a fact


Now for using the data in the simulation we need to calculate the average and the standard deviation. Luckily PowerPivot 2013 can calculate this!

       
AvgSales:=CALCULATE(AVERAGE(Sales[Sales]); ALL(Sales))

StanDevSales:=CALCULATE(STDEV.P(Sales[Sales]); ALL(Sales))       
 

Now because PowerPivot does not have statistical functions I can use Excel to calculate the predictions. So I pull the averages and standard deviation into Excel and calculate there the simulation per day.


With this code for Average Sales and for the standard deviation:

       
=CUBEVALUE("ThisWorkbookDataModel"; "[Measures].[AvgSales]")
       
=CUBEVALUE("ThisWorkbookDataModel";"[Measures].[StanDevSales]")


The following piece of information I use in the table to calculate the predicted sales based on the average and standard deviation:

       
=NORM.INV(RAND();$E$4;$E$5)       
 

Now I pull this information back into PowerPivot in order to integrate this later in a PivotChart.


And now I can build this data model


And now I can draw this graph with the information :


And if we refresh the data you can see that is a probabilistic model and the prediction is simulated differently when the simulation executes multiple times. This is illustrated with the grey line.




Review of the solution

Now there are some serious drawbacks about this solution. If I want to reload the data with new calculations PowerPivot reports an error. The column with the predicted sales is showing "GETTING DATA". 




If the button "Refresh Selected" is pressed the error disappears and the data is refreshed again. Now, this is not very feasible in an automated environment, is it?



Another disadvantage is passing the average and the standard deviation to Excel with the CUBEVALUE. I haven't found another solution yet but I need to reference this cells with an average and standarddeviation in the NORM.INV() function in Excel. The problem is that this is nice on high level sales analysis but what if we want to include products in the simulation. Referencing every product with CUBEVALUE is time consuming and prone to errors when products are introduced and others decay.

Yet, another problem is when the data is not normally distributed. In the diagram below the predicted sales is even negative!


Conclusion

This was a interesting exercise to understand simulation a bit further. It seems that support for this in PowerPivot 2013 is minimum. Perhaps, better tooling are PowerPivot 2016, PowerBI v2 (Desktop). I've read that there is still no support for a Excellish NORM.INV() and NORM() in PowerBI v2. SQL Server 2016 has an R engine included. But, I think it is possible to automate this with SQL Server T-SQL functions as well. But that is for a future blogpost.

Regarding the simulation, this example assumes that the data is normally distributed, but for instance there are products that are just introduced and has another pattern (exponential?). Now, if you want to predict trustful figures a good model is needed. Now the problem is, what is a good model. As someone said :"All models are wrong but some are less wrong", you have to strive to a maximum of effort to bring the model to perfection. Perhaps you have to apply the 80/20 rule: 20 % of effort will give you a 80% correctness.

In my opinion you need to test first whether what kind of distribution the data has. Is it normally distributed, is it skewed? Or perhaps there are seasonal aspects in the data? Or can I give a reliability number to a prediction? So, a nice discovery of the subject but not ready for a real situation.

Greetz,
Hennie

dinsdag 21 juni 2016

Error : This PivotTable report is invalid

Introduction

I was struggling with a problem with a PivotTable in Excel for hours. Although I can't exactly pinpoint the source of the problem, I would like to share this problem and a possible solution with you.

The Problem

The situation is that I have a workbook with multiple sheets and every sheets shows another view of the data, for instance from a product hierarchy and a customer hierarchy. Now, the client wanted to use slicers to select all the pivot tables and pivot charts on multiple sheets in the workbook. I have shown the situation below.


Now the real pain. The following error messages kept on coming everything I tried : "This PivotTable report is invalid. Try refreshing the Data...".


It seems a problem in Excel and not PowerPivot that is running in the back.

The solution

After trying to turn off the reportconnections (and not succeeding), I decided to remove the second tabsheet and start from scratch and right away the problems vanished.


As a side effect the other slicers performed a lot better now.

Update 04-06-2016 (NL) : Today I ran again into this error. Now this time deleted a hierarchy that was in the pivottable and put it it back again and it works again. May be the cause could be that I was editing the underlying query in PowerPivot .

Conclusion

Cross selecting (Rob Collie already used cross filtering in another context) with the slicers to multiple pivot tables can be a real pain. So don't make it too complex or you'll end in a swamp of errors in Excel.

Greetz,
Hennie


woensdag 15 juni 2016

What is Github?

Introduction

GitHub is a code hosting platform for version control and for collaboration. You can work together on projects from any place you want. Now once in a while, I'm using Github for working on my R skills For exercises in the course I need to create and maintain a Github repository. A repository is a usually a project and it can contain files and folders.

Now as I said, I'm not an user that often uses Github and therefore I wrote this blog. There are a couple of steps that you have to do in order to check in your code, in short:
  1. Create the Github account.
  2. Install the Github Desktop.
  3. Create a repository.
  4. Copy the files in the Github folder.
  5. Commit the code.
  6. Sync the code with Github.com
Commit and syncing,  I do find a bit confusing: Commit is saving the code in your project and syncing is syncing the commited code with Github.com. Let's keep that in mind.


Working with Github

Now if you create a repository a folder is created. this one is for the course Reproducible research R on Coursera.


Opening github desktop will show the files and code in the github folder on your local system.


Nothing is in the github.com repository, yet


Next step is syncing the code with Github.com


And the files are synced with Github.com



Conclusion

A small blogpost about using Github with Github desktop.

Greetz,
Hennie

zaterdag 11 juni 2016

Installation of Team foundation Server 2015

Introduction

As your projects get bigger and bigger and more people are involved, a normal Visual Studio project is not enough anymore. It is not build for multiple user support. Now in order to maintain the integrity and control in your project, it is a good idea to install Team Foundation Server 2015 and integrate this with your Visual Studio project. Now let's take a look at the installation process of Team Foundation Server. In this blog post I'm showing the installation process with Team Foundation Express 2015.

Installation of Team Foundation Server

First download Team Foundation Express 2015 and execute the installer and it starts with the following window.


The installation continues:


And it reports some warnings at the end of the install process. As it is a warning I decide to neglect this, for now.


When you start Team Foundation you start to configuring SQL Server. Press on Start Wizard.


The configuration begins.


 Again a warning and I decide to neglect this.


Press on next.


This was the installation process of Team Foundation Server 2015 Express.

Conclusion

This is a small blogpost about the installation of Team Foundation Server 2015 Express.


Greetz,

Hennie

dinsdag 7 juni 2016

DAX : Some DAX QTD, YTD and LY calculations with a Period Dimension

Introduction

Sometimes, there are workbooks where you don't have a granularity on a daily basis available but on a higher level. For, instance on a weekly or a monthly level. The problem is that the normal time intelligence function only works on a date dimension on a day granularity) and not a higher level of granularity. In this blog post I'll write down some of the time intelligence based on a monthly basis period dimension. The key of this dimension is based on the format YYYYMM.

Netrevenue

First thing is always create an implicit summary of the measure, in this case the Net revenue. This is a best practice and has many advantages like disconnecting the busnesslogic from the business layer. For more best practices I'll advise you to take a look at this blogpost.

       
SumNR:=SUM(FactSalesPerMonth[NetRevenue])


QuarterToDate (QTD)

One requirement that is often heard is that customers want a QuarterToDate functionality. So in case of month January, February and March the values do add up and it is reset when April starts.

       
SumNetRevenueQTD:=CALCULATE ([SumNR];
                     FILTER(ALL(DimPeriodMonth);
                       DimPeriodMonth[CalendarYear] = VALUES(DimPeriodMonth[CalendarYear]) &&
                       DimPeriodMonth[Quarter] = VALUES(DimPeriodMonth[Quarter]) &&
                       DimPeriodMonth[MonthNumber] <= MAX(DimPeriodMonth[MonthNumber])
                     )
                )
 

YearToDate (YTD)

Yet another one is the YearToDate calculation. This calculation is used to summarize the numbers in a year until a certain point in time.

       
SumNRYTD:=CALCULATE ([SumNR];
                    FILTER(ALL(DimPeriodMonth);
                      DimPeriodMonth[CalendarYear] = VALUES(DimPeriodMonth[CalendarYear]) &&
                      DimPeriodMonth[MonthNumber] <= MAX(DimPeriodMonth[MonthNumber])
                    )
                  )
    

Last Year (LY)

sometimes, customers wants to compare the numbers with last year. This can be done with the following expression:

       
SumNRLY:=CALCULATE([SumNR];
                    FILTER(ALL(DimPeriodMonth);
                      DimPeriodMonth[MonthNumber] = MAX(DimPeriodMonth[MonthNumber]) &&
                      DimPeriodMonth[CalendarYear] = MAX(DimPeriodMonth[CalendarYear]) - 1
                    )
                 )       
 

QuarterToDate LastYear 

The following DAX expression can be used to calculate the Quarter To date, not for this year, but for the year before the this year. And 'this year' depends of the context of the calculation, off course.

       
SumNRQTDLY:=CALCULATE ([SumNR];
               FILTER(ALL(DimPeriodMonth);
                   DimPeriodMonth[CalendarYear] = VALUES(DimPeriodMonth[CalendarYear]) - 1 &&
                   DimPeriodMonth[Quarter] = VALUES(DimPeriodMonth[Quarter]) &&
                   DimPeriodMonth[MonthNumber] <= MAX(DimPeriodMonth[MonthNumber])
               )
            )
       

YearToDate LastYear 

And the last one I would like to mention is the Year To date but for the last year. This DAX expression calculates the sum of last year until a certain point in time (depending on the context of the calculation)

       
SumNRYTDLY:=CALCULATE ([SumNR];
               FILTER(ALL(DimPeriodMonth);
                  DimPeriodMonth[CalendarYear] = VALUES(DimPeriodMonth[CalendarYear])-1 &&
                  DimPeriodMonth[MonthNumber] <= MAX(DimPeriodMonth[MonthNumber])
               )
            )
       


Conclusion

This are some handy DAX expression that you can use in your calculations.

Greetz,
Hennie

maandag 6 juni 2016

Installation of SQL Server Data Tools

Introduction

Now the last part of the installation of SQL Server 2016 and that is SQL Server Data Tools. In the past you had to install SSDT BI and SSDT DB as two separate installations. It was confusing and lead to many misunderstandings. In this installation the two are combined.

Installation process

Start the setup.exe.


Again, a download from a website is started


I choose to install the English version.


And the installation process is executed.


Now check whether we have the DB and BI settings for SSDT and it is..


Conclusion

A blogpost about the installation of SSDT. Both DB and BI are installed in one single installation.

Greetz,

Hennie

Installation of SQL Server Management Studio 2016

Introduction

It seems that Microsoft decided to install the SQL Server Management Studio as a separate installation. And, you need to have internet connection to download the software. It is about 825 MB

Installation

Start the installation by opening the Setup.exe of SQL Server Developer edition and click on the Install SQL Server Management Tools.


And internet explorer is opened and you can download the SSMS 2016:


Click on the executable and start the installation with Install.


A restart is needed and you're done..

Conclusion

A small blogpost about the installation of SSMS 2016, It is now a separate installation.

Greetz,

Hennie

zondag 5 juni 2016

Installation of SQL Server Developer Edition 2016

Introduction

I've downloaded the SQL Server Developer Edition 2016 ISO from Visual Studio Essentials site and installed it on a VirtualBox environment. This is my first experience with SQL Server 2016. I didn't got involved in the CTPs and the RCs. This blogpost is just a plain and simple installation with most of the defaults. Disclaimer : this is not a recommended approach. It's just to learn and see how the new installation process is going.

You can download the new SQL Server at visual studio Dev Essentials.

Installation

Once you have the new SQL Server Developer Edition, you can start the installation process.  This is the familiar installation screen. Click on Installation.


Some screenshots along the installation way..


Click a couple of times on Next.


And here we are at the features window.


Some of the interesting (new) features:


Okay, living on the edge! Let's install all the features and see what happens during the next steps in the installation process.


Okay that went well...;-)


Click on the failed Oracle...link will show this:


Okay, now lets download this driver.


KB2919355 

And now solve this one:


Searching for this KB leads to an update for windows 8.1. I'm running Windows Server 2012 R2 and I haven't updated this server yet.


Okay.. a 700 MB download process starts up.. The next step was that I tried to install the software but it seems that the software is incompatible: "The update is not applicable to your computer". Really?


May be there is a Windows Server 2012 R2 version of KB2919355? Now strangely enough If I google for KB2919355 and Windows Server 2012 R2, I see the same page where I downloaded the Windows 8.1 update. This is really great.

I decided to run Windows Update to fix the problem. 1 GB of updates. Hopefully that will do the trick. But It keeps complaining about the KB2919355.


It seems I'm not the only one : http://answers.microsoft.com/en-us/windows/forum/windows8_1-update/unable-to-install-the-kb2919355-update/a3d7d0e8-df36-4a19-9a9a-dab9423bce04?page=2

And finally I managed to fix this according to this site. Just rerun the KB2919355 update.


The question is why so difficult and why is the KB2919355 not in the Windows update? Well, let's leave that for the other bloggers;-) Let's run it by pressing "Yes".


Installation proceeds

And now the installation proceeds with the other screens. click Next, Next, etc

Click Next.


Okay thisis interesting too. I can setup the tempdb during the installation process.


Next...

And these are the installed programs



Conclusion

Now this is the core of the (not a very best practice) installation of SQL Server. Next steps will be installation of SQL Server Management Studio and SSDT BI and DB.

Greetz,

Hennie