dinsdag 8 november 2016

SQL : ColumnStore (part III) : enforce uniqueness (SQL Server 2014)

Introduction

Columnstores in SQL Server 2014 can not enforce primary keys and you can't create an unique non clustered index in contrast with SQL server 2016, where you are able to create an unique non clustered index on the ColumnStore table. That's a pity because I have to work with SQL Server 2014. Now, how can you enforce a semi primary key on a ColumnStore table? Well, there seems to be a way to do that with indexed views together with an unique clustered index. In this blogpost I'll show you the way. For this blogpost I've used the blogpost of MSDN where you can find more information about this.

Enforce uniqueness with Indexed view

For this blogpost I've created a simple table.
       
/*
DROP TABLE testIndexedVIEW
GO
*/
CREATE TABLE testIndexedVIEW
(TestID int)
GO     
 

The next step is creating a ColumStore table with the following statement:
       
CREATE CLUSTERED COLUMNSTORE INDEX CCI_testIndexedVIEW ON testIndexedVIEW;  
GO        
 

Next step is building the indexed view on this table (WITH SCHEMABINDING):
       
/* 
DROP VIEW dbo.testIndexedViewUniqueness
GO
*/

CREATE VIEW dbo.testIndexedViewUniqueness
WITH SCHEMABINDING
AS 
SELECT TestID
FROM dbo.testIndexedVIEW
GO      
 

And now create a Unique Clustered Index on this Indexed view:
       

CREATE UNIQUE CLUSTERED INDEX UNCI_testIndexedViewUniqueness ON testIndexedViewUniqueness(TestID)
GO
       
 

That's it. The columnstore, indexed view and Unique Clustered Index are in place and now it's time to test this by inserting twice the same value in the column with the Unique Clustered Index.
       
INSERT INTO [dbo].testIndexedVIEW
           (TestID)
     VALUES
           (1)

INSERT INTO [dbo].testIndexedVIEW
           (TestID)
     VALUES
           (1)     
 

Resulting in an error message.

Level 14, State 1, Line 21
Cannot insert duplicate key row in object 'dbo.testIndexedVIEW' with unique index 'UNCI_testIndexedViewUniqueness'. The duplicate key value is (1).
The statement has been terminated.

Conclusion

It's not possible to create an Unique Clustered Index on a ColumnStore table in SQL Server 2014. With an Indexed View, together with a Unique Nonclustered index you can enforce the uniqueness of the key fields.

Greetz,

Hennie

maandag 7 november 2016

SQL : ColumnStore Index (part II) : Indexing

Introduction

In my first blogpost about the ColumnStore Index, I've written about creating and dropping a columnstore. Please took a look at my first blogpost: SQL : ColumnStore Index (part I). In this blogpost I'll continue and observe some details about the ColumnStore Index, particularly differences between SQL Server 2014 and SQL Server 2016 building non clustered indexes on ColumnStores.

Creating indexes on the ColumnStore 

So, the table sale (imported form WorldWideImporters demo database) is not a Rowstore anymore but a ColumnStore. Let's experiment with some indexing. Can we create a clustered index on the ColumnStore Index? It seems to not very logical (and it is) but let's try and see what happens.

       
CREATE CLUSTERED INDEX [CI_CustomerKey] ON Sale([Customer Key])
       

Resulting in :

Msg 1902, Level 16, State 3, Line 15
Cannot create more than one clustered index on table 'Sale'. Drop the existing clustered index 'CCI_Sale' before creating another.

So, this will generate error, as expected. Let's try to create a non clustered index on the ColumnStore (in SQL Server 2016)

       
CREATE NONCLUSTERED INDEX [CI_CustomerKey] ON Sale([Customer Key])


And experiment with a query and see what happens.

       
SELECT [Customer Key] FROM Sale WHERE [Customer Key] = 6
       

And now it uses the non clustered index :



But when I try to do this on SQL Server 2014, it will give an error:

Msg 35303, Level 16, State 1, Line 1
CREATE INDEX statement failed because a nonclustered index cannot be created on a table that has a clustered columnstore index. Consider replacing the clustered columnstore index with a nonclustered columnstore index.

So, creating a (unique) non clustered index on SQL Server 2014 is not possible but on SQL Server 2016 it is easy done.

Conclusion

It is not possible to create (unique) non clustered indexes on a Clustered ColumnStore in SQL Server 2014. In SQL Server 2016 it is a piece of cake.

Greetz,

Hennie


zondag 6 november 2016

SQL : ColumnStore Index (part I)

Introduction

ColumnStore indexes are a powerful addition to SQL Server and it started with SQL Server 2012 and it continued to evolve to a better ColumnStore Index today. I want to learn more about it and it did some investigation about the columnstore Index. So, this blogpost is not a in depth blogpost but a reflection of some basics of the columnstore index.

Now, I'm working with SQL Server 2016 and it is possible to create a column store table with the CREATE TABLE statement. Unfortunately my client is working on SQL Server 2014 and therefore I won't use that.

Playground

First, let's start with getting some data from the WorldWideImporters database.

SELECT * 
INTO Sale
FROM WideWorldImportersDW.[Fact].[Sale]


Create the ColumnStore

Now, we have a table Sale in a database (I've called it ColumnStoreTest). Because we are pretending that we are working on SQL Server 2014 we create the ColumnStore Index with the CREATE CLUSTERED COLUMNSTORE INDEX command

       
CREATE CLUSTERED COLUMNSTORE INDEX CCI_Sale ON [Sale];  
GO  


How do we know whether we created a Column Store Table? Well, go the properties of the table and click on the Storage tab and here you can see the CompressionType as ColumnStore (actually, I'm not sure whether this is also in SQL Server 2014).


Drop the ColumnStore

Next stop is dropping the ColumnStore with the DROP INDEX statement:

       
DROP INDEX CCI_Sale ON [Sale]
GO  
 

This will drop the index and it will return to the row store. The CompressionType is set to none.



Conclusion 

First discovery of the ColumnStore Index: how to build and how to drop a columnstore index.

Greetz,

Hennie



woensdag 2 november 2016

WorldWideImporters Data warehouse Datamodels

Introduction

Just a small blogpost about the data models of World wide Importers data warehouse (WorldWideImportersDW). I just wanted to get an overview of the data warehouse models of World Wide Importers demo database of SQL Server 2016. There seems to be 6 facts:
  • Order
  • Sale 
  • Purchase
  • Stock Holding
  • Movement
  • Transaction
WideWorldImportersDW is the main database for data warehousing and analytics (OLAP – OnLine Analytics Processing). The data in this database is derived from the transactional database WideWorldImporters, but it uses a schema that is specifically optimized for analytics.

Order

This is the order fact of the WorldWide Importers database and there a couple of (role playing) dimensions here:
  • Order date and Picked Date (Date)
  • SalesPerson and Picker (Employee)
  • City
  • Date
  • Customer
  • StockItem





Sale

The sales fact contains almost the same dimensions as the order fact. There are a couple of (role playing) dimensions here:
  • InvoiceDate and DeliveryDate (Date).
  • SalesPerson (Employee).
  • City.
  • Date.
  • Customer and BillToCustomer (Customer).
  • StockItem


Purchase

The purchase Fact has the following dimensions :
  • Date
  • Supplier
  • StockItem



Stock holding 

The StockHolding is a bit weird fact, in my opinion. I don't why and how but there seems to be one dimension: 
  • Stock holding

Movement

The movement fact has the following dimensions:
  • Date
  • StockItem
  • Customer
  • Supplier
  • TransactionType




Transaction

And the last one, Transaction has the following dimensions:

  • Date
  • Customer and BillToCustomer (customer)
  • Supplier
  • TransactionType
  • PaymentMethod



Conclusion

A simple and overview of the WorldWideImporters demo database of SQL Server 2016.

Greetz,

Hennie

zondag 30 oktober 2016

New, stopped, active and returned products

Introduction

I want to develop a report for the evolution of products. The question is whether a product that is sold is a new product in a given period (innovative product), a product that is not sold anymore in the given period (stopped product), is actively bought in a given period and a product that is sold in the previous period and is also sold in the current period.

Now, yet another interesting thing with this approach is that you can also use this in customer behavior analysis for customer retention, -attrition, -churning and -loyality.

Now there is something to say about this approach because it's not based on a certain definition and I accept this for this particular situation. The customer is happy with a year to year comparison between products. So, in this lab situation the logic is dependent on the filter context (in this case the calendar year slicer).

Lab 

I have made a small workbook with some sample data to experiment with the DAX expression. This is the data that is used for the building the expressions


And this is the model.



New products

Fist build your implicit measure SumNetrevenue and the Previous year calulation. A new product is a product that is sold in the current period (year) and is not sold in the previous period. You could argue that, I know, but for me it is ok enough, for now.

SumNetRevenue:=SUM(Fact[Revenue])
       
NR-1Year:=CALCULATE([SumNetRevenue];
                     FILTER(ALL(Period);
                            Period[Year] = MAX(Period[Year]) - 1
                            )
                    )

#NewProducts:=COUNTROWS(FILTER(VALUES(Fact[ProductID]);[NR-1Year] = 0))


Stopped products

A stopped product is a product that is sold in the previous period but is not sold in the current period. This is the DAX expression:

       
#LostProducts:=COUNTROWS(FILTER(Product; ([#ActiveProducts] = 0 && [NR-1Year] <> 0)))       
 


Active products

An active product is a product that is solfd in in the current year (and could have been sold in previous years)

       
#ActiveProducts:=DISTINCTCOUNT(Fact[ProductID])


Returned products

A returned product is a product that is sold in the previous period and is also sold in the current period. This DAX expression is borrowed from the DAXPatterns of Alberto Ferrari and Marco Russo.

       
#ReturningProduct:=COUNTROWS (
    CALCULATETABLE (
        VALUES ( Fact[ProductID] );
        VALUES ( Fact[ProductID] );
        FILTER (
            ALL ( Period );
           Period[PeriodID] < MIN (Period[PeriodID] )
        )
    )
)
 

Conclusion

Interesting DAX expressions to use in customer- or product analysis.

Greetz,

Hennie