zondag 31 augustus 2014

70-467 Study Guide reformatted

Introduction

In this blogpost I've gathered and reformatted the study guide of the Microsoft exam 70-467 for SQL Server 2014. This exam is the last (depending on the sequence you take the exams, off course) of the MCSE BI certification. I decided to reformat the guide for a better overview of the topics covered for the exam. Hope that it helps studying the exam topics.

1. Plan business intelligence (BI) infrastructure (15-20%)

  • Plan for performance
    • Optimize batch procedures: extract, transform, load (ETL) in SQL Server Integration Services (SSIS)/SQL and processing phase in Analysis Services.
    • Configure Proactive Caching within SQL Server Analysis Services (SSAS) for different scenarios
    • Analyze and optimize performances of Multidimensional Expression (MDX) and Data Analysis Expression (DAX) queries
    • Understand the difference between partitioning for load performance versus query performance in SSAS.
    • Appropriately index a fact table
    • Optimize Analysis Services cubes in SQL Server Data Tools.
    • Create aggregations.
    • Understand performance consequences of named queries in a data source view.
  • Plan for scalability
    • Multidimensional OLAP (MOLAP).
    • Relational OLAP (ROLAP).
    • Hybrid OLAP (HOLAP).
    • Change binding options for partitions.
  • Plan and manage upgrades
    •  Plan change management for a BI solution.
  • Maintain server health
    • Design an automation strategy. 


2. Design BI infrastructure (15-20%)

  • Design a security strategy
    • Configure security and impersonation between database, analysis services and frontend.
    • Implement Dynamic Dimension Security within a cube.
    • Configure security for an extranet environment.
    • Configure Kerberos security.
    • Skills in authentication mechanisms.
    • Ability to build secure solutions end to end.
    • Design security roles for calculated measures.
    • Understand the tradeoffs between regular SSAS security and dynamic security
  • Design a SQL partitioning strategy
    • Choose the proper partitioning strategy for the data warehouse and cube.
    • Implement a parallel load to fact tables by using partition switching.
    • Use data compression in fact table.
    • Design optimal data compression.
  • Design a High Availability and Disaster Recovery strategy
    • Design a recovery strategy; back up and restore SSAS databases.
    • Back up and restore SSRS databases; move and restore the SSIS Catalog.
    • Design an AlwaysON solution.
  • Design a logging and auditing strategy
    • Design a new SSIS logging infrastructure (for  example, information available through the catalog views).
    • Validate data is balancing and reconciling correctly.


3. Design a reporting solution (20-25%)

  • Design a Reporting Services dataset
    • Data query parameters
    • Managing data rights and security.
    • Extracting data from analysis services.
    • Balancing querybased processing versus filter-based processing.
    • Managing data sets through the use of stored procedure.
    • Create appropriate DAX queries for an application.
    • Extract data from analysis services by using MDX queries
  • Manage Excel Services/reporting for SharePoint
    • Configure data refresh schedules for PowerPivot published to SharePoint.
    • Publish BI info to SharePoint.
    • Use SharePoint to accomplish BI administrative tasks.
    • Install and configure Power View.
    • Publish PowerPivot and Power View to SharePoint.
  • Design a data acquisition strategy
    • Identify the data sources that need to be used to pull in the data.
    • Determine the changes (incremental data) in the data source (time window).
    • Identify the relationship and dependencies between the data sources.
    • Determine who can access which data.
    • What data can be retained for how long (regulatory compliance, data archiving, aging).
    • Design a data movement strategy .
    • Profile source data.
    • Customize data acquisition using DAX with reporting services data sources.
  • Plan and manage reporting services configuration
    • Native mode.
    • Choose the appropriate reporting services requirements (including native mode and SharePoint mode).
  • Design BI reporting solution architecture
    • Linked reports, drill-down reports, drill-through reports, migration strategies, access report services API, sub reports, code-behind strategies.
    • Identify when to use Reporting Services, ReportBuilder, or Crescent.
    • design/implement context transfer when interlinking all types of reports (RS, RB, Crescent, Excel, PowerPivot).
    • Implement BI tools for reporting in SharePoint Excel Services versus Performance Point versus Reporting Services).
    • Select a subscription strategy.
    • Identify when to use Reporting Services, Report Builder, or Power View.
    • Design/implement context transfer when interlinking all types of reports (RS, RB, Power view, Excel).
    • Implement BI tools for reporting in SharePoint (Excel Services versus Power View versus Reporting
    • Services).
    • Enable Data Alerts.
    • Design map visualization.


4. Design BI data models (30-35%)

  • Design the data warehouse
    • Design a data model that is optimized for reporting; design and build a cube on top.
    • Design enterprise data warehouse (EDW) and OLAP cubes; choose between natural keys and surrogate keys when designing the data warehouse; use the facilities available in SQL Server to design, implement and maintain a data warehouse (partitioning, slowly changing dimensions (SCD), change data capture (CDC), Clustered Index Views, etc.).
    • Identify design best practices.
    • Implement a many to many relationship in an OLAP cube.
    • Design a data mart/warehouse in reverse from an Analysis Services cube (or empty Analysis Services cube that was created referring requirements).
    • Choose between performing aggregation operations in the SSIS pipeline or the relational engine.
    • Use SQL Server to design, implement, and maintain a data warehouse (including partitioning, slowly.
    • Changing dimensions [SCD], change data capture  [CDC], Index Views, and columnstore indexes).
    • Design a data mart/warehouse in reverse from an Analysis Services cube.
  • Design a schema
    • Multidimensional modeling starting from a star or snowflake schema.
    • Relational modeling for a Data Mart.
  • Design cube architecture
    • Partition cubes and build aggregation strategies for the separate partitions.
    • Design a data model.
    • Choose the proper partitioning strategy for the data warehouse and cube.
    • Design the data file layout.
    • Given a requirement, identify the aggregation method that should be selected for a measure in a MOLAP cube.
    • Performance tune a MOLAP cube using aggregations.
    • Design a data source view.
    • Cube drill-through and write back actions.
    • Choose the correct grain of data to store in a measure group.
    • Design analysis services processing by using indexes, indexed views, and order by statements.
  • Design fact tables
    • Design a data warehouse that supports many to many dimensions with factless fact tables.
  • Design BI semantic models Revised task – new full definition:
    • Plan for a multidimensional cube.
    • Write a UDM model with many to many (this is related to MDX/BISM code, but it is a good example for exercises).
    • Choose between UDM and BISM depending on the type of data and workload.
    • Plan for a multidimensional cube.
    • Support a many to-many relationship between tables.
    • Choose between multidimensional and tabular depending on the type of data and workload.
  • Design and create MDX calculations
    • MDX authoring.
    • Identify the structures of MDX and the common functions (tuples, sets, topcount, SCOPE etc.) .
    • Identify which MDX statement would return the required result (single result and multiple MDX options provided to test taker).
    • Implement a custom MDX or logical solution for a pre-prepared case task.
    • Create calculated members in an MDX statement.


5. Design an ETL solution (10-15%)

  • Design SSIS package execution
    • Using new project deployment model.
    • Passing values at execution time.
    • share parameters between packages.
    • Plan for incremental loads vs. full loads.
    • Optimize execution by using Balanced Data Distributor (BDD).
    • Choose optimal processing strategy (including Script transform, flat file incremental loads, and Derived Column transform).
  • Plan to deploy SSIS solutions
    • Deploy the package to another server with different security requirements.
    • Secure integration services packages that are deployed at the file system.
    • Demonstrate awareness of SSIS packages/projects  and how they interact with environments (including recoverability).
    • Decide between performing aggregation operations in the SSIS pipeline or the relational engine
    • Plan to automate SSIS deployment.
    • Plan the administration of the SSIS Catalog database.
  • Design package configurations for SSIS packages
    • Avoid repeating configuration information entered in SSIS packages and use configuration file.

Conclusion

This blogpost is about reformatting of the study guide for the 70-467 exam.


Greetz,

Hennie

woensdag 16 juli 2014

SSAS : Introduction to Tabular Model in SQL Server 2014 (RTM) (Part I)

Introduction

In this introduction blogpost I'll explain the Tabular Model with a walk through the tutorial that is available on MSDN. In this tutorial a Tabular Model is created on the AdventureWorks Database. This tutorial is based on the Adventure Works Cycles, a fictitious company. They produce and distributes bicycles in North America, europe and Asia.

1. Setting up the project

The first thing to do is creating the project with Visual Studio 2012. In SQL Server Data Tools, on the File menu, click New, and then click Project. In the New Project dialog box, under Installed Templates, click Business Intelligence, then click Analysis Services, and then click Analysis Services Tabular Project. In Name, type AW Internet Sales Tabular Model, then specify a location for the project files. Click OK.


2. Select an Analysis Services Instance

The next dialog is about selecting the right Workspace Server and the right compatibility level. First set the Workspace server to the Tabular Model Instance in SQL Server. The next thing to do is setting the compatibility level of Tabular model. In SQL Server 2014 Analysis Services Instance supports the following compatibility levels (database version) (MSDN):
  • SQL Server 2012 (1100)
  • SQL Server 2012 SP1 (1103)
  • SQL Server 2014 (1103)
Strangly enough, I see two compatability levels: SQL Server 2012 (1100) and SQL Server 2012 SP1 (1103).

I assume that SQL Server SQL Server 2012 SP1 (1103) and SQL Server 2014 (1103) are the same. Checking the compatibilty level in the Analysis Service Properties in SQL Server Managament Studio shows me that the compatibility level is also 1103.


Kasper de Jonge blogged about the differences between compatability levels 1100 and 1103. Please take a look at the differences between the two.

3. The Tabular project

Once you have created the Tabular project the following window is presented on the screen. There are a couple of interesting menus, options and properties available. In the Model menu, you can launch the Table Import Wizard, view and edit existing connections, refresh workspace data, browse your model in Microsoft Excel with the Analyze in Excel feature, create perspectives and roles, select the model view, and set calculation options. In the Table menu, you can create and manage relationships between tables, create and manage, specify date table settings, create partitions, and edit table properties. And in the Column menu you can add and delete columns in a table, freeze columns, and specify sort order. 



In the properties window of the model.bim, you can also see the compatbility level again(1103), the DirectQuery property. This property handles the In Memory (or not) possibillity of deploying a Tabular model.

Conclusion

In this blogpost I described an introduction to the Tabular model (Part I). I wrote about the compatability level of Tabular Model in SQL Server 2014 and Visual Studio 2012.

Greetz,
Hennie

donderdag 10 juli 2014

R: getwd() and setwd() (Part II)

Introduction

R is a very popular programming language for statistical analysis of data. All kind of tools are supporting R or they will. In this blogpost and followup blogpost I've gathered some constructs in R that are quite often used when you program in R. This is not a complete list but it is a handy list.

This is the second post about R in a series:
  • R : An introduction (Part I).
  • R : getwd() and setwd()  (Part II).

getwd() and setwd()

The first step using R is checking out the working directory. The working directory is the place where all of your project files are placed.You need to keep track in each of your R session.If you read or write files, this takes place in the working directory.


> getwd()
[1] "C:/Users/Hennie/Documents"

Setting the working directory is executed with setwd()

> setwd("~/Coursera-RProgramming")
> getwd()
[1] "C:/Users/Hennie/Documents/Coursera-RProgramming" 

Conclusion

With setwd() and getwd() the workingdirectory is set and checked.

Greetz,
Hennie




woensdag 9 juli 2014

SSAS : Partitions in a multidimensional cube

Introduction

When your fact tables grows, consequently your multidimenional cube will grow, too (if you have a multidimensional cube off course). You can increase your performance by splitting up the measures in partitions. Partitions allows you to divide a cube in one or more folders. These folders are usually placed on one or more hard drives for improved performance. Multiple partitions can have the following advantages (mssqltips):
  • Increased performance by placing the partitions on different disks.
  • Each partition can have it's own storage mode.
  • Parallel processing of partitions
  • etc. 
In this blogpost I'll show you how to create partition in SSAS by using the SSAS tutorial of MSDN. In this tutorial a cube is created on AdventureWorks2012DW database. The measuregroup internetsales is divided into partitions by year.

Creating the partitions

Open the cube and navigate to the partitions tab and click on New partition. A screenshot is shown in the picture below. Here you can zsee the standard partitions.


A wizard is opened and the following window appears. click on Next.




Change from Table binding to Query binding. Check the "internet tables" in the box Available tables and press OK


If you change to Query binding the following window is shown.The WHERE statement is not finished yet and there you have to enter your partition range.


The data of AdventureWorks is from 2005 until 2008. Below the first range is entered from 20050101 until 20051231. Now press on check and the messagebox "Syntax check succesful". Press on OK.



Now you have to enter the location of your partition. You can also process the partitions on different servers. Press OK



Do you want to design partitions by aggregating it? In this example I will not do that. Press Finish




Now The first partition is ready and now you can create the next partition.


Finally, when you have done this for 2005, 2006, 2007 and 2008 the partition tab will look like below.


Conclusions

It's is very easy to add partitions to a cube and a simple partition design is very straightforware and simple.


Greetz,

Hennie

vrijdag 4 juli 2014

Microsoft SQL Server 2014 Sample databases

Introduction

Today, I decided to do some study on SQL Server 2014 and I wanted to install the sample databases of SQL server 2014. But, I can't find the sample AdventureWorks databases for SQL server 2014. It seems that the sample databases of SQL Server 2012 are applicable for SQL Server 2014.

By default, sample databases and sample code are not installed during the installation of SQL Server and you have to search, find and download the sample databases.

Sample databases

On Codeplex you can download the AdventureWorks databases for SQL Server. It's quite a list and I grouped the files together.

Relational Databases (OLTP):
  • AdventureWorks2012 Data File
  • AdventureWorks 2012 OLTP Script
  • AdventureWorks2012-Full Database Backup.zip
  • AdventureWorks2012_Data.zip
  • AdventureWorks2012 CS Data File
  • AdventureWorks 2012 CS OLTP Script
  • AdventureWorksLT2012_Data
  • AdventureWorks 2012 LT Script

Databases (DW)
  • AdventureWorksDW2012 Data File
  • AdventureWorksDW2012Images

SSAS MD
  • AdventureWorks Multidimensional Models SQL Server 2012
  • Analysis Services Tutorial SQL Server 2012

SSAS Tabular
  • AdventureWorks Tabular Model SQL Server 2012
  • AdventureWorks Internet Sales Tabular Model SQL Server 201

Here are some other samples for SQL Server I've found so far:

Conclusion

It seems that the sample database of SQL server 2012 are reusable for SQL Server 2014. There are some more samples created specifically SQL Server 2014. But, the adventure Works databases are the same...

Greetz

Hennie