Incremental Processing

I would like to dedicate this post to SSAS cube incremental processing since it is very interesting approach how proceed your data without a need to process all data in SSAS Cube or all even all data in SSAS database.

On large data volume systems, it could be very useful to do not re-process all data in cubes from the beginning, since with more data coming into your systems it could be very time consuming. Or there could be a request to display data in SSAS cubes more frequently and in production time could be problem to realize Full Processing.

Here I would like to demonstrate the incremental approach for data processing to SSAS cubes and some difficulties you can meet with and you should be aware of.

  • The main assumption is to have cube in processed state to have possibility to add data incrementally. There can be many reasons how to get cube to unprocessed state. Deploying new solution or Fully processed dimensions, when data changed (this will not be needed in scenarios that dimension does not change).
  • Another important thing is that the you should handle the data you are going to process on relation level to do not get duplicities. The data that were already processed in past should not be processed again by incremental processing.

In my simple example I created

  1. Relation database and SSAS database (see post here)
  2. SSIS package for processing data (see post here)

All scripts could be downloadable here.

In processing option of fact table select Process Add for incremental processing (check it in SSAS package or if you are using Management Studio set it there). The other options will be explained and demonstrated in one of next post to complete an overview of SSAS processing options.

Picture 01 - SSIS - Analysis Services processing task - Process Options
Picture 01 – SSIS – Analysis Services processing task – Process Options

Process database with empty data source. See bellow that SSAS dimension is empty.

Picture 02 - Browsing dimension data in Visual studio
Picture 02 – Browsing dimension data in Visual studio

Now put data to dimension table and process it.

INSERT INTO dbo.DIMTest(name)
SELECT 'Test'

Change Process Options to Process Full because. If you leave Process Default it would not add the data to SSAS dimension, since Process Default cause the data processing when there was a change in dimension structures.

Picture 03 - Management Studio - Process Dimension -  Process Options
Picture 03 – Management Studio – Process Dimension – Process Options

You should see some data in your dimension by Browse Dimension using Management Studio.

Picture 04 - Visual Studio - Browsing dimension data
Picture 04 – Visual Studio – Browsing dimension data

Change it in the SSIS package too.

Picture 05 - Visual Studio - Analysis Services processing task
Picture 05 – Visual Studio – Analysis Services processing task

Now let’s try our incremental processing. Put some data to our fact table.

TRUNCATE TABLE dbo.FactTest__Delta
INSERT INTO dbo.FactTest__Delta (dimtest_id)
SELECT 1
INSERT INTO dbo.FactTest(dimtest_id,InsertDateTime)
SELECT dimtest_id,GETDATE()
FROM dbo.FactTest__Delta

Process partition using Process add option. Now you  see error.

Error: 0xC1140017 at Processing Fact, Analysis Services Execute DDL Task: Errors in the metadata manager. The process type specified for the Fact Test partition is not valid since it is not processed.–

From message you can see that our cube partition is in some invalid state and we are not able to process it. Go to Management Studio, right click on partition or cube click on Properties.

Picture 06 - SQL Management studio - object property - Status - State
Picture 06 – SQL Management studio – object property – Status – State

See that cube State property has Unprocessed state assigned. This can happen when you for example deploy new cube solution to the server. In our cases we did not manage any deployment. Another reason could be Full processing of dimensions. When you have changing data in dimensions like we have in our example you should process it, but with Process update options. It will not bring cube to Unprocessed state. Change this option in Management Studio and reprocess the cube/partitions.

Picture 07 - Visual Studio - Analysis Services Processing Task - Dimension - Process Update
Picture 07 – Visual Studio – Analysis Services Processing Task – Dimension – Process Update

Clean the data and repeat previous steps again with Process Options changed to Process Update.

Reprocessing partition with Process Add option you should see data in cube. (Mangament studio -> Right click on Cube object -> Browse cube)

Picture 08 - Visual Studio - Browse cube
Picture 08 – Visual Studio – Browse cube

You can see that now the cube State stayed in Processed state.

Picture 09 - SQL Management Studio - Cube properties - Status State
Picture 09 – SQL Management Studio – Cube properties – Status State

Put another data to dimension and fact table and processed it by created package. Everything works fine, the data were proceeded to the SSAS database.

INSERT INTO dbo.DIMTest (name)
SELECT 'Test 2'
TRUNCATE TABLE dbo.FactTest__Delta
INSERT INTO dbo.FactTest__Delta (dimtest_id)
SELECT 2
INSERT INTO dbo.FactTest(dimtest_id,InsertDateTime)
SELECT dimtest_id,GETDATE()
FROM dbo.FactTest__Delta
Picture 10 - Visual Studio - browse cube data
Picture 10 – Visual Studio – browse cube data
Picture 11 - Visual Studio - browse cube data
Picture 11 – Visual Studio – browse cube data

You can try to comment cleaning delta table to see what happened. You can see that the data were doubled. It is very important to handle delta table with data that were not proceed to cube.

Picture 12 - Visual Studio - browse cube data - duplicity
Picture 12 – Visual Studio – browse cube data – duplicity

Incremental processing is very nice and effective way to proceed data to your Data Warehouse and SSAS database. Of course it depends on your data characteristics, if it can be implemented or it there is a better way how to get data to your cubes. But here I would like to show you simple example how you can manage it trough SSIS package. And remember, you should take care of handling already proceed data and status of your SSAS cube.

Create SSAS database

Here I would like to quickly describe how to easily create
simple SSAS database. If you don’t have Data tools in your studio you can download it here https://bit.ly/32etQGG.

Here are described steps in short:

  1. Create relation database
  2. Create Analysis Services project
  3. Create connection to Data Source
  4. Create Data Source view and put Data Source objects in there
  5. Create dimensions
  6. Create cube, measures and dimensions usage
  7. Process and deploy created SSAS database

Let’s create database structure first. I created one test dimension and one test table for simplicity.

CREATE DATABASE [Test]
CONTAINMENT = NONE ON PRIMARY
( NAME = N'Test', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\Test.mdf' , SIZE = 512000KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON ( NAME = N'Test_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\Test_log.ldf' , SIZE = 512000KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO
USE [Test]
GO
SET ANSI_NULLS ON
GO 
SET QUOTED_IDENTIFIER ON
GO
 
CREATE TABLE [dbo].[FactTest](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [dimtest_id] [int] NULL,
    [InsertDateTime] [datetime] NULL
)
ON [PRIMARY]
GO
 
ALTER TABLE [dbo].[FactTest] WITH CHECK ADD CONSTRAINT [FK__DIMTest_id] FOREIGN KEY([dimtest_id]) REFERENCES [dbo].[DIMTest] ([id])
GO
 
ALTER TABLE [dbo].[FactTest] CHECK CONSTRAINT [FK__DIMTest_id]
GO
SET ANSI_NULLS ON
GO
 
SET QUOTED_IDENTIFIER ON
GO
 
CREATE TABLE [dbo].[DIMTest](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [name] [sysname] NOT NULL,
CONSTRAINT [PK__DIMTest_id] PRIMARY KEY CLUSTERED
(   [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY]
)
ON [PRIMARY]
 
GO

Now open Microsoft Visual Studio and create New Analysis Services Project.On picture bellow you see Solution Explorer with final solution. I tried to figure out individual steps you should do to create your SSAS database in defined order.

Picture 01 - Visual Studio - Solution Explorer
Picture 01 – Visual Studio – Solution Explorer

First create connection to your Data Source which is SQL Server database in our case. By right clicking mouse on Data Sources section, add New Data Source from context menu.

Picture 02 - Visual Studio - Data source
Picture 02 – Visual Studio – Data source

As next step we have to create Data Source view to get objects we would like to work with. By right clicking mouse on Data Source Views section, add New Data Source View from context menu. Right click on Data Source View surface and select Add/Remove tables from the context menu. Here we select tables we created at the begging. As you can see from menu you can create queries too. Visual studio tries to create relationships between objects you put into Data Source View. It comes from your referential integrity set on your object’s. In case you are missing constraints on your tables or Studio didn’t create from some reasons this relationship you can create or modify it manually (by dragging mouse between objects, or by context menu form Data Source View surface).

Picture 03 - Visual Studio - Data source view
Picture 03 – Visual Studio – Data source view

In next step create dimension DIMTest, and then create dimension usage – create relationship between cube and dimension. By right click on dimension and New dimension from context menu. Dimension Wizard appears. It will navigate you through process of dimension creation. You can choose you would like to create dimension from existing table you have in Data Source View or you generate the dimension. In next steps you are advised to select attributes will be present in the Dimension. Let’s click next on following Dialog windows .

Picture 04 - Visual Studio -Dimension Wizard
Picture 04 – Visual Studio -Dimension Wizard
Picture 05 - Visual Studio -Dimension Wizard - Attributes
Picture 05 – Visual Studio -Dimension Wizard – Attributes

When you are finished, you should see something similar on picture bellow. I will not show other sections like Attribute Relationships etc. Let’s go with simple dimension structure and its attributes.

Picture 06 - Visual Studio -Dimension structure
Picture 06 – Visual Studio -Dimension structure

If we would have another dimension, we would add them in similar way. Because I have only one dimension in my example, I can move to the next step, creating cube – define Measures and Dimension usage. Right click on Cubes section and select New cube.

As in creating dimension scenario we get Cube Wizard helping user to create cube with its attributes. Set Use existing tables option in the next window. Select table, will be used for measure attributes. FactTest table in our scenario.

Picture 07 - Visual Studio - Cube Wizard - Measure Group
Picture 07 – Visual Studio – Cube Wizard – Measure Group

Set Measures we can get from our SSAS cube.

Picture 08 - Visual Studio - Cube Wizard - Measures
Picture 08 – Visual Studio – Cube Wizard – Measures

Select Existing Dimensions we created in one of previous steps.

Picture 09 - Visual Studio - Cube Wizard - Selecting dimension
Picture 09 – Visual Studio – Cube Wizard – Selecting dimension

Finally put a name for our new cube.

Picture 10 - Visual Studio - Cube Wizard - Cube name
Picture 10 – Visual Studio – Cube Wizard – Cube name

Now you should see cube designer window. On the left side Cube structure section there are cube Measures we set. 

Picture 11- Visual Studio - Cube structure - Measures
Picture 11- Visual Studio – Cube structure – Measures

By right click on measure we can get its properties to check operator used for aggregation function of measured data.

Picture 12- Visual Studio - New Measure
Picture 12- Visual Studio – New Measure

In the next section – Dimension usage we set dimension used to work with our cube and its relationships. Our scenario is very simple we add our DIM Test dimension and set Regular relationship with Fact table which is typical for star schema of multidimensional model of SSAS cube.

Picture 13 - Dimension usage- Relationship
Picture 13 – Dimension usage- Relationship

Finally, we should see settings on picture bellow.

Picture 14 - Dimension usage
Picture 14 – Dimension usage

Processing or Deployment settings on picture bellow, by right mouse click in Database section in Solution explorer. Deployment settings are situated in Visual Studio, top menu, Project -> Properties -> Deployment section. Here you set the Destination server and name of SSAS database you deploy.

Picture 15 - Deploy SSAS project
Picture 15 – Deploy SSAS project

I have to notice that it was the easiest and fastest way to create simple SSAS database from scratch. I skipped lots of settings and possibilities which analysis services offers. As you can see on picture bellow there are options to set Calculations, Aggregations, Partitions etc.

Picture 16 - Cube menu
Picture 16 – Cube menu

You can download solution here: ProcessingIncrement.

I would like to describe all settings and features of Analysis Services from more perspectives and used with more scenarios, step by step, in next posts. So, stay tunned!.

 

 

SSIS package – Analysis services processing tasks

To process SSAS database objects you can use variant of tools/approaches one of them comes with Integration Services. If there is data processing to Analysis Services database part of your ETL process, SSIS package is good solution for handling it.

There is Analysis Services Processing task you can simply select object you would like to process with. To create such a solution processing your SSAS database you have actuality put two components to your SSIS package.

  1. Analysis Services Connection manager
  2. Analysis Services processing task

Download Data tools for Visual Studio if you don’t have. Now you can process SSAS database and deploy solution to SSAS server.

Let’s create package. New-> Integration Services project. Add Analysis Services Connection Manager to connect to our SSAS database.

Picture 01 - Analysis Services Connection Manager
Picture 01 – Analysis Services Connection Manager

Add two Analysis Services Processing Tasks   from SSIS toolbox to  process dimension and fact data.

Picture 02 - Analysis Services Processing Task
Picture 02 – Analysis Services Processing Task

By double click (or right mouse and Edit) on Analysis Services Processing Tasks you get window where you set objects you would like to work with. Go to Processing Settings section select connection manager, click on Add button and select objects from dialog, picture bellow. You can choose cube, or just partition of cube or dimension to proceed.

Picture 03 - Add Analysis Services Object
Picture 03 – Add Analysis Services Object

When objects are selected you can choose Process Options you would like to proceed the object. For example, in processing option of fact table select Process Add for incremental processing. The other options will be explained and demonstrated in one of next post to complete an overview of SSAS processing options. You can find other processing types for multidimensional SSAS here https://bit.ly/2SPsBtg .

Picture 03 - Analysis Services Process Options
Picture 03 – Analysis Services Process Options

Finally, you can get SSIS package with flow like on picture bellow. You should process dimensions first to avoid unknown member processing error when processing OLAP facts. You can work on it and extend this simple package with processing of DWH relation layer.

Picture 04 - SSIS package - processing Dimensions, Facts
Picture 04 – SSIS package – processing Dimensions, Facts

You can download package here: IncrementalProcessing

Stay tuned.