Monday, October 17, 2011

SSIS:Automated Unit Testing

There are quite a few ways to perform unit testing for a SSIS package and also automate it.One such method is using the BizUnit Framework which is predominantly used for the Biz Unit testing.We can customize BizUnit to test SSIS Package as well.
We should write extension methods in BizUnit Framework to execute the SSIS package, query the database and compare the result with the expected result or even compare the output file with the expected output file.It might be a little extra effort for a SQL BI developer to do some c# coding but its an one time activity and definitely worth it.


Before going ahead let me create a simple package which would raise warnings in to system log table.

Friday, July 1, 2011

SQL:SQL Server Agent - Alerts

SQL Server Agent - Agent Alerts

What is an Alert?

A SQL server Agent Alert is an action in response to an event or a happening in the SQL Server System.
An Alert could be triggered for one of the following conditions

• When an error occurs in the SQL Server System.
• When there is a performance issue or threshold reached in the SQL Server System
• When there is a WMI event

In response to an alert you can either execute a SQL Server Job or email the DBA (Operator).
For automated fix to a problem we could run a Pre defined fix (SQL Server Job) and for human intervention to an issue we could email the information to the Operator.

Thursday, June 23, 2011

SSIS:Data Validation

VALIDATION IN SSIS

Why we need Validation ?
I have not thought of validating the input in SSIS package until recently I found myself in a position to do it. In my previous packages I found a way to redirect the error outputs while doing a data conversion. But what about situations where you need to validate the input file row by row before updating in to the destination. The advantage of validating input file is mainly to have custom error messages logged in to my logging table.

For example when the format of my expected input column “A” is not valid then I could log the error message “The Column A is not valid in the Row 120” .This could save a lot of my time to correct the input file or for further analysis.

Pros and Cons

Sunday, May 22, 2011

SQL:Data warehousing :An Overview

DATA-WAREHOUSING
What is Data-warehousing?
A Data warehouse is the collection of the historic data in a multidimensional space for a business or an application or an organization which aids the Business Analyst of an organization in Decision making. With the increase in competition among the companies in any form of business, a platform for data analysis has become very essential in key decision making. A Data warehouse holds the his­­­toric information separately from the system; it provides the key business information which are extracted and cleansed from the system database using several methods of ETL(Extract, Transact and Load).In the last couple of decades the data-warehousing has evolved significantly due to millions of dollars invested by several companies around the globe.
As defined by William H. Inmon “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision making process”
The data available in the data warehouse is classified according to the subjects compared to the transaction data available in the system. The data-warehouse normally tend to have n number of sources like relational database, excel or text files or sometimes even another data warehouse. The data from these multiple sources are integrated in the data-warehouse after performing several data cleansing, filtering, processing techniques in the ETL stage. The data is normally presented according to the time scale which is the most important dimension. The data warehouse is separated from the actual database where the daily transactions take place. The main motivation for doing so is to have better performance in both which serves different purpose. The data warehouse is initially loaded with the data on its creation and from then onwards it is periodically updated with the latest data from its source database. The data warehouse has no relation with the transactions and its purely for the data analysis. More often the Data warehouse may not contain the most up to date information.
The organization’s normal day to day information like user registration, bank transactions, and products sold, products bought etc are stored in the database which is referred as on-line transaction processing (OLTP) systems. This is the primary source for the data-warehouse which is also called as the on-line analytical processing (OLAP) systems. There are several distinguishing features between them. The OLTP is used by several thousand compared to OLAP which is used only the executives for decision making. The OLTP stores the current information where transaction takes place where as the OLAP stores the historic information. The OLTP is designed using ER modeling where as the OLAP follows either the snowflake or star schema. The size of the OLAP database is much bigger compared to the OLAP.
OLTP VS OLAP
OLTP
OLAP
Purpose
Data Transaction
Data Analysis
Type of Access
Read/Write
Read(mostly)
Design Type
Entity Relationship model
Star/Snowflake schema
No of Users
1000s
100s
User Type
Software Professional, Admin
Analysts and Executives
Data Type
Flat data
Multidimensional data
Functionality
Day to day transaction data
Historic data
Database size
Less than 1 GB
More than 100 GB
Importance to
Quick transaction of data
Response time for data retrieval
Data is
Non summarized
summarized
CUBE: A MULTIDIMENSIONAL DATA MODEL

Tuesday, May 10, 2011

SSRS:Enabling Tree View

Enabling Tree View and customizing it to pass parameters to data region on user selection
Requirement: To enable a tree view in the report to browse the different hierarchies. On selection of the value at any hierarchy the corresponding data shall be shown in the tablix.

The following sample was tried using the Adventure Works data in SQL Server 2008 Reporting Services.
My first challenge was to establish the tree structure .
I did that in a matter of seconds ,but I was told to have the data in one column only . LI did some browsing and I came up with the following solution.
Right click on the last hierarchy level
Then select Insert Rows -> Outside GroupAbove
You will get a new texbox above .Copy the value from the previous column (towards the left ) in to this texbox.

Creating a Meta Data Driven SSIS Solution with Biml

Biml Biml ( Business Intelligence Markup Language ) is a markup language that enables you to quickly represent a variety of database ...