-
Make a separate database for your Data
Warehouse
You should always base your cubes off a separate data warehouse database.
For larger organizations this will also reside on separate machines. This
will be used as the store for all the data collected as time rolls on. Why
would you do this?
The good things:
- Only keep data you want
- Allows you to purge old data from OLTP
- Allows you to record Time Dependant info (Slowly Changing Dimensions)
- Performance not hitting the OLTP
The bad things:
- Redundancy
Disk space
- Mirror Schema changes
People often ask: Why make a new database and not just make Views/StoredProcs?
The answer to this is that when using multiple databases, cleaning data
gets harder and messier without the data warehouse in the middle (where a
DTS package may exist for each datasource into the warehouse).
 |
| Figure: A good naming convention for your Data Warehouse is to tack on DW to the end of the name |
An Exception:
If you are not worried about performance issues and your data is clean, then you dont need to make another database
Just add the business logic using views (eg. If you have a bit field, but you want to add meaning, then you add a view with a CASE statement).
Note: In SQL Server 2005 you would add this business logic in a new layer Data Source Views within the UDM (Unified Data Model)
|
-
Follow a naming convention for your Data Warehouse
Use Dim<Name> for the dimension tables. (e.g. DimAccount)
(Tip: think of Dimensions as your filters/WHERE/ GROUP
BY)
Use Fact<Name> for the fact tables. (e.g. FactFinance)
(Tip: Fact tables contain the
numbers/measures (or the transactions tables/COUNT()/SUM())
 |
|
Figure:
Give tables based on dimensions names starting with Dim,
tables based on facts start with Fact. |
-
Do you have a DimTimeDay table in your Data Warehouse?
Your data warehouse must have a table DimTimeDay with a
record for every day. This way when you want to make reports based on time
(which you inevitably will), you can view data for periods other then standard
calendar periods (e.g. the fiscal quarter).
How do you create this?
Well, there are two ways that we recommend:
- Using the Dimension Wizard in SQL Server 2005
(see Creating a Time Dimension in 10 Easy Steps
for more information - Preferred Method).
- Using
BI Accelerator
.
While we prefer using the Dimension Wizard in SQL Server 2005, you may find using BI Accelerator to be more
customizable; however, you should be aware that BI Accelerator is an
Excel sheet that creates an entire data warehouse (including DB, Cubes and
DTS to populate it) and not just the Time Dimension table.
 |
|
Figure: Time Dimension table after completing the Dimension
Wizard in SQL Server 2005 |
 |
|
Figure: This is what your Time Dimension will look
like after using BI Accelerator. |
Do you know which Performance you care about?
There is Cube Building Performance and
Querying Performance.
Sometimes it
is a trade-off like when you set lots of aggregates which making
the users querying faster, but makes the Cube slower to build,
and larger.
-
Always make the Lowest Level of a Dimension the Key
You should always have a unique key for every element of data.
This means that the lowest level of a given dimension should be the
key as this will be unique and based on some unique key already.
For example:
 |
|
Figure: What is wrong with this? ... |
 |
|
Figure: ... Well the bottom Level (called the Leaf Level) should always be the key. |
Note: Remember to make the Member Keys Unique = True and
Member Names Unique = True
The first reason is for performance as in the Cube Editor you can successfully run Tools Optimize Schema.
Note: This improves Cube Building Performance (always) and Querying Performance (for some cases).
The second reason is a smaller cube size, but you need to apply the next rule for that..
-
Always use Virtual Dimensions instead of Real Dimensions where possible
 |
|
Figure: What is wrong with this? ... |
Well we are using Real Dimensions for "Product Color"
and "Product Size", when they should be Virtual Dimensions of the
Product Dimension
There are 3 steps to doing that:
1) Add the Member Properties eg. Color and
Size
2) Delete the original Shared Dimension
3) Add the new Virtual Dimension using the wizard
Note: This improves Cube Building Performance (always) and Querying Performance (for most cases as the cube is physically smaller).
-
When you are using Real Dimensions, when applicable specify the
"Depends On Dimension" property
A common property which is often overlooked is "Depends on
Dimension". This will indicate to the aggregate engine that
these dimensions are related, therefore more efficient aggregates
will be built.
 |
|
Figure: In this example the Depends on Dimension property should be set to
"Product" |
Note: This improves Cube Building Performance (sometimes) and Querying Performance (for most cases because you have more efficient aggregates in the Cube).
-
Always make sure the dimensions All Captions
= All
When you are carrying out reporting (using Reporting
Services) based on some OLAP cube you should change the "All
Caption" property of each dimension to be have the value "All". By default the caption for the
dimension will be "All <dimension_name>" (e.g. All Products).
This default is distracting when used in reporting solutions and
doesn't particularly clarify anything anyway.
 |
|
Figure 1: Set the "All Caption" to
read "All" instead of the distracting default |
When you use this dimension in the parameter toolbar of
Reporting Services this default value will look horrible and it
should be changed to some consistent value in order to
facilitate fast scanning of the parameter values.
 |
|
Figure 2: Having set the All
caption nicely, you can more easily see which products
aren't using the defaults. |
-
Always use Shared Dimensions
When you're building an Analysis Server Database you should make always use
Shared Dimensions. This is because:
- all of your cubes can re-use them, making future cubes
faster to develop
- they are only processed once
 |
|
Figure: This is the only place you should be
creating dimensions. Stay aware from Private
dimensions hidden in the cube.
|
-
Always use spaces in names
Always use spaces for Dimension Names, Level Names and Measures this is a business tool and you want users to see friendly names.
-
Always make time scales clear, if using "financial year" then make it
obvious
|
2004
...
- Month 06
2005
- Month 07
- Month 08
- Month 09
- Month 10
|
| Figure: What is wrong with this? |
3 things:
- If you are at the year level, the user doesnt know if it is financial or calendar
- When you are at the month level, the user doesnt know what year they are in
- The full month names should be used because there are no issues with sorting.
When you create the dimension you specify the Order By to be Key. Therefore it would look like this:
|
FY 2004
- Jun 2004
FY 2005
- Jul 2004
- Aug 2004
- Sep 2004
- Oct 2004
|
| Figure: This is the type of data we want. Make names
clear and clarify that you are referring to financial year. |
-
Move your OLAP MetaData from Access to SQL Server
Analysis Server will by default use an Access database to store the meta data
for each of the databases. Presumably this is because you don't necessarily need
to have SQL Server on the same machine as Analysis Server. You should change
this repository to point to your SQL Server because it's a more secure, robust,
and allows you to backup your data with your normal SQL Server backups.
To migrate the repository of meta data:
- In Enterprise Manager, create a new database to use for the meta data.
Call it "OLAPMetaData".
- In Analysis Manager, right click on the server, select "Migrate
Repository" and point it to your table in the database.
|
 |
| Figure: Move the MetaData repository away from the default Access
Database into SQL Server |
At the end you can also double check that it worked successfully
by checking the Repository Connection String in the context menu shown above.
-
When presenting make sure
you use AdventureWorks database and OLAP cube
For years we have been using the samples of
Northwind (for SQL Server) and FoodMart (for OLAP). Well no
more... SQL Reporting Services includes a new sample database of
AdventureWorks2000.
Download
AdventureWorks2000DW zip file
which contains the AdventureWorks2000DW database (a backup of a
flattened/de-normalised version of AdventureWorks2000) and the
AdventureWorks2000 analysis server database that includes the
Reseller Sales cube. (7.41 MB - .zip)
View
Instructions
on setting up the above resources.
For your interest, you
may also want to download
AdventureWorks2000
(comes with SQL Reporting Services)
-
When designing don't compute cube aggregations
When prompted about whether to process the aggregations for the cube
when you are designing, you should click No in order to save time.
However, when deploying or testing query performance you should click Yes.
Figure: Click No while designing, click Yes when deploying or testing performance.
-
Do you hide zero values when you are doing financial report?
Financial report could be very complicated and there are large number of figures showing,
we only want to focus on the significant figures. Zeros could be ignored, so we will hide zeros in order to make other figures standard out.

- Figure: Bad Example - a lot of zeros are showing in the gird, we cannot see the significant figures.

- Figure: Good Example - zeros are hidden, easy to focus on the values
To do this in Excel sheet, you can simply choose the cell and format it as "Accounting"

- Figure: Excel - choose "Accounting" as the cell format
-
Do you remove decimal places from chart?
Graphs or charts are used for showing big picture or trend of data. Decimal places in such
reports are not useful at all, only a waste of space. So you should always remove decimal places
from charts.

- Figure: Bad Example - decimal places on a chart is a waste of space

- Figure: Good Example - decimal places are removed from the chart
-
Do you use conditional formatting to visualize your data?
If you are using Excel 2007, there is a very exciting new feature that allows you to visualize your data in the place.
It allows you to view your data in a visual way, make it very easy to tell the difference.

- Figure: Bad Example - no visualization, cannot tell the difference of data easily

- Figure: Good Example - visualized data, very easy to read and understand the difference
You can simply choose "Conditional Formatting" from the tool bar in Excel 2007 to do this.

- Figure: Choose "Conditional formatting -> Data Bars" to enable data visualization
-
Can you use the super cool Microsoft Data Mining Addin?
Microsoft SQL Server 2005 Data Mining Add-ins for Microsoft Office 2007 (Data Mining Add-ins) allow you take advantage of SQL Server 2005 predictive analytics in Office Excel 2007 and Office Visio 2007.

- Figure: Simply click the highlight exceptions to see data that looks abnormal

- Figure: Click Forecast on this one