Skip Navigation LinksHome > SSW Standards > SSW Rules > Rules To Better Business Intelligence

What others have to say about us
See what people think about this product I've been putting together Development Guidelines for my employer and in the process have reviewed many published standards (in the .Net arena) from around the world. In each category, the suggestions at SSW are always among the best. See what people think about this product
- Leon Bambrick,
 

Do you agree with them all? Are we missing some? Let us know what you think.

  1. Make a separate database for your Data Warehouse

    You should always base your cubes off a separate data warehouse database. For larger organizations this will also reside on separate machines. This will be used as the store for all the data collected as time rolls on. Why would you do this?

    The good things:

    • Only keep data you want
    • Allows you to purge old data from OLTP
    • Allows you to record Time Dependant info (Slowly Changing Dimensions)
    • Performance not hitting the OLTP

    The bad things:

    • Redundancy Disk space
    • Mirror Schema changes

    People often ask: Why make a new database and not just make Views/StoredProcs? The answer to this is that when using multiple databases, cleaning data gets harder and messier without the data warehouse in the middle (where a DTS package may exist for each datasource into the warehouse).

    OLAP separate
    Figure: A good naming convention for your Data Warehouse is to tack on DW to the end of the name

    An Exception:

    If you are not worried about performance issues and your data is clean, then you dont need to make another database
    Just add the business logic using views (eg. If you have a bit field, but you want to add meaning, then you add a view with a CASE statement).

    Note: In SQL Server 2005 you would add this business logic in a new layer Data Source Views within the UDM (Unified Data Model)

     

  2. Follow a naming convention for your Data Warehouse

    Use Dim<Name> for the dimension tables. (e.g. DimAccount)

        (Tip: think of Dimensions as your filters/WHERE/ GROUP BY)

    Use Fact<Name> for the fact tables. (e.g. FactFinance)

        (Tip: Fact tables contain the numbers/measures (or the transactions tables/COUNT()/SUM())

    OLAP Naming Conventions
    Figure: Give tables based on dimensions names starting with Dim, tables based on facts start with Fact.

     

  3. Do you have a DimTimeDay table in your Data Warehouse?

    Your data warehouse must have a table DimTimeDay with a record for every day. This way when you want to make reports based on time (which you inevitably will), you can view data for periods other then standard calendar periods (e.g. the fiscal quarter).

    How do you create this?

    Well, there are two ways that we recommend:

    1. Using the Dimension Wizard in SQL Server 2005 (see Creating a Time Dimension in 10 Easy Steps for more information - Preferred Method).
    2. Using BI Accelerator you are about to leave the SSW site.

    While we prefer using the Dimension Wizard in SQL Server 2005, you may find using BI Accelerator to be more customizable; however, you should be aware that BI Accelerator is an Excel sheet that creates an entire data warehouse (including DB, Cubes and DTS to populate it) and not just the Time Dimension table.

     
    Figure: Time Dimension table after completing the Dimension Wizard in SQL Server 2005

    OLAP View TimeData
    Figure: This is what your Time Dimension will look like after using BI Accelerator.

     

  4. Do you know which Performance you care about?

    There is Cube Building Performance and Querying Performance.

    Sometimes it is a trade-off like when you set lots of aggregates which making the users querying faster, but makes the Cube slower to build, and larger.

     

  5. Always make the Lowest Level of a Dimension the Key

    You should always have a unique key for every element of data. This means that the lowest level of a given dimension should be the key as this will be unique and based on some unique key already.

    For example:

    OLAP Lowest Dimension Key
    Figure: What is wrong with this? ...
     
    OLAP Unique Member Keys
    Figure: ... Well the bottom Level (called the Leaf Level) should always be the key.

    Note: Remember to make the Member Keys Unique = True and Member Names Unique = True

    The first reason is for performance as in the Cube Editor you can successfully run Tools Optimize Schema. Note: This improves Cube Building Performance (always) and Querying Performance (for some cases).

    The second reason is a smaller cube size, but you need to apply the next rule for that..

     
  6. Always use Virtual Dimensions instead of Real Dimensions where possible

    OLAP Virtual Dimensions
    Figure: What is wrong with this? ...

    Well we are using Real Dimensions for "Product Color" and "Product Size", when they should be Virtual Dimensions of the Product Dimension

    There are 3 steps to doing that:

    1) Add the Member Properties eg. Color and Size

    OLAP Virtual Dimensions For Product

    2) Delete the original Shared Dimension

    3) Add the new Virtual Dimension using the wizard

    OLAP Virtual Dimensions Create

    Note: This improves Cube Building Performance (always) and Querying Performance (for most cases as the cube is physically smaller).

     

  7. When you are using Real Dimensions, when applicable specify the "Depends On Dimension" property

    A common property which is often overlooked is "Depends on Dimension". This will indicate to the aggregate engine that these dimensions are related, therefore more efficient aggregates will be built.

    OLAP Depends On

    Figure: In this example the Depends on Dimension property should be set to "Product"

    Note: This improves Cube Building Performance (sometimes) and Querying Performance (for most cases because you have more efficient aggregates in the Cube).

     

  8. Always make sure the dimensions All Captions = All

    When you are carrying out reporting (using Reporting Services) based on some OLAP cube you should change the "All Caption" property of each dimension to be have the value "All". By default the caption for the dimension will be "All <dimension_name>" (e.g. All Products). This default is distracting when used in reporting solutions and doesn't particularly clarify anything anyway.

    OLAP Only Use All By Itself

    Figure 1: Set the "All Caption" to read "All" instead of the distracting default

    When you use this dimension in the parameter toolbar of Reporting Services this default value will look horrible and it should be changed to some consistent value in order to facilitate fast scanning of the parameter values.

    OLAP Makes Anything But All Stand Out

    Figure 2: Having set the All caption nicely, you can more easily see which products aren't using the defaults.

     

  9. Always use Shared Dimensions

    When you're building an Analysis Server Database you should make always use Shared Dimensions. This is because:

    1. all of your cubes can re-use them, making future cubes faster to develop
    2. they are only processed once

    OLAP Use Shared
    Figure: This is the only place you should be creating dimensions. Stay aware from Private dimensions hidden in the cube.

     

  10. Always use spaces in names

    Always use spaces for Dimension Names, Level Names and Measures this is a business tool and you want users to see friendly names.

     

  11. Always make time scales clear, if using "financial year" then make it obvious

    2004
       ...
       - Month 06
    2005
       - Month 07
       - Month 08
       - Month 09
       - Month 10
       
    Figure: What is wrong with this?

    3 things:

    1. If you are at the year level, the user doesnt know if it is financial or calendar
    2. When you are at the month level, the user doesnt know what year they are in
    3. The full month names should be used because there are no issues with sorting.

    When you create the dimension you specify the Order By to be Key. Therefore it would look like this:

    FY 2004
       
        - Jun 2004
    FY 2005
        - Jul 2004
        - Aug 2004
        - Sep 2004
        - Oct 2004
       
    Figure: This is the type of data we want. Make names clear and clarify that you are referring to financial year.

     

  12. Move your OLAP MetaData from Access to SQL Server

    Analysis Server will by default use an Access database to store the meta data for each of the databases. Presumably this is because you don't necessarily need to have SQL Server on the same machine as Analysis Server. You should change this repository to point to your SQL Server because it's a more secure, robust, and allows you to backup your data with your normal SQL Server backups.

    To migrate the repository of meta data:

    1. In Enterprise Manager, create a new database to use for the meta data. Call it "OLAPMetaData".
    2. In Analysis Manager, right click on the server, select "Migrate Repository" and point it to your table in the database.

    OLAP Migrate Repository

    Figure: Move the MetaData repository away from the default Access Database into SQL Server

    At the end you can also double check that it worked successfully by checking the Repository Connection String in the context menu shown above.

     

  13. When presenting make sure you use AdventureWorks database and OLAP cube

    For years we have been using the samples of Northwind (for SQL Server) and FoodMart (for OLAP). Well no more... SQL Reporting Services includes a new sample database of AdventureWorks2000.

    Download AdventureWorks2000DW zip file which contains the AdventureWorks2000DW database (a backup of a flattened/de-normalised version of AdventureWorks2000) and the AdventureWorks2000 analysis server database that includes the Reseller Sales cube. (7.41 MB - .zip)

    View Instructions on setting up the above resources.

    For your interest, you may also want to download AdventureWorks2000you are about to leave the SSW site (comes with SQL Reporting Services)
     

  14. When designing don't compute cube aggregations

    When prompted about whether to process the aggregations for the cube when you are designing, you should click No in order to save time. However, when deploying or testing query performance you should click Yes.

    Rules To Better Business Intelligence Aggregations

    Figure: Click No while designing, click Yes when deploying or testing performance.

  15. Do you hide zero values when you are doing financial report?

    Financial report could be very complicated and there are large number of figures showing, we only want to focus on the significant figures. Zeros could be ignored, so we will hide zeros in order to make other figures standard out.

    Figure: Bad Example - a lot of zeros are showing in the gird, we cannot see the significant figures.
    Figure: Good Example - zeros are hidden, easy to focus on the values

    To do this in Excel sheet, you can simply choose the cell and format it as "Accounting"

    Figure: Excel - choose "Accounting" as the cell format
  16. Do you remove decimal places from chart?

    Graphs or charts are used for showing big picture or trend of data. Decimal places in such reports are not useful at all, only a waste of space. So you should always remove decimal places from charts.

    Figure: Bad Example - decimal places on a chart is a waste of space
    Figure: Good Example - decimal places are removed from the chart
  17. Do you use conditional formatting to visualize your data?

    If you are using Excel 2007, there is a very exciting new feature that allows you to visualize your data in the place. It allows you to view your data in a visual way, make it very easy to tell the difference.

    Figure: Bad Example - no visualization, cannot tell the difference of data easily
    Figure: Good Example - visualized data, very easy to read and understand the difference

    You can simply choose "Conditional Formatting" from the tool bar in Excel 2007 to do this.

    Figure: Choose "Conditional formatting -> Data Bars" to enable data visualization
  18. Can you use the super cool Microsoft Data Mining Addin?

    Microsoft SQL Server 2005 Data Mining Add-ins for Microsoft Office 2007 (Data Mining Add-ins) allow you take advantage of SQL Server 2005 predictive analytics in Office Excel 2007 and Office Visio 2007.

    Figure: Simply click the highlight exceptions to see data that looks abnormal
    Figure: Click Forecast on this one

Acknowledgements

Adam Cogan
Grant Paisley
Bing Bao