Skip Navigation LinksHome > SSW Standards > Rules > SSW Rules to Better SQL Server Databases

It was a pleasure to have you in the office again. I continue to be impressed by the quality of work and advice you provide. Mark Pigram - Contracts Online http://grosvenor.com.au
 

Frequently when I'm reviewing a client's SQL Server database design, the in-house developers tell me that "it's a bit of a dog's breakfast." 10 different developers have been working on it for a few years so a lot of the basic defaults have not been changed. Here are some of the typical things that all SQL Server DBAs and Database developers should know. These rules are above and beyond the most basic textbook recommendations of:

Now I don't see a purpose in rewriting applications that work. But - I do see the value in making an application standardised. We all KNOW that standards are a good thing, it's time we were all brave enough to be strict. Here are the standards that we follow...

Do you agree with them all? Are we missing some? Email us your tips, thoughts or arguments. Let us know what you think.

*** Update *** Instead of manually implementing these rules, use SSW SQL Auditor to automatically implement many of them for you. Download it and give it a go.


SSW's Rules to Better SQL Server Databases

1 - 99 General and DBA Rules
  1. Do you measure up-time?
  2. Do you check your SQL Server is up-to-date?
  3. Do you design for change?
  4. Do you script out all changes?
  5. Do you configure all your SQL Server Services to use a Domain Account rather than a local service account?
  6. Do you name your SQL Server Domain Account as 'SQLServerMachineName' so it is easily identified when performing network operations?
  7. Do your SQL Server Services run under non-Administrator accounts?
  8. Do you use Database Mail (not SQL Mail) in SQL 2005?
  9. Do you turn on all the default alerts?
  10. Have you created your own alerts?
  11. Backup - Do you setup a complete Maintenance Plan?
  12. Backup - Do you backup scripts?
  13. Backup - Do you take Restoration seriously?
  14. Backup - Do you have a Restoration Standard?
  15. Are you aware of all the log files?
  16. Do you secure your server by changing the 'defaults'?
  17. Do you turn on security auditing?
  18. Do you increase the Log Size of your Event Viewer?
  19. Do you remove unnecessary permissions on databases?
  20. Do you use Performance Alerts?
  21. Do you make sure you use a consistent Collation server-wide?
  22. Do you create new databases in the default data directory?
  23. Are you aware of compatibility issues between SQL Server 2000 and 2005?
  24. Do you have a general Contact Detail table?
  25. Do you know how to provide best database schema document?
  26. Do you avoid collation errors?

100 - 199 Data Rules

  1. Don't allow NULLs in text fields
  2. Don't allow NULLs in number fields if it has the same meaning as zero
  3. Don't start data in character columns with empty line
  4. Don't start data in character columns with spaces
  5. Use Identities in SQL Server (but don't use Autonumbers in Access)
  6. Don't delete records - just flag them as deleted
  7. Make sure you have valid date data in your database
  8. DateTime fields must be converted to universal time
  9. Do you use 3rd party audit tools to audit data changes?
  10. Do you use invalid characters in object identifiers?
  11. Do you check invalid characters in character data?
  12. Do you use a URL instead of a image in your database?

200 - 299 Table and Column Rules

  1. Only use Unicode datatypes (nchar, nvarchar and ntext) in special circumstances
  2. Do you always use Varchar?
  3. Do you have standard Tables and Columns?
  4. Do you use Bit/Numeric data type correctly?
  5. What type of primary keys do you use - Natural or Surrogate?
  6. Maximum row size for a table
  7. Do you create primary key on your tables?
  8. Do you create clustered index on your tables?
  9. Use smalldatetime datatype, where possible, instead of datetime datatype
  10. Do not use indexes on RowGuid column
  11. Do you have a timestamp column?
  12. Use FillFactor of 90% for indexes and constraints
  13. Do not use table names longer than 24 characters
  14. Do you always have version tracking tables?
  15. Do you validate your denormalized fields correctly?
  16. Do you avoid using user-schema separation?
  17. Do you use triggers for denormalized fields?
  18. Do you create a consistent primary key column on your tables?
  19. Do you use separate lookup tables rather than one large lookup table for your lookup data?
  20. Do you avoid de-normalized fields with computed columns?
  21. Do you add zs prefix to table name?

300 - 399 Views Rules

  1. Don't have views as redundant objects

400 - 499 Stored Procedure Rules

  1. Do your stored procedures return a value indicating the status?
  2. Do you standardize on the return values of stored procedures for success and failures?
  3. If you need to return the value of variables, do you use OUTPUT parameters?
  4. Do you check the global variable @@ERROR after executing a data manipulation statement?
  5. Do you use SCOPE_IDENTITY() to get the most recent row identity?
  6. Do you SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes?
  7. Do you keep your Stored Procs simple?
  8. Do not start user stored procedures with system prefix "sp_" or "dt_"
  9. Use company standard description in your stored procedures
  10. Do you avoid using SELECT * when inserting data?
  11. Do you use transactions for complicated stored procedures?
  12. Do you know SQL stored procedure names should be prefixed with the owner?

500 - 599 Function Rules

600 - 699 Relationship Rules

  1. Turn on referential integrity in relationships
  2. Use ON UPDATE CASCADE clause when creating relationship
  3. Do not use ON DELETE CASCADE clause when creating relationship
  4. Use NOT FOR REPLICATION clause when creating relationship
  5. Do columns ending with 'ID' have FOREIGN KEY constraints?

700 - 799 General Rules

  1. Object name should not be a reserved word
  2. Object name should not contain spaces
  3. Do not use sp_rename to rename objects like stored procedures, views and triggers.
  4. Object name should follow your company Naming Conventions
  5. Object should be owned by dbo
  6. Naming convention for use on database server test and production

800 - 899 Middle Tier Rules

  1. Do you submit all dates to SQL Server in ISO format
  2. Do you implement business rules in the middle tier?

900 - 999 Performance Tuning Rules

  1. Be Sure To Clear SQL Server Cache When Performing Benchmark Tests

General DBA Rules

  1. Do you Measure Up-Time?

    If you measure up-time you can pro-actively inform your manager how successful you have been as a DBA. You can do this in 2 ways:

    Option 1: High Tech Solution - using Microsoft Operations Manager (MOM)

    MOM allows you to monitor and generate reports on the total uptime of your SQL Server and other service level exceptions. You need the following for these reports:
    1. Microsoft Operations Manager 2005 and SQL Server on the network when performing a network scan
    2. Downloaded Microsoft SQL Server Management Pack for Microsoft Operations Manager 2005 for free.

    MOMs Operations Manager
    Figure: Show managers what a good job you're doing with Microsoft Operations Manager and associated reporting service reports

    Option 2: Low Tech Solution - using a recurring select as a heartbeat

    1. Run a query as a ping once every a 5 minutes something that takes about 2 seconds
    2. SELECT * FROM Orders Five times
    3. Log it with the time
    4. Graph - See uptime
    5. Graph See performance

             see results

  2. Do you check your SQL Server is up-to-date?

    Most patches are for security. SQL Slammer showed that it's no good waiting a month before you decide to install a service pack. I would say wait one week and then install the service pack (maximum delay should be 2 weeks)

    Option 1: Manually check using @@version

    Option 2: Run SSW Diagnostics and get all green ticks (Recommended)

    SSW Diagnostics
    Figure: Use diagnostics to ensure your SQL is up-to-date


    Note: To check all servers on my network I use Net Ping
             see results

  3. Do you design for change?

    Many developers are frightened of making a change to the existing database because they just don't know what applications are using it. This is especially a problem when you are dealing with a databases that you did not create. Here are some approaches to this issue:

    1. You could run around the office and find some one and hope they know (unbelievably this seems this the most common method!)
    2. Trawl through source control, all network locations and all the source code around to check what connection strings are being used
    3. You can have a zsApplication table and manually populate with application it uses (Recommended). This can be populated with a run of a SQL profiler over a period of a week so all usage is captured.
    4. Figure : Add a zsApplication table to make applications that use it visible to all developers

    5. Keep a constantly running login Audit with a SQL Server Profiler Trace that saves to a table - and make sure all applications have an application name in their connection string. This method is the most comprehensive option but is not recommended because you get a constant performance hit from SQL Profiler running.
    6. Security Log for Profiler
      Figure: SQL Profiler can help you design for change with auditing of Login events by giving you a guide on what applications are connecting to your database.


             see results

  4. Do you script out all changes?

    Every time a change is made to your product's SQL Server Database, script out the change. You can use Enterprise Manager, VS.NET or Query Analyzer but every time you make changes you must save the change as a .sql script file so any alterations are scripted. Everything at SSW is usually done three times, once on Development, once on Staging and once on Production. Change control is one of the most important processes to ensuring a stable database system.

    Keep the scripts in a separate directory to any other scripts or files. This way you can always go back to them and find out what alterations you have made to the database in version xxx to find errors. If you have all the scripts you are able to rebuild the database from scratch. At SSW we name this folder SQLChangeScripts so as to not confuse it with other script folders.


    Figure 1: A list of change SQL scripts, each file name is in the correct format.

    The script file format should be: <version>_<description>.sql

    The <version> should be a number which is padded with leading zeros (0) on the right to firm 3 or 4 digits (however long we need).

    What if you are using a code generator?

    Every time we use Next Generation You are about to leave the ssw site, it creates its own Generated Stored Procs in the Database Project of our Solution. The folder it is kept in is called "Auto-Generated Stored Procedures".

    The scripts found within this folder are as follows:

    • 010_ViewsForStoredProcedures.sql
    • 020_StoredProcedures_Select.sql
    • 030_StoredProcedures_Insert.sql
    • 040_StoredProcedures_Update.sql
    • 050_StoredProcedures_Delete.sql

    After re-generation of code in the solution, these scripts will be updated with the required stored procs for new Database Objects found in the application. The problem is, however, that every time a re-generation occurs these files must always be added to the large list of scripts in the "SQLChangeScripts" folder as shown above.

    To solve the issue of continually piling up these scripts every time you use Next Generation You are about to leave the SSW site, it is recommended that the scripts are copied over to the "SQLChangeScripts" folder, and the names should not be changed.
    By only modifying the first three numbers accordingly for the correct script sequence, you will be able to find all other Next Generation stored procs, as shown in Figure 2.


    Figure 2: Previous NextGen scripts can be removed except the last NextGen script file e.g 008_StoredProcedures_Delete.sql should not be deleted as it may be the last script in a previous version which SQL Deploy may need for reference.

    Since the previous NextGen Scripts are considered outdated with the newly generated scripts; deleting the previous NextGen scripts will not affect the Database Objects found in the application.
    Deleting these scripts will in fact decrease the list of scripts significantly and save a very large amount of time when upgrading the database using SQL Deploy, especially when the generated scripts contain a lot of SQL commands.

    After the Upgrade, you should do a check on the database with the scripts just to make sure they Reconcile.

    We have a program called SSW SQL Deploy which allows you to run scripts automatically. SSW SQL Deploy NuGet package is also available for ASP.NET MVC applications.

     

             see results

  5. Do you configure all your SQL Server Services to use a Domain Account rather than a local service account?

    SQL Server 2000 and 2005 have several different services that support them.

    • SQL Server
    • SQL Server Agent
    • SQL Server Reporting Services
    • SQL Server Integration Services
    • SQL Server Fulltext search
    • SQL Server Analysis Services
    In the service properties window for these services, ensure that the Service Startup Account is run as "This Account" and not as "Built-in Account" (SQL 2005) or "Service Account" (SQL 2000). Otherwise, you won't get all the functionality by default such as the ability to use Replication, Linked Servers, connect to other machines or use SQL Server mail.

    For security you should not have this domain acccount in the Administrators group.  

    Run as Account in SQL 2005
    Figure : Run as Account should use a domain account rather than a built-in account (Good)

    Run as Account in SQL 2005 (Bad)
    Figure : This service is using a built-in local service account (Bad)


             see results

  6. Do you name your SQL Server Domain Account as 'SQLServerMachineName' so it is easily identified when performing network operations?

    When you create the domain accounts under which SQL Server services run, you should name this domain account as "SQLServerMachineName". e.g. SSW2000\SQLServerDragon. If one of the SQL Server services updates a file on the network, then you can then determine which server wrote to the file.


             see results

  7. Do your SQL Server Services run under non-Administrator accounts?

    You should always run all SQL Server services with the lowest possible priviledges allowed in case the account is compromised. In SQL 2000 this was a bit of a pain as you would have to manually create a minimal account with the bare minimum required priviledges. SQL Server 2005 setup makes the whole process of granting priviledges a whole lot easer than in SQL 2000 - because it automatically creates groups with all the neccessary permissions for you!

    SQL 2005 now creates groups with the bare minimum permissions for you

    Figure : SQL 2005 now creates groups for all the SQL Server services with the bare minimum permissions for you

    For good old SQL 2000, you must manually give the SQL Server Service accounts the following permissions:

    • Ability to log on as a service
    • Ability to access and change the MSSQL directory
    • Ability to access and change applicable .mdf, .ndf, and .ldf files
    • Ability to read and write to certain registry keys under:

    HKEY_LOCAL_MACHINE\Software\Microsoft\MSSQLServer.
    -or- for any named instance: HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft SQL Server.
    HKEY_LOCAL_MACHINE\System\CurrentControlset\Services\MSSQLServer.
    -or- for any named instance: HKEY_LOCAL_MACHINE\System\CurrentControlset\Services\MSSQL$Instancename.
    HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Perflib.

    If you are running any SQL Server Service in an user account that has administrator privileges a user that compromises the account could do anything that administrator could do - including playing around with the registry with procedures like xp_regdeletevalue. So, if you use an Administrator account, you're in effect giving away the keys to the house. Is this something you want to do?
     
             see results

  8. Do you use Database Mail (not SQL Mail) in SQL 2005?

    SQL Server 2005 includes Database Mail, a replacement for SQL Mail. Database Mail solves many of the problems inherent in SQL Mail, including:

    • HTML messages are now natively supported - so there's no need to use 3rd party dlls anymore
    • There's no need for outlook or MAPI profiles on server - communication is directly with SMTP server
    • Multiple profiles and accounts are supported to specify multiple SMTP servers or different email infrastructure situations
    • SQL Server queues messages even when the external mailing process fails
    • High security - users and roles have to be granted permission to send mail
    • Logging and auditing
    • Attachment size regulations and file extension requirements can now be implemented

     

    Use SQL Mail (bad)

    Figure : Using SQL Mail (Bad)

    EXEC master.dbo.xp_smtp_sendmail

    @FROM = N'your@email.com',

    @FROM_NAME = N'Sophie Belle',

    @TO = 'recipient@email.com',

    @subject = 'Vendor List',

    @message = 'The list of vendors is attached.',

    @type = N'text/html',

    @server = N'mail.company.com.au'

    Figure: Avoid using SQL Mail -  you need to have Outlook on the server and there is no built-in logging (Bad)

     

    SQL 2005 now creates groups with the bare minimum permissions for you

    Figure : Use Database Mail (Good)

    USE msdb

    Execute dbo.sp_send_dbmail

    @profile_name = 'UTS',

    @recipients = 'your@email.com,

    @body = 'The list of vendors is attached.',

    @query = 'USE AdventureWorks; SELECT VendorID, Name FROM Purchasing.Vendor',

    @subject = 'Vendor List',

    @attach_query_result_as_file = 1

    Figure: Use database mail for scalability, built-in logging and HTML capability (Good)

     

    For a more in-depth comparison of SQL Mail vs Database Mail, see this intro to Database Mail in SQL 2005
     


             see results

  9. Do you turn on all the default alerts?

    Even if you don't have Microsoft Operations Manager, SQL Alerts are valuable because they can alert administrators of imminent SQL Server failures. e.g. when the msdb log file is full. To enable, you should change the settings under SQL Server Agent.

    Alerts in SQL 2000

    Figure : Default Alerts in SQL 2000 were disabled after install. Enable them.

    SQL 2005 on the other hand has no default alerts. You will have to create them, and I recommend that you add all the fatal level exceptions to alerts.

    Alerts in SQL 2000

    Figure : SQL 2005 alerts - We recommend that you add the fatal exceptions as alerts


             see results

  10. Have you created your own alerts?

    In addition don't forget to add your own alerts, such as sending an alert for the exception "is outstanding amount > $15000". Many peple are not aware of this functionality and write triggers or get 3rd party products to the same job.
     


             see results

  11. Backup - Do you setup a complete Maintenance Plan?

    It is surprising how many IT staff create backup plans and then leave it at that. To have a complete maintenance plan, you should also consider the following:

    1. Checking database integrity
    2. Shrinking Databases
    3. Reorganizing Indexes
    4. Rebuilding Indexes
    5. Updating Statistics
    6. Cleaning up old maintenance histories
    7. Performing automatic backups
    8. Backing up System databases
    9. Last but not least - you should regularly check that the maintenance plans have been running successfully. Otherwise all your backup and maintenance efforts are pointless.
    10. Complete Maintenance Plans

      Figure : SQL 2005 - A Complete Weekly Maintenance Plan


             see results

  12. Backup - Do you back up scripts?

    Scripts are an important component in the operation of any database. This is why you should back up all your scripts and historical schema snapshots - so you can track the scripts that have been run and those that need to be deployed to test and production databases. We typically store these in source control such as VSS or Team Foundation Server as a Visual Studio Database project. You should regularly generate full scripts of all objects changed, keeping the following points in mind:

    • Don't encrypt your database objects if you can avoid it - otherwise they can't be scripted.
    • Use the
      • Enterprise Manager Generate Scripts Wizard Or
      • SQL DMO object model to script out the objects Or
      • Try a third party utility called SQL Scribe (Recommended) to generate your schema snapshot scripts.

             see results

  13. Backup - Do you take Restoration seriously?

    Restoration is vital, we all agree. But rapid restoration with minimal down time is just as important. Run practice restorations on a regular basis, as you don't want to find out that your back-up doesn't restore when a problem has already occurred.

    This restoration process is so important that you should have a termination clause in an employee's contract if the restore doesn't work when something goes wrong.  


             see results

  14. Backup - Do you have a Restoration Standard?

    If the restoration process is not clear and too complicated, then it will cost you both time and money when you are already in a pressure situation. Make sure you have a step by step and comprehensive restore standard with screenshots for every step of the way so there are no nasty little surprises when your system goes down.


             see results

  15. Are you aware of all the log files?

    SQL Server stores vital error and performance information in several different logs. You should be aware of all of them:
    1. SQL Server Error Logs
      • Configure how many you want to keep
      • You should Back up your SQL Server error logs with your other scripts
      • Sp_cycle_errorlog
    2. SQL Server Agent Error Log
      • Recycles after every service restart
    3. Job History Logs
      • Agent properties, Job System tab
      • Probably too low by default
    4. DBMaint history logs
    5. (Event Viewer) - Issues
       

             see results

  16. Do you secure your server by changing the 'defaults'?

    1. Disable defaults
      • Disable Administrator and Rename it, then create a new "honeypot" Administrator account with no permissions.
      • Disable Guest on the SQL
      • Change Port 1433
      • Delete the sample databases - (AdventureWorks, Northwind and Pubs). These have a Public Role which is a security risk and allow Massive SQL Statements
    2. Other security issues
      • Use a service account with a strong password
      • Dont run SQL Server service as an administrator
      • Run in integrated security mode
      • Run on NTFS file system - Encrypt the data files

             see results

  17. Do you turn on security auditing?

    • Configure login security auditing
      • Not on by default
      • Configure on the security tab of Server Properties in Enterprise Manager
      • Enable for Failure
      • View using the Windows Event Viewer

    Enable Auditing

    Figure : Enable Auditing for SQL Server logins

    Note: You can turn on a trace for SQL DDL operations statements
     
             see results

  18. Do you increase the Log Size of your Event Viewer?

    Change the defaults from 512KB and "Overwrite events older than 7 days" to 64000KB and Overwrite as needed. This will allow the users to view Security audits and errors much further into the past with a minimal increase in space - and it will never bloat your server.

    Change event log defaults

    Figure : Change from this ridiculously small log size (Bad)...

    Change event log defaults

    Figure : ... To a much more reasonable log size (Good)


             see results

  19. Do you remove unnecessary permissions on databases?

    What goes for permissions to Windows objects also goes for SQL Server objects. Remove all permissions but the bare minimum required to operate your application.


             see results

  20. Do you use Performance Alerts?

    Performance alerts work well for problems that need to be discovered before they occur.

    For example, one problem that you may encounter is database file growth. Since databases are set to grow to a certain percentage, you needed to configure an alert to let you know when my database would draw close to that threshold. you can configure a performance alert that fired off when it reached 80% of that threshold. Here is an example of what you can do:

    To configure an alert to trigger an application, perform the following steps:

    1. Start the Microsoft Management Console (MMC) Performance snap-in (Start, Programs, Administrative Tools, Performance).
    2. Expand Performance Logs and Alerts, and select Alerts.
    3. Right-click in the right pane, and select New Alert Settings.
    4. Enter a name for the setting that reflects what the alert will monitor, and click OK.
    5. On the General tab, add the counter the alert will monitor and specify the values that will trigger the action.
    6. On the Action tab, select the Run this Program checkbox.
    7. Click the Browse button, and select the name of the application you want to run.
    8. Click OK.

    Set a new Performance Alert

    You have just configured an application to run in response to an alert. Unfortunately, because the program doesn't interact with the desktop, it runs in the background, visible only in Task Manager. To enable the program to run interactively, perform the following steps:
    1. Start the MMC Services snap-in (Start, Programs, Administrative Tools, Services).
    2. Right-click Performance Logs and Alerts, and select Properties.
    3. On the Log On tab, specify the "Local System account" and select the "Allow service to interact with desktop" checkbox.

             see results

  21. Do you make sure you use a consistent Collation server-wide?

    Collation is the combination of language and sort orders, and you typically don't notice it until you start running cross database queries.

    It would make development simpler if the whole world spoke one language but even if you are using English, you will still encounter collation issues if you are not careful. The most common issue is the dreaded 'Cannot resolve collation conflict for equal to operation' error when joining on columns that have different collation orders. Collation is a great feature for international companies, but if you are not consciously using it then you should have ALL the objects in ALL the databases on ALL the servers using a consistent collation.

    Flexibility with collation orders has increased a lot since SQL 7.0:

    • SQL 7: Back in SQL Server 7, you could only define the collation at the server level and, once it was set, you could not change it without rebuilding the master database.
    • SQL 2000: This added the ability to have Column level collation which allows you to set it at the database or column level.

    However, with this column-level flexibility come additional issues. It is ideal for those who only want the column name 'FirstName' to be represented in accent insensitive sort order. However, one of the side effects, if you are not taking notice of collation, is that you end up with many different collations on many different databases.

    We feel that the only time you need inconsitent collations is when you have a rogue 3rd Party application like Microsoft Great Plains that enforces its own collation.

    See these Knowledge Base articles for more information about the issues you will encounter when you have inconsistent collations:

    • Q211874 - Why do I get the error 'Cannot resolve collation conflict for equal to operation'?

      The database collation differs from the SQL Server default collation because it was attached or created with a different collation order. This causes issues when you attempt to join tables in databases that have different collation orders. For example, if your tempdb database and Northwind each have a different collation you will get the following error 'Cannot resolve collation conflict for equal to operation' when you attempt to do a join between tables from these databases


    • Q711843 - How do I change the collation order in my SQL Server 2000 or 7.0 database?

      There is no 'recommended' collation as different collations will be used in different countries but as a guideline, installations in the United States and installations that require compatibility with SQL Server 7 databases should use the SQL_Latin1_General_Cp1_CI_AS collation. Non-United States installations in English speaking countries should use the Latin1_General_CI_AS collation.

    SQL 2005

    Figure : Setting the collation in SQL 2005 Setup - Choose Case Insensitive(CI), Accent Sensitive (AS)



    Use our product SSW SQL Auditor to automatically implement this rule for you. And you can configure in Tools->Options,
        Option [Current database(Recommended)] for check your selected database;
        Option [All database] for check all databases on your selected server;
    Download it and give it a go.


             see results

  22. Do you create new databases in the default data directory?

    When trying to create a database in SQL Server 2005 from an existing create script written for SQL Server 2000, we came across a problem. Our create script was trying to determine the path to save the database file (the path to the default data store) by using the sysdevices table in the Master database; however, the schema for the Master database had changed in 2005 and our script could no longer find the column it relied on to determine this path.

    Rather than creating a new script specific to 2005, we found that by removing the optional FILENAME attribute all together, both SQL Server 2000 and 2005 were happy and the database files were saved into the default data directory which is what we were after.

    The moral of the story is - keep it simple.

    When using a create script to create a new database, let SQL Server determine the filename and path from its default settings. This will help make the script simpler, more flexible, and ready to use with utilities such as MS OSQL and SSW SQL Deploy.

    DECLARE @device_directory NVARCHAR(520)
    
    SELECT @device_directory = SUBSTRING(phyname, 1,
     CHARINDEX(N'master.mdf', LOWER(phyname)) - 1)
    FROM master.dbo.sysdevices
    WHERE (name = N'master')
     
    EXECUTE (N'
    CREATE DATABASE [DatabaseName]
                ON PRIMARY  
                (
                NAME = N''[DatabaseName]'', 
                FILENAME = N''' + @device_directory + N'[DatabaseName].mdf''
                )
                LOG ON 
                (
                NAME = N''[DatabaseName]_log'',  
                FILENAME = N''' + @device_directory + N'[DatabaseName].ldf''
                ) 
    			COLLATE SQL_Latin1_General_CP1_CI_AS
                '           
                )
    Go
    Figure: FILENAME Parameter used to specify database path - Bad

    CREATE DATABASE [DatabaseName]
    COLLATE SQL_Latin1_General_CP1_CI_AS
    Go
    
    Figure: Generic CREATE DATABASE used - Good

    We have a program called Code Auditor that checks for this rule.


             see results

  23. Are you aware of compatibility issues between SQL Server 2000 and 2005?

    The SQL 2005 generated scripts are not compatible to SQL 2000, so use SQL 2000 to generate your scripts if you want to make your scripts work well on both versions.

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[ProcessTarget]') AND type in (N'P', N'PC'))
    drop procedure [dbo].[ProcessTarget]
    
    Figure: script only works on SQL 2005, because 'sys.objects' is only available in this version.

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[ProcessTarget]') AND OBJECTPROPERTY(id, N'IsProcedure') = 1)
    drop procedure [dbo].[ProcessTarget]
    
    Figure: script works on both SQL 2000 and SQL 2005


             see results

  24. Do you have a general Contact Detail table?

    It is common to have a Contact Detail table to store your contact information such as phone numbers. Below is an example of Contact Detail table and its related tables. This is bad because the PartyPhone table is too specific for phone number and you have to add a new table to save email or other contact information if this is needed in the future.

    Figure: a too specific Contact Detail table
    we normally have a general Contact Detail table that includes all the different categories of phone numbers, whether is is shared or primary plus emails all in the same table.
    General Contact Detail Table
    Figure: a general Contact Detail table
    We use a Contact Detail Category table to store these categories.
    Details of Contact Detail Category Table
    Figure: details of Contact Detail Category table

             see results

  25. Do you know how to provide best database schema document?

    You should not provide a database schema via several screen captures - it has little information of the details. A well formatted Word document may be providing more details information, but it is not easy to maintain the document to keep it up-to-date. The best way is to automatically generate your document with a tool.

    We recommend and use Red-Gate SQL Doc to produce chm help files or html pages of the database schema. SQL Doc also allows you to run via the command line so you can include the generation in your build process to be automatically created.

    We have also have used these other available tools in the past: SQL Scribe and Apex SQL Doc.

             see results

  26. Do you avoid collation errors?

    You don't want this error:

    "120_ClientInvoice_ClientIDRequired.sql...Column 'dbo.Client.ClientID' is not of same collation as referencing column 'ClientInvoice.ClientID' in foreig..."

    When you write a stored proc - it must work regardless of the users collation. When you are joining to a temp table - meaning you are joining 2 different databases (eg. Northwind and TempDB) they wont always have the same collation.

    The reality is that you can't tell a user what collation to run their TempDB - we can only specify the collation Northwind should be (we don't even want to specify that - we want that to be the their default (as per their server))

    Here is what you need to do:

        SELECT
            #ClientSummary.ClientID,
            DateOfLastReminder = MAX(ClientDiary.DateCreated),
            DaysSinceLastReminder = DATEDIFF(day,MAX(ClientDiary.DateCreated),getdate())
        INTO #RecentReminderList
        FROM
            ClientDiary INNER JOIN #ClientSummary
            ON ClientDiary.ClientID = #ClientSummary.ClientID COLLATE
                database_default
        WHERE
            ClientDiary.CategoryID LIKE 'DEBT-%'
        GROUP BY
            #ClientSummary.ClientID
    

    SQL Auditor will check this rule


             see results

Data Rules

  1. Don't allow Nulls in text fields

    NULLs complicate your life. To avoid having to constantly differentiate between empty strings and NULLs, you should avoid storing NULLS if you can.

    Why? Well what is wrong with this?

    SELECT ContactName
                                    FROM Customer
                                    WHERE ContactName <> ''
    Figure: Selecting on empty string

    Nothing if your data is perfect, but if you allow Nulls in your database, then statements like this will give you unexpected results. To get it working you would have to add the following to the last line:

    WHERE ContactName <> '' OR ContactName Is Null 
    Figure: Allowing null strings makes queries more complex

    What about only allowing empty strings? Well we choose to block Nulls because it is a lot easier to check off a check box in Enterprise Manager than it is to put a constraint on every field that disallows empty string ('').

    SQL Server Nulls and Empty Strings
    Figure: Don't allow Nulls

    Not allowing Nulls will give you the following benefits: 
    * Don't have to enforce every text field with a CHECK constraint such as ([ContactName]<>'').
    * Make your query simpler, avoid extra checking in stored procedures. So you don't have to check for NULLs and empty strings in your WHERE clause.
    * SQL Server performs better when nulls are not being used.
    * Don't have to deal with the pain in middle tier to explicitly check DBNull.Value, you can always use contactRow.ContactName == String.Empty. Database Nulls in the .NET framework are represented as DBNull.Value and it cannot implicitly typecast to ANY other type, so if you are allowing NULLs in ContactName field, the above comparing will raise an exception.
    * Avoid other nasty issue, a lot of controls in the .NET framework have real problems binding to DBNull.Value. So you don't have write custom controls to handle this small thing.

    However, you should always be aware that Nulls and empty strings are totally different, so if you absolutely have to have them, they should be used consistently. In the ANSI SQL-92 standard, an empty string ('') is never equated to Null, because empty string can be significance in certain applications.
    For example, you have Address1 and Address2 in your database, a Null value in Address2 means you don't know what the Address2 is, but an empty string means you know there is no data for Address2. You have to use a checkbox on the UI to explicitly distinguish Null value and empty string:
      
        
    Figure: A check box is required if you want to allow user to use Null value on the UI

    Some people are not going to like this rule, but this is how it works in Oracle and Access:
    *In Oracle, empty strings are turned into Nulls (which is basically what this rule is doing). Empty strings per se are not supported in Oracle (This is not ANSI compliant).
    *And talking of legacy systems :-) be aware that using Access as a data editor is a "No-No". Access turns empty strings into a Null.

    Finally always listen to the client, Nulls have meaning over an empty string - there are exceptions where you might use them - but they are rare.

    So follow this rule, block Nulls where possible, update your NULLs with proper information as soon as possible, and keep data consistent and queries simple.

    We have a program called SSW SQL Auditor to check for this rule:

    o                                Alternative #1 (recommended) - Use NOT NULL constraint and allow empty strings instead of allowing null values

    o                                Alternative #2 (not recommended) - Use NULL constraint and disallow empty strings with CHECK constraint

    Figure: What the Customers table will look like after applying the rule above 'Alternative #1 (recommended)'

    Figure: What the Customers table will look like after applying the rule above 'Alternative #2 (not recommended)

    Alternative #1 is always preferred, but if you do have some exceptions in your application, please put comment in
    the field "Ignored by SQL Auditor: [your reasons]".
    Eg. Ignored by SQL Auditor: Null means unknown addresses. I have provided a checked box in the UI so the user can signify that.


     


             see results

  2. Don't allow NULLs in number fields if it has the same meaning as zero

    NULLs create difficulty in the middle-tier because you need to add further handling. So avoid them where you can, eg. For a Discount field, make the default 0 and don't allow NULLs.

    This rule should not be applied when a NULL value is valid data. Often times data such as a percent earnings rate on a super fund is nullable because it may not be supplied or relevant. This is very different to it being zero and you have no way to determine real zero values from not supplied data. The hit of doing the work in code is often offset in this case by the validity of query results.

    As a general rule of thumbs, don't use NULL if you cannot distinguish it from another value.

    Q. What is the difference between NULL and 0 in discount field?
    A. No difference, so don’t allow Nulls.

    Q: What is the difference between NULL and 0 in Tumor size?
    A: Null means unknown and 0 means no tumor, so allow Nulls.

    Note: Nulls are evil, but don't go crazy removing nulls. Never invent your own constant eg. -999 to represent a Null.


             see results

  3. Don't start data in character columns with empty line

    Character columns (char, varchar, text, nchar, varchar, text) can store data as <Ctrl>+<Enter> in the first line and the rest of data in the second line.

    Note: If the front-end is Microsoft Access, then the data in the second line is not shown.

    SQL Auditor will check this rule and generate a script to remove an empty line from character type columns where it is the first character


             see results

  4. Don't start data in character columns with spaces

    Text in character columns (char, varchar, text, nchar, varchar, text) can start with spaces which is usually data entry error.

    SQL Auditor will check this rule and generate a script to remove leading spaces from character data


             see results

  5. Use Identities in SQL Server (but don't use Autonumbers in Access)

    This one is going to be a controversial one. But the bottom line is every now and then you want to do something and then you curse and wish your database didnt have an identities. So why use them? Let's look at the problems first:

    Cons:

    • You cant manually change a Primary Key and let the Cascade Update do its work, eg. an InvoiceID
    • Hassles when importing data into related tables where you want to control the Primary Key eg. Order and Order Details
    • Replication you will get conflicts

    In Microsoft Access you have autonumbers and there is no way around them so never use them.
    But in SQL Server you have identities and we have these procs:

    • DBCC CHECKIDENT - Checks the current identity value for the specified table and, if needed, corrects the identity value
    • SET IDENTITY_INSERT { table } { ON | OFF } - Allows explicit values to be inserted into the identity column of a table

    Pros:

    • Less programming - letting the database take care of it
    • Replication (identities are supported by SQL Server with ranges so when you want replication, no coding
    • Avoiding concurrency errors on high INSERT systems so no coding

    So the only Con left is the importing of data but we can use one of the above procs to get around it. See grey box.

    The best way to import messy data into SQL Server (with Identities)
    Eg. inserting data to the Orders and Orders Details table:
    • Use an .adp to copy the first record to Excel
    • Get the data into the same column orders
    • --
    • SET IDENTITY_INSERT Orders ON --this will allow manual identity INSERTS
    • Copy and Paste Append the Orders
    • SET IDENTITY_INSERT Orders OFF --as it can only be on for one table at a time
    • --
    • SET IDENTITY_INSERT [Order Details] ON --this will allow manual identity INSERTS
    • Copy and Paste Append the [Order Details]
    • SET IDENTITY_INSERT [Order Details] OFF

    Automatic Identity Range Handling

    The simplest way of handling identity ranges across replicas is to allow SQL Server 2000 to manage identity range handling for you. To use automatic identity range handling, you must first enable the feature at the time the publication is created, assign a set of initial Publisher and Subscriber identity range values, and then assign a threshold value that determines when a new identity range is created.
    For example, assigning an identity range from 1000 through 2000 to a Publisher, and a range from 2001 through 3000 to an initial Subscriber a range from 3001 to 4000 to the next publisher etc.

    SQL Auditor will check this rule


             see results

  6. Don't delete records - just flag them as Inactive

    When users are deleting a lot of records as part of normal operations - they can and do make mistakes. Instead of the painful process of having to go to a backup to get these records, why not simply flag the records as Inactive?

    Advantages
    • You do not have to delete all related records e.g. Customers, Orders, Order Details. Instead, you can just flag the parent record as deleted with an "IsDeleted" bit field.
    • You do not lose historical data e.g. how many products one of your previous clients purchased
    • You can actually see who deleted the record, as your standard audit columns (e.g. DateUpdated, UserUpdated are still there. The record does not just vanish
    • It is simple to implement - particularly when using a code generator. For example - our code generator produces views and stored procedures, and all data access layer code. With all data access done through the data layer views, we simply had to add a filter to all views ("WHERE IsActive = 0"). Our autogenerated delete stored procedures simply set the "IsActive" column to false.
    Disadvantages
    • Depending on your interface design, you may have to join to parent tables to ensure that deleted child records do not appear. Typically, the interface would be designed in such a way that you would not need be able to created new records based on the deleted items (e.g. you cannot create a new order record for a customer that is deleted). Performance of queries can potentially suffer if you have to do these joins.
    • While storage space is very cheap, you are not removing records from your database. You may need to archive records if the number of deleted records becomes large.
    Also see Using Audit Tools for alternatives to this approach using 3rd party auditing tools.

             see results

  7. Date - Make sure you have valid date data in your database

    SQL Server dates can range from year 1900 up to year 9999. However, certain date data in your database just wouldn't make any sense in the context of your business. For example, if your company started trading in 2005 you should not have any dates in your database before 2005 (unless you are tracking start dates of your clients, but this is an exception). An invoice date of 2003 wouldn't make sense at all. You should run validation queries to ensure no rubbush date data gets into your database.

    SQL Auditor will check this rule and generate a script to see data that violates user-defined date range


             see results

  8. Date - DateTime fields must be converted to universal time

    Any DateTime fields must be converted to universal time from the application to the stored procedures when storing data into the database.

    When retrieving data from the database it must be converted back to the local time of the user.
    That way you get an accurate representation of e.g. the time someone entered data into the database (i.e. the DateUpdated field).

    The exception to this rule, however, is for already existing databases that deal with DateTime as part of their queries.
    e.g. SSW Time PRO.NET is an application that allows employees to enter their timesheet. The table used for storing this information has an important field that has a DateTime data type.
    This cannot be converted to UTC in the database because that would mean:
     

    1. Converting every single entry since entries began being stored (in SSW's case since 1996) to keep information consistent;
    2. Other separate applications currently using the timesheet information in the database for reporting will also have to be entirely modified.

    Currently there will be an issue if for example someone from the US (Pacific time) has 19 hours difference between her local time and our servers.

    Example: Sally in the US enters a timesheet for the 21/04/05. (which will default to have a time of 12:00:00 AM since the time was not specified)
    Our servers will store it as 21/04/05 19:00:00 in other words 21/04/05 07:00:00 PM because the .NET Framework will automatically convert the time accordingly for our Web Service.
    Therefore our servers have to take the Date component of the DateTime and add the Time component as 12:00:00 AM to make it stored in our local time format.

    [WebMethod]
    public double GetDateDifference(DateTime dateRemote)
    {
    	DateTime dateLocal = dateRemote.Date;
    	return (dateRemote.TimeOfDay.TotalHours - 
    	dateLocal.TimeOfDay.TotalHours);
    }

    Figure: When dateRemote is passed in from the remote machine, .Net Framework will have already converted it to the UTC equivalent for the local server (i.e. the necessary hours would have been added to cater for the local server time).

    In the above code snippet, the .Date property would cut off the Time portion of the DateTime variable and set the Time portion to "12:00:00 AM" as default.

    This is for applications we currently have that:

    1. Consider the DateTime component integral for the implementation of the application
    2. That will be used world wide.

     

             see results

     
  9. Do you use 3rd party audit tools to audit data changes?

    In many cases, there are legal requirements to audit all updates to financial records. In other cases, you will want to be able to track and undo deletes to your database. Some solutions we have seen in the past are:

    1. Manually adding triggers on all database tables to log every table
    2. The business objects or stored procedures all write to 2 tables the main table such as Customer and CustomerAudit
    3. Using a logging utility to audit database changes
    For ease of reporting and the ability to undo, we recommend that you use a logging utility such as Lumigent Log Explorer. This means that you can devote your development time to areas other than auditing. Also, unlike other utilities which use triggers (such as ApexSQL Audit), there is no performance overhead because it relies upon log files already created by SQL Server. If required, you can export the log information to SQL Server, so you can perform advanced queries on it. It even allows you to recover previously deleted tables.

             see results

  10. Do you use invalid characters in object identifiers?

    We believe it is not good that use invalid characters (most of are Symbol characters, like ",;"\/(", etc.) in object identifiers. Though it is legal, it is easy confused and probably cause error during run script on these objects.

    We have a program called SQL Auditor that checks for this rule.


             see results

  11. Do you check invalid characters in character data?

    Always avoid invalid characters in your data (most of are Symbol characters, like ",;"\/\n\r", etc.). You usually get them in your database by mistake people usually get them from copy and pasting from Word.
    They can be costly here is an example of an error you can get.


    What could this be? Well in this case the html source of http://www.ssw.com.au/ssw/Download/Download.aspx?GroupCategoryID=5BUS had this:

    <script language="Javascript">
    

    document.write('SSW Smart Tags for Word ');

    ... </script>

    There is an unwanted [return char] in the end of [SSW Smart Tags for Word]. So, then you trace it to a database record and find that the CategoryName field is SSW Smart Tags for Word? in that record. After you I remove the error chars, this bug was fixed all very expensive and all very costly.

    We have a program called SQL Auditor that checks for this rule.


             see results

  12. Do you use a URL instead of a image in your database?

    We recommend that you use a URL instead of a image in your database, this will make you

    1. avoid size of your database increasing too speedy (which may will bring a serial of problems, like performance, log and disk space, etc);
    2. easy to validate and change the image;
    SQL Auditor will check this rule and generate a report of image columns


             see results

Table and Column Rules

  1. Only use Unicode datatypes (nchar, nvarchar and ntext) in special circumstances

    Columns defined using the nchar, nvarchar and ntext datatypes can store any character defined by the Unicode Standard, which includes all of the characters defined in the various English and Non-English character sets. These datatypes take twice as much storage space per characters as non-Unicode data types.

    It is not the disk space costs that are the concern. It is the 8060 limit, please refer to Maximum Capacity Specifications for SQL Server for details.

    If your database stores only English characters, this is a waste of space. Don't use Unicode double-byte datatypes such as nchar, nvarchar and ntext unless you are doing multilingual applications.

    We have a program called SSW SQL Auditor that can produce a report indicating unicode datatype columns in your database. It also can generate sql script to change datatype of those columns from nchar to char and from nvarchar to varchar. To change datatype of ntext columns we recommend to use Enterprise Manager.


    Figure: SQL Auditor will report all unicode columns like nvarchar and ntext. It will give you a SQL script for the nvarchar and advise you to use Enterprise Manager to change ntext columns to text.


             see results

  2. Do you always use Varchar?

    Use VARCHAR instead of CHAR, unless your data is almost always of a fixed length, or is very short. For example, a Social Security/Tax File number which is always 9 characters. These situations are rare. SQL Server fits a whole row on a single page, and will never try to save space by splitting a row across two pages. Running DBCC SHOWCONTIG against tables shows that a table with fixed length columns takes up less pages of storage space to store rows of data. General rule is that the shorter the row length, the more rows you will fit on a page, and the smaller a table will be. It allows you to save disk space and it means that any retrieval operation such as SELECT COUNT(*) FROM, runs much quicker against the smaller table.


             see results

  3. Do you have standard Tables and Columns?

    1. All tables should have the following fields:
      Field SQL Server Field Properties
      CreatedUtc datetime2 Allow Nulls=False Default=GETUTCDATE()
      CreatedUserId Foreign Key to Users table, Allow Nulls=False
      ModifiedUtc datetime2 Allow Nulls=False Default=GETUTCDATE()
      ModifiedUserId Foreign Key to Users table, Allow Nulls=False
      Concurrency rowversion Allow Nulls=False

      The first three are examples of bad table records. The last one is an example of how this table structure should be entered.

      Good Bad Practices Example SQL Fields
      Figure: 3 bad examples and 1 good example of Row auditing

      Note #1: Never set the CreatedUtc field - instead use a default GETUTCDATE()

      Note #2: These fields offer basic row auditing that will cover the majority of applications. When an application has specific auditing requirements, they should be analysed to see if this approach is sufficient.

    2. All databases should have a table with one record to store application Defaults. This table should be called 'Control'.

      If the settings are not application-wide, but just for that user then an XML (do not use an INI file) for simple stuff might be better. Examples are saving the 'User' for logon, 'Select Date Range' for a report, form positions, etc.

      .NET programs have an Application.Configuration which exports to XML file (app.config) automatically. It works very well, and deployment is very simple. It's integrated right into the Visual Studio.NET designer as well.


    3. All databases should have a version table to record structural changes to tables. See SSW Rules to Better Code
       
    4. Lookup tables that have just two columns should be consistent and follow this convention: CategoryId (int) and CategoryName (varchar(100)).

      The benefit is that a generic lookup form can be used. You will just need the generic lookup form pass in the TableName and Column1 and Column2.

      Note #1: The problem with the naming is the primary keys don't match

      Note #2: The benefit with the character primary key columns is that queries and query strings have meaning eg. http://www.ssw.com.au/ssw/Download/Download.aspx?GroupCategoryID=5BUS from this URL I can guess that it is in the business category.

    SQL Auditor will check this rule and generate a script to add standard columns specified in Tools > Options

             see results

  4. Do you use Bit/Numeric data type correctly?

    1. Bit data type

      Bit data from 0 to 1 (2 values only). Storage size is 1 byte.
      Columns of type bit cannot have indexes on them.  Also, SQL Server 7 only allows True or False values in a bit column. SQL 2000 introduced the ability to store NULL as well. Applications built for SQL Server 7 often does not expect this behaviour, and may create subtle runtime errors. [more information on bit data type]

      Columns of type bit should be prefixed with "Is" or a "Should" ie. IsInvoiceSent (y/n) or ShouldInvoiceBeSent (y/n) you can tell easily which way the boolean is directed. [more information on naming conventions]

      This being said, fields of this type should generally be avoided because often a field like this can contain a date i.e. DateInvoiceSent (Date/Time) is prefered over InvoiceSent (y/n). If a date is inappropriate then we still recommend an int field over a bit field anyway, because bits are a pain in the butt :-)

    2. Tinyint data type

      Integer data from 0 through 255. Storage size is 1 byte.

    3. Smallint data type

      Integer data from -2^15 (-32,768) through 2^15-1 (32,767). Storage size is 2 bytes.

    4. Int data type

      Integer (whole number) data from -2^31 (-2,147,483,648) through 2^31-1 (2,147,483,647). Storage size is 4 bytes. The SQL-92 synonym for int is integer.

    5. Bigint data type

      Integer (whole number) data from -2^63 (-9223372036854775808) through 2^63-1 (9223372036854775807). Storage size is 8 bytes.

    Recommended:
    • Use smallint datatype instead of bit datatype - so it can be indexed;
    • Use int datatype, where possible, instead of bigint datatype - for saving disk space;
    • Use smallint datatype, where possible, instead of int datatype - for saving disk space;
    • Use tinyint datatype, where possible, instead of smallint datatype - for saving disk space;
    SQL Auditor will check this rule and generate a script to replace with the proper data type;

             see results

  5. What sort of primary keys do you use - Natural or Surrogate?

    Now this is a controversial one. Which one do you use?

    1. A "Natural" (or "Intelligent") key is actual data
      • Surname, FirstName, DateOfBirth
    2. An "Acquired Surrogate" (or "Artifical" or "System Generated") key is NOT derived from data eg. Autonumber
      • eg. ClientID 1234
      • eg. ClientID JSKDYF
      • eg. ReceiptID 1234
    3. A "Derived Surrogate" (or "User Provided") key is indirectly derived from data eg. Autonumber
      • eg. ClientID SSW (for SSW)
      • eg. EmpID AJC (for Adam Jon Cogan)
      • eg. ProdID CA (for Code Auditor)
    4. A "GUID" key automatically generated by SQL Server

    The problems with Natural Keys:

    • Because they have a business meaning, if that meaning changes (eg. they change their surname), then that value NEEDS to changed. Changing a value with data is a little hard - but a lot easier with Cascade Update.
    • The main problem is that the key is large and combined and this needs to be used in all joins

    The Problem with Acquired Surrogate Keys:

    • A surrogate key has no meaning to a user
    • It always requires a join when browsing a child table eg. The InvoiceDetail table

    The Problem with Derived Surrogate

    • The user needs to enter a unique value
    • Because they have a business meaning, if that meaning changes (eg. they change their company name), then that value MAY NEED to changed. Changing a value with data is a little hard - but a lot easier with Cascade Update
    • More likely to have a problem with Merge Replication

    The Problem with GUID key

    We like GUID keys. However, GUID generation produces essentially random numbers which cannot realistically be used as primary keys since new rows are inserted into the table at random positions leading to extremely slow inserts as the table grows to even a moderate size. Inserting into the middle of a table with clustered index, rather than appending to the end can potentially cause the database to have to move large portions of the data to accommodate space for the insert. This can be very slow.

    Recommendations

    1. We dont use Natural keys ever
    2. We use Acquired Surrogate for some tables
      • eg. Invoice table
      • eg. Receipt table
    3. a combination of Acquired Surrogate and Derived Surrogate for other tables
      • eg. Customer table
      • eg. Employee table
      • eg. Product table

      When we say combination because if the user doesn't enter a value then we put a random value in (by a middle tier function, so it works with Access or SQL). eg. ClientID JSKDYF

      The user can then change the value to anything else and we validate it is not used, and then perform a cascade update - or if it is more then 3 levels deep we execute a stored proc. Unfortunately this is a complicated proc that cycles through all related tables and performs an UPDATE. Here is an example

      The Derived Surrogate has the benefit being easy for people to remember and can be used in the interface or even the query string

    Over the years experience has lead me to the opinion that the natural vs surrogate key argument comes down to a style issue. If a client or employer has a standard one way or another, fine use it. If not, use whichever you method you prefer, recognizing that there may be some annoyances you face down the road. But don't let somebody criticize you because your style doesn't fit his preconceived notions.

    Links




             see results

  6. Maximum row size for a table

    A tables' maximum row size should be less than the size of a single SQL Server data page (8060 bytes). Otherwise, data entry forms can give errors is not validated correctly.

    SQL Auditor will check this rule.

             see results

  7. Do you create primary key on your tables?

    When you specify a PRIMARY KEY constraint for a table, SQL Server enforces data uniqueness by creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries.
    Although, strictly speaking, the primary key is not essential you can update tables in SQL Enterprise Manager without it - we recommend all tables have a primary key (except tables that have a high volume of continuous transactions). Especially, when you have a client like Access, it would help you to avoid the problems.


             see results

  8. Do you create clustered index on your tables?

    You're allowed one clustered index per table, so unless you are never going to query a table, you may as well choose a field to be part of a clustered index. Basically,
    1) Every table should have a clustered index;
    2) The clustered index should be a unique clustered index where possible;
    3) The clustered index should be on a single column where possible;

    So how do you choose the right field? Depending on the usage pattern of a table, clustered indices should be created. If sets of related records are regularly retrieved from a table in an application, a clustered index could dramatically improve performance.

    For example, in an Order to OrderDetails relationship with OrderID as the joining key, items in an order are regularly retrieved in a bundle. A clustered index on the OrderID column in OrderDetails table will improve the performance of the application significantly.

    Another example, if a table is frequently used for reporting, and a date range is used to define the time scope of the report, a clustered index on the date column is suitable. In more technical terms, if queries such as

    SELECT * FROM ReportTable WHERE ItemDate BETWEEN 1/1/2003 AND 1/2/2003
    is executed frequently, ItemDate is a good candidate column for a clustered index. [more information on Clustered Indexes]
    SQL Auditor will check this rule and generate a script to create a clustered index on the columns selected in the wizard

             see results

  9. Use smalldatetime datatype, where possible, instead of datetime datatype

    Most applications do not require the range and precision offered by the DateTime data type. When was the last time you needed to enter an order past the year of 2079? So you end up with better data integrity. Most business applications never need dates outside the range of 1900-2079.

    More Information:
    In addition (I don't really care about this) but I get a smaller database.
    DateTime type takes up 8 bytes. It can store dates ranging from January 1, 1753, to December 31, 9999, with time values rounded to increments of .000, .003, or .007 milliseconds.

    A SmallDateTime type takes up only 4 bytes, as a consequence, it can only store dates ranging from January 1, 1900, through June 6, 2079, with accuracy to the minute. With a million records each with two date fields, you could save 8MB of storage space. More space could actually be saved if you have indices on those columns.So that is about 1 cent worth today :-)

    SQL Auditor will check this rule and generate a script to change the datatype of datetime columns to smalldatetime

             see results

  10. Do not use indexes on RowGuid column

    RowGuids (uniqueidentifier) are large fields (16 bytes) and are basically going to ALWAYS be unique.

    SQL Server adds a RowGUID column to all tables if you are using Merge Replication (but doesn't add an index).

    RowGuids in general slow things down. Some people may consider using a RowGuid as their primary key. This is a bad idea because the index is going to be quite slow.... you are searching a large field. It goes without saying, NEVER have clustered index on a RowGuid column.

    Another little annoyance with RowGuids is when you are searching for one. You can't use > or < on a RowGuid column.

    Note: There are not many cases where a RowGuid should have an index on it. (Exception SSW SQL Total Compare which is a tool that compares data is in sync via rowguids and this makes it lots faster).

    Be aware that SQL server adds this column when you perform merge replication. There are not many cases where this should have an index on it. An exception is if you are using our utility SQL Total Compare

    SQL Auditor will check this rule and generate a script to drop any indexes on rowguid columns

             see results

  11. Do you have a timestamp column?

    The SQL Server timestamp data type has nothing to do with times or dates. SQL Server timestamps are binary numbers that indicate the relative sequence in which data modifications took place in a database.

    All tables should have a timestamp column to aid concurrency checking. A timestamp improves update performance because only one column needs to be checked when performing a concurrency check (instead of checking all columns in a table for changes).

    Be aware that when replicating with a SQL Server CE Pocket PC device using SQL server, a timestamp column is added automatically.

    SQL Auditor will check this rule and generate a script to add a timestamp column to tables that do not have a timestamp column

             see results

  12. Use FillFactor of 90% for indexes and constraints

    Indexes should generally have a fillfactor of 90%. If the amount of data stored in the database does not prohibit rebuilding indexes, a fillfactor of 90% should be maintained to increase performance of inserts.

    A table that expects a lot of insert operations could use a lower fillfactor.

    SQL Auditor will check this rule and generate a script to change FillFactor to 90% for indexes and constraints

             see results

  13. Do not use table names longer than 24 characters

    If a SQL Server table name is longer than 24 characters and is linked to an Access front-end, characters after the 24th will be truncated [more...]

    SQL Auditor will check this rule.

             see results

  14. Do you always have version tracking tables?

    We always use two tables for tracking versioning information:

    • _zsDataVersion tracks the schema changes, and which update script we are up to. This helps tremendously in determining which version of the scripts are still required between development, test, and production databases.
    • _zsVersionLatest tracks which version the front-end client should be. This allows us to give a warning to (or even deny) users who are connecting to the database while not using the right version of the front-end client.

    Please see "Is a Back-end structural change going to be a hassle?" on our Rules to Successful Projects.


             see results

  15. Do you validate your denormalized fields correctly?

    90% of the databases that SSW works with make use of denormalized fields. We believe this is with good reason. However, several precautions should be taken to ensure that the data held within these fields is reliable. This is particularly the case several applications are updating your denormalized data. To illustrate, let's say that we want to show all Customers with a calculated field totalling their order amount (ie Customer.OrderTotal).

    With this example in mind, the main reasons we use denormalized fields are:

    • reducing development complexity. A denormalized field can mean that all SELECT queries in the database are simpler. Power users find it easier to use for reporting purposes - without the need for a cube. In our example, we would not need a large view to retrieve the data (as below).
      SELECT Customer.CustomerID, SUM(SalesOrderDetail.OrderQty * (SalesOrderDetail.UnitPrice
      - SalesOrderDetail.UnitPriceDiscount)) AS DetailTotal, Customer.SalesPersonID, Customer.TerritoryID,
      Customer.AccountNumber, Customer.CustomerType, Customer.ModifiedDate, Customer.rowguid
      FROM Customer INNER JOIN SalesOrderHeader ON Customer.CustomerID = SalesOrderHeader.CustomerID
      INNER JOIN SalesOrderDetail ON SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID
      GROUP BY Customer.CustomerID, Customer.SalesPersonID, Customer.TerritoryID, Customer.AccountNumber,
      Customer.CustomerType, Customer.ModifiedDate,Customer.rowguid ORDER BY Customer.CustomerID
      Figure: A view to get customer totals when no denormalized fields are used

      If we had a denormalized field, the user or developer would simply have run the following query:

      SELECT Customer.CustomerID, Customer.OrderTotal 
      AS DetailTotal 
      FROM Customer 
      ORDER BY Customer.CustomerID
      				   
      Figure: Queries are much simpler with denormalized fields

      Note that this is not a particularly complicated example. However, you can see why it can simplify development greatly when working with a large number of tables

    • Performance is better for read-intensive reports - particularly when reporting on data with a cube
    • when there a multiple tables in a SQL Server view, they can not be updated in one hit - they must be updated one table at a time
    • It is a built-in validation device. For example, if records are accidentally deleted directly in the database, there is is still a validation check for the correct totals. The value of this is mitigated when there is a full audit log on the database

    However, there are reasons against using denormalized fields

    • they have to be maintained - and can potentially get out of synch. This can makes them unreliable - particularly if several applications are incorrectly updating the denormalized fields. UPDATE, INSERT, DELETEs are more complicated as they have to update the denormalized fields
    • they can be seen as an unneccessary waste of space

    All in all, we choose to still use denormalized fields because they can save development time. We do this with some provisos. In particular, they must be validated correctly to ensure the integrity of the data.

    Here is how we ensure that this data is validated:

    1. Change the description on any denormalized fields to include ";Denormalized" in the description - ";Denormalized: Sum(OrderTotal) FROM Orders" in description in Enterprise Manager
    2. Create a view that lists all the denormalized fields in the database - based on the decription field.
      CREATE VIEW dbo.vwValidateDenormalizedFields
      AS
      SELECT OBJECT_NAME(id) AS TableName, COL_NAME(id, smallid) AS ColumnName,
      CAST([value] AS VARCHAR(8000)) AS Description,
      'procValidate_' + OBJECT_NAME(id) + '_' + COL_NAME(id, smallid) as
      ValidationProcedureName
      FROM dbo.sysproperties
      WHERE (name = 'MS_Description') AND (CAST([value] AS VARCHAR(8000))
      LIKE '%;Denormalized:%')
      Figure: Standard view for validation of denormalized fields validators
    3. Create a stored procedure (based on the above view) that validates whether all denormalized fields have a stored procedure that validates the data within them
      CREATE PROCEDURE procValidateDenormalizedFieldValidators AS
      --Checks whether there is a validator for the denormalized field (DDK 6/6/2005)
      SELECT ValidationProcedureName as MissingValidationProcedureName 
      FROM 
      vwValidateDenormalizedFields
      WHERE
      ValidationProcedureName
      NOT IN
      (
      SELECT 
      ValidationProcedureName
      FROM 
      vwValidateDenormalizedFields 
      LEFT JOIN 
      sysobjects 
      ON vwValidateDenormalizedFields.ValidationProcedureName =OBJECT_NAME(sysobjects.id)
      WHERE id IS NOT NULL
                      
      Figure: Standard stored procedure for validation of denormalized fields validators

             see results

    *Note: Moved to http://sharepoint.ssw.com.au/Standards/SoftwareDevelopment/RulesToBetterSQLServerSchemaDeployment/Pages/DoYouCheckYourDenormalizedFieldIsStillThereWithprocValidate.aspx
  16. Do you avoid using user-schema separation?

    User-schema separation is a new feature introduced in SQL 2005.

    In SQL 2000
    • All objects are owned by users
    • If a user is deleted, all these objects must be deleted or have the owner reassigned
    • In script the naming convention is databaseName.ownerName.objectName
    • You need to update all scripts when a user changes.

    User-schema separation solves this problem by adding another level of naming, and shifting ownership of database objects to the schema, not the user. So, is it worth doing? Unless you are working with a very large database (100+ tables), the answer is "no". Most smaller databases have all objects with owner "dbo", which is fine in most cases.

    User schema (Bad)
    Figure: AdventureWorks using user schema - instead, keep it simple and avoid using user schema unnecessarily
    Avoid user schema
    Figure: Adventure works with user schema cleaned out (Good). Much simpler and more readable

             see results

  17. Do you use triggers for denormalized fields?

    I believe that de-normalised fields are not a bad thing. When used properly and sparingly, they can actually improve your application's performance. As an example:

    • I have an Orders table containing one record per order
    • I also have an OrderItems table which contains line items linked to the main OrderID, as well as subtotals for each line item
    • In my front end I have a report showing the total for each order
    To generate this report, I can either
    1. Calculate the Order total by summing up every single line item for the selected Order every time the report is loaded, or
    2. Store the Order subtotal as a de-normalised field in the Orders table which gets updated using trigger
    The second option will save me an expensive JOIN query each time because I can just tack the denormalised field onto the end of my SELECT query.
    1. Code: Alter Orders table
      ALTER TABLE Orders
      ADD SumOfOrderItems money NULL
      				
    2. Code: Insert trigger
      Alter Trigger tri_SumOfOrderItems
      On dbo.OrderItems
      For Insert
      AS
      DECLARE @OrderID varchar (5)
      SELECT @OrderID = OrderID from inserted
      
      UPDATE Orders
      SET Orders.SumOfOrderItems = Orders.SumOfOrderItems + 
      (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
      WHERE Orders.OrderID = @OrderID
      				
    3. Code: Update trigger
      Alter Trigger tru_SumOfOrderItems
      On dbo.OrderItems
      For Update
      AS
      DECLARE @OrderID varchar (5)
      SELECT @OrderID = OrderID from deleted
      --Could have used inserted table
      
      UPDATE Orders
      SET Orders.SumOfOrderItems = Orders.SumOfOrderItems
      + (SELECT isnull(SUM(ItemValue),0) FROM inserted WHERE inserted.OrderID = Orders.OrderID)
      - (SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID) 
      WHERE Orders.OrderID = @OrderID
      
    4. Code: Delete trigger
      Alter Trigger trd_SumOfOrderItems
      On dbo.OrderItems
      For Delete
      AS
      DECLARE @OrderID varchar (5)
      SELECT @OrderID = OrderID FROM deleted
      
      UPDATE Orders
      SET Orders.SumOfOrderItems = Orders.SumOfOrderItems - 
      	(SELECT isnull(SUM(ItemValue),0) FROM deleted WHERE deleted.OrderID = Orders.OrderID)
      WHERE Orders.OrderID = @OrderID
      
    5. Code: Maintenance stored procedure
      --Stored Procedure for Maintenance
      Alter Procedure dt_Maintenance_SumOfItemValue
      As
      UPDATE Orders
      SET Orders.SumOfOrderItems = Isnull((SELECT SUM (ItemValue) FROM OrderItems WHERE OrderItems.OrderID = Orders.OrderID),0)
      


             see results

  18. Do you create a consistent primary key column on your tables?

    Make sure you created a consistent primary key column named [TableName]+"ID" on your tables.

                                        Employee.EmployeeID
    Figure: good example.
                                        Employee.ID, Employee.Employee_Code, Employee.Employee
    Figure: bad example.
    SQL Auditor will check this rule.

             see results

  19. Do you use separate lookup tables rather than one large lookup table for your lookup data?

    Advantage: Simplifies ORM Mapping

    We prefer multiple lookup tables so they make more sense in ORM tools. E.g. you could have either:

    1. OrderType

    Or

    2. LookupTable

    But when you are obtaining the OrderType for an order, you would have

    Either

    Order.OrderType.OrderTypeID (Good)

    Or

    Order.LookupTable.Value (Not great as it is not clear what the nature of the lookup table is). If you have multiple lookups to the one table, you would need to do your mappings manually rather than using a tool.

    Advantage: Maintains Complete Referential Integrity without the need for triggers Advantage: Maintains Complete Referential Integrity without the need for triggers

    The other advantage of having separate lookup tables rather than one large one is that referential integrity is maintained.

    One issue with having one large table is that you can still enter invalid values in the Order.OrderTypeID column. E.g. if Order TypeIDs range from 1-3 and CustomerTypeIDs range from 4 to 10.

    If I put OrderTypeID = 10, then I will not get referential integrity errors (even though I should) because I have entered a value which exists in the lookup table (even though it is for the wrong type).

    If I want to enforce referential integrity so I can only enter the correct type for my lookup table, then I would need to resort to triggers or a (fallible) coded data tier.

    Advantage: You can add new columns specific to each lookup table

    For example, if a Lookup table (e.g. CustomerType) has an associated value (e.g. the field MaximumDebtAmount), we don't need to add a field that is irrelevant to all the other lookup tables. We can just add it to the individual lookup table.

    Disadvantage: Multiple tables make maintenance slightly more difficult, especially when making changes directly via Management Studio.

    It is simpler to Administer one table than multiple tables, but you can reduce this problem with a good Generic Administration Page UI.

  20. Do you avoid de-normalized fields with computed columns?

    We should always use computed columns (in SQL Server 2005 and later they can be persisted) to avoid these types of denormalized columns.

    Figure: bad example.
    Figure: good example.

    Computed columns has some limitations - they cannot access fields in other tables, or other computed fields in the current table.

    We use user defined functions (UDF) to encapsulate our logic in reusable functions, this allows one computed column to use a function to call another function.

    Use the suffix Computed to clearly distinguish that this field is a computed field.

                                    ALTER FUNCTION [dbo].[udfEmpTime_TimeTotalComputed]
    
    (
    @TimeStart as DateTime,
    @TimeEnd as DateTime
    )
    RETURNS DECIMAL(8,6)
    AS
    BEGIN
    -- This function returns the time difference in hours - decimal(8,6)
    RETURN (round(isnull(CONVERT([decimal](8,6),@TimeEnd - @TimeStart,(0))*(24),(0)),(2)))
    END
    Figure: This is the user defined function.
    Figure: Setting up computed column in table designer.
  21. Do you add zs prefix to table name?

    Any type of table in a database where that does not contain application data should be called zs. So when the other application (e.g. SSW SQL Deploy) or the programmer populates the table then it should be called zs (e.g. zsDate - the program populates it, zsVersion - the programmer populates it).

Views Rules

  1. Don't have views as redundant objects

    Don't have views as redundant objects. e.g. vwCustomers as SELECT * FROM Customers. This is unnecessary. Instead Views should be generally used for security.


             see results

Stored Procedure Rules

  1. Do your stored procedures return a value indicating the status?

    Make sure your stored procedures always return a value indicating the status. All stored procedures should return the error number (if an error) or a 0 to indicate no errors (ie success).


             see results

  2. Do you standardize on the return values of stored procedures for success and failures?

    Standardize on the return values of stored procedures for success and failures.


             see results

  3. If you need to return the value of variables, do you use OUTPUT parameters?

    The RETURN statement is meant for returning the execution status only, but not data. If you need to return value of variables, use OUTPUT parameters. There is a compelling reason for this - if you use return values rather than output values to return data, money values that you return will silently be truncated.

    For more information, see the following KB article
     


             see results

  4. Do you check the global variable @@ERROR after executing a data manipulation statement?

    Always check the global variable @@ERROR immediately after executing a data manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction in case of an error (@@ERROR will be greater than 0 in case of an error). This is important, because, by default, SQL Server will not rollback all the previous changes within a transaction if a particular statement fails. This behaviour can be changed by executing SET XACT_ABORT ON. The @@ROWCOUNT variable also plays an important role in determining how many rows were affected by a previous data manipulation (also, retrieval) statement, and based on that you could choose to commit or rollback a particular transaction.


             see results

  5. Do you use SCOPE_IDENTITY() to get the most recent row identity?

    When inserting a row in a stored procedure, always use SCOPE_IDENTITY() if you want to get the ID of the row that was just inserted. A common error is to use @@IDENTITY, which returns the most recently created identity for your current connection, not necessarily the identity for the recently added row in a table. You could have a situation where there is a trigger that inserts a new record in a Logs Table, for example, when your Stored Procedure or INSERT SQL Statement inserts a record in the Orders Table. If you use @@IDENTITY to retrieve the identity of the new order, you will actually get the identity of the record added into the Log Table and not the Orders Table, which will create a nasty bug in your data access layer. To avoid the potential problems associated with someone adding a trigger later on, always use SCOPE_IDENTITY() to return the identity of the recently added row in your INSERT SQL Statement or Stored Procedure.

    Behold this example from SQL Server Books online.

    USE tempdb
    GO
    CREATE TABLE TZ (
       Z_id  int IDENTITY(1,1)PRIMARY KEY,
       Z_name varchar(20) NOT NULL)
    
    INSERT TZ
       VALUES ('Lisa')
    INSERT TZ
       VALUES ('Mike')
    INSERT TZ
       VALUES ('Carla')
    
    SELECT * FROM TZ
    
    --Result set: This is how table TZ looks.
    Z_id   Z_name
    -------------
    1      Lisa
    2      Mike
    3      Carla
    
    CREATE TABLE TY (
       Y_id  int IDENTITY(100,5)PRIMARY KEY,
       Y_name varchar(20) NULL)
    
    INSERT TY (Y_name)
       VALUES ('boathouse')
    INSERT TY (Y_name)
       VALUES ('rocks')
    INSERT TY (Y_name)
       VALUES ('elevator')
    
    SELECT * FROM TY
    --Result set: This is how TY looks:
    Y_id  Y_name
    ---------------
    100   boathouse
    105   rocks
    110   elevator
    
    /*Create the trigger that inserts a row in table TY 
    when a row is inserted in table TZ*/
    CREATE TRIGGER Ztrig
    ON TZ
    FOR INSERT AS 
       BEGIN
       INSERT TY VALUES ('')
       END
    
    /*FIRE the trigger and determine what identity values you obtain 
    with the @@IDENTITY and SCOPE_IDENTITY functions.*/
    INSERT TZ VALUES ('Rosalie')
    
    SELECT SCOPE_IDENTITY() AS [SCOPE_IDENTITY]
    GO
    SELECT   @@IDENTITY AS [@@IDENTITY]
    GO
    

    Notice the difference in the result sets. As you can see, it's crucial that you understand the difference between the 2 commands in order to get the correct ID of the row you just inserted.

    SCOPE_IDENTITY
    4
    /*SCOPE_IDENTITY returned the last identity value in the same scope. This was the insert on table TZ.*/
    
    @@IDENTITY
    115
    /*@@IDENTITY returned the last identity value inserted to TY by the trigger. This fired because of an earlier insert on TZ.*/
    

             see results

  6. Do you SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes?

    You should use SET NOCOUNT ON for production and NOCOUNT OFF off for development/debugging purposes (i.e. when you want the rowcounts to display as the messages from your T-SQL).

    According to SQL Server Books Online:
    "For stored procedures that contain several statements that do not return much actual data, this can provide a significant performance boost because network traffic is greatly reduced."

    Example: Procedure that returns a scalar value (ClientID generated by an insert statement) should use OUTPUT keyword (not RETURN) to pass back data. This is how you should return a generated ClientID from the procedure, and also return a status value

    CREATE PROCEDURE procClientInsert
    /*
    '---------------------------------------------- 
    ' Copyright 2001 SSW 
    ' www.ssw.com.au All Rights Reserved.
    ' VERSION AUTHOR  DATE COMMENT  
    ' 1.0     DDK      17/12/2001 
    '
    'Calling example
    
    'DECLARE @pintClientID int
    'DECLARE @intReturnValue int
    'exec @intReturnValue = procClientInsert 'TEST Entry', 
    @pintClientID OUTPUT
    'PRINT @pintClientID
    'PRINT  @intReturnValue
    '---------------------------------------------- 
    */
    
    @pstrCoName varchar (254),
    @pintClientID int OUTPUT
    
    AS
    
    --IF ONE THING FAILS, ROLLBACK
    SET XACT_ABORT ON
    --THE COUNT WILL NOT NORMALLY DISPLAY IN AN APPLICATION IN PRODUCTION. 
    --GET RID OF IT BECAUSE IT IS EXTRA TRAFFIC, AND CAN CAUSE 
    PROBLEMS WITH SOME CLIENTS
    SET NOCOUNT ON
    
    --Generate a random number
    SET @pintClientID = (SELECT CAST(RAND() * 100000000 AS int))
    
    INSERT INTO Client (ClientID, CoName) VALUES (@pintClientID , 
    @pstrCoName)
    
    SET XACT_ABORT OFF
    
    IF @@ROWCOUNT = 1 
                RETURN 0 -- SUCCESS
    ELSE
        BEGIN
                IF @@ERROR=0 
                    RETURN 1  -- FAILURE 
                ELSE
                    RETURN @@ERROR  -- FAILURE 
        END
    
    SET NOCOUNT OFF
    

    This procedure will display 0 or the error to indicate success or failure. You should base you actions on this return code.

    This separates return values from actual data so that other programmers know what to expect.

    Note:
    If you are using SQL Server stored procedures to edit or delete data using a SqlDataAdapter, make sure that you do not use SET NOCOUNT ON in the stored procedure definition. This causes the rows affected count returned to be zero, which the DataAdapter interprets as a concurrency conflict. In this event, a DBConcurrencyException will be thrown.


             see results

  7. Do you keep your Stored Procs simple?

    If you are using the .NET Framework, put validation and defaults in the middle tier. The backend should have the required fields (Allow Nulls = False), but no complicated constraints. The following are examples that work with the Products table (with an added timestamp field called Concurrency) from Northwind.

    1. Code: Select Procedure
      ALTER PROCEDURE dbo.ProductSelect
      @ProductID int
      AS
      SELECT ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice, UnitsInStock,
      UnitsOnOrder, ReorderLevel, Discontinued, Concurrency
      FROM Products
      WHERE (ProductID= @ProductID)
                                              
    2. Code: Insert Procedure
      ALTER PROCEDURE dbo.ProductInsert
      @ProductName nvarchar(40),
      @SupplierID int,
      @CategoryID int,
      @QuantityPerUnit nvarchar(20),
      @UnitPrice money,
      @UnitsInStock smallint,
      @UnitsOnOrder smallint,
      @ReorderLevel smallint,
      @Discontinued bit
      AS
      INSERT INTO Products (ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice,
      UnitsInStock, UnitsOnOrder, ReorderLevel, Discontinued)
      VALUES (@ProductName, @SupplierID, @CategoryID, @QuantityPerUnit, @UnitPrice, @UnitsInStock,
      @UnitsOnOrder, @ReorderLevel, @Discontinued, 1)
      SELECT Scope_Identity() AS [SCOPE_IDENTITY]  --If table has identity column
      --SELECT @@ROWCOUNT  --If table doesn't have identity column
      -- Note: The middle tier must check the ROWCOUNT = 1
    3. Code: Update Procedure
      ALTER PROCEDURE dbo.ProductUpdate 
      @ProductID int, 
      @ProductName nvarchar(40), 
      @SupplierID int, 
      @CategoryID int, 
      @QuantityPerUnit nvarchar(20), 
      @UnitPrice money, 
      @UnitsInStock smallint, 
      @UnitsOnOrder smallint, 
      @ReorderLevel smallint, 
      @Discontinued bit, 
      @Concurrency timestamp 
      
      UPDATE Products 
      SET ProductName = @ProductName,
      SupplierID = @SupplierID,
      CategoryID = @CategoryID,
      QuantityPerUnit = @QuantityPerUnit,
      UnitPrice = @UnitPrice,
      UnitsInStock = @UnitsInStock,
      UnitsOnOrder = @UnitsOnOrder,
      ReorderLevel = @ReorderLevel,
      Discontinued = @Discontinued
       
      WHERE (Concurrency = @Concurrency) AND (ProductID= @ProductID) --Note the double criteria to ensure concurrency 
      SELECT @@ROWCOUNT 
       
      -- Note: The middle tier must check the ROWCOUNT = 1
    4. Code: Delete Procedure
      ALTER PROCEDURE dbo.ProductDelete 
      @ProductID int, 
      @Concurrency timestamp 
      AS 
      DELETE FROM Products 
      WHERE (ProductID= @ProductID) AND (Concurrency = @Concurrency)
       
      --Note the double criteria to ensure concurrency 
      SELECT @@ROWCOUNT 
       
      --Note: The middle tier must check the ROWCOUNT = 1

             see results

  8. Do not start user stored procedures with system prefix "sp_" or "dt_"

    System stored procedures are created and stored in the master database and have the sp_ prefix. System stored procedures can be executed from any database without having to qualify the stored procedure name fully using the database name master. It is strongly recommended that you do not create any stored procedures using sp_ as a prefix. SQL Server always looks for a stored procedure beginning with sp_ in this order:

    1. The stored procedure in the master database.
    2. The stored procedure based on any qualifiers provided (database name or owner).
    3. The stored procedure using dbo as the owner, if one is not specified.
    Therefore, although the user-created stored procedure prefixed with sp_ may exist in the current database, the master database is always checked first, even if the stored procedure is qualified with the database name.

    Important: If any user-created stored procedure has the same name as a system stored procedure, the user-created stored procedure will never be executed.

    SQL Auditor will check this rule and generate a script to replace user-defined stored procedures prefix of 'sp_' or 'dt_' with prefix defined in Tools > Options


             see results

  9. Use company standard description in your stored procedures

    All stored procedures Must Have Company Description.

    SQL Auditor will check this rule


             see results

  10. Do you avoid using SELECT * when inserting data?

    Using a statement like "INSERT tableName SELECT * FROM otherTable", makes your stored procedures vulnerable to failure. Once either of the two tables changs, your stored procedure won't work. Not only that, when the inserting table has an identity column, such a statement will cause an error - "An explicit value for the identity column in table ParaRight can only be specified when a column list is used and IDENTITY_INSERT is ON."

    USE [ParaGreg]
    GO
    /****** Object:  StoredProcedure [dbo].[procMove]    Script Date: 08/08/2008 12:18:33 ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    ALTER PROCEDURE [dbo].[procMove]
    @id AS Char,
    @direction AS INT
    AS
    IF @direction = 0
    BEGIN
          INSERT INTO ParaRight
          SELECT * FROM ParaLeft
          WHERE ParaID = @id
          DELETE FROM ParaLeft
          WHERE ParaID = @id
    END
    ELSE IF @direction = 1
    BEGIN
          INSERT INTO ParaLeft
          SELECT * FROM ParaRight
          WHERE ParaID = @id
          DELETE FROM ParaRight
          WHERE ParaID = @id
    END
    
                    
                    
    Bad example: Using SELECT * when inserting data. Besides, this stored procedure should have an Else section to raise error when no condition is satisfied.

    USE [ParaGreg]
    GO
    /****** Object:  StoredProcedure [dbo].[procMove]    Script Date: 08/08/2008 12:18:33 ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    ALTER PROCEDURE [dbo].[procMove]
    @id AS Char,
    @direction AS INT
    AS
    IF @direction = 0
    BEGIN
          INSERT INTO ParaRight
          SELECT Col1,Col2 FROM ParaLeft
          WHERE ParaID = @id
          DELETE FROM ParaLeft
          WHERE ParaID = @id
    END
    ELSE IF @direction = 1
    BEGIN
          INSERT INTO ParaLeft
          SELECT * FROM ParaRight
          WHERE ParaID = @id
          DELETE FROM ParaRight
          WHERE ParaID = @id
    END
    ELSE BEGIN PRINT "Please use a correct direction"
        END
    
                    
                    
    Good example: Using concrete columns instead of * and provide an Else section to raise errors.
  11. Do you use transactions for complicated stored procedures?

    A transaction means an atomic operation, it assures that all operations within the transaction are successful, if not, the transaction will cancel all operations and roll back to the original state of the database, that means no dirty data and mess exists in the database, so if a stored procedure has many steps, and each step has relation with other steps, it is strongly recommended that you encapsulate the procedure in a transaction.

    ALTER PROCEDURE [dbo].[procInit]
    AS
        DELETE ParaLeft
        DELETE ParaRight
        INSERT INTO ParaLeft (ParaID)
        SELECT ParaID FROM Para
                    
    Bad example: No transaction here, if any of operations fail, the database will only partially update, resulting in an unwanted result.

    ALTER PROCEDURE [dbo].[procInit]
    AS
        BEGIN TRANSACTION
        
        DELETE ParaLeft
        DELETE ParaRight
        INSERT INTO ParaLeft (ParaID)
        SELECT ParaID FROM Para
        
        COMMIT
                    
    Good example: Using a transaction to assure that all operations within the transaction will be successful, otherwise, the database will roll back to original state.
  12. Do you know SQL stored procedure names should be prefixed with the owner?

    Always specify the schema prefix when creating stored procedures. This way you know that it will always be dbo.procedure_name no matter who is logged in when it is created.

    There are 2 other benefits to including the schema prefix on all object references:

    1. This prevents the database engine from checking for an object under the users schema first
    2. Also avoids the issue where multiple plans are cached for the exact same statement/batch just because they were executed by users with different default schemas

    Aaron Bertrand agrees with this rule - My stored procedure "best practices" checklist .

    CREATE PROCEDURE procCustomer_Update
       @CustomerID     INT,
      …..
    BEGIN
                    
    Figure: Bad example
    CREATE PROCEDURE dbo.procCustomer_Update
       @CustomerID     INT,
      …..
    BEGIN
                    
    Figure: Good example

    We have a program called SSW Code Auditor to check for this rule.

Function Rules

  1. None to speak of at the moment

Relationship Rules

  1. Turn on referential integrity in relationships

    Cascading referential integrity constraints allow you to define the actions SQL Server takes when a user attempts to delete or update a key to which existing foreign keys point. The REFERENCES clauses of the CREATE TABLE and ALTER TABLE statements support ON DELETE and ON UPDATE clauses:

    • [ ON DELETE { CASCADE | NO ACTION } ]
    • [ ON UPDATE { CASCADE | NO ACTION } ]
    NO ACTION is the default if ON DELETE or ON UPDATE is not specified.

    Relationships should always have referential integrity turned on. If you turned it on after data has been added, you may have data in your database that violates your referential integrity rules.

    Referential Integrity Check
    Figure: Recommended referential integrity constraints

    SQL Auditor will check this rule.

             see results

  2. Use ON UPDATE CASCADE clause when creating relationship

    The ON UPDATE CASCADE feature of SQL Server 2000 and above can save you time in writing application and stored procedure code. We recommend that you take advantage of it. It is also more efficient than using triggers to perform these updates.

    SQL Auditor will check this rule and generate a script to add the ON UPDATE CASCADE clause to relationships

             see results

  3. Do not use ON DELETE CASCADE clause when creating relationship

    SQL Servers ON DELETE CASCADE functionality can be very dangerous. We recommend not using it. Imagine someone deletes customer and the orders are deleted. If you need to delete records in related tables, do it in code in the application as it gives you more control.

    SQL Auditor will check this rule and generate a script to remove the ON DELETE CASCADE clause from relationships

             see results

  4. Use NOT FOR REPLICATION clause when creating relationship

    When NOT FOR REPLICATION is used with a Foreign Key relationship, the integrity of the relationship is not checked while the Replication Agent is logged in and performing replication operations. This allows changes to the data (such as cascading updates) be propagated correctly.

    SQL Auditor will check this rule and generate a script to add the NOT FOR REPLICATION clause to relationships

             see results

  5. Do columns ending with 'ID' have FOREIGN KEY constraints?

    Northwind Relationships
    Figure: Missing relationships


             see results

General Rules

  1. Object name should not be a reserved word

    SQL Server reserves certain keywords for its exclusive use. It is not legal to include the reserved keywords in a Transact-SQL statement in any location except that defined by SQL Server. No objects in the database should be given a name that matches a reserved keyword. If such a name exists, the object must always be referred to using delimited identifiers. Although this method does allow for objects whose names are reserved words, it is recommended that you do not name any database objects with a name that is the same as a reserved word. In addition, the SQL-92 standard implemented by Microsoft SQL Server defines a list of reserved keywords.

    Avoid using SQL-92 reserved keywords for object names and identifiers, ie. User, Count, Group, etc. They can be used if joined with other words.
    What are reserved words for SQL Server 2000?
    Why avoid reserved words and spaces in object names?

    SQL Auditor will check this rule and generate a script to rename object names that represent reserved words

             see results

  2. Object name should not contain spaces

    Spaces should be avoided. If an object name contains a space, it can cause problems later on for developers because the developer must remember to put the object name inside square brackets when referencing it.
    Why avoid reserved words and spaces in object names?

    We aim to never have to use square brackets in any of our databases.

    SQL Auditor will check this rule and generate a script to remove spaces from object names

             see results

  3. Do not use "sp_rename" to rename objects like stored procedures, views and triggers

    Object name should be the same as name used in the object's script (e.g. CREATE script for stored procedures, views and triggers). Inconsistency can happen when object is renamed with sp_rename, but its script is not updated.

    SQL Auditor will check this rule and generate a script to re-create objects that have mismatched names. The object name will be used as the new name in CREATE script, not the name originally specified.

             see results

  4. Object name should follow your company Naming Conventions

    1. SQL Server Object Naming Standard.aspx SSW's Standard for naming SQL Server Objects.
    2. SQL Server Stored Procedure Naming Standard SSW's Standard for naming Stored Procedures.
    3. SQL Server Indexes Naming Standard SSW's Standard for naming Indexes.
    4. SQL Server Relationship Naming Standard SSW's Standard for naming Relationships
    5. Use decreasing generality for table names ie. Client and ClientInvoice, then ClientInvoiceDetail.
    6. Don't use underscores, instead use upper and lower case ie. ClientInvoice is preferred over Client_Invoice.
    7. Table names should not use plurals ie. Client is preferred over Clients.
    8. Generally don't use abbreviations. But there are a few words that are so commonly used that they can be abbreviated. These are:
      • Quantity = Qty
      • Amount = Amt
      • Password = Pwd
    9. Prefix all Date fields with 'Date' ie. DateInvoiced. One extra use of this is you can have generic code that enables a Date control on this field.
    10. Suffix Percent fields with 'Pct' ie. SalesTaxPct.
    11. Only use alphabet characters. ie. don't use AustraliaListA$. Avoid the following characters in your object names in SQL Server. If you do not do this, you will need to constantly identify those ill-named objects with bracketed or quoted identifiers - otherwise, unintended bugs can arise. What characters and symbols should I avoid using when naming objects in Access and SQL Server databases
    12. Don't use reserved words on their own. ie. User, Count, Group, etc. They can be used if joined with other words. What are reserved words for SQL Server 2000?
    SQL Auditor will check this rule and generate a script to append prefix to object names according with current Naming Convention. Refer to 'Naming Conventions' tab in Tools>Options.

             see results

  5. Every object name should be owned by dbo

    The reason is that you avoid ownership chain problems. Where Mary owns an object, Fred can read the object and then he creates a proc and he gives permission to Tom to execute. But Tom cant because there is a product chain of ownership.

                                        CREATE PROCEDURE [dbo].[Sales by Year]
    
    
    @Beginning_Date DateTime,
    @Ending_Date DateTime AS
    SELECT Orders.ShippedDate
    ,Orders.OrderID
    ,"vwOrderSubTotals".Subtotal
    ,DATENAME(yy,ShippedDate) AS Year
    FROM Orders
    INNER JOIN "vwOrderSubTotals"
    ON Orders.OrderID = "vwOrderSubTotals".OrderID
    WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date
    Figure: good example.
                                        CREATE PROCEDURE [Adam Cogan].[Sales by Year]
    
    
    @Beginning_Date DateTime,
    @Ending_Date DateTime AS
    SELECT Orders.ShippedDate
    ,Orders.OrderID
    ,"vwOrderSubTotals".Subtotal
    ,DATENAME(yy,ShippedDate) AS Year
    FROM Orders
    INNER JOIN "vwOrderSubTotals"
    ON Orders.OrderID = "vwOrderSubTotals".OrderID
    WHERE Orders.ShippedDate Between @Beginning_Date And @Ending_Date
    Figure: bad example.
    SQL Auditor will check this rule and generate a script to re-create objects that don't have owner dbo.

             see results

  6. Naming convention for use on database server test and production

    Generally, every client should have a dev and a test database, so the dev database need to have the postfix "Dev" and the test database need to have the postfix "Test"(E.g. SSWCRMDev, SSWCRMTest). However, you don't need any postfix for production database.

    Database with bad names
    Figure: Database with bad names
    Database with standard names
    Figure: Database with standard names

             see results

Middle Tier Rules

  1. Do you submit all dates to SQL Server in ISO format

    All dates submitted to SQL Server must be in ISO format date. This ensures that language or database settings do not interfere with inserts and updates of data. You should NEVER need to change the default language of users or of the database in SQL Server. For example, any insert into a SQL Server database with Visual Basic should call Format(ctlStartDate,"yyyy-mm-dd") or VB.NET Ctype(ctlStartDate.Text,Date).ToString("yyyy-MM-dd") before attempting the insert or update. This will ensure consistency of treatment when dealing with dates in your SQL Server backend.

    SET DATEFORMAT mdy
    
    print convert( datetime, '2003-07-01' )
    -- returns Jul 1 2003 12:00AM
    print convert( datetime, '01/07/2003' )
    -- returns Jan 7 2003 12:00AM
    print convert( datetime, '20030701' )
    -- returns Jul 1 2003 12:00AM
    SET DATEFORMAT dmy
    
    print convert( datetime, '2003-07-01' )
    -- returns Jan 7 2003 12:00AM, opposite of above
    print convert( datetime, '01/07/2003' )
    -- returns Jul 1 2003 12:00AM, opposite of above
    print convert( datetime, '20030701' )
    -- returns Jul 1 2003 12:00AM, only one which is same as above
    Code - ISO format date is the best.

    The extended format can still mix up month & day in some circumstances, whereas the basic format is the only one that always works correctly.

    To be even more pedantic, when you include the time as well as the date, the value isn't really an ISO value at all! The ISO representation of a date/time would be '20030701T0958', whereas for SQL you should send it as '20030701 09:58'. This isn't even the extended ISO format as it is missing the obligatory "T" character (ref. section 5.4.1 of the standard).

    (The standard does allow you to "be omitted in applications where there is no risk of confusing", but it doesn't allow you to add a space or mix basic date with extended time.)

    So, if you want to be absolutely correct then it may be best to remove the reference to ISO, so that your rule works for date/time as well as just dates.

    The technical term used in the SQL help is "Unseparated String Format" (easily searched for).

    The help specifies that this format is unaffected by the SET DATEFORMAT command (which depends on any locale settings for SQL Server or the computer it is installed on).

    "The SET DATEFORMAT session setting does not apply to all-numeric date entries (numeric entries without separators). Six- or eight-digit strings are always interpreted as ymd."

    What is ISO format date?


             see results

  2. Do you implement business logic in middle tier?

    Business logic/rules should be implemented in an object oriented language such as VB.NET and C#.  This dramatically increases the adaptability, extensibility and maintainability of the application.

    Implementing business logic in stored procedures have the disadvantage of being hard to test, debug and evolve, therefore, they should only implement basic data access logic.

    With the exception of some very heavy data oriented operations, it is excusable to use stored procedures to carry out some logic for performance reasons.

    Triggers are even more difficult as their behaviour is event based.  It is okay to use triggers for non-functional/infrastructural features such as logging changes, or maintain more complex relational integrity which cannot be enforced by a simple relationship.


             see results

Performance Tuning

  1. Be Sure To Clear SQL Server Cache When Performing Benchmark Tests

    When you are tuning SQL statements you tend to play in SQL management studio for a while. During this time SQL caches your query's and execution plans.
    All well and good but when you are trying to speed up a existing query that is taking some time then you may not be making a difference even though your execution times are way down.

    You really need to clear SQL's cache (or buffer) every time you test the speed of a query. This prevents the data and/or execution plans from being cached, thus corrupting the next test.

    To clear SQL Server's cache, run DBCC DROPCLEANBUFFERS, which clears all data from the cache. Then run DBCC FREEPROCCACHE, which clears the stored procedure cache.

  2. Clearing the cache

    Figure 1: First call is after clearing the cache. The second one is without clearing the cache. (26 seconds vs 2 seconds)

Links

Acknowledgements

Adam Cogan

David Klein

Daniel Hyles


Benefit from our knowledge and experience!

SSW is the industry leader in Custom Software Development, Software Auditing & Developer Training.

Call us on +61 2 9953 3000 or email us for a free consultation

What does it cost? I’m not in Australia. Can you still help?