Browsing articles from "December, 2010"

Identity Column Issues and Trade-offs

Dec 1, 2010   //   by Karen Lopez   //   Blog, Data, Data Modeling, Database, DLBlog  //  4 Comments

This post is based on a post I previously blogged on 30 May 2008…but like most design approaches, it still holds true.

If you’ve heard me present, then you know that one of my primary design mantras is:

Everything is a trade off.  You need to be able to explain, in both business and technical terms, the cost, benefits, and risks associated with every design decision.

If you can’t do that, then you aren’t doing design: you are copying design patterns from other designs and hoping for the best outcome, even if your current design has different requirements and risk tolerances.

I am often told that a database design must have surrogate keys on every table and that “right” way to implement surrogate keys is to use the Identity property in SQL Server (or the RowID in Oracle, or the Identity property in DB2).  These sorts of features of modern DBMSs do vary from vendor to vendor, so the cost, benefit, and risk associated with each varies by version and by vendor.  Most IT pros I speak with don’t realize that.

Many people I speak with assume, incorrectly, that Identity Property = Surrogate Key = Primary Key= Unique Index but this is not true. It is important for database designers to understand the differences between these concepts.

Nigel Rivett has written a great article over at Simple-talk.com about SQL Server’s identity property and some of the interesting “features” of this incrementing function:

Note: An identity column is not guaranteed to be unique nor consecutive. You should always place a unique index on an identity column if your system requires uniqueness.

Note: The next value is the step added to the current seed; not one more than the max value in the table, or even the step from the last or maximum value.

I’m guessing that 6+ out of 10 of experienced SQL Server designers and developers are not clear on these two gotchas.  Other DBMSs have technical issues as well.  When I’m questioned on why I don’t just slap on an identity property, call it a PK an move on to the next change request, I usually start asking questions:

  • Will the application need to assume that the values are always sequential?  What if the sequence is missing a few steps?  Will that break the code?
  • Will the Identity column value be displayed anywhere outside the database, such as on a report or on a screen?
  • Will the users be confused if a sequence is missing or if they “restart” in the middle of an order?
  • Will the Identity column value be used outside this database? How?  Where?
  • What are our plans for dealing with rows that exceed the maximum number of identity values (in some versions of DB2, identity values maxed out at 32k or so)?

Identity properties can be useful, but as in every design decision, there are uses that are appropriate and uses that are not — it all comes down to cost, benefit, and risk.

Pages:«123

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to www.datamodel.com by Email


Recent Comments

Categories

Archive

UA-52726617-1