Browsing articles tagged with " Primary Key"

The Key to Keys SQL Saturday Silicon Valley

Apr 9, 2016   //   by Karen Lopez   //    //  No Comments

sqlsat493_web

 

So many of us have learned database design approaches from working with one database or data technology. We may have used only one data modeling or development tool. That means our vocabularies around identifiers and keys tend to be product specific. Do you know the difference between a unique index and a unique key? What about the difference between RI, FK and AK?

These concepts span data activities and it’s important that your team understand each other and where they, their tools and approaches need to support these features.

We’ll look at the generic and proprietary terms for these concepts, as well as where they fit in the database design process. We’ll also look at implementation options in SQL Server and other DBMSs.

The Key to Keys at the North Texas SQL Server User Group – 17 March

Mar 15, 2016   //   by Karen Lopez   //   Blog, Data Modeling, Database, Database Design, DLBlog, Speaking, SQL Server  //  No Comments

I’m visiting Dallas this week to speak at the North Texas SQL Server User Group this Thursday.  I’ll be speaking about keys: primary keys, surrogate keys, clustered keys, GUIDs, SEQUENCEs, alternate keys…well, there’s a lot to cover about such a simple topic.  The reason I put this presentation together is I see a lot of confusion about these topics. Some of it’s about terminology (“I can’t find anything about alternate keys in SQL Server…what the heck is that, anyway”), some of it is misunderstandings (“what do you mean IDENTITIES aren’t unique! of course they are…they are primary keys!”), some of it is just new (“Why the heck would anyone want to use a SEQUENCE?”).

We’ll be chatting about all these questions and more on Thursday, 17 March at the Microsoft venue in Irving, Texas starting at 6PM.

Attendance is free, but you need to register at http://northtexas.sqlpass.org/ to help organizers plan for the event.

Don’t worry if you don’t know about SQL Server or don’t use it: this presentation will focus on some SQL Server specific features, but the discussion is completely portable to other DBMSs.

So many of us have learned database design approaches from working with one database or data technology. We may have used only one data modeling or development tool. That means our vocabularies around identifiers and keys tend to be product specific. Do you know the difference between a unique index and a unique key? What about the difference between RI, FK and AK? These concepts span data activities and it’s important that your team understand each other and where they, their tools and approaches need to support these features. We’ll look at the generic and proprietary terms for these concepts, as well as where they fit in the database design process. We’ll also look at implementation options in SQL Server and other DBMSs.

Hope to see you there!

She’s Cute, So it’s Okay…

Jul 12, 2012   //   by Karen Lopez   //   Blog, Data, Data Modeling, Database, Snark  //  No Comments

image

…and, yes, this is quote from one of my former DBAs.

Clustered Indexes and UniqueIdentifiers

Jul 11, 2012   //   by Karen Lopez   //   Blog, Database, Database Design, Fun, Snark  //  No Comments

Tom LaRock (@sqlrockstar) posted today about clustered indexes on uniqueidentifiers.  He has a great image there, which got me thinking about making my own….

 

image

New in SQL Server 2012 SEQUENCEs: Why They Aren’t Just for Surrogate Keys

Feb 16, 2012   //   by Karen Lopez   //   Blog, Data, Data Modeling, Database, DLBlog, SQL Server  //  5 Comments

imageStarting with SQL Server 2012, database designers will have the ability to use SEQUENCEs to generate numbers, such as for populating surrogate keys (primary Keys with no meaning). Sounds like the IDENTITY property, right?  It’s not quite the same.  In SQL Server (and other DBMSs), the IDENTITY property is a characteristic of a column in a table.  A database designer can set this property up to automatically generate a value in that column and specify certain features such as what number to start with and what number to increment by.  These properties are set for a specific column and only one column in a table can have this identity property.  That’s why IDENTITY is almost always used as a surrogate primary key on a table.

To create a column with the IDENTITY property in SQL Server 2012:

CREATE TABLE 
    [ database_name . [ schema_name ] . | schema_name . ] table_name 
    [ AS FileTable ]
    ( { <column_definition> | <computed_column_definition> 
        | <column_set_definition> | [ <table_constraint> ] [ ,...n ] } )
    [ ON { partition_scheme_name ( partition_column_name ) | filegroup 
        | "default" } ] 
    [ { TEXTIMAGE_ON { filegroup | "default" } ] 
    [ FILESTREAM_ON { partition_scheme_name | filegroup 
        | "default" } ]
    [ WITH ( <table_option> [ ,...n ] ) ]
[ ; ]
 
<column_definition> ::=
column_name <data_type>
    [ FILESTREAM ]
    [ COLLATE collation_name ] 
    [ NULL | NOT NULL ]
    [ 
        [ CONSTRAINT constraint_name ] DEFAULT constant_expression ] 
      | [ IDENTITY [ ( seed ,increment ) ] [ NOT FOR REPLICATION ] 
    ]
    [ ROWGUIDCOL ] [ <column_constraint> [ ...n ] ] 
    [ SPARSE ] 

http://msdn.microsoft.com/en-us/library/ms174979(v=sql.110).aspx

I’ve highlighted the syntax that applies the IDENTITY to a column.  SQL Server will know which column has the identity property (and you can only have one of these columns in the same table).  Notice that there are only three options for an IDENTITY.

To create a SQUENCE in SQL Server 2012:

CREATE SEQUENCE [schema_name . ] sequence_name
    [ AS [ built_in_integer_type | user-defined_integer_type ] ]
    [ START WITH <constant> ]
    [ INCREMENT BY <constant> ]
    [ { MINVALUE [ <constant> ] } | { NO MINVALUE } ]
    [ { MAXVALUE [ <constant> ] } | { NO MAXVALUE } ]
    [ CYCLE | { NO CYCLE } ]
    [ { CACHE [ <constant> ] } | { NO CACHE } ]
    [ ; ]

http://msdn.microsoft.com/en-us/library/ff878091(v=sql.110).aspx

Notice that there is no TABLE or COLUMN referenced in that TSQL.  You are just creating a SEQUENCE.  SQL Server will have no idea what you are going to use it for or where it will be used.  In fact, you could create a SEQUENCE and never use it. Notice there are several more options for a SEQUENCE.

SEQUENCEs are separate objects in a database and can be used to populate more than one column even across several tables.  Because they aren’t bound to a table in any way, they can be used in ways that IDENTITY columns can’t be used.  For  instance:

  • One SEQUENCE can be used across more than one table. For instance, a Control Number or Document Number on several tables.  I’ve seen this used in other DBMSs to generate document numbers across subtypes, say Purchase Orders, Shipping Notices, Return to Vendor Numbers, etc.  There’s a requirement that they be unique across all types of documents.  This could not be done with IDENTITY.
  • Using a SEQUENCE allows tables to have more than one auto-generated number in multiple columns. Think of Order Number and Cancellation Number, both on the Order table. You can’t do that with IDENTITY.
  • Since sequences are generated completely outside a table, you can chose to store the resulting value in a different format, say VARCHAR or CHAR. Or, heaven forbid, you might want to append data to it or do other kinds of evil to it before storing it.
  • Since sequences are generated completely outside a table, you can store the results in a NULLable column.

I think the confusion sets in because we tend to think of auto-generated number as useful only as surrogate keys. But we have real data needs for generating numbers other than just row identifiers, even if that’s the primary use for this. In fact, one could have a table with a PK that uses IDENTITY and an unlimited number of columns that use a SEQUENCE to set their values.  Of course, SEQUENCEs can be used for surrogate primary keys, too.  The existence of SEQUENCEs gives us another option over IDENTITY.

So SEQUENCE is a feature that just generates numbers based on some characteristics. A designer has many ways to use the number once it is generated and tables are not dependent on that method to get that number, nor do they even “know” about how that value was provided.

This separation of the COLUMN from how the data is populated is a feature of using SEQUENCEs. How one uses this separation depends on the technical and business requirements for the data.

Identity Column Issues and Trade-offs

Dec 1, 2010   //   by Karen Lopez   //   Blog, Data, Data Modeling, Database, DLBlog  //  4 Comments

This post is based on a post I previously blogged on 30 May 2008…but like most design approaches, it still holds true.

If you’ve heard me present, then you know that one of my primary design mantras is:

Everything is a trade off.  You need to be able to explain, in both business and technical terms, the cost, benefits, and risks associated with every design decision.

If you can’t do that, then you aren’t doing design: you are copying design patterns from other designs and hoping for the best outcome, even if your current design has different requirements and risk tolerances.

I am often told that a database design must have surrogate keys on every table and that “right” way to implement surrogate keys is to use the Identity property in SQL Server (or the RowID in Oracle, or the Identity property in DB2).  These sorts of features of modern DBMSs do vary from vendor to vendor, so the cost, benefit, and risk associated with each varies by version and by vendor.  Most IT pros I speak with don’t realize that.

Many people I speak with assume, incorrectly, that Identity Property = Surrogate Key = Primary Key= Unique Index but this is not true. It is important for database designers to understand the differences between these concepts.

Nigel Rivett has written a great article over at Simple-talk.com about SQL Server’s identity property and some of the interesting “features” of this incrementing function:

Note: An identity column is not guaranteed to be unique nor consecutive. You should always place a unique index on an identity column if your system requires uniqueness.

Note: The next value is the step added to the current seed; not one more than the max value in the table, or even the step from the last or maximum value.

I’m guessing that 6+ out of 10 of experienced SQL Server designers and developers are not clear on these two gotchas.  Other DBMSs have technical issues as well.  When I’m questioned on why I don’t just slap on an identity property, call it a PK an move on to the next change request, I usually start asking questions:

  • Will the application need to assume that the values are always sequential?  What if the sequence is missing a few steps?  Will that break the code?
  • Will the Identity column value be displayed anywhere outside the database, such as on a report or on a screen?
  • Will the users be confused if a sequence is missing or if they “restart” in the middle of an order?
  • Will the Identity column value be used outside this database? How?  Where?
  • What are our plans for dealing with rows that exceed the maximum number of identity values (in some versions of DB2, identity values maxed out at 32k or so)?

Identity properties can be useful, but as in every design decision, there are uses that are appropriate and uses that are not — it all comes down to cost, benefit, and risk.

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to www.datamodel.com by Email


Categories

Archive

UA-52726617-1