Browsing articles tagged with " Database Design"

You’re Doing it Wrong: Generalizations about Generalizations

What Could Go Wrong Complicated Data Model Thumbnail Plus Darth Vader

I have a couple of presentations where I describe how generalized data modeling can offer both benefits and unacceptable costs.  In my Data Modeling Contentious Issues presentation, the one where we vote via sticky notes, we debate the trade-offs of generalization in a data model and database design.  In 5 Classic Data Modeling Mistakes, I talk about over-generalization.

Over the last 20 some years (and there’s more “some” there than ever before), I’ve noticed a trend towards more generalized data models.  The means that instead of having a box for almost every noun in our business, we have concepts that have categories.  Drawing examples from the ARTS Data Model, instead of having entities for:

  • Purchase Order
  • Shipping Notice
  • Receipt
  • Invoice
  • etc

…we have one entity for InventoryControlDocument that has a DocumentType instance of Purchase order, Shipping Notice, Receipt, Invoice, etc.

See what we did there?  We took metadata that was on the diagram as separate boxes and turned them into rows in a table in the database.  This is brilliant, in some form, because it means when the business comes up with a new type of document we don’t have to create a new entity and a new table to represent that new concept.  We just add a row to the DocumentType table and we’re done.  Well, not exactly…we probably still have to update code to process that new type…and maybe add a new user interface for that…and determine what attributes of InventoryControlDocument apply to that document type so that the code can enforce the business rules.

Ah! See what we did there this time?  We moved responsibility for managing data integrity from the data architect to the coders.  Sometimes that’s great and sometimes, well, it just doesn’t happen.

So my primary reason to raise generalization as an issue is that sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems.  Just because you engineered a requirement from a table to a row does not mean it is no longer your responsibility.  I’ve even seen architects become so enamoured with moving the work from their plate to another’s that they have generalized the heck out of everything while leaving the data quality responsibility up to someone else.  That someone else typically is not measured or compensated for data integrity, either.

Sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems

Alec Sharp has written a few blog posts on Generalizations. These posts have some great examples of his 5 Ways to Go Wrong with Generalisation.   I especially like his use of the term literalism since I never seem to get the word specificity out when I’m speaking. I recommend you check out his 5 reasons, since I agree with all of them.

1 – Failure to generalize, a.k.a. literalism

2 – Generalizing too much

3 – Generalizing too soon

4 – Confusing subtypes with roles, states, or other multi-valued characteristics

5 – Applying subtyping to the wrong entity.

By the way, Len Silverston and Paul Agnew talk about levels of generalization in their The Data Model Resource Book, Vol 3: Universal Patterns for Data Modeling book (affiliate link).  Generalization isn’t just a yes/no position.  Every data model structure you architect has a level of generalization.

Every data model structure you architect has a level of generalization.

I’m wondering how many of you who have used a higher level of generalization and what you’ve done to ensure that the metadata you transformed into data still has integrity?

Leave your recommendations in the comments.

Update: I updated the link to Alec’s blog post.  Do head over there to read his points on generalization.

What Your Database Security Design….

Jul 15, 2015   //   by Karen Lopez   //   Blog, Data Modeling, Database Design, DLBlog, Fun, Snark, WTF  //  3 Comments

…looks like to me.

Sure, you’ve got your own home-grown database security system all designed and working in development.  And then you ask me to confirm that it’s “safe”. I’ll tell ya “it’s safe as long as you don’t actually put any data in it”.

Hacking Database Design

Jul 14, 2015   //   by Karen Lopez   //   Blog, Data Modeling, Database, Database Design, DLBlog, Fun, Snark, WTF  //  4 Comments


I get asked to help teams increase the performance of their database (hint: indexes, query tuning and correct datatypes, in that order)  or to help the scale it out for increasing workloads.  But when I open it up to take a look, I see something that looks more like this meme.

All those cheats, workarounds and tricks they’ve used are going to make the engine optimizers work harder, make the tuning of queries that much harder and in the end it’s going to cost so much more to make it “go faster” or “go web scale”.

Where are the nail clippers in your data models and databases?

27 June Big Challenges with Data Modeling: Data Modelers and DBAs

This month’s webinar (2PM EDT/11AM PDT) focuses on how DBAs and Data Modelers can collaborate better.  You know what it’s like: we both love our data and what databases to purr.  But what happens when our points of views differ?  Is performance more important that some data quality?  Should we all be using the same tools?  Who should be calling the shots?  Who should be “in charge”?  What deliverable should we be preparing?

Our webinars are very interactive.  We open the chat to everyone and offer a formal Q&A for asking questions and giving feedback to the panelists.  You can even join early to chat with us and we tend to stay about after to have “off the record” follow up conversations.  So this isn’t a lecture, but an online community event to chat about what we can do as a team to provide data and database solutions to our organizations.

We are sponsored this month by CA Technologies, the makers of CA ERwin®.  Thanks, ERwin team!

You’ll need to pre-register, but it’s free.  Do it now.  It will be fun.


About the Presentation

We invite you to join us in this monthly DATAVERSITY webinar series, “Big Challenges with Data Modeling” hosted by Karen Lopez. Join Karen and two or more expert panelists each month to discuss their experiences in breaking through these specific data modeling challenges. Hear from experts in the field on how and where they came across these challenges and what resolution they found. Join them in the end for the Q&A portion to ask your own questions on the challenge topic of the month.

This Month’s Panelists

  • Yanni Robel, Lead DBA, Database Operations Team at
    • Yanni (@yannirobel)  manages the Database Operations team at in Seattle, WA. She is a very active Senior Database Administrator as well, leveraging more than a dozen years of experience working with Microsoft SQL Server products and technologies.Yanni specializes in bridging communications between the business, the developers, the sysadmins, and the DBAs to help everyone understand what it is that they depend on. As an active member in the SQL community, previous developer, and parent, Yanni knows the importance of knowing what you don’t know and leveraging the collective knowledge of others and the community to learn and grow. Yanni is pleased to give back to peers, coworkers (past & present), and recruiters by sharing knowledge.
  • Javed Mati, Sales Director, Myriad Solutions UK Ltd
    • [Bio Pending]
  • Paul Agnew
    • Paul Agnew is an author, consultant and speaker with more than 20 years experience in the data modeling and data integration fields in many different industries. He is the co-author (With Len Silverston) of The Data Model Resource Book Volume 3: Universal Patterns in Data Modeling, which describes universal data modeling patterns. Mr. Agnew, has been a featured speaker at the prestigious DAMA International conference, and has presented papers on many different aspects of data management field.Paul has extensive, hands-on experience working in many different industries, including financial services, insurance and re-insurance, health care, healthcare informatics, sales and marketing and manufacturing. Paul is an expert in solution architecture, data architecture, data quality, master data management, data warehousing, big data and data governance.


Metadata Stuffing: Why I Hate tbl_ for Table Names

Oct 10, 2012   //   by Karen Lopez   //   Blog, Data, Data Modeling, Database, Snark, SQL Server  //  13 Comments


Karen Lopez Quote Table and View


This week was Canadian Thanksgiving so this post on metadata stuffing is timely.

Today Thomas LaRock (@sqlrockstar | blog ) posted a rant about our Database Design Throwdown topic on naming standards for tables and other database objects.  Tom is a fan of what I call “metadata stuffing” in object names.  That’s basically shoving as much additional information as one can into object names so that one does not have to go find out that information from its rightful place.

My reason for wanting to use prefixes is simple enough: I want to know if I am looking at a table or a view when reviewing code.

Karen doesn’t believe that anyone should be using object names as a place to store meta data about the objects themselves. I would like to agree with her but then we’d both be wrong.

He even created the nice graphic of my quote I use here. Thanks, Tom.

Object Confusion Abounds

That quote is indeed one I use during our debate.  It’s a snarky introduction to what I think the real problem is: our tools aren’t doing enough to help us with this potential confusion of tables and views.    When you are writing or looking at a query, the syntax for referencing a table or a view is exactly the same.

When a developer writes:

USE AdventureWorks;
SELECT Name, ProductNumber, ListPrice AS Price
FROM Production.Product 

He has no idea for certain if Product is a table or a view.  The syntax is the same.  And if the developer is working in a text editor of some sort, or with some native tools, there’s no tooltip or other help that the he can use to check what type of object they are querying.

Why does this matter?  Tom has a great presentation where he shows the impacts of trying to make stuff work and you don’t know what you are looking at.  That’s why Tom wants to do this with his objects:

FROM Production.tbl_Product


FROM Production.view_Product

So that the object type is injected into the object name.  That seems so innocuous, doesn’t it?  What could go wrong?

I’m here to tell you that this is a slippery slope.  One of the most egregious examples of this sort of meta data stuffing I’ve run across is one that required all this meta data to be prefixed in front of every table name:

  • tbl_ prefix
  • Primary Systems that managed the table
  • Primary subject area that the table belongs to in the data model
  • Classification of the role the table plays in the database (Associative Entity, Domain Entity, Master Data, Reference Data, Log Data, etc.)
  • Three letter login of the DBA responsible for administering this table (I KID YOU NOT).

So in this wonderful naming scheme, we’d get:

FROM Production.tbl_MFGR_ORDERPROC_DOM_KQL_Product

I have found that once an organization starts thinking of stuffing, their designs become turkeys really fast.  It’s ugly.  Think about the tools you use, with all those nifty object lists on the left side. To find a table you need to know all that great metadata as you scroll through the list, hoping that PRODUCT Is buried in there, somewhere.  And what the heck was the name of that guy that dressed funny who did all that data stuff for the company before he won the lottery?

All that metadata that should have been managed elsewhere, not prefixed in front of the “real” name of the table.  In fact, it was.  In the data model and in the system catalog. Every time any of that data changed (DBA assignment changes, DBA wins the lottery, whatever), we had to rename the table and change all the code and reports that referenced it.  Sure we could have isolated systems by this change by using views and or aliases but that is additional complexity for no performance gain, either.  Refactoring might have helped, but eventually we’d still have to change all the code and queries.

Why I Hate Metadata Stuffing

  1. It’s redundant data.  Just like with business data, the reason we want to minimize redundant data is because we then have to worry about updating the data in multiple places. There’s cost and added risk for that.
  2. It changes.  I don’t know about you, but I don’t have the luxury of taking down a production system just to update  a change in the name of the DBA’s favourite TV show or whatever lame naming scheme someone thought up. Sure, tables can’t change into views or indexes, but all the other type of stuffings will change.
  3. It takes up real estate. I get all kinds of flack from developers and DBAs for the length of object names when I want the names to be meaningful.  It’s funny how spelling out CUSTOMER is unacceptable, but adding the exact same characters in front of every object of its type is A-OKAY.  What’s up with that?  Somehow optimizing names for developers is more important than loving your data?  Show me where it says that in the Project Charter.
  4. tbl_ is a tell for bad database design.  I don’t know where this particular naming scheme originated, but when I do a database design review and I see this naming scheme, I know that the designer learned design in a one hour webinar “training course” and has not really mastered the complexities of enterprise database design and maintenance. The design will be less than best practice 90% of the time.  This naming scheme is prominent in programming books, introduction to database books, presentations by non-database people, and uninformed blog posts, by far.  It’s not popular with people who do professional database design. Sure, some products use this, too, but do you really want to take database design best practices from vendors? How many professional data architects do you think they have on staff?  I will most likely see a database design that is highly optimized to make development go faster.  Not for data integrity or loving data.
  5. It’s not needed “for consistency”. One of Tom’s points is that if we are going to prefix views, we have to prefix tables to be consistent.  Actually, no, we don’t.   If we have to bite the bullet and prefix views because our tools let us down, we can choose not to clutter up tables names just to punish those objects, too.  I’m assuming that since Tom prefixes tables and views, he prefixes columns, too, right?  its just being consistent.  < snort>
  6. It gets in the way of using the data. Tables and Column names are the most user-facing parts of a database design.  When we in IT insist on munging up these names with a bunch of systemese, we make it more difficult for business users to get at their data.  It shows that we have optimized the database design to help a relatively small number of technical users (developers, DBAs, ETL folks) over the needs of the business.  Ultimately, we build databases to manage data.  For the business.

What’s the Cost, Benefit and Risk?

One of my Splendid Truths is that all design decisions should assess cost, benefit and risks.  In the overall scheme of things, just prefixing “tbl_” in front of a table name isn’t that costly and it isn’t that risky.  Tom assesses his designs based only on potential for performance harm according to his post.  He “laughs” at my position.  I’m happy that my stance on metadata stuffing brings happiness to his day.  But performance is only one data point out of many for making a design decision.  Usability, clarity, business goal support are other factors that a database architect needs to consider when assigning a name to an object.  If we optimize something for a subsystem, we do it as the expense of other subsystems.

Our Tools Should Help Us More

Select statement view hover tooltip

Having said that, I feel the pain of people having to work with sub-standard tools or having to use tools that just refuse to help.  Tom showed how SQL Server Management Studio tooltips can help.  But all those command line “I don’t need any stinking help” aficionados are left on their own to know what they are looking at.

Oh, and Tom:

ProperNoun_Tom, Pronoun_you Verb_made Pronoun_me Verb_laugh Preposition_with Possessive_your Noun_post.

(See how all this stuffing gets in the way?)

One of my other Splendid Truths about database design is:

Your tools will impact your data models and database designs more than you can imagine.

We shouldn’t sit back and let that happen. Stuffing is great with Tofurky, not with databases.

Join me at SQL Saturday San Diego #SQLSat157 – 15 Sept 2012

Sep 11, 2012   //   by Karen Lopez   //   Blog, Data, Database, Database Design, Events, Speaking  //  1 Comment

I’ll be doing two sessions at SQL Saturday San Diego this weekend:


The first is with co-presenter, Tom LaRock (@sqlrockstar | blog), where we debate, whine (Tom) and win (me)  several database design approaches and methods in front of a live audience (you!).  This is a warm up for our PASS Summit spotlight presentation.

Database Design Throw Down

Karen and Tom debate about the options and best practices of common and advanced design issues, such as: * Natural vs. Surrogate keys * NULL vs NOT NULL * Datatypes * Agile Database Design * Database Refactoring * Identity Crisis ? …and others. Bring your votes, your debates, and your opinions. Help us figure out who’s right and who wrong…or less right.

Session Level: Intermediate

My second presentation is on career management.

Career Management for Data Professionals

Career Success in Data Management during Turbulent Times: A workshop on issues and ideas that today’s data professionals can do to build their careers and networking skills with other data management professionals. Workshop topics will include: • Demonstrating your expertise • Building a portfolio of your success stories • Getting others to sell your skills and business value • Building & extending your data management skill set • 10 Steps to highlighting you and your work Bring your thoughts, ideas, and experiences.

Session Level: Beginner

There are many great speakers at this event and it’s FREE for a full day of learning. Registration is still open, but it is common for these events to sell out before the event. Register now!

Please Clean Up…

Aug 13, 2012   //   by Karen Lopez   //   Blog, Data, Database, Database Design, Fun, Project Management, Snark  //  2 Comments




Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to by Email