Browsing articles in "Blog"

You’re Doing it Wrong: Generalizations about Generalizations

Nov 17, 2010   //   by Karen Lopez   //   Blog, Data, Data Modeling  //  3 Comments

I have a couple of presentations where I describe how generalized data modeling can offer both benefits and unacceptable costs.  In my Data Modeling Contentious Issues presentation, the one where we vote via sticky notes, we debate the trade-offs of generalization in a data model and database design.  In 5 Classic Data Modeling Mistakes, I talk about over-generalization.

Over the last 20 some years (and there’s more “some” there than ever before), I’ve noticed a trend towards more generalizalized data models.  The means that instead of having a box for almost every noun in our business, we have concepts that have categories.  Drawing examples from the ARTS Data Model, instead of having entities for:

  • Purchase Order
  • Shipping Notice
  • Receipt
  • Invoice
  • etc

…we have one entity for InventoryControlDocument that has a DocumentType instance of Purchase order, Shipping Notice, Receipt, Invoice, etc.

See what we did there?  We took metadata that was on the diagram as separate boxes and turned them into rows in a table in the database.  This is brilliant, in some form, because it means when the business comes up with a new type of document we don’t have to create a new entity and a new table to represent that new concept.  We just add a row to the DocumentType table and we’re done.  Well, not exactly…we probably still have to update code to process that new type…and maybe add a new user interface for that…and determine what attributes of InventoryControlDocument apply to that document type so that the code can enforce the business rules.

Ah! See what we did there this time?  We moved responsibility for managing data integrity from the data architect to the coders.  Sometimes that’s great and sometimes, well, it just doesn’t happen.

So my primary reason to raise generalization as an issue is that sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems.  Just because you engineered a requirement from a table to a row does not mean it is no longer your responsibility.  I’ve even seen architects become so enamoured with moving the work from their plate to another’s that they have generalized the heck out of everything while leaving the data quality responsibility up to someone else.  That someone else typically is not measured or compensated for data integrity, either.

Alec Sharp has written a few blog posts on Generalizations. These posts have some great examples of his 5 Ways to Go Wrong with Generalisation.   I especially like his use of the term literalism since I never seem to get the word specificity out when I’m speaking. I recommend you check out his 5 reasons, since I agree with all of them.

1 – Failure to generalize, a.k.a. literalism

2 – Generalizing too much

3 – Generalizing too soon

4 – Confusing subtypes with roles, states, or other multi-valued characteristics

5 – Applying subtyping to the wrong entity.

By the way, Len Silverston and Paul Agnew talk about levels of generalization in their The Data Model Resource Book, Vol 3: Universal Patterns for Data Modeling book.  Generalization isn’t just a yes/no question.  Every data model structure you architect has a level of generalization.

I’m wondering how many of you who have used a higher level of generalization and what you’ve done to ensure that the metadata you transformed into data still has integrity?  Leave your recommendations in the comments.

Gift Me Baby One More Time: A Nifty Idea

Nov 17, 2010   //   by Karen Lopez   //   Blog, Fun, Reviews  //  1 Comment

A few weeks ago I was tweeting about my regular use of GoToMeeting.  Nearly all my projects for the last few years have been with geographically dispersed teams and being able to collaborate on documents, models and other artefacts in real time has been a vast improvement over conference calls where someone says “at the bottom of page 42”…”oh, you’ve printed on A4? I don’t know what page it would be then”…etc.

So the folks at GoToMeeting noticed that I tend to talk about tweet about their product, so they sent me a gift via (twitgift on Twitter).  What was unique about this gift was that they just sent it to my Twitter ID, @datachick, instead of ordering something to be sent to my address.  Their order sent a tweet to me that they had sent me a gift and twitgift contacted me also via Twitter to ask for my contact information so that they could send the gift to me. 

Tweet Gift Notice

Tweet Accepted gift

 Tweet Shipping Notice

In fact, the mailing label had just @datachick on it.

Tweet Shipping Label @datachickThe first obstacle for me for claiming my gift was that right now twitgift only ships to US addresses. I could have directed the gift to another address (regifting FTW!), but I really wanted to see how this all worked out.  So I provided the address of our friends Yanni and John and the delivered it to me when we were in Seattle for SQLPASS. Did I tell you what great friends they are?

In the tradition of unboxing blog posts, I bring you the twitgift…

Outside box

Packed in cotton balls, playing up the clouds/birds theme.  There was also confetti and stickers.  A gift within a gift…

Inside Box The inner box contained chocolate chip cookies, wrapped two by two.

Gift goodies cookies, confetti, stickersCookies and stickers, again, FTW.

As of the time of this writing, offers only the cookies as a gift, but they are hoping to expand to other merchant items. Their FAQ also says that they hope to expand to other countries soon.

I did not and do not receive any compensation for this post…other than the gift from GoToMeeting.  I chose to write about this post as a thank you for the thank you gift* to help a new Twitter-related service that I believe has a great idea.  I guess technically I did get something, but not for writing this post.  Got it?

If I ever do write a sponsored tweet or blog post, I will definitely let you know, right up front.

Do you have Twitter friends you’d like to send an affordable gift to?  Now you know how to do that without the creepy “where do you live” question to someone you only know through Twitter.

* I’m told that giving a thank you for a thank you means you (a metathankyou) end up in an infinite loop, so I won’t do that. But thanks….

Creating an ERwin Data Modeler Template

Nov 16, 2010   //   by admin   //   Blog, Data, Data Modeling  //  3 Comments

You probably have some default settings or shop standards (Yes, I have standards!) for your ERwin data models.  A time saving method for ensuring that you don’t have to set those defaults for every model file is to create a data model template, then use that when you create a new model.

The steps for creating and using a template:

  1. Open ERwin Data Modeler with a blank model (choose the type: Logical, Physical, or Logical/Physical)
  2. Set your defaults, create standard objects like domains or UDTs or UDPs – anything you’d like to appear or be enforced in future data models that make use of this template.
  3. Save the model as a template:

    Save As TemplateNotice that I’ve chosen to use underscores in the template file name.  ERwin seems to force the name to all upper case, so spaces or underscores will be needed to make the file name readable when you go to open again.  The file will be saved with a special extension, .erwin_tmpl .  This tells ERwin it is a template file.

    Once you have saved this, it will be available on the FILE, NEW menu option for use when you create new models:image 

  4. Once you have this open, you’ll need to remember to do an immediate FILE, SAVE AS… to save it as a regular .erwin file.

At this point you have created an ERwin template file and used it to create a model that will have the reusable objects you want to share as well as enforce some standards.

Setting the Default Datatype in ERwin Data Modeler

Nov 15, 2010   //   by Karen Lopez   //   Blog, Data, Data Modeling  //  1 Comment

Out of the box, CA ERwin Data Modeler has a default datatype of char(18).  This default is used when you create an attribute or column in a data model and don’t specifically assign a datatype to that attribute.

I don’t know this for a fact, but I’m guessing the developers of ERwin chose this odd length so it would stand out as being a bit odd, reminding modelers to choose a “real” datatype and length at some point.  You can see in the figure below that I’ve added an attribute named test to the MOVIE entity, but did not assign a datatype, so ERwin gave it the default datatype of char(18).

MOVIE Entity

However, one the problems with this is that it is still not odd enough.  One of the tips I picked up while speaking at the BeNeLux Modeling User Group was that you can set the default datatype in ERwin to something especially funky, char(-1).  That datatype is certain to:

  • Stand out in both the data model diagram and any DDL
  • Not generate valid DDL
  • Fail any DDL testing
  • Make a DBA really mad
  • Motivate database designers to set real datatypes

Sounds to me like the perfect datatype for being a default.

Defaults in ERwin are model-specific, meaning that you set them on a .erwin file basis.  To do this in ERwin 7.3, you must first open your model, then go to MODEL, MODEL PROPERTIES:


MODEL, MODEL Properties


This will bring up the properties dialog.  Choose the DEFAULTS tab.

Defaults Tab

In the Default datatype section above, you can set the default datatype to char(-1) or whatever makes sense for you project.  I’ve seen some people make a good guess at the most generic datatype, something like varchar(50), but I prefer to assign an oddball datatype so that I can find these “unassigned” attributes and columns easily.  Note that you can also set separate defaults for the logical and physical models.  Since I’m choosing to set the default to an invalid default, I set them the same.

Having said this, I rarely create attributes manually. I usually create attributes via the domain approach, where domains have a “best guess” standard datatype that makes sense for the type of data contained in them.  For instance, a domain for email address would normally be varchar(254) in a model, since that is the Internet standard for email addresses.  However, attributes do get created without a domain all the time, so it makes sense to have a default datatype that meets your needs.

Having set my default datatype in my model to char(-1), when I create a new attribute and don’t assign a datatype it will be set to the default as shown in the MOVIE entity below:

Movie Entity The key to using an invalid datatype is to ensure that you’ve circled back and set them all to a real datatypes and lengths before you derive a new model from them, generated DDL, or compared to a database.

You can use a report or the ERwin Query Tool to find all the attributes that have this datatype:

ERwin Query Tool

More on the ERwin Query Tool in another blog post soon.

Getting SQL Server Help from the Best, Simply the Best

Nov 15, 2010   //   by Karen Lopez   //   Blog, Database, Social Networking  //  2 Comments

"Tina Turner" at SQLPASS Do you have a question about SQL Server database design or how things work?  Are you experienced in another RDMBS and are trying to apply a DB2/Oracle/Whatever design principle to SQL Server?

If you have a Twitter account you can post your question (in 140 characters or less) with the #SQLHelp hashtag and most likely someone will jump in to answer the question.

Thousands of SQL Server experts (many who do not dress as Tina Turner) want you to be a SQL Server Success, so they monitor this hashtag to answer your question or give you tips on how to address an issue.  Of course the format of Twitter means that you probably won’t want to ask about the meaning of life (A:42) or big complex issues, but even then there’s probably a blog post that someone can point you to.

Some recent examples of questions and answers to this tag:


I’ve made use of this “service” on Twitter a few times when I needed help with a physical design consideration.  It’s great that the SQL Community is willing to help like this.  Perhaps we should be posting questions under hashtags like #ERwinHelp and #ERSStudioHelp, too.

Computer Engineer Barbie attends the PASS Women in IT Lunch

Nov 15, 2010   //   by Karen Lopez   //   Blog, Fun  //  1 Comment

In addition to Buck and Lil Buck, Computer Engineer Barbie attended the PASS WIT Luncheon with me.

Computer Engineer Barbie at PASSWITBuck Woody and Lil Buck





I’m SQLPASSed out – My Trip Report from Seattle

Nov 15, 2010   //   by Karen Lopez   //   Blog, Database, Professional Development, Social Networking, Speaking  //  2 Comments


IMG_0982[1]I spent last week in Seattle attending the Professional Association for SQL Server (PASS) Summit (as if you could not tell from my tweet stream). Like most successful conferences, there was more to it than a bazillion sessions.  There were so many events: vendor events, unofficial events, pre-cons, post-cons, impromptu events and more.  I feel as if I couldn’t stretch myself far enough to take advantage of all them.  Did I mention there were sessions, too?

We arrived in Seattle on Friday so that we could do our obligatory cross-border shopping at our favourite not-available-in-Canada retailers: Fry’s Electronics, Kohl’s, Target and Trader Joes.  It is my goal to shop at every Fry’s in the US; so far I have tackled more than half of them.  We didn’t buy much there, but its like visiting a gadget museum.  Geekiness for the win.  We spent time on Saturday shopping as well.  We were doing our best to help stimulate the US economy.

We also spent time visiting with friends, including Yanni Robel, John Robel, Corey Smith and Eva Smith.  We know Yanni and John from the SQLCruise.  We know Corey and Eva from numerous DAMA events.  But we built these friendships via the conversations we have via Twitter and Facebook. 

On Sunday I participated in Freecon, a special event for bloggers and social media celebrities from the SQL Server world.  This event, hosted by Brent Ozar, covered topics such as developing content, dealing with plagiarism, monetizing writing, and consulting skills.  I’d love to see this event develop into a writers’ guild for those of us putting content out there. 

IMG_1043[1] Monday was the opening session for the PASS Summit.  There was a Quiz Bowl “panel”, where SQL “elites” played a Jeopardy-like trivia game.  I’d love to see Enterprise Data World (EDW) offer a similar opening session event.

I’m not going to list all the sessions I attended, but I wanted to point out that there were several sessions that focused on data architecture topics.  My own session on “Starting with More than a Blank Page” focused on how to best adopt data model patterns/industry standard data models.  A session by Louis Davidson covered Intro to Database Design.  Normalization is a tough topic to cover in just over an hour and I thought he did a great job helping people visualize why the normal forms exist. 

It took me a while, but I managed to meet up with Neil Buchwalter, Product Manager of CA ERwin Data Modeler.  CA is a founding organization of PASS and Neil has a spot on the PASS Board.

IMG_1036[1]Wednesday was SQLKilt day.  What a hoot. There were many attendees dressed in kilts,  mostly just because they could, but also in honour of the Women in IT.  In fact, Jen McCown of MidnightDBA fame produced t-shirts to link the kilts to WIT.  That’s Sean McCown there, sporting a kilt and the t-shirt that asks “What are you doing” for women in IT.

Did I mention that several people found new jobs/gigs right at the conference?  There was a job board in the exhibits area, but most of these “connections” happened ad hoc during the lunches, coffee breaks and informal get-togethers that surrounded the event.  As I tweeted, #Network to #Getwork.

Network to Getwork

As I have blogged before, one of the best things about attending SQLPASS was that Rob and I knew hundreds of the attendees, even though it was my first time attending this event.  The connections we made on Twitter, Facebook and LinkedIn prior to the conference paid off in spades for us during the event.  Sure, I was a first timer, but it wasn’t a bunch of first conversations. For me, this is the real power of tying together virtual and in-person events.

I’m bringing home a DVD of the SQL Server new version codenamed “Denali”.  I’m looking forward to getting it installed, then moving on to all the evaluation licenses I received for third party tools.

I will definitely be back for next year’s PASS Summit, again in Seattle in October 2011 and SQLRally 2011, located in Orlando in May.  I want to continue learning about SQL Server and making connections with industry leaders in this space. 

Blog Categories

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to by Email



UA-52726617-1 Secured By miniOrange