Browsing articles tagged with " Database Design"

How Deep is My Non-Love? Nested Dependencies and Overly Complex Design

Dec 4, 2017   //   by Karen Lopez   //   Blog, Data Modeling, Database, Database Design, SQL Server, WTF  //  No Comments

Relational databases have this nifty concept of objects (just things, not code objects) being dependent upon other things.  Sometimes those dependencies exist due to foreign key constraints, others via references to other things.  One example of the latter can be found in VIEWs.  A database VIEW is an object that references TABLEs or other VIEWS.  Of course, if that VIEW references other VIEWs, then that view must reference TABLEs or another VIEW.  And it’s that or another VIEW that can get modelers into trouble.

I reviewed a database design that had massively dependent VIEWs.  How did I know that? I used a proper data modeling tool to look at all the dependencies for one central VIEW.  And this is what my data modeling tool showed me:

Data Model with hundreds of dependencies (lines) between a handful of objects (squares)

That diagram shows how ONE VIEW is related to a whole bunch of other VIEWs and TABLEs in that design.  In reviewing the model, I saw that many of the VIEWs appeared to be duplicates or had very high overlap of content with other VIEWs. 

How do VIEWs Like This Happen?

There are many reasons one would created a nested VIEW.  Like anything in a hierarchy, you could have objects that could be used independently and as part of a group on a regular basis.  But that only explains one level of a VIEW hierarchy (nest).   What about VIEWs that are nested dozens are levels deep?  And why would a database have such a complex design around one VIEW?  These are the most common reasons I run into bad practices with VIEWs:

  • Designers who don’t understand the massive performance loss for massively nested VIEWS
  • Designers who design for theory, not for real world data stories
  • Designers who have no idea they are referencing another VIEW when they design their VIEW
  • Designers who are following a worst practice of creating a VIEW for every report and every window in an application
  • Designers who don’t collaborate with other designers and create their own set of VIEWs and dependencies
  • Designers who are compensated for doing work fast and not well
  • Designers who use DDL to do design, therefore never seeing the complexity of their designs
  • Data Governance policies that let anyone create objects in a database
  • A team environment were “everyone is a generalist”.

I could go on.  While I can’t go into details here, in my review I recommended complete refactoring of this overly complex design.  It is my guess this complexity was contributing to performance problems experienced in this application.  I also recommended that professional designer was used to refactor other issues with the database design.  I have no idea if this happened.  But I doubted that this application was going to meet its large scale web application goals.

Why Am I Sharing This?

Because so many design issues I find in reviews have the same causes for performance and data quality issues I’ve listed above.  I find that not using a real data modeling or design tool is the main contributing factor.  There’s a reason why physical world architects and engineers use drawings and architectural diagrams. Models are also how they make modifications successful to the items they build.

Yes, physical objects are different than software/application/database objects. My position is that these latter objects need models at least as much as buildings and devices do.  We need tools to reverse engineer objects, to view the dependencies, to search, and to assess.  In other words, to model.  Engineering data solutions requires engineering tools like data modeling tools.  And, yes, data engineers to understand how to use those tools and how to model out the unnecessary complexity.

The Key to Keys at the North Texas SQL Server User Group – 17 March

Mar 15, 2016   //   by Karen Lopez   //   Blog, Data Modeling, Database, Database Design, DLBlog, Speaking, SQL Server  //  No Comments

I’m visiting Dallas this week to speak at the North Texas SQL Server User Group this Thursday.  I’ll be speaking about keys: primary keys, surrogate keys, clustered keys, GUIDs, SEQUENCEs, alternate keys…well, there’s a lot to cover about such a simple topic.  The reason I put this presentation together is I see a lot of confusion about these topics. Some of it’s about terminology (“I can’t find anything about alternate keys in SQL Server…what the heck is that, anyway”), some of it is misunderstandings (“what do you mean IDENTITIES aren’t unique! of course they are…they are primary keys!”), some of it is just new (“Why the heck would anyone want to use a SEQUENCE?”).

We’ll be chatting about all these questions and more on Thursday, 17 March at the Microsoft venue in Irving, Texas starting at 6PM.

Attendance is free, but you need to register at http://northtexas.sqlpass.org/ to help organizers plan for the event.

Don’t worry if you don’t know about SQL Server or don’t use it: this presentation will focus on some SQL Server specific features, but the discussion is completely portable to other DBMSs.

So many of us have learned database design approaches from working with one database or data technology. We may have used only one data modeling or development tool. That means our vocabularies around identifiers and keys tend to be product specific. Do you know the difference between a unique index and a unique key? What about the difference between RI, FK and AK? These concepts span data activities and it’s important that your team understand each other and where they, their tools and approaches need to support these features. We’ll look at the generic and proprietary terms for these concepts, as well as where they fit in the database design process. We’ll also look at implementation options in SQL Server and other DBMSs.

Hope to see you there!

Database Design Throwdown, Texas Style

Jan 21, 2016   //   by Karen Lopez   //   Blog, Data, Data Modeling, Database, Database Design, DLBlog, Events, Fun, Snark, Speaking, SQL Server  //  3 Comments

SQLSaturday #461 - Austin 2016

It’s a new year and I’ve given Thomas LaRock (@@sqlrockstar | blog ) a few months to recover and ramp up his training since our last Throwdown.  The trophies from all my wins are really cluttering my office and I feel back that Tom has not yet had a chance to claim victory.  So we will battling again in just a few days.

I’ll be dishing out the knowledge along with a handkerchief for Tom to wipe up his tears at SQL Saturday #461 Austin, TX on 30 January 2016.  This full day community-driven event features real database professionals giving free presentations on SQL Server and Data Platform topics.  All you need to do is register (again, it’s free) before all the tickets are gone.

Database Design Throwdown

Speaker(s):  Karen Lopez Thomas LaRock

Duration: 60 minutes

Track: Application & Database Development

Everyone agrees that great database performance starts with a great database design. Unfortunately, not everyone agrees which design options are best. Data architects and DBAs have debated database design best practices for decades. Systems built to handle current workloads are unable to maintain performance as workloads increase.Attend this new and improved session and join the debate about the pros and cons of database design decisions. This debate includes topics such as logical design, data types, primary keys, indexes, refactoring, code-first generators, and even the cloud. Learn about the contentious issues that most affect your end users and how to avoid them.

One of the other great benefits of attending these events is that you get to network with other data professionals who are working on project just like yours…or ones you will likely work on at some point.

Join us an other data pros to talk about data, databases and projects. And make sure you give a #datahug to Tom after the Throwdown. He’s gonna need it.

You’re Doing it Wrong: Generalizations about Generalizations

What Could Go Wrong Complicated Data Model Thumbnail Plus Darth Vader

I have a couple of presentations where I describe how generalized data modeling can offer both benefits and unacceptable costs.  In my Data Modeling Contentious Issues presentation, the one where we vote via sticky notes, we debate the trade-offs of generalization in a data model and database design.  In 5 Classic Data Modeling Mistakes, I talk about over-generalization.

Over the last 20 some years (and there’s more “some” there than ever before), I’ve noticed a trend towards more generalized data models.  The means that instead of having a box for almost every noun in our business, we have concepts that have categories.  Drawing examples from the ARTS Data Model, instead of having entities for:

  • Purchase Order
  • Shipping Notice
  • Receipt
  • Invoice
  • etc

…we have one entity for InventoryControlDocument that has a DocumentType instance of Purchase order, Shipping Notice, Receipt, Invoice, etc.

See what we did there?  We took metadata that was on the diagram as separate boxes and turned them into rows in a table in the database.  This is brilliant, in some form, because it means when the business comes up with a new type of document we don’t have to create a new entity and a new table to represent that new concept.  We just add a row to the DocumentType table and we’re done.  Well, not exactly…we probably still have to update code to process that new type…and maybe add a new user interface for that…and determine what attributes of InventoryControlDocument apply to that document type so that the code can enforce the business rules.

Ah! See what we did there this time?  We moved responsibility for managing data integrity from the data architect to the coders.  Sometimes that’s great and sometimes, well, it just doesn’t happen.

So my primary reason to raise generalization as an issue is that sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems.  Just because you engineered a requirement from a table to a row does not mean it is no longer your responsibility.  I’ve even seen architects become so enamoured with moving the work from their plate to another’s that they have generalized the heck out of everything while leaving the data quality responsibility up to someone else.  That someone else typically is not measured or compensated for data integrity, either.

Sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems

Alec Sharp has written a few blog posts on Generalizations. These posts have some great examples of his 5 Ways to Go Wrong with Generalisation.   I especially like his use of the term literalism since I never seem to get the word specificity out when I’m speaking. I recommend you check out his 5 reasons, since I agree with all of them.

1 – Failure to generalize, a.k.a. literalism

2 – Generalizing too much

3 – Generalizing too soon

4 – Confusing subtypes with roles, states, or other multi-valued characteristics

5 – Applying subtyping to the wrong entity.

By the way, Len Silverston and Paul Agnew talk about levels of generalization in their The Data Model Resource Book, Vol 3: Universal Patterns for Data Modeling book (affiliate link).  Generalization isn’t just a yes/no position.  Every data model structure you architect has a level of generalization.

Every data model structure you architect has a level of generalization.

I’m wondering how many of you who have used a higher level of generalization and what you’ve done to ensure that the metadata you transformed into data still has integrity?

Leave your recommendations in the comments.

Update: I updated the link to Alec’s blog post.  Do head over there to read his points on generalization.

What Your Database Security Design….

Jul 15, 2015   //   by Karen Lopez   //   Blog, Data Modeling, Database Design, DLBlog, Fun, Snark, WTF  //  3 Comments

…looks like to me.

Sure, you’ve got your own home-grown database security system all designed and working in development.  And then you ask me to confirm that it’s “safe”. I’ll tell ya “it’s safe as long as you don’t actually put any data in it”.

Hacking Database Design

Jul 14, 2015   //   by Karen Lopez   //   Blog, Data Modeling, Database, Database Design, DLBlog, Fun, Snark, WTF  //  4 Comments

DBDesignAdaptor

I get asked to help teams increase the performance of their database (hint: indexes, query tuning and correct datatypes, in that order)  or to help the scale it out for increasing workloads.  But when I open it up to take a look, I see something that looks more like this meme.

All those cheats, workarounds and tricks they’ve used are going to make the engine optimizers work harder, make the tuning of queries that much harder and in the end it’s going to cost so much more to make it “go faster” or “go web scale”.

Where are the nail clippers in your data models and databases?

27 June Big Challenges with Data Modeling: Data Modelers and DBAs

This month’s webinar (2PM EDT/11AM PDT) focuses on how DBAs and Data Modelers can collaborate better.  You know what it’s like: we both love our data and what databases to purr.  But what happens when our points of views differ?  Is performance more important that some data quality?  Should we all be using the same tools?  Who should be calling the shots?  Who should be “in charge”?  What deliverable should we be preparing?

Our webinars are very interactive.  We open the chat to everyone and offer a formal Q&A for asking questions and giving feedback to the panelists.  You can even join early to chat with us and we tend to stay about after to have “off the record” follow up conversations.  So this isn’t a lecture, but an online community event to chat about what we can do as a team to provide data and database solutions to our organizations.

We are sponsored this month by CA Technologies, the makers of CA ERwin®.  Thanks, ERwin team!

You’ll need to pre-register, but it’s free.  Do it now.  It will be fun.

 

About the Presentation

We invite you to join us in this monthly DATAVERSITY webinar series, “Big Challenges with Data Modeling” hosted by Karen Lopez. Join Karen and two or more expert panelists each month to discuss their experiences in breaking through these specific data modeling challenges. Hear from experts in the field on how and where they came across these challenges and what resolution they found. Join them in the end for the Q&A portion to ask your own questions on the challenge topic of the month.

This Month’s Panelists

  • Yanni Robel, Lead DBA, Database Operations Team at Amazon.com
    • Yanni (@yannirobel)  manages the Database Operations team at Amazon.com in Seattle, WA. She is a very active Senior Database Administrator as well, leveraging more than a dozen years of experience working with Microsoft SQL Server products and technologies.Yanni specializes in bridging communications between the business, the developers, the sysadmins, and the DBAs to help everyone understand what it is that they depend on. As an active member in the SQL community, previous developer, and parent, Yanni knows the importance of knowing what you don’t know and leveraging the collective knowledge of others and the community to learn and grow. Yanni is pleased to give back to peers, coworkers (past & present), and recruiters by sharing knowledge.
  • Javed Mati, Sales Director, Myriad Solutions UK Ltd
    • [Bio Pending]
  • Paul Agnew
    • Paul Agnew is an author, consultant and speaker with more than 20 years experience in the data modeling and data integration fields in many different industries. He is the co-author (With Len Silverston) of The Data Model Resource Book Volume 3: Universal Patterns in Data Modeling, which describes universal data modeling patterns. Mr. Agnew, has been a featured speaker at the prestigious DAMA International conference, and has presented papers on many different aspects of data management field.Paul has extensive, hands-on experience working in many different industries, including financial services, insurance and re-insurance, health care, healthcare informatics, sales and marketing and manufacturing. Paul is an expert in solution architecture, data architecture, data quality, master data management, data warehousing, big data and data governance.

RegisterNow

Pages:123»

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to www.datamodel.com by Email


Categories

Archive

UA-52726617-1