Data Modeling is Iterative. It’s not Waterfall

Mar 7, 2014   //   by Karen Lopez   //   Blog, Data, Data Governance, Data Modeling, Database Design, DLBlog  //  7 Comments

Sure, data modeling is taught in many training classes as a linear process for building software.  It usually goes something like this:

  1. Build a Conceptual Data Model.
  2. Review that with users
  3. Build a Logical Data Model
  4. Review that with users
  5. Build a Physical Data Model
  6. Give it to the DBA
  7. GOTO step one on another project.

And most team members think it looks like this:


Training classes work this way because it’s a good way to learn notations, tools and methods.  But that’s not how data modeling works when the professionals do it on a real project.

Data modeling is an iterative effort. Those integrations can be sprints (typical for my projects) or have longer intervals. Sometimes the iterations exist just between efforts to complete the data models, prior to generating a database.  But it’s highly iterative, just like the software development part of the project. 

In reality, data modeling looks more like this:

Data Model Driven Development - Karen Lopez

This is Data Model-Driven Development.  The high-level steps work like:

  1. Discuss requirements.
  2. Develop data models (all of them, some of them, one of them).
  3. Generate Databases, XML schemas, file structures, whatever you might want to physically build. Or nothing physical, if that’s not what the team is ready for. 
  4. Refine
  5. Repeat.

These, again, are small intervals, not the waterfall steps of an entire project.  In fact, I might do this several times even in the same sprint. Not all modeling efforts lead to databases or physical implementations.  That’s okay.  We still follow an iterative approach.  And while the steps here look like the same waterfall list, they aren’t the same.

  • There isn’t really a first step.  For instance, I could start with an in-production database and move around the circle from there.
  • We could start with existing data models. In fact, that’s the ideal starting point in a well-managed data model-driven development shop.
  • The data models add value because they are kept in sync with what’s happening elsewhere – as a natural part of the process, not as a separate deliverable.
  • The modeling doesn’t stop.  We don’t do a logical model, then derive a physical model, throwing away the logical model.
  • Data  modelers are involved in the the project throughout its lifecycle, not just some arbitrary phase. 
  • Modeling responsibilities may be shared among more roles.  In a strong data model-driven process, it is easier for DBAs and BAs to be hands-on with the data models.  Sometimes even users.  Really.

By the way, this iterative modeling approach isn’t unique to data models.  All the models we might work on for a project should follow this project.  Class diagrams, sequence diagrams, use cases, flow charts, etc. should all follow this process to deliver the value that has been invested in them.  That’s what Agile means in “the right amount of [modeling] documentation”. Data model driven development means that models are “alive”.  

If you are a modeler and re-enforcing the wrong perceptions of needing a waterfall-like approach to data modeling, you are doing it wrong.  You might be causing more pain for yourself than anyone else on your project.

Data Models aren’t just documentation checklist items.  They model the reality of the living, breathing systems at all points in its life.  They deliver value because they are accurate, not because they are “done”.


  • I like the diagram very much! It’s pretty naive to think that IT development is waterfall.Even when it was supposed to be, it wasn’t. This has many reasons. It is mostly because IT development is in discovery mode at most times. Requirements are uncovered and addressed by developers at all times in the project life-cycle. Data modelers need to embrace this iterative approach. Keeping in sync and current is a top priority for our models to have value and not just be a picture hanging on a cubicle wall.
    Tom Bilcze recently posted..The 5 W’s and the conceptual data modelMy Profile

    • Thanks, Tom. I agree. When I come on to a project, I have to clear this up, sometimes even with the modelers. I want to blog some more about what this means for the modeler in their day-to-day jobs.
      Karen Lopez recently posted..Rob Ford’s Achilles Data ManagementMy Profile

  • How do you see in real life the work flow for Data Model development for a Data Warehouse data model go when it has to be done at the same time as the source system data model(various scenarios, changes to existing or a brand new data model). We have tried approaches where we were a month or two behind source, so that their design is stable and we have some data(from their QA) to increase our understanding of source data models. But the timelines did not work for our business users needs who needed reports when source system changes were launched into production. We now follow with source systems and that causes so much re-work in the Data Warehouse model design and frustrates everyone. Plus the risk of assumptions about source data models not turning out true because source developers did not understand the source Data Modelers design intent.

    Add Agile to the mix, things get more complicated. We have atleast for now not experienced a project where source is doing work in Agile and so is the Data Warehouse model work.

  • Great article Karen,
    I do agree that data modeling is and always has been iterative, given the fact that requirements do change and due dates are looming. I’ve also participated in Waterfall modeling when we get involved very early. The requirements have not be started, but there is a high level Business Justification or Defintion Document. We use that to create the CDM, but then it may be a couple of months before we get the details needed to complete a first draft LDM. That puts us in Waterfall mode. But that is then short-lived and we go into the Iterative mode as requirements continue to evolve. The physical designers are often creating the PDM at the same time that the LDM is being created. In a few cases, they wait for the LDM and then forward engineer the PDM.

    When resources are limited, we have to move on to the next two projects and miss the opportunity to work closer with the physical designers to see the end result of what was implemented. Whether Waterfall or Iterative, through data modeling we gain additional business knowlege that helps in understaning requirements and communicating with our business partners when the next opportunity presents itself.

    • I’d say that’s a typical approach. I do expect, though, that data modelers still have roles on the physical design phases. I think the process is broken when they don’t. Builders are too motivated to break the logical design and requirements when there are no architects measuring and monitoring their work.

      I realize that this is often a resourcing issue, but that just means that there’s a resource issue that management needs to fix.
      Karen Lopez recently posted..Your #1 Job….My Profile

  • Yes, I agree.

  • […] Data Modeling is Iterative. It’s not Waterfall […]

Leave a comment

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to by Email