My March Dataversity webinar will focus on one of the most challenging aspects of data governance and data modeling in 2016 — working on agile and other modern development method projects while maintaining data stewardship, data quality and data protection as project teams are sprinting by….with or without you.
Most of us learned data modeling via a waterfall-driven methodology lens. Yet Agile and other modern development methods have for the most part assumed that data governance is an anti-pattern to just getting things (software) done. Well look at questions such as:
- Are Agile and Data Governance Enemies?
- How can we get stuff done AND get systems delivered?
- And what do we do about existing systems delivered without data governance attention?
We’ll also look at how data modeling fits in the answers to these questions.
If you join us live during the webinar, you have the opportunity to engage with all the other attendees, share your stories, ask your questions and be part of the discussion. If you can’t attend the live event, you can still register to get the slide deck and links to the recording when they are available.
You can join a bit early to listen in to the pre-show and stay a bit later when I do off-the-record Q&A. Just between the few of us.
I get asked to help teams increase the performance of their database (hint: indexes, query tuning and correct datatypes, in that order) or to help the scale it out for increasing workloads. But when I open it up to take a look, I see something that looks more like this meme.
All those cheats, workarounds and tricks they’ve used are going to make the engine optimizers work harder, make the tuning of queries that much harder and in the end it’s going to cost so much more to make it “go faster” or “go web scale”.
Where are the nail clippers in your data models and databases?
22 May 2014, 2PM EDT
It’s May, which sets this former Hoosier thinking of racetracks and Indy cars. I’m also a runner and that means I’m always thinking about pace and timings…and feeling guilty about not training hard enough.
This got me musing about how data modelers can speed up the data modeling process — not just during a development projects, but at all points in our work day. So let’s have a discussion about
In this month’s webinar, we’ll talk about:
- The Need for Speed
- Sprints, marathons and training
- Race cars, horses, carts, and feet
- Qualifiers and Races
- Pace cars
- Backseat drivers
- Rules, tickets and enforcement
- Fads, gadgets and automation
- Red, yellow, green and checkered flags
- How do you know when to stop racing?
Joining me in the discussion will be two wonderful panellists:
Donna Burbank, VP, Information Management Services at Enterprise Architects ( @donnaburbank )
Carol Lehn, MDM Database Designer at PepsiCo ( @lehnca )
And as usual, our attendees will have the opportunity to participate via chat and Q&A as our final panellist.
Sure, data modeling is taught in many training classes as a linear process for building software. It usually goes something like this:
- Build a Conceptual Data Model.
- Review that with users
- Build a Logical Data Model
- Review that with users
- Build a Physical Data Model
- Give it to the DBA
- GOTO step one on another project.
And most team members think it looks like this:
Training classes work this way because it’s a good way to learn notations, tools and methods. But that’s not how data modeling works when the professionals do it on a real project.
Data modeling is an iterative effort. Those integrations can be sprints (typical for my projects) or have longer intervals. Sometimes the iterations exist just between efforts to complete the data models, prior to generating a database. But it’s highly iterative, just like the software development part of the project.
In reality, data modeling looks more like this:
This is Data Model-Driven Development. The high-level steps work like:
- Discuss requirements.
- Develop data models (all of them, some of them, one of them).
- Generate Databases, XML schemas, file structures, whatever you might want to physically build. Or nothing physical, if that’s not what the team is ready for.
These, again, are small intervals, not the waterfall steps of an entire project. In fact, I might do this several times even in the same sprint. Not all modeling efforts lead to databases or physical implementations. That’s okay. We still follow an iterative approach. And while the steps here look like the same waterfall list, they aren’t the same.
- There isn’t really a first step. For instance, I could start with an in-production database and move around the circle from there.
- We could start with existing data models. In fact, that’s the ideal starting point in a well-managed data model-driven development shop.
- The data models add value because they are kept in sync with what’s happening elsewhere – as a natural part of the process, not as a separate deliverable.
- The modeling doesn’t stop. We don’t do a logical model, then derive a physical model, throwing away the logical model.
- Data modelers are involved in the the project throughout its lifecycle, not just some arbitrary phase.
- Modeling responsibilities may be shared among more roles. In a strong data model-driven process, it is easier for DBAs and BAs to be hands-on with the data models. Sometimes even users. Really.
By the way, this iterative modeling approach isn’t unique to data models. All the models we might work on for a project should follow this project. Class diagrams, sequence diagrams, use cases, flow charts, etc. should all follow this process to deliver the value that has been invested in them. That’s what Agile means in “the right amount of [modeling] documentation”. Data model driven development means that models are “alive”.
If you are a modeler and re-enforcing the wrong perceptions of needing a waterfall-like approach to data modeling, you are doing it wrong. You might be causing more pain for yourself than anyone else on your project.
Data Models aren’t just documentation checklist items. They model the reality of the living, breathing systems at all points in its life. They deliver value because they are accurate, not because they are “done”.
I’ll be joining host Graeme Simsion (@graemesimsion) and panelists Terry Buino, John Giles, and Chris Woodruff for a lively and (I’m hoping) contentious discussion about Data Modeling in an Agile Environment. This free webinar event will be hosted by Dataversity.net as part of their monthly series on data management issues.
We invite you to join us in this monthly DATAVERSITY webinar series, “Big Challenges with Data Modeling” hosted by Graeme Simsion. Join Graeme and two or more expert panelists each month to discuss their experiences in breaking through these specific data modeling challenges. Hear from experts in the field on how and where they came across these challenges and what resolution they found. Join them in the end for the Q&A portion to ask your own questions on the challenge topic of the month.
We four panelists come from a variety of backgrounds. Two of us are Microsoft MVPs, one just wrote a book on Agile Data Modeling and another calls himself a born-again agilist. Graeme is always the stimulating and
controlling jocular host at these events…he’d have to be to manage the characters he has to herd in just a short period of time.
Many data architects and modelers tell me that they can’t or won’t work on agile projects. Or that they’ve heard that there are no data models in agile approaches. Worse, they’ve attended presentations by certain industry pundits who have been so anti-architecture that they don’t understand why anyone would attempt this. But it’s not that way in practice. Most of my projects over the last few years have been agile or SCRUM. So I’m going to bring my experiences and stories to discussion, along with tips that I have for working on a software-focused delivery methodology.
Find out whether agile is your friend or frenemy.
All you have to do is register to attend. It’s free, too.
My friend Argenis Fernandez (blog | @DBArgenis) is the host of this TSQL Tuesday and he’s chosen the topic of Jack of All Trades, Master of None. This is one of my favourite discussions about the IT industry. My interest stems from the Agile Manifesto that says:
The best architectures, requirements, and designs emerge from self-organizing teams.
This one statement, which sounds wonderful to me, is often interpreted to mean:
Agile teams must be made up of generalists: no specialists allowed.
Another interpretation is that anyone on the team must be able to do any job on the project. Most rational agile teams don’t take that extreme interpretation. Or at least they never repeat that mistake more than a couple of times. They learn that having people who understand things at more than a surface level will make work go faster and with less rework.
Specialists vs. Generalists…or is is Specialists vs. Speed?
This is a very strong belief of one prominent Agilisto who is extremely vocal about this principle. His articles and post are full scathing attacks on people who specialize. Sure, he allows people to have a couple of specializations just for fun, but he’s clear that specialists impede the speed of a project, hold back production and generally lead to diva princesses, his name for me when I appear in debates with him. There’s also a prominent consultancy that tells clients that no one with the words "analyst, audit, architect, administrator" will be allowed to speak to anyone on the project –teams must be just business users and team members (developers). This consultancy is also adamant that only a developer and a clerk be able to decide on requirements and implementation issues.
I’ve been on those projects. We ended up with the clerk and the generalist designing and implementing a brand new way of doing accounting, with a brand new chart of accounts. Just for one project. They couldn’t get into QA testing because their solution was not going to pass audit, integration, security or generally accepted accounting practices reviews. But dang, they sure did it fast. And their new way of doing accounting led to inaccurate accounts, but that didn’t matter. They were fast. However, they were sent back to do it all over again. It was painful for everyone.
These are the types of things an architect, auditor, administrator and analyst would have slowed them down with by pointing out gaps in their solution. But dang, they sure did it fast.
Over Specialization and Over Generalization
I do recognize that people can be over-specialized. You see those people all over…if you ask a question, their answer always involves the same solution or tool. They can’t see any other way of doing something than what they know. I also know people who are fabulous at many, many things on my projects. But in my opinion, the all generalist meme really translates to:
Our team needs to be staffed with people who are specialists in everything.
Think about that for a few seconds. What would that mean, just on your current project? Someone who knows these topics at a professional level: database, network, security, design, data, storage, development, coding, planning, estimation, capacity planning, estimation, UX, reporting, analytics, scalability, reliability, availability, quality, testing, compliance, legislation, localization, globalization, privacy, accessibility for people with disabilities, reporting, methodology, development environments.
Now insert the words for all the technologies used on an enterprise system. All of them. We need professional level people to work with all of them. Note that professional doesn’t mean expert; it means someone who can get something done with minimal supervision.
Then insert all the words for all the activities your entire enterprise does. Do you have a few hundred words? A few thousand? Imagine trying to hire someone who meets all those criteria at the professional level? Even if you could find that person, which I don’t believe you can, how much are you going to have to pay her? Does your company have enough spare zeros hanging around to do that? [tweet quote] What are they going to do when they need 100 more people, just like that?
Why I Hire Specialists who Make Great Generalists
So what does this mean? I want to hire people who have a broad understanding of IT development. I want them to have a good literacy-level understand of most the things we do and use. If they don’t have that knowledge, then they need to be able to pick it up as we go. But I need specialists. I don’t have time on my projects to train and mentor someone one who is going to build the database on the difference between foreign keys, alternate keys, surrogate keys, primary keys and Florida Keys. Now if someone else on the team wants to know that, I’m happy to point to resources where they can find that out. However, my database designer needs to be able to work under minimal supervision to be able to do that. In fact, I’d prefer that they know how implement in it in our specific technology. They should be able to rely on external resources, but they shouldn’t have to sit at their desks with a (virtual) book open before them showing them how to do every step. That’s a recipe for disaster. It will take longer and be more error prone.
Be Both a Specialist and a Generalist
How can you do that? By ensuring that your professional development plan (you have one, don’t you?) includes activities that strengthen both your specializations and your overall technical and non-technical skills. That means you read about things outside your specialization. You actually sit through a DAMA meeting or a SQLSaturday session that isn’t part of your "today job". You expand the depth and breadth of your knowledge. Heck you even attend a session on professional development. Then you make sure you plan gets executed, even if it means paying for your own training or getting up early on a Saturday to attend free training at a SQLSaturday. Maybe it means starting up a series of brown bag lunches at your company, where every group takes turn presenting 20 minutes on their favourite topic for other groups.
If you are a data architect, it means learning more about process modeling, database implementations and development tools in your shop. If you are a DBA, it means learning more about data modeling and data compliance. If you are a developer, it means you learn more about all of the above. It’s up to you. You need to take care of both your inner generalist and inner specialists.
Generalists are great…in general. You can’t master everything, no matter what people tell you. But your specialization won’t be much value if you can’t apply that knowledge within the context of the overall project. You need to be both.
I’ve been doing this data modeling stuff for a really long time. So long that I just put "20+ years" on my slides…and I’m well past that 20. However, having done this for a long time does not qualify me as an expert. What qualifies me is that I know how to adapt my tools, methods and approaches as development tools and methods change. What worked for projects in 1986 doesn’t necessarily work now. Not only do we have radically more advanced features in our tools, we have many more platforms to support. We have a greater variety of development approaches. Our architectures are much more distributed and much more complex than they were in the nineties.
Back in the mid-eighties I worked in the US Defense consulting business. The somewhat serious joke was that consultants were paid $1 million for each inch of documentation they delivered. It didn’t matter what was in the models or whether they were correct; it only mattered how many reams of paper were delivered. That sort of "all documentation is great" mindset still exists well into 1999, long past its usefulness. The world has changed and we data architects need to find a replacement for the publication fashion equivalent shoulder pads, thongs, leggings, skinny ties, and lace fingerless gloves.
Those differences mean that the deliverables we produce need to be radically different from what they were in 1999. Our team members are no longer going to open up a 175-page prose document to find out how to use and understand the data model or database. They don’t want all that great metadata trapped in a MS Word document or, worse, buried in an image. The can’t easily search those and they can’t import that knowledge into their own models and tools.
As much as I don’t want this to be true, no one wants to read or use massive narratives any longer. Sure, if they are really, really stuck a quick write up might be helpful, but sometimes a quick video or screencast would be faster and easier to produce. If you are spending days or weeks writing big wordy documents, the sad truth is there is a high likelihood that absolutely no one except your mother is going to read or appreciate it…and she isn’t actually going to read it.
I’ve significantly changed what I produce when I release a data model. I constantly monitor how these deliverables are used. I look at server logs or general usage reports to continually verify that the time I’m spending on data-model related deliverables is adding value to the project and organization. The main way I gauge usefulness of deliverables is by how urgently my team members start bugging me to get them up on the intranet where they are published.
Here are my top 10 recommendations for approaching your data model deliverables:
- Get formal, lab-based hands-on training for staff. Or use staff that are already trained in the tools and version of the tools they are using. You may be missing out on features in the tools that make publishing data models much easier that the methods you are currently using. I had a client who was struggling to support an elaborate custom-developed application that they didn’t really know how to use or maintain. It used a deprecated data model format to build an HTML-based report of the data model. Sound familiar? Almost all tools provide a feature to generate such reports in seconds.
- Move away from very large, manual documentation. Think in terms of publishing data and models, not documents. Prose documents are costly to produce and maintain. They do more harm than no documentation at all when they are not maintained. The are difficult to search, share, and use. This is not how the vast majority of IT staff want to consume information. Team members want their data model data (metadata) in a format that is consumable, that can be used on a huge variety of platforms and that is interactive, not locked only in a PDF.
- Know how long it takes to produce every deliverable. Having this information makes it easier for you and your team to prioritize each deliverable. I once had a project manager ask if cutting back the automatically generated reports could save time for getting data modeling completed. I could show her that the total time to put the documents on the intranet was only about 5 minutes. My document production data also helps other modelers estimate how long a release will take to produce.
- Stop exporting data for features that can be done right in the tool. Move data model content that is locked in MS Word documents into the models or stop producing it. Exporting diagrams as images and marking them up with more images means all that mark-up is unsearchable. It also means that every change to the data model, even a trivial one, triggers a new requirement to recreate all those images. Modern tools have drawing and mark-up features in them. Cost/benefit of exporting and annotating outside the modeling tool means you’ll always be spending more than your "earn". You’re creating a data model deficit.
- Stop producing deliverables that require a complete manual re-write every time there is a new release. Unless, of course, these sorts of things are HIGHLY valued by your team and you have evidence that they are used. I’m betting that while people will say that they love them, they don’t love them as much as getting half as much data architecture work done.
- Focus on deliverables that are 100% generated from the tools. Leave manually developed deliverables to short overviews and references to collaborative, user-driven content. Wikis, knowledgebases, and FAQs are for capturing user-generated or user-initiated content.
- Focus on delivering documentation that can be used by implementers of the model. That would be Data Architects, DBAs, and Developers. Probably in reverse order of that list in priority. Yes, you want to ensure that there are good definitions and good material so that any data architect in your company can work with the model even after you’ve won the lottery, the largest number of people who will work with the models are business users, developers, and DBAs. Think of them as your target audience.
- Automate the generation of those deliverables via features right in the tools – APIs, macros, etc. Challenge every deliverable that will cost more to produce once than the value it will deliver to ever single member.
- Move supporting content that is locked into MS Word documents (naming standards, modeling standards and such) to a collaborative environment like a wiki or knowledgebase. Don’t delay releases just to deliver those items. Theses formal deliverables are important, but from a relative value point of view, they should not hold up delivering data models.
- Focus now on making a process that is more agile and less expensive to execute while also meeting the needs of the data model users. If it is taking you more time to publish your data architectures than actually architecting them, you are doing it wrong.
While data modeling standards have stayed relatively stable over the last 10-20 years, our methods and tools haven’t. If you are still modeling like it’s 1980, you stopped delivering value sometime around 1999.
Subscribe via E-mail
- September 2016
- August 2016
- June 2016
- May 2016
- April 2016
- March 2016
- February 2016
- January 2016
- December 2015
- November 2015
- September 2015
- July 2015
- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- October 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- September 2010
- August 2010
- July 2010
- February 2009