I don’t usually blog about politics here, but when bad data management and bad people mix, it’s time for a post…
Toronto Star reporter Robyn Doolittle has reported that my world famous (infamous?) mayor Rob Ford, may have lost all the data from his previous election campaign.
Councillor Doug Ford claims the mayor’s former campaign manager, Nick Kouvalis, is refusing to turn over valuable 2010 voter database information.
Kouvalis, who also served for a time as Ford’s chief of staff, is now working for the John Tory campaign. The man who actually ran the database aspect of Rob Ford’s first mayoral campaign says the Fords were given everything right after the election.
“I made two DVDs with all of the data from the campaign — entire voters’ list with contact info, supporters, non-supporters, signs, volunteers, all voter contact records, etc. — and gave them both to Doug Ford,” said Conservative data expert Mitch Wexler.
If it is in fact gone, it would be a serious blow to the mayor’s re-election hopes. Numerous political strategists involved in the 2010 race say what helped set Ford apart was that voter intelligence, much of it collected by Ford himself over his 10 years as a councillor in Etobicoke.
I’ll try to not to comment on the use of the term “voter intelligence.” Just in case you’ve been hiding under a rock (not a crack rock, I presume) our mayor has been in a heap of trouble (NSFW) since he was elected. Actually, even before he was elected. This isn’t a partisan thing when I say I’m not a fan of my mayor. This is all about not respecting his behaviour. But back to the data thing….
Where Rob Ford’s Data Management Went Wrong
Well, pretty much every single thing he has done has been wrong. At least it feels that way. And sounds and views that way. But if we focus on today’s issue of his reported data loss, I’m thinking he messed up by:
- Giving source data to an external party without a backup. When Ford handed over those record boxes full of 10 years of handwritten notes, he lost his source data. All data deserves protection, even handwritten notes. We in IT sometimes ignore paper data, but we shouldn’t. It’s still data.
- Storing personally identifiable and sensitive data insecurely. I’m betting those file boxes where sitting next to his desk. Sure, his desk is in city hall and I’m betting they have decent physical security. But file boxes aren’t exactly locked cabinets. They also have a way of getting disposed of incorrectly.
- Outsourcing data and database management without getting copies of data on a regular basis. It’s sort of crazy to hand over critical data to a third party for management and not insisting that you get copies of it on a regular basis. Even if your relationship is strong, people leave companies or they stop working for you (as we see in Rob Ford’s case). Have you been getting data, models, code, documents from your vendors on a regular basis? You should.
- Using data collected for a specific reason for another reason. Allegedly this data was collected by Ford in fulfilling his duties as city councillor. I’m not sure whether that means it can be used for fundraising and vote elicitation. Sounds off to me. I wonder if all those people who called Ford asking for help with their trash collection and dead raccoon needs knew they were being added to a campaign database.
- Waiting until he needed the data to ask for it. It appears that the Ford brothers waited until it was time to campaign to play “who has the data”. It would be entirely possible (maybe even legally or ethically required) for the outsourcer to destroy all copies of the data when their work ended and the data was given back to Ford.
- Getting copies of data and losing them. It’s reported that the data was provided to Rob Ford’s brother, Councillor Doug Ford. But it appears he lost the data. That’s not good. Where are those DVDs now? Again, this indicates that private and sensitive data probably wasn’t treated with the respect it deserves.
As data professionals, I believe it’s our job to ensure that all data is properly managed and protected. That means monitoring paper and digital data, ensuring that good data management practices are followed, and ensuring that these practices are followed even when we outsource these activities. Please, go find out if anyone in your organization is doing a better job than Rob Ford is. You might be shocked at what you find.
Today is Valentine’s Day in many parts of the the world. That means either you are looking forward to a happy day full of fun and a night full of …fun… or you are are planning on catching up with Frank Underwood on Netflix. Both sound great to me.
Last year I wrote about 5 Naughty and Nice Ways to Love Your Data. This year I’m going to focus on ways you can romance your data for a stronger, more lasting relationship. So I’m assuming in the past you’ve followed my advice and have long since left the honeymoon phase of your data coffee dates. But where are you now? Are you starting to feel like maybe you need some more passion with your bits and bytes? I’m here to help.
1. Tell your data you love it. Often.
Heck, even show it you love it. Maybe one of the reasons your data has let itself go is that you haven’t told it how much you love it. Do you even remember the things you used to say to woo your data when you first met? Do you have actively managed data models: conceptual, logical, and physical? Do you give your database objects great names? Do you keep good metadata about this data? Do you follow data model-driven development? If you did all these in your early years of your relationship, are you still doing all that now? Are you doing all this in a modern way, not just the way you did it in 1980? Do you just talk a good game, but fail when it comes to actively showing it love?
Some day, when I’m awfully low,
When the query is slow,
I will feel a glow just charting you
And the way you look tonight.
You’re lovely, with your axes so true
And your bars so blue
There is nothing for me but to report you,
And the way you look tonight.
With each crow’s foot your normalization grows,
Tearing my pages apart
And that CHAR that wraps your text,
Touches my foolish heart.
Yes you’re lovely, never, ever refactor
Keep that structured charm.
Won’t you never change it?
‘Cause I love you
Just the way you look tonight.
2. Stop with the games.
We’ve all seen it in personal relationships. One person makes everything a game. Do you store your data in one format, say ZIPCodes as INTEGERS, but have to pad out all those missing leading zeros every time you have to deal with North Eastern United States postal codes? Stop doing that. Do you pretend that doing something faster is always better than doing it good enough? Forget perfect. Good enough. Do you tell management you have data models but all you really do is reverse engineer them? It’s all games.
I don’t know, I don’t know if numbers are REAL
Been a LONG night and something ain’t right
You won’t SHOWPLAN, you won’t SHOWPLAN how you feel
No DATETIME ever seems right
To talk about the reasons why CAST and I fight
It’s DATETIME to end the TIMESTAMP
Put an end to this game before it’s too late
Data games, it’s you and me baby
Data games, and I can’t take it anymore
Data games, I don’t wanna play the…
3. Know where your data lives.
Do you have an active inventory of what data resides where? No? How can you romance data you don’t know about? If a server walked out the door of your organization, how long would it take you to figure out what was on it? If a user had a legal need to access all the data the company held about a customer, would you be able to tell them? If you really wanted a happy strong relationship with your data, you’d know. Yes, it’s a lot of data to track where you data is. That’s why they invented tools that do this. And why data professionals are expected to use them.
Data is bigger
It’s bigger than the drives and they are not PB
The servers it is spread to
The bits in your drives
Oh no, I’ve duplicated too much
I set it up
That’s me in the ETL
That’s me in the database
Losing my governance
Trying to keep up with it all
And I don’t know if I can do it
Oh no, I’ve deployed too much
I haven’t documented enough
I thought that I heard you laughing
I thought that I heard you coughing
I think, I thought, I saw you cry
4. Stop faking it.
Yeah, sometimes little white lies are good for a relationship (BTW, You DO Look Beautiful!). But the big ones? Nope, never. The paranoia about NULLs often leads to a lot of lying. Do you pretend that NULLs don’t exist by giving them various fake values like 999999 or N/A, UNKNOWN, WHO KNOWS or __ ? Does every developer get to choose their own NULL Imposter Text? Are your aggregates all a huge lie due to all those zeros and 1980s dates you use to lie to your database? Stop it. It’s not helping that your queries are 2 ms faster when the data is one big lie.
Late at night a big database gets slower
I guess every normal form has its price
And it breaks her data to think her love is
Only given to a user with queries as fragile as ice
So it tells me it all adds up just fine
To aggregate the sales numbers for every town
But only the dev knows where those NULL have been killed
And it’s is headed for the cheatin’ UNKNOWN town
You can’t hide your lyin’ nines
And your N/A is a thin disguise
I thought by now you’d realize
There ain’t no way to hide your lyin underlines….
5. Protect it.
Do you l et just anyone throw code at your data without ensuring it’s treated right? Do you participate in security and privacy reviews of application code? You have those, right? Do you have metadata that describes the privacy and sensitive data requirements for each data element? Do you ensure that things like SQL injection tests happen for every application?
Oh where, oh where can my data be?
The dev took her away from me.
She’s gone to pastebin, so I’m gonna be sad,
So I can see my data, by now I’m so mad.
We were out on a date in my modelling tool,
I had been too much a fool.
There in the database, all laid out,
a data was there, the database queried by a lout.
The dev allowed the inject, the data failed to be right.
I’ll never forget, the sound that night–
the screamin users, the bustin app,
the painful scream that I– heard crash.
Oh where, oh where can my data be?
The dev took her away from me.
She’s gone to pastebin, so I’m gonna be sad,
So I can see my data when my new job is had.
Keep saying it. Keep doing it.
There’s so much more you can do to revitalize your relationship with data. But if you do these, your data will keep on loving you back. I promise. Remember, you data wants to love you back. It’s up to you to make sure it’s still there in morning.
I’ve been attending Enterprise Data World for more than 15 years. This event, focused on data architectures, data management, data modeling data governance and other great enterprise-class methods is part technical training and part revival for data professionals. It’s just that good.
This year the big bash is being held in Austin, TX, a thriving tech-oriented community, 27-April to 1 May. And this year’s theme is “The Transformation to Data-Driven Business Starts Here.”
And right now there’s a $200 Early Bird Discount going…plus if you use coupon code “DATACHICK” you can save $200 more on a multi-day registration or fifty bucks on a one day pass. There. I just saved you $400. And no, I get no kickbacks with this discount code. I don’t need them. I need you to be at this event, sharing your knowledge and meeting other data professionals. I need you to be part of the community of data professionals.
Top 10 Reasons You Need to Go to EDW 2014
- Data is HOT HOT HOT. I deemed 2013 The Year of Data and I see no signs that organizations are going to back to software-is-everything thinking. 2014 is still going to be a year full of data. There’s even an executive, invitation-only CDOvision even co-located.
- Not Just Bullet Points. There are over 20 hours of scheduled networking events for you to chat with other data-curious people. Chatting with other data professionals is my favourite part of this event. Bring your business cards…er… .vcs contact file.
- Lots of Expertise. Not just data celebrities, but also other data professionals with thousands of hours of hands-on experiences, sharing their use cases around data. And not just data modeling. Big Data. Analytics. Methods. Tools. Open Data. Governance. NoSQL. SQL. RDBMS. Fun.
- Certifications. You can take advantage of the Pay-Only-If-You-Pass option for the CDMP on-site certification testing.
- Workshops. I’m doing a half day tutorial on Driving Development Projects with Enterprise Data Models. I’ll be talking about how data models fit within real-life, practical, get-stuff-done development projects. No ivory towers here.
- SIGs. There are special interest groups on data modeling products, industries and methods. You can meet people just like you an share your tips and tricks for data lovin. I will be leading the ER/Studio SIG.
- Ice Cream. This conference has a tradition of the ice cream break on the exhibit floor. Nice ice cream, even.
- Austin. Austin is one of the more vibrant cities in Texas. So cool, it even has a Stevie Ray Vaughan statue. Museums, Theatres, indoor golf, clubs. There’s a reason why SxSW is held here.
- Vendors. Yes, we love them, too. Meet the product teams of the makers of the tools you use every day. Or meet new teams and ask for a demo. They are good people.
- Love Your Data. There’s no better way to show your love than to network with other data professionals and learn from industry leaders.
Come learn how to help your organization love data better. You might even see me in a lightning talk holding a martini. Or taking impromptu pics of @data_model and other data professionals. Or debating data management strategy with people from around the globe. In other words, talking data. With people who love their data. Join us.
On Thursday 5 February at 5:00 PM EST I’ll be moderating a panel on Myths, Misunderstandings and Successes in Data Analytics as part of the PASS Business Analytics 24 Hours of PASS preview. It’s free, but you need to register. And I have a fantastic set of panelists: Stacia Misner, Joey D’Antoni and Lynn Langit.
Duration: 60 minutes
Track: Strategy and Architecture
Big Data, Business Analytics, Data Analytics, NoSQL, Relational . . . do we even agree on what we mean by those terms? In this panel session, industry thought leaders will discuss and debate the most common myths, truths, and mostly-truths of new and traditional approaches for enterprise data management and analytics.
We’ll be leaving time for questions from the audience, so come ready with your myths and stories.
Yes, there’s 24 hours of goodness spread out over 2 days, so check out the other sessions.
Subscribe via E-mail
- September 2016
- August 2016
- June 2016
- May 2016
- April 2016
- March 2016
- February 2016
- January 2016
- December 2015
- November 2015
- September 2015
- July 2015
- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- October 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- September 2010
- August 2010
- July 2010
- February 2009