I have a great topic and panel for this month’s Big Challenges in Data Modeling webinar on Thursday, 24 April 2014, 2:00 PM EDT. It’s free, but you have to register to get the log in information.
Ethical Issues in Data Modeling
We’ll be talking about the nature of ethics, data and data modeling. I bet all of you have been placed in a tough situation before, either by other IT professionals or by business users who ask you to do something that you aren’t sure is ethical. Maybe it’s legal, maybe it isn’t. Maybe it’s about protecting data or data quality.
Some of the topics I hope we can discuss:
- What is the nature of ethics?
- How do ethics differ from morality? Legality?
- Can ethics be taught?
- Where does ego come into play here?
- What about Codes of Ethics and Codes of Conduct?
- Is there one right answer? Is there an always wrong answer?
- What’s the difference between a whistleblower and a tattletale?
- What tools do we have in making ethical decisions?
- How should we deal with unethical co-workers? Management? Customers?
- What does it all mean, anyway?
Ethical Situations in Data and Data Modeling
- If the answer is always “it depends”, what does it depend on?
- What if faster data means lesser data quality?
- Have you ever been asked to falsify a status report?
- Have you had to deal with someone else who provided incorrect information to a business user or management?
- Have you ever been asked to look the other way when security policies are being broken?
- Have you raised an issue of data protection that was ignored? Or minimalized?
- What about using production data for testing and development?
- What if the data is right, but the transformations or reporting is wrong?
- What if it’s intentionally wrong or misleading?
- Have you ever had to deal with someone else’s ego?
- Have you escalated an ethical issue? What about a legal one? A moral one?
- Do data modelers have distinct areas that we need to watch out for when it comes to ethics?
- Have you ever left a job or project due to ethical reasons?
YOU! Our webinars consider attendees as panelists. You’ll have the opportunity to ask questions, chat with other attendees and tell your own stories. You can even arrive early and stay late for our pre-show and after-show discussions.
Register now and bring your ethical questions and comments.
Today is Valentine’s Day in many parts of the the world. That means either you are looking forward to a happy day full of fun and a night full of …fun… or you are are planning on catching up with Frank Underwood on Netflix. Both sound great to me.
Last year I wrote about 5 Naughty and Nice Ways to Love Your Data. This year I’m going to focus on ways you can romance your data for a stronger, more lasting relationship. So I’m assuming in the past you’ve followed my advice and have long since left the honeymoon phase of your data coffee dates. But where are you now? Are you starting to feel like maybe you need some more passion with your bits and bytes? I’m here to help.
1. Tell your data you love it. Often.
Heck, even show it you love it. Maybe one of the reasons your data has let itself go is that you haven’t told it how much you love it. Do you even remember the things you used to say to woo your data when you first met? Do you have actively managed data models: conceptual, logical, and physical? Do you give your database objects great names? Do you keep good metadata about this data? Do you follow data model-driven development? If you did all these in your early years of your relationship, are you still doing all that now? Are you doing all this in a modern way, not just the way you did it in 1980? Do you just talk a good game, but fail when it comes to actively showing it love?
Some day, when I’m awfully low,
When the query is slow,
I will feel a glow just charting you
And the way you look tonight.
You’re lovely, with your axes so true
And your bars so blue
There is nothing for me but to report you,
And the way you look tonight.
With each crow’s foot your normalization grows,
Tearing my pages apart
And that CHAR that wraps your text,
Touches my foolish heart.
Yes you’re lovely, never, ever refactor
Keep that structured charm.
Won’t you never change it?
‘Cause I love you
Just the way you look tonight.
2. Stop with the games.
We’ve all seen it in personal relationships. One person makes everything a game. Do you store your data in one format, say ZIPCodes as INTEGERS, but have to pad out all those missing leading zeros every time you have to deal with North Eastern United States postal codes? Stop doing that. Do you pretend that doing something faster is always better than doing it good enough? Forget perfect. Good enough. Do you tell management you have data models but all you really do is reverse engineer them? It’s all games.
I don’t know, I don’t know if numbers are REAL
Been a LONG night and something ain’t right
You won’t SHOWPLAN, you won’t SHOWPLAN how you feel
No DATETIME ever seems right
To talk about the reasons why CAST and I fight
It’s DATETIME to end the TIMESTAMP
Put an end to this game before it’s too late
Data games, it’s you and me baby
Data games, and I can’t take it anymore
Data games, I don’t wanna play the…
3. Know where your data lives.
Do you have an active inventory of what data resides where? No? How can you romance data you don’t know about? If a server walked out the door of your organization, how long would it take you to figure out what was on it? If a user had a legal need to access all the data the company held about a customer, would you be able to tell them? If you really wanted a happy strong relationship with your data, you’d know. Yes, it’s a lot of data to track where you data is. That’s why they invented tools that do this. And why data professionals are expected to use them.
Data is bigger
It’s bigger than the drives and they are not PB
The servers it is spread to
The bits in your drives
Oh no, I’ve duplicated too much
I set it up
That’s me in the ETL
That’s me in the database
Losing my governance
Trying to keep up with it all
And I don’t know if I can do it
Oh no, I’ve deployed too much
I haven’t documented enough
I thought that I heard you laughing
I thought that I heard you coughing
I think, I thought, I saw you cry
4. Stop faking it.
Yeah, sometimes little white lies are good for a relationship (BTW, You DO Look Beautiful!). But the big ones? Nope, never. The paranoia about NULLs often leads to a lot of lying. Do you pretend that NULLs don’t exist by giving them various fake values like 999999 or N/A, UNKNOWN, WHO KNOWS or __ ? Does every developer get to choose their own NULL Imposter Text? Are your aggregates all a huge lie due to all those zeros and 1980s dates you use to lie to your database? Stop it. It’s not helping that your queries are 2 ms faster when the data is one big lie.
Late at night a big database gets slower
I guess every normal form has its price
And it breaks her data to think her love is
Only given to a user with queries as fragile as ice
So it tells me it all adds up just fine
To aggregate the sales numbers for every town
But only the dev knows where those NULL have been killed
And it’s is headed for the cheatin’ UNKNOWN town
You can’t hide your lyin’ nines
And your N/A is a thin disguise
I thought by now you’d realize
There ain’t no way to hide your lyin underlines….
5. Protect it.
Do you l et just anyone throw code at your data without ensuring it’s treated right? Do you participate in security and privacy reviews of application code? You have those, right? Do you have metadata that describes the privacy and sensitive data requirements for each data element? Do you ensure that things like SQL injection tests happen for every application?
Oh where, oh where can my data be?
The dev took her away from me.
She’s gone to pastebin, so I’m gonna be sad,
So I can see my data, by now I’m so mad.
We were out on a date in my modelling tool,
I had been too much a fool.
There in the database, all laid out,
a data was there, the database queried by a lout.
The dev allowed the inject, the data failed to be right.
I’ll never forget, the sound that night–
the screamin users, the bustin app,
the painful scream that I– heard crash.
Oh where, oh where can my data be?
The dev took her away from me.
She’s gone to pastebin, so I’m gonna be sad,
So I can see my data when my new job is had.
Keep saying it. Keep doing it.
There’s so much more you can do to revitalize your relationship with data. But if you do these, your data will keep on loving you back. I promise. Remember, you data wants to love you back. It’s up to you to make sure it’s still there in morning.
Some people believe that in an age of Facebook, Foursquare and Twitter, we should give up all our expectations of privacy. While I agree that I’ve been shocked by the amount of personal information that people share (sometimes even how much I share), I still believe that organizations need to have the right technologies, policies and training in place to protect abuse of personal and sensitive data.
In a wilful privacy breach in 2011, a clerk at British Columbia’s insurance bureau (ICBC) accessed customer data in order to intimidate employees of another organization. One of the victims has launched legal proceedings against ICBC for failing to have suitable data protections in place. ICBC is a sort of universal automobile insurance organization in BC – everyone who wants a driver’s license there must get their insurance via this organization, so their data collection covers most adult BC residents.
Annette Oliver isn’t just worried about sensitive information being made public, but about how that data was used to terrorize her family and co-workers.
Annette Oliver alleges in her lawsuit that her husband’s van was torched on April 17, 2011, at about 2 a.m., which police believe was an arson.
Then on June 1, 2011, Oliver claims, she was at home when she heard three loud bangs at about 5 a.m. and discovered three bullet holes in the front of her house.
Oliver says her husband and two daughters were home at the time.
This wasn’t an isolated case: others had their cars burned and homes shot.
Three months later, on Dec. 14, 2011, the RCMP revealed the investigation had found a link to an ICBC employee, who allegedly accessed personal information of 65 people, including 13 identified as victims who were targeted.
ICBC said at the time the employee under investigation was a woman who had been at ICBC for 15 years before she was fired in August 2011
It appears from the lawsuit that ICBC did not use monitoring technologies to monitor access. Or that they weren’t using them correctly. I’m always surprised by organizations that steward customer data and don’t do much to properly care for that data. We’ll see in the end whether or not ICBC had suitable protections.
Myths about Data Protection
- Data privacy breaches don’t really hurt people. This one makes me mad. Even something less physically harmful like having their identities stolen can cause years of trouble for your customers, not to mention great financial harm. But data breaches can and do physically harm people.
- Data privacy is about secrecy. No, data privacy protection is about controlling the usage of data for only the reasons for which it was collected. Among other things.
- If the data is available elsewhere, it doesn’t need to be protect in our database. No, IT professionals still have a duty to protect personal and sensitive data in their care.
- Data wants to be free, so we shouldn’t control how it’s used within the organizations. Yeah? My cats want to be free, too. And we still don’t let them outside.
- Data protection is just a technology issue. Data protection is just a training issue. Data protection requires technological, process and people-based solutions.
- Encryption is all we need to do. No, because if people can read the data or download it, it’s not encrypted any more. Encryption helps when people walk away with the data. But people who use the data don’t see encrypted data.
- Data privacy requirements can be applied after the system goes into production. This one drives me crazy. Data protection requires effort at all phases of a project. There architectural, design, development, deployment and maintenance components to be addressed. There are policy and procedures to be developed. There is monitoring and alerting to be practiced.
You know my mantra. Love your data because it’s not really yours. You have a professional duty to ensure it’s safe.
Read the full story at Metronews
Tomorrow, Thursday 28 February at 2;00PM EST, I’ll be moderating a panel of expert data modelers as part of my Big Challenges in Data Modeling Series at Dataversity.net . In this month’s webinar, we’ll be debating the role of data architects in how we can best support business processes related to data privacy, data security and compliance. We’ll start by talking about recent data breaches and privacy issues.
One of the more contentious debates I have on projects is whether or not data modelers and architects should even have a role in these processes.
Joining me for this month’s panel are:
- Eva Smith ( @datadeva | blog ) Director of Information Technology at Edmonds Community College (EdCC) where she oversees college IT functions and serves on the IT Commission for the Washington State Community and Technical College system. Eva also volunteers for DAMA, International on the Editorial Board for the Data Management Body of Knowledge (DMBOK) Version 1, and as DAMA-I liaison to the Institute for Certification of Computing Professionals (ICCP).
- Loretta Mahon Smith( @silverdata ) is currently the IBM Global Business Services, Business Analytics & Optimization Lead for the Data Modeling Center of Excellence. She has an extensive background in the financial services industry and is also a long time DAMA volunteer.
- Peggy Schlesinger is a well-respected Master Enterprise Architect with Intel Corporation with a long history in Master Data Management. She is currently working on the Semantic Definition for the enterprise to improve and accelerate Business Intelligence, and is moving the environment toward Self-Service Business Intelligence.
As always, our last panelist is YOU! Unlike many webinars, we run these as highly-interactive events. We have a formal Q&A for when you want to ask a question of the panel, but we also have a peer-to-peer chat open so that you can discuss what you hearing in real time. We try to keep track of what’s going on in the chat so that we can comment and address the points being raised there. I love this feature and hope you will join us to be part of this event.
If you have a topic or question you’d like us to address, leave a comment below and we’ll try to work it in.
Also, if you are unable to make the webinar, you can register now anyway and listen to the recording later. So get registered now.
I’ve blogged about this data breach before: Federal Department Bans Use of Portable Devices (YAFF). To add insult to the injury, a “printer error” has led to recipients of notifications about the breach receiving letters intended for other victims.
The federal government is blaming a printing error for the fact that some student loan recipients who received letters to say their personal information had gone missing along with a portable hard drive also got letters addressed to someone else.
Human Resources and Skills Development Canada revealed last month that a hard drive containing the personal information of some 583,000 Canadians had gone missing. The data included social insurance numbers and dates of birth of people who had received student loans between 2002 and 2006.
Sure, these sorts of errors do happen, especially when using automated printing and envelope stuffing equipment. I’ve got to say, though, that the timing on this error is more than … difficult. I’m wondering if the IT teams are being blamed here, or just the outsourcing company that provides mailing services.
I’ve been blogging about health data breaches lately, but I’m not sure if there are more of them or if the reporting requirements are more strict. I suspect the latter.
One of the things I’ve noticed is that many of the breaches seem to be of multiple exposures by the same organization, which has led to recent legislative changes to the HITECH Act. You can see from the quote below that not only has the limit to the penalty been increased, but the penalties for repeat violators are higher.
Given the sensitive nature of health data, I’m still thinking that we need to move more towards criminal penalties for wilful neglect and repeat violations.
In addition to redefining the scope and liabilities of business associates in the healthcare industry, the final HIPAA omnibus rule includes revisions to the penalties applied to each HIPAA violation category. While the American Recovery and Reinvestment Act of 2009 (ARRA) initially established a tiered penalty structure, it hasn’t been revised until now.
Section 160.404 refers to the amount of civil monetary penalty as administered under the HITECH (Health Information Technology for Economic and Clinical Health) Act. The original penalty structure used to be:
Do you think companies are bearing enough of the responsibility for protecting our data? Do you as a data professional get enough support from management to ensure that data is protected?
I thought I had blogged about this Canadian data breach, but I guess not. All these data breaches are coming so fast it’s hard to keep up. In this report, we have another YAFF: a portable hard drive being used as a backup device.
It looks like Human Resources and Skills Development Canada (HRSDC) will be taking a three-pronged approach to protecting our data: first, a new policy banning portable storage devices; second, use of data loss protection technologies and third, establishing consequences for staff that cause a data breach.
OTTAWA — The federal department at the centre of a massive data breach says it is banning the use of portable data devices in its offices, using new technology to prevent information from being easily removed from the network and warning any staff that violation of the new rules could mean the loss of their job.
Human Resources and Skills Development Canada (HRSDC) said Monday that it will start using “data loss technology,” which would allow the department to restrict when, where and which staff can remove information from government systems. Reviews have already started to see what risks the use of secured, portable data devices, such as USB memory sticks, carry in the department’s work and whether there are enough safeguards to prevent another massive breach of personal information from happening again.
Their loss of more than half a million student loan borrowers’ data has led to class action lawsuits. A missing external hard drive is the hardware piece of this breach; the fact that this drive contained unencrypted backups is the behavioural issue. Perhaps we need to start thinking about how to train end users on the consequences of moving data from “the system” to any place else, even for backup purposes.
Is there a solution?
I have more questions than solutions here, though. Usually enterprise backup solutions involve software plus a server or external service. I’m not sure why HRSDC was using a portable hard drive for backup. They are harder to manage, they tend to walk away, and they aren’t that reliable. So I’m going to guess here that this device was a personal device or being used to sneakernet files from one location to another. Perhaps from office to home, or from office to office. Both of those scenarios bother me because they most likely were not official methods for doing these tasks.
I don’t think there’s one answer. Training, policy, inspections, consequences, real monitoring and protection, more training, more inspections, some tough decisions. It’s a complex issue that will require complex responses. I’d like to hear what other organizations are doing to mitigate data breaches.
Subscribe via E-mail
- September 2016
- August 2016
- June 2016
- May 2016
- April 2016
- March 2016
- February 2016
- January 2016
- December 2015
- November 2015
- September 2015
- July 2015
- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- October 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- September 2010
- August 2010
- July 2010
- February 2009