Browsing articles in "Blog"

Database Design Throwdown, Texas Style

Jan 21, 2016   //   by Karen Lopez   //   Blog, Data, Data Modeling, Database, Database Design, DLBlog, Events, Fun, Snark, Speaking, SQL Server  //  3 Comments

SQLSaturday #461 - Austin 2016

It’s a new year and I’ve given Thomas LaRock (@@sqlrockstar | blog ) a few months to recover and ramp up his training since our last Throwdown.  The trophies from all my wins are really cluttering my office and I feel back that Tom has not yet had a chance to claim victory.  So we will battling again in just a few days.

I’ll be dishing out the knowledge along with a handkerchief for Tom to wipe up his tears at SQL Saturday #461 Austin, TX on 30 January 2016.  This full day community-driven event features real database professionals giving free presentations on SQL Server and Data Platform topics.  All you need to do is register (again, it’s free) before all the tickets are gone.

Database Design Throwdown

Speaker(s):  Karen Lopez Thomas LaRock

Duration: 60 minutes

Track: Application & Database Development

Everyone agrees that great database performance starts with a great database design. Unfortunately, not everyone agrees which design options are best. Data architects and DBAs have debated database design best practices for decades. Systems built to handle current workloads are unable to maintain performance as workloads increase.Attend this new and improved session and join the debate about the pros and cons of database design decisions. This debate includes topics such as logical design, data types, primary keys, indexes, refactoring, code-first generators, and even the cloud. Learn about the contentious issues that most affect your end users and how to avoid them.

One of the other great benefits of attending these events is that you get to network with other data professionals who are working on project just like yours…or ones you will likely work on at some point.

Join us an other data pros to talk about data, databases and projects. And make sure you give a #datahug to Tom after the Throwdown. He’s gonna need it.

I’m Going to be TECHUnplugged in Austin

Jan 20, 2016   //   by Karen Lopez   //   Blog, Cloud, Data, DLBlog, Events, Speaking, Training  //  3 Comments

…and you should join me.

image

On 2 February I’ll be speaking at TECHUnplugged Austin, Texas.  This event, which has free registration, focuses on how technology innovation is changing business and IT.

TECHunplugged is a full day conference focused on cloud computing and IT infrastructure.

Its innovative formula combines three essential parts of the industry for an exceptional exchange of information, insights and education:

– A group of independent, insightful and well-recognized influencers
– Leading disruptive technology vendors,
– End users who manage rich technology environments.

The ultimate goal of TECHUnplugged Conference is to bring quality information to IT decision makers by bringing them together with independent influencers and industry vendors, to engage, debate and be informed through open discussions on topics such as IT infrastructure, virtualization, cloud computing and storage.

I’m going to be talking about how data has changed over the years and how data quality issues can become obstacles to business innovation.

If you are in IT and would like to attend, use the registration form below.  If you use my special code, you’ll be entered to win a special prize of an Amazon Echo (I SO LOVE MINE!) at the event.

My promotional code is:

infoadvisors

Yes, all lowercase.

 

I hope to see you in Austin.  Maybe we can have tacos.

Database Design Evils 

Dec 11, 2015   //   by Karen Lopez   //   Blog, Data Modeling, Database Design, DLBlog, Fun  //  2 Comments

I posted a tongue-in-cheek Twitter poll about some database design features that tend to polarize team members.

Of course, every design decision comes down to cost, benefit and risk. So there aren’t any evil choices.

How would you vote?

Poll results: GUIDS,  NULLs, Joins, Satan

Twitter Poll

 

What You Are Missing by Sticking with SQL Server 2005: Data Architects Version

Dec 1, 2015   //   by Karen Lopez   //   Blog, Data Modeling, Database, Database Design, DLBlog, SQL Server  //  1 Comment

File:Voskhod spacecraft IMP 'Globus' navigation instrument, front view.jpgOn 12 April 2011 it was Yuri’s Night — the night we space fans celebrate Yuri Gagarin’s 1961 history-setting flight into space.  In 2011 we were celebrating 50 years of manned spaceflight. On that same day in 2011, we reached the end of support for SQL Server 2005 SP4. On 12 April 2016 we will reach the end of extended support for SQL Server 2005. That means no more hotfixes, no help from Microsoft and no love for your data still living in SQL 2005 databases.

I’m hoping your organization is already on its way to upgrading and migrating data and applications to newer versions of SQL Server.  SQL Server 2016 is already being used in production by early access customers. No matter which version you will be migrating to, I want to share with you some of the features and perks you’ll have available to you now that you are moving away from a dead version. Of course there are hundreds of enhancements that have happened since 2005, but today I’m focusing on those that a data architect would want to use in a database design for enhanced performance, security and data quality.

Columnstore Indexes

If you are designing for data warehouse type solutions, this is the closest thing we have for a "turbo switch" for SQL Server. Columnstore Indexes achieve high compression rates since they store columns together instead of storing rows together. They also support much faster query performance for batch and aggregation queries. They typically achieve 10x performance increases, sometimes even more. This feature was introduced in SQL Server 2012, but you’ll want the advances to this feature that came with SQL Server 2014.

SEQUENCEs

SEQUENCEs have been around in other DBMSs for a while, but were introduced in SQL Server 2012.  These special objects work much like IDENTITY columns, but offer more flexibility and use cases.  The main feature is that you can grab a sequence (or a bunch of them)  before you insert a row.  Many developers use GUIDs for similar reasons, but GUIDs are much longer and therefore had performance downsides. SEQUENCEs are integer types.

New Data Types

So many new data types have been introduced since SQL Server 2005, but the ones that really stand out for me are DATE, TIME, DATETIMEOFFSET, the geospatial types, and the deprecation of timestamp.

It wasn’t until SQL Server 2008 that we had access to data types that comprised only the DATE or TIME portion of a point in time. So we had to do all kinds of conversions just to strip out unwanted data (00:00:00). We also had to make room to store that unwanted precision. Storing millions of rows of unneeded zeros hurts performance, both operationally and for backup and recovery.

SQL Server 2008 also introduced DATETIMEOFFSET, which allows us to track data in context of its time zone. If you remember the days when meeting invites did not include this important piece of information, you’ll know why this is important.

The spatial data types GEOGRAPHY and GEOMETRY and have added a new and feature-rich way of tracking places, their geometry plus special features that make it much easier to answer questions like "which is the closest" or "is this address located in this neighbourhood".  If your data story includes location points, you’ll want to use these.

SQL Server was always an oddball when it came to the data type TIMESTAMP. In other DBMSs, this data type was one that included date and time, to a very large precision. In SQL Server, TIMESTAMP is a type of row version identifier that has nothing to do with TIME. So data architects migrating from other DBMSs were often bitten when they used the wrong data type. Microsoft announced in 2008 that it was depreciating TIMESTAMP and recommending the use of ROWVERSION, which is similar (but not the same) in functionality.

Encryption

SQL Server 2016 currently includes support of Always Encrypted, a feature that does just that: it support the encryption of data from application to database and back, so that it is better protected than solutions that encrypt data once it is written to the database. I’m always reminding you that keeping your CIO out of jail is part of your job description, right?

Always Encrypted flow

Data Archiving

As our data gets bigger and bigger, the size of our databases is growing as well. That means that performance takes a hit. Developers want us to take shortcuts on data quality to improve performance because size matters. One of the ways to help manage data volumes is to move "cold" data to other storage locations. Starting in SQL Server 2016, we can stretch a database to Azure, which means that data that isn’t accessed as often can be stored in the cloud and retrieved when needed. This allows our hot data to be local and fast, while the cooler data is more economical to store and still there and your application doesn’t even have to manage this difference.

JSON Support

In SQL Server 2016 we are getting support for JSON processing.  This isn’t the same as a JSON data type like we have with XML, but a set of import and export features for providing relational data as JSON documents and brining JSON data into SQL Server.  Now you won’t have to manage all those curly brackets on your own.

One Last Thing…

As vendors withdraw support for their products, third party tool makers do so as well. If you are supporting older, out of support versions of databases, it’s likely that your data modeling, data quality and data integration tools are also dropping support for these solutions. You’ll be left supporting database systems without vendor support and without professional enterprise class modeling and design tools.  I know how hard it is to keep track of databases that my tools can’t connect with.  Don’t let sticking with an old version be the end of data modeling support for that data.

If you like geeking out about space and data types, you might want to check out my 1 April 2014 post on a new data type.

Show Your Data Some Love

These are just a tiny number of the types of features that will be available to you when you upgrade to modern versions of SQL Server. The advent of security, data quality and performance features are leaving your old solutions behind, putting your data at risk and leaving your customer data feeling unloved.  There’s a data space race going on. Don’t live your company using old technology to manage data. Go fix that!

Datatypes or Data Types?

Nov 23, 2015   //   by Karen Lopez   //   Blog, Data Modeling, DLBlog, Fun  //  No Comments

I conducted a Twitter poll last week about how to spell DATATYPE (or is it DATA TYPEs?).  Many compound words start out as two separate words, then get hyphenated, then concatenated to a new word.  We saw this with:

data base –> data-base –> database

I keep seeing data types spelled both ways (and never as data-type).

Ted Codd used DATA TYPE in his 12 Rules for a Relational Database Product.

Embarcadero ER/Studio and CA ERwin Data Modeler use DATATYPE in their products and occasionally use DATA TYPE in their help or documentation.

Oracle uses both spellings in their documentation.  Microsoft sticks heavily to DATA TYPE.

Twitter polls last for 24 hours and not all clients can see or vote on them. So consider this more of a fun question on social media.

image

How do you spell this concept? Are there other words you find with a variety of spellings?

You’re Doing it Wrong: Generalizations about Generalizations

What Could Go Wrong Complicated Data Model Thumbnail Plus Darth Vader

I have a couple of presentations where I describe how generalized data modeling can offer both benefits and unacceptable costs.  In my Data Modeling Contentious Issues presentation, the one where we vote via sticky notes, we debate the trade-offs of generalization in a data model and database design.  In 5 Classic Data Modeling Mistakes, I talk about over-generalization.

Over the last 20 some years (and there’s more “some” there than ever before), I’ve noticed a trend towards more generalized data models.  The means that instead of having a box for almost every noun in our business, we have concepts that have categories.  Drawing examples from the ARTS Data Model, instead of having entities for:

  • Purchase Order
  • Shipping Notice
  • Receipt
  • Invoice
  • etc

…we have one entity for InventoryControlDocument that has a DocumentType instance of Purchase order, Shipping Notice, Receipt, Invoice, etc.

See what we did there?  We took metadata that was on the diagram as separate boxes and turned them into rows in a table in the database.  This is brilliant, in some form, because it means when the business comes up with a new type of document we don’t have to create a new entity and a new table to represent that new concept.  We just add a row to the DocumentType table and we’re done.  Well, not exactly…we probably still have to update code to process that new type…and maybe add a new user interface for that…and determine what attributes of InventoryControlDocument apply to that document type so that the code can enforce the business rules.

Ah! See what we did there this time?  We moved responsibility for managing data integrity from the data architect to the coders.  Sometimes that’s great and sometimes, well, it just doesn’t happen.

So my primary reason to raise generalization as an issue is that sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems.  Just because you engineered a requirement from a table to a row does not mean it is no longer your responsibility.  I’ve even seen architects become so enamoured with moving the work from their plate to another’s that they have generalized the heck out of everything while leaving the data quality responsibility up to someone else.  That someone else typically is not measured or compensated for data integrity, either.

Sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems

Alec Sharp has written a few blog posts on Generalizations. These posts have some great examples of his 5 Ways to Go Wrong with Generalisation.   I especially like his use of the term literalism since I never seem to get the word specificity out when I’m speaking. I recommend you check out his 5 reasons, since I agree with all of them.

1 – Failure to generalize, a.k.a. literalism

2 – Generalizing too much

3 – Generalizing too soon

4 – Confusing subtypes with roles, states, or other multi-valued characteristics

5 – Applying subtyping to the wrong entity.

By the way, Len Silverston and Paul Agnew talk about levels of generalization in their The Data Model Resource Book, Vol 3: Universal Patterns for Data Modeling book (affiliate link).  Generalization isn’t just a yes/no position.  Every data model structure you architect has a level of generalization.

Every data model structure you architect has a level of generalization.

I’m wondering how many of you who have used a higher level of generalization and what you’ve done to ensure that the metadata you transformed into data still has integrity?

Leave your recommendations in the comments.

Update: I updated the link to Alec’s blog post.  Do head over there to read his points on generalization.

Idera to Buy Embarcadero Technologies–Update

Sep 18, 2015   //   by Karen Lopez   //   Blog, Breaking News, Data Modeling, DLBlog  //  No Comments

image

This news arrived today:

http://www.reuters.com/article/2015/09/17/sponsored-mmvolume-idUSL1N11N1L820150917

Jefferies is also leading a US$425m covenant-lite credit to back Idera’s acquisition of Embarcadero Technologies. Idera is backed by TA Associates. The deal, which launches on Thursday, includes a US$25m revolving credit, a US$300m first-lien term loan and a US$100m second-lien term loan.

So last year we had Embarcadero attempting to purchase ERwin from CA, now today we have Idera, makers of SQL Server focused database-related solutions, moving towards buying Embarcadero.

The Embarcadero-buying-ERwin deal fell through, in part, due to regulatory concerns over market consolidation of the database/data modeling tool business.  I’m wondering how regulators will feel about this consolidation of tools.

I’ve worked with both vendors in the past.  Both are based in Austin, Tx.  Standing by to see what happens next.

UPDATE:  I’m now seeing official communications about the sale, with a very aggressive closing date.  This is in contrast to the prolonged, ultimately failed acquisition attempt by Embarcadero for CA’s ERwin Data Modeler product.

Thoma Bravo, a leading private equity investment firm, today announces the sale of Embarcadero Technologies, a leading provider of software solutions for application and database development, to Idera, Inc., an application and server management software provider. The exit represents the culmination of a long and successful partnership between Thoma Bravo and Embarcadero Technologies. The deal is expected to close in mid-October, subject to normal closing conditions and approvals.

http://www.businesswire.com/news/home/20151007006216/en/Thoma-Bravo-Announces-Sale-Embarcadero-Idera#.VhfF5vlVhBc

http://www.theregister.co.uk/2015/10/08/embarcadero_database_and_app_dev_company_sold_to_idera/

Pages:«1234567...49»

Blog Categories

Subscribe via E-mail

Use the link below to receive posts via e-mail. Unsubscribe at any time. Subscribe to www.datamodel.com by Email


Categories

Archive

UA-52726617-1