I conducted a Twitter poll last week about how to spell DATATYPE (or is it DATA TYPEs?). Many compound words start out as two separate words, then get hyphenated, then concatenated to a new word. We saw this with:
data base –> data-base –> database
I keep seeing data types spelled both ways (and never as data-type).
Ted Codd used DATA TYPE in his 12 Rules for a Relational Database Product.
Embarcadero ER/Studio and CA ERwin Data Modeler use DATATYPE in their products and occasionally use DATA TYPE in their help or documentation.
Oracle uses both spellings in their documentation. Microsoft sticks heavily to DATA TYPE.
Twitter polls last for 24 hours and not all clients can see or vote on them. So consider this more of a fun question on social media.
How do you spell this concept? Are there other words you find with a variety of spellings?
What could that datatype possible mean?
Read my blog post on an ER/Studio secret over on Embarcadero’s Community site and find out.
Today is the general availability release date for the newest version of SQL Server, aptly named SQL Server 2014. I’m excited about many of the new features being rolled out today, but the ones that will impact data architects, modelers and database designers are the new datatypes that will be introduced. But first, for those of you who have their heads stuck in the deep piping and spit-managing of databases, some background about datatypes:
A datatype is a categorization of data items, based on the range and types of data that it can contain and a set of actions that can be validly taken against that data.
As such, applying a datatype to a column in a database makes it work as another type of constraint. A tinyint column can’t hold my Starbucks name (Kitty) because it constrains the values to integers and only a subset of all integers, for example.
The number and type of datatypes (yes, I’m being meta there) varies depending on the strength and quality of the tequila the DBMS product management teams were drinking at their last
Vegas Blow Out team building retreat, as called for in the ISO Standards for databases, AKA
ISO/IEC JTC 1/SC 32 – Data management and interchange.
One of the things that developers and DBAs will tell you is that choosing the right datatype is important for performance reasons. And by that, they mean the smallest datatype you can fit most of the data in. And maybe a bit smaller. Soooo much bad info out there, I know. When Knowledge Conquers Fear, we can love our data. Thank the Cosmos you have me here to help you out.
What’s new in SQL Server 2014: A New Datatype
This new datatype is exciting for me as a data & space enthusiast. The new feature finally allows modern database designers to properly specify the constraints for tracking time and location data in the same column. Yes, this means that your developers and DBAs no longer have to use comma-delimited values in their relational database designs when they need to track how much time and personal space they need to get up to speed on professional database design. And it’s big enough to store that many man-hours. Yeah. I said that.
BTW, it seems that Stack Overflow is *the* place to find info on how to implement comma-delimited values in database columns. Kids, don’t get your database design knowledge from random forums on the Internet.
Anyway, back to the news!
The new feature makes so much sense with Microsoft’s push to the Cloud, it’s embracing of NoSQL technologies and all. It’s AWESOME.
Defines a time and location in a universe.
SQL Server 2014
spacetime [(fractional seconds precision)], (universe, 5DGeometry)
CREATE TABLE Table1 ( Column1 spacetime (1000, 2014.12.0.2000.8
–∞ to +∞ and beyond
[you’ll need a 5D monitor to view this range.]
Timezone offset range
Thank Venus, no, nope, never. We are scientists here. Use Multiuniversal Universal Time Coordinates (UTMC).
Daylight saving aware
Oh, for Carl’s sake. Do you really think something like spacetime needs to be sullied by DST?
If you have to ask, you don’t ever need to use this datatype. Seriously.
+/- 10 Plancks. Depending on how far your server is from the Sun. Earth’s Sun, that is.
Microsoft Azure DB Support
|Yes, of course. But only in Premium plans and higher.|
Special Considerations and Gotchas
Some gotchas with this new datatype:
- Due to the highly multi-dimensional, multiuniversal nature of this datatype, there isn’t any backwards compatibility. Unless, of course, you can fold spacetime and go back and change earlier versions of SQL Server. But if you could do that, you wouldn’t be reading my blog, would you?
- Just like the confusion over timestamps, you can’t really treat this like a date or time datatype. It’s special. And spatial.
- This means you can’t convert it to date, time, datetime, timestamp or spatial datatypes, either.
- The 5D geometry thing is way too complex to explain in a single blog post. But for those of you that managed to stick it out through some college level math, it involves parsecs (the correct usage of the term) and the double declining balance method of space depreciation. In this first rollout of spacetime, the geometry completely ignores most OctoDeca Bands. Except for Miller tracks.
- You can’t use normal date and geometrical math on data in the columns. You can bend or fold the values, but since space has no center, and time has no beginning or end, spacetime has no beginning or end. It is infinite. So the usual infinity rules apply.
- This datatype is only available via O365, but that makes sense since as announced today, SQL Server 2014 is also only available via O365 subscriptions.
- This datatype is only available at O365 plans at U3 and higher. Wait, I don’t think I should have said anything about the new Universe O365 plans. Forget I said anything. That’s probably not going to be a rule in our universe. Seriously. No NDA broken. I think.
Some of this post may have been inspired by some bad veggie April Fish (poisson d’avril) I had last night. If you want to get some real information about the new features of SQL Server 2014, you probably shouldn’t read random blogs on the internet on launch day. Especially when it’s 1 April.
Did you catch all the special references in this post? Let me know.
Out of the box, CA ERwin Data Modeler has a default datatype of char(18). This default is used when you create an attribute or column in a data model and don’t specifically assign a datatype to that attribute.
I don’t know this for a fact, but I’m guessing the developers of ERwin chose this odd length so it would stand out as being a bit odd, reminding modelers to choose a “real” datatype and length at some point. You can see in the figure below that I’ve added an attribute named test to the MOVIE entity, but did not assign a datatype, so ERwin gave it the default datatype of char(18).
However, one the problems with this is that it is still not odd enough. One of the tips I picked up while speaking at the BeNeLux Modeling User Group was that you can set the default datatype in ERwin to something especially funky, char(-1). That datatype is certain to:
- Stand out in both the data model diagram and any DDL
- Not generate valid DDL
- Fail any DDL testing
- Make a DBA really mad
- Motivate database designers to set real datatypes
Sounds to me like the perfect datatype for being a default.
Defaults in ERwin are model-specific, meaning that you set them on a .erwin file basis. To do this in ERwin 7.3, you must first open your model, then go to MODEL, MODEL PROPERTIES:
This will bring up the properties dialog. Choose the DEFAULTS tab.
In the Default datatype section above, you can set the default datatype to char(-1) or whatever makes sense for you project. I’ve seen some people make a good guess at the most generic datatype, something like varchar(50), but I prefer to assign an oddball datatype so that I can find these “unassigned” attributes and columns easily. Note that you can also set separate defaults for the logical and physical models. Since I’m choosing to set the default to an invalid default, I set them the same.
Having said this, I rarely create attributes manually. I usually create attributes via the domain approach, where domains have a “best guess” standard datatype that makes sense for the type of data contained in them. For instance, a domain for email address would normally be varchar(254) in a model, since that is the Internet standard for email addresses. However, attributes do get created without a domain all the time, so it makes sense to have a default datatype that meets your needs.
Having set my default datatype in my model to char(-1), when I create a new attribute and don’t assign a datatype it will be set to the default as shown in the MOVIE entity below:
The key to using an invalid datatype is to ensure that you’ve circled back and set them all to a real datatypes and lengths before you derive a new model from them, generated DDL, or compared to a database.
You can use a report or the ERwin Query Tool to find all the attributes that have this datatype:
More on the ERwin Query Tool in another blog post soon.
Subscribe via E-mail
- September 2016
- August 2016
- June 2016
- May 2016
- April 2016
- March 2016
- February 2016
- January 2016
- December 2015
- November 2015
- September 2015
- July 2015
- June 2015
- May 2015
- April 2015
- March 2015
- February 2015
- January 2015
- December 2014
- November 2014
- October 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- September 2010
- August 2010
- July 2010
- February 2009