Why MySQL’s (SQL) DATETIME can and should be avoided

This post was written by eli on March 15, 2009
Posted Under: MySQL, Software

SQL DATETIME sucks

MySQL, among other databases, has a column type called DATETIME. Its name seems to mislead people into thinking that it’s suitable for storing time of events. Or suitable for anything.

This is a general SQL thing, by the way, but I’ll demonstrate it on MySQL.

I often find this column type in other people’s database schemas, and I wonder if the designer gave it a thought before using it. It’s true, that in the beginning it looks simple:

mysql> CREATE TABLE stupid_date ( thedate DATETIME, PRIMARY KEY (thedate) );
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO stupid_date(thedate) VALUES ( NOW() );
Query OK, 1 row affected (0.03 sec)

mysql> SELECT * FROM stupid_date;
+---------------------+
| thedate             |
+---------------------+
| 2009-03-15 14:01:43 |
+---------------------+
1 row in set (0.00 sec)

That was way too cute, wasn’t it? We also have the NOW() function, which fits in exactly, and puts the current time! Yay! Yay! And if the timestamp looks old-fashioned to you, I suppose there is a reason for that.

But wait, there are two major problems. The first one is that the time is given in the host’s local time. That was fair enough before the internet was invented. But today a web server can be a continent away. DATETIME will show you the local time of the server, not yours. There are SQL functions to convert timezones, of course. Are you sure that you want to deal with them? What happens when you want to move your database to a server in another timezone? What about daylight saving time? Local time is one big YUCK.

(Update: As commented below, the real stupidity is to use NOW(), and not UTC_TIMESTAMP(). The latter gives the UTC time, as its name implies)

Problem number two: Most applications don’t care what the absolute time is. The current time is commonly used to calculate how much time has elapsed since a certain event. To filter elements according to if they were X seconds before now. Is the user still logged in? Has 24 hours elapsed since the last warning email was sent? And so on.

“Solution”: The SQL language supplies a variety of COBOL-like functions to calculate whatever we can ask for. And also an opportunity to get things horribly wrong, because the SQL statement became way too complicated.

Use POSIX time() instead

Sorry, didn’t mean to scare you off. It’s really simple: Any modern operating system, even Windows, will readily supply you with the number of seconds since January 1, 1970, midnight, UTC (that is, more or less GMT). This is also called “seconds since the Epoch” or “UNIX time”.

No matter where the computer is, what timezone it uses or what programming language you’re using, this simple integer representation will show the same number at any given moment.

You can, in fact, obtain this number from MySQL directly:

mysql> SELECT UNIX_TIMESTAMP(thedate) FROM stupid_date;
+-------------------------+
| UNIX_TIMESTAMP(thedate) |
+-------------------------+
|              1237118503 |
+-------------------------+
1 row in set (0.00 sec)

This means, that 1237118503 seconds elapsed since the Epoch (which is a global time point) until 14:01:43 in Israeli LOCAL time of the day I wrote this post. So now we have an integer number to work with, which is handy for calculations, but things will still get messy if we try to move the database to another server.

Store the number instead

If we are interested in working with integers, why not store the integer itself in the database? We could go:

mysql> CREATE TABLE smart_date ( thedate INTEGER UNSIGNED, PRIMARY KEY (thedate) );
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO smart_date(thedate) VALUES (1237118503);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM smart_date;
+------------+
| thedate    |
+------------+
| 1237118503 |
+------------+
1 row in set (0.00 sec)

That wasn’t very impressive, was it? The first question would be “OK, how do I get this magic number, now that I don’t have the NOW() function?”

The short and not-so-clever answer is that you could always use MySQL’s UNIX_TIMESTAMP( NOW() ) for this. The better answer is that no matter which scripting or programming language you’re using, this number is very easy to obtain. I’ll show examples below.

As for the magnitude of this number, yes, it’s pretty big. But it will fit a signed 32-bit integer until year 2038. I presume that nobody will use 32-bit integers by then.

And finally, one could argue that DATETIME is convenient when reading from the database directly. True. But for that specific issue we have the FROM_UNIXTIME() function:

mysql> SELECT FROM_UNIXTIME(thedate) FROM smart_date;
+------------------------+
| FROM_UNIXTIME(thedate) |
+------------------------+
| 2009-03-15 14:01:43    |
+------------------------+
1 row in set (0.00 sec)

And again, this is given in the computer’s local time. Which is fine, because it’s intended to be read by humans. In particular, humans who easily translate time differences between their server and themselves.

Obtaining Epoch time

Just to prove that it’s easy to know what the “Epoch time” is in any language, here are a few examples. Wherever it’s really simple, I’m showing how to convert this format to human-readable format.

In Perl:

print time();
print scalar localtime time(); # Local time for humans

In PHP:

<?php
echo time();
echo date('r', time() ); // Local time for humans
?>

In Python:

from time import time;
print time();

(note that the time is returned as a float number with higher precision)

In C:

#include <time.h>
#include <stdio.h>

int main () {
  int now = time(NULL);

  printf("%d seconds since the Epoch\n", now);
  return 0;
}

In JavaScript:

<script language="JavaScript" type="text/javascript">
now = new Date();
alert( now.getTime() / 1000 );
</script>

In this case, the time is shown with a fractional resolution.

The JavaScript example is not really useful for a database application, because the time is measured at the computer showing the page. In a website application, this is just anybody’s computer clock, which may be wrong. But it’s yet another example of how this time representation is widely available.

Conclusion

Drop those DATETIME columns from your tables, and use a simple, robust and handy format to represent time. Don’t let the database play around with a sensitive issue like time, and don’t risk getting confused by different functions when calculating time differences. Just because the DATETIME column type exists, it doesn’t mean there is a reason to use it.

Enjoy the database on what it’s best at: Storing and collecting information.

Reader Comments

Hi,

I read this article and although I find you have some valid remarks there I cannot shake off the impression that you are overlooking some aspects.

- NOW() vs. UTC_TIMESTAMP()

In case you really want to be independant of the timezone I suggest NOT using NOW() as that is localtime. UTC_TIMESTAMP() is what you need to use. It makes no sense saying that the webserver is half a globe away and still use NOW in those examples. Maybe you know this, maybe not but it might be worth mentioning as you focus on timezones.

- Mysql Partitioning schema’s. TIMESTAMP just doesn’t work (as in mysql 5.1) for partitioned tables at the moment, the engine isn’t optimized voor BETWEEN type queries. When using DATETIME and TO_DAYS function you can partition on that collumn. So dropping DATETIME from schema’s isn’t always feasable nor desirable.

But thanks for sharing the different insight there..

Glenn

#1 
Written By Glenn Plas on May 26th, 2009 @ 16:30

Thanks for that remark. I added a small remark about UTC_TIMESTAMP() in the post, which indeed solves the time zone issue.

As for partitioning, that’s beyond me (for now, at least).

#2 
Written By eli on May 26th, 2009 @ 20:56

Sorry, I don’t quite follow. I thought DATETIME, TIMESTAMP, and INT only differ in the way they store data; what you put into them depends on your usage and not the table structure syntax.

In other words, couldn’t I create/use a DATETIME field and on every new record insert UTC_TIMESTAMP() instead of NOW()?
If that is the case, the real issue is storing universal time vs local time, isn’t it?

Thanks.

#3 
Written By Michael B on May 28th, 2009 @ 19:54

Yes, you can create a DATETIME to store UTC time. To be honest, I wasn’t aware of UTC_TIMESTAMP() when I wrote the post.

The bottom line is that nothing stops you from using DATETIME to store UTC time, and nothing stops you from displaying Epoch time as something meaningful within MySQL.

My main point still remains: There is nothing to DATETIME which gives it an advantage over just Epoch time, but somehow I feel safer when time is stored as a plain integer (and I’m sure MySQL won’t think about fiddling with it).

Except for the last remark about partitioning, which I can’t comment much about.

#4 
Written By eli on May 28th, 2009 @ 20:10

Hi, I’m back. I can see why some people would prefer to store dates in INT formats (Unix Timestamps) because it could simplify some calculations and data conversions from one medium to another.
But I looked at the MySQL site and realized 2 important limitations (right now) with using INT or TIMESTAMP that doesn’t occur with DATETIME. A valid range for UNIX Timestamp (right now) is 1970 through ‘2038-01-09 03:14:07′. That is because it is stored as a 32 bit value. So, you might not be able to use it if you were writing some historical type of application, such as dealing with historical birth dates or astrological events (which may need granularity to the second). Also, you couldn’t deal with dates past 2038, until they change the format to 64 bits.
On the other hand, using TIMESTAMPS saves half the amount of space (only 4 bytes as opposed to DATETIME’s 8 bytes) which could speed up calculations and record getting.

So, in summary, DATETIME should be avoided unless you need to store historical or far future dates right now.

Source: http://dev.mysql.com/doc/refman/5.0/en/datetime.html

#5 
Written By Michael B on May 29th, 2009 @ 17:53

The 32 bit limitation is indeed an issue. Still, year 2038 occurs when the epoch time reaches 2^31-1, which is the maximum for a *signed* 32-bit integer.

And given the invasion of 64-bit computers, I would expect the UNIX standard to shift pretty soon.

Until then, if your database deals with pension funds, stick to DATETIME. ;)

#6 
Written By eli on May 29th, 2009 @ 18:02

Hi Eli, I think your post is has a misleading subject. As you have already noticed, DATETIME can and should be used, depending on the situation. Also, the issue you report regarding time zone effect you will also run through it when you use TIMESTAMP, as it’s also timezone affected (thus meaning that the Epoch (UTC) offset will still be affected by time zone conversion).

There are also performance considerations to bear in mind when choosing the datatype:

Check this out:

http://gpshumano.blogs.dri.pt/2009/07/06/mysql-datetime-vs-timestamp-vs-int-performance-and-benchmarking-with-myisam/

This was a test I made to study performance on a “from human readable to database” and the opposite. As you’ll see, DATETIME beats TIMESTAMP like ‘always’ for large datasets.

Hope that helps, and maybe you could fix the subject :)

#7 
Written By Nuno Tavares on September 7th, 2009 @ 13:20

Hello.

Thanks for your comment. The title isn’t misleading, it reflects my opinion. ;)

I suggested to obtain the time in the script, and pass it to the database as an integer. The benchmark you pointed at is interesting, but it shows that INT is slower when the database itself calculates the Epoch time for each line it works on. It also shows that INT is faster when the epoch time is precalculated, which is the realistic case.

#8 
Written By eli on September 7th, 2009 @ 13:42

אחי אחלה פוסט!

#9 
Written By gal on November 3rd, 2009 @ 13:26

Hey, nice post, just looking into this myself.
I was wondering the same things about the timestamp format.
Reading this I’m definately going the signed int way!
Storing in the database as unixtime and taking out
and manipulating with perl, sounds like the way :)

thanks again

David

#10 
Written By David on January 8th, 2010 @ 16:09

I never tought abaut this aspect of precision before. Thanks a lot for your assististment, to find the right decision for my future one and only date format in databases.
Gino

#11 
Written By Josef G, Knust on January 21st, 2010 @ 04:31

Calling UNIX_TIMESTAMP() without any parameters returns the current time in UNIX time. It’s that easy!

#12 
Written By Tim E on February 9th, 2010 @ 23:40

Yes, I agree for keeping record of a table which acts as a log. However, some DBs won’t allow you to put a hard date into a timestamp. So if you’re doing something like keeping track of the date something is due and the date something ships and the due date is entered by an order processor then you’ve got to use the DATE type.

My point really is that there are no hard and fast rules just personal opinions when it comes to writing software or designing databases.

#13 
Written By Vinnie on February 13th, 2010 @ 18:01

Well, my point *was* that there is one simple fit-all rule, which is: use INTEGER UNSIGNED, period. If the date is not “now”, let the script convert the relevant date to an integer, and again, store it as such.

#14 
Written By eli on February 13th, 2010 @ 18:46

I’m going to have to agree with ElI and say you should not use DATETIME.

Sure you can /try/ to store a UTC date in a DATETIME column, but it’s only PRETEND because mysql is really treating it as local. So if you try to get your time back as an integer in UNIX time by calling UNIX_TIMESTAMP(mydate) it treats the stored date as local and does a local->UTC conversion that screws it up.

#15 
Written By sshock on February 15th, 2010 @ 03:58

First of all - Great post, Thanks!
One question remains unclear to me regarding the DST-
assuming I’m using TIMESTAMP on event start date in my website, please follow the scenario:
1. On the server it’s now 23:00 and someone is adding an event for tomorrow at 13:00.
2. at 2:00 on server time, due to DST, the time is going back an hour to 1:00.
3. our ’someone’ is going to his calendar and checking his event.
-What will he see? will it be 12:00, 13:00 or 14:00?
Thanks

#16 
Written By Guy on March 8th, 2010 @ 19:16

Hi, I’ve written extensively about mysql datetime and timestamp columns on my blog and long story short, while I think you raise some interesting points, I couldn’t disagree with you more. At no point did you acknowledge the huge advantage the intrinsic types have in terms of basic readability or date range queries.

Comparing a unix timestamp to a datetime is very much an apples to oranges comparison — an integer is 32 bits, whereas a datetime is an 8 byte column. With the extra storage, you can represent date values far beyond the range of the unix timestamp.

While not exactly the same, the mysql timestamp is essentially equivalent to an integer storing the unix timestamp, but with the benefits of allowing for the use of mysql date functions.

#17 
Written By David on March 10th, 2010 @ 06:55

Thanks for the article! The pitfalls of DATETIME columns was something that got by me somehow. In my case to avoid massive code changes we’re converting to TIMESTAMP columns. Either way you go, POSIX integers or TIMESTAMP columns the potential evil of using DATETIME columns is something nobody should ignore. For anyone who plans on converting to TIMESTAMP, make SURE you read the manual carefully about setting defaults, especially when you want either nullable columns or if you don’t want the TIMESTAMP to change when the record is updated.

One thing I didn’t see mentioned here is the fact that a DATETIME stored based on the servers current time zone, unless I’m missing something, CANNOT be reliably converted to UTC. At the point that clocks get turned back you end up with a number of local times that can literally represent more than one UTC time. If I’m wrong, someone correct me, but I can’t imagine that this is possible…ugly indeed.

#18 
Written By Tom on July 20th, 2010 @ 17:42

Hi everyone!
I’ve recognized that some of you are worried about the 32bit-thing. What if you simply take away the website’s int start time (because we don’t want negative integers, do we), from every current time you store. Then you have a much smaller int to work with, and still you can store the start time somewhere in your scripts or in mysql.
Start time: the date when the server begin to run in this aspect.

Wish it’s understandable, it’s late at night ;)

#19 
Written By mcchicken on August 24th, 2010 @ 01:15

I still don’t see the point in this post. I use epoch a lot, but there is nothing here to convince me that I should always use it instead of datetime.

#20 
Written By Andrew McGregor on August 25th, 2010 @ 12:23

Trackbacks

Add a Comment

required, use real name
required, will not be published
optional, your blog address