"Time is that great gift of nature
which keeps everything from happening at once."--C.J. Overbeck(?)
All
but the most rudimentary database applications store one or more expressions of
time. They may record when an event occurred, its duration, the expected
occurrence of a future event, or a combination of these. The values themselves
are frequently less important than the calculations that depend on them: How
much did we sell this month? How many hits did we get today? Is that hotel room
available two weeks from now? How long before this asset is fully depreciated?
SQL
Server has respectable facilities for storing, calculating, and reporting time
and date information. If they do not meet your needs, the server is flexible
enough to handle nearly any system you improvise.
This
article won't discuss SQL Server's online analytical processing (OLAP)
services. Regardless, these services almost always require time values obtained
from an online transaction processing (OLTP) database. It will also omit the extensive
application of time in administering SQL Server.
Out of the Box
SQL
Server has four basic capabilities with regard to time values: Storage and
retrieval; calculations and comparisons; formatted display; and conversion to
and from other data types.
System data types
SQL
Server provides two time data types: datetime and smalldatetime. They are
distinguished by their range and precision, and corresponding storage
requirements. Both reckon dates as the number of days either from or to January
1, 1900, the base date. Both reckon time as
the period elapsed since midnight. Datetime and smalldatetime combine date and
time into a single value--there is no distinct time data type in SQL Server.
You can store only time values by omitting dates, or vice versa.
Datetime stores the range
January 1, 1753, to December 31, 9999 with one three-hundredth of a second
accuracy--it rounds to the nearest .000, .003, or .007 seconds. It requires
eight bytes, four for the date, and four for the time
Smalldatetime covers the range
January 1, 1900, through June 6, 2079 with one-minute resolution. It requires
four bytes, two for the date, and two for the time.
CREATE TABLE #T(
id INT IDENTITY,
dt DATETIME NOT NULL DEFAULT GETDATE(),
sdt SMALLDATETIME NOT NULL DEFAULT GETDATE())
GO
INSERT #T VALUES(DEFAULT,DEFAULT)
INSERT #T VALUES('6/24/2001 7:05:01.544', '6/24/2001 7:05:01.544')
INSERT #T VALUES('6/24/2001', '6/24/2001')
INSERT #T VALUES('6/24/01', '6/24/01')
INSERT #T VALUES('6-24-2001','6-24-2001')
INSERT #T VALUES('20010624', '20010624') -- this is ISO/international format
INSERT #T VALUES('June 24, 2001', 'June 24, 2001')
INSERT #T VALUES('7:05:01.544', '7:05:01.544')
INSERT #T VALUES('7:05:01.544 PM', '7:05:01.544 PM')
GO
SELECT * FROM #T
On
my 7.0 system, I got the following results:
Example 2:
id dt sdt
--- --------------------------- ---------------------------
1 2001-06-24 07:36:56.910 2001-06-24 07:37:00
2 2001-06-24 07:05:01.543 2001-06-24 07:05:00
3 2001-06-24 00:00:00.000 2001-06-24 00:00:00
4 2001-06-24 00:00:00.000 2001-06-24 00:00:00
5 2001-06-24 00:00:00.000 2001-06-24 00:00:00
6 2001-06-24 00:00:00.000 2001-06-24 00:00:00
7 2001-06-24 00:00:00.000 2001-06-24 00:00:00
8 1900-01-01 07:05:01.543 1900-01-01 07:05:00
9 1900-01-01 19:05:01.543 1900-01-01 19:05:00
Note
the difference in precision, and the effect of rounding. Also note the various
means of populating the columns. The default value uses the GETDATE() function; the
remaining rows use a variety of character strings enclosed in single quotes.
The
server will transparently convert properly formatted strings into their correct
datetime values. These examples do not cover the full scope of its
capabilities. Be warned that formatting of string constants is very particular
if the database must support multiple languages. See BOL for
internationalization information. Also pay attention to the settings for DATEFORMAT and DATEFIRST.
With
regard to two-digit years, the server uses the two digit year cutoff setting
specified by sp_configure or the Enterprise
Manager. The default is 2049--'6/24/49' is equivalent to '6/24/2049,' and
'6/24/50' is equivalent to '6/24/1950.' There is little rationale for using
two-digit years, and they should be avoided in the interest of clarity and
consistency.
Conversion and Formatting
Implicit Conversion
SQL
Server implicitly converts datetime and smalldatetime both among
themselves and among the various char data types. I've already shown
conversions from char. Things become interesting
when mathematical operators are involved.
--This works fine
DECLARE @c CHAR(20)
SET @c = GETDATE()
SELECT 'The time is now ' + @c
-- This doesn't
SELECT 'The time is now ' +
GETDATE()
The
first example converts a datetime (smalldatetime , actually) to a char local variable. The server can then
concatenate the variable with a string constant and announce the time.
The
second example fails because the server considers the overloaded plus sign an
addition operator, not a concatenation operator, and is trying to add two
incompatible types. The way around this is to use the CAST function to convert the datetime to a string.
SELECT 'The time is now ' +
CAST(GETDATE() AS CHAR)
CAST
CAST has two uses as
regards dates: It explicitly converts converts the date data types to and from
most of the other system data types, and it changes the output's length,
usually to truncate it.
SELECT 'Today is ' + CAST(GETDATE()
AS CHAR(11))
You
can also use CAST to parse a text
string into a time datatype.
DECLARE @c CHAR(20), @d DATETIME
SET @c = '06242001073656910'
SET @d = CAST(SUBSTRING(@c,5,4) +
+ SUBSTRING(@c,1,4) + ' '
+ SUBSTRING(@c,9,2) + ":"
+ SUBSTRING(@c,11,2) + ":"
+ SUBSTRING(@c,13,2) + "."
+ SUBSTRING(@c,15,3)
AS DATETIME)
SELECT @d
CAST has minimal
formatting capability and isn't supported prior to 7.0. Unless you need ANSI
compliance, consider using the CONVERT function instead.
CONVERT
CONVERT provides a number of
styles corresponding to prevailing date format of each of a number of
countries. Most styles allow either two or four digit years.
--CONVERT
(data_type[(length)],expression [, style])
--See BOL for list of styles
DECLARE @d DATETIME
SET @d = '20010624 21:48'
SELECT NULL, CONVERT(VARCHAR,@d)
SELECT 1, CONVERT(VARCHAR,@d,1)
SELECT 101, CONVERT(VARCHAR,@d,101)
SELECT 2, CONVERT(VARCHAR,@d,2)
SELECT 104, CONVERT(VARCHAR,@d,104)
SELECT 108, CONVERT(VARCHAR,@d,108)
-- time only
SELECT 112, CONVERT(VARCHAR,@d,112)
SELECT 121, CONVERT(VARCHAR,@d,121)
This
gives the following output on my system:
NULL Jun 24 2001 9:48PM
1 06/24/01
101 06/24/2001
2 01.06.24
104 24.06.2001
108 21:48:00
112 20010624
121 2001-06-24 21:48:00.000
Like
CAST, CONVERT can truncate values
if desired. Just use the appropriate data type; i.e., CHAR(4), and style. You can
also nest CONVERT. The following
strips the time off a datetime and converts it back
to datetime.
SELECT
CONVERT(DATETIME,CONVERT(CHAR(8),GETDATE(),112))
To
put it mildly, there are large number of ways to handle conversion and
formatting. You do an implicit or explicit conversion, and then use one or more
functions that handle that data type. You'll find some combinations run much
faster than others. For instance, the above example could be rewritten as
follows. The rewrite runs more than twice as quickly on my single user system.
SELECT
CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))
Date Functions
SQL
Server has a good collection of specialized functions for returning parts of
dates. Most are also useful in calculations. The functions are DATENAME, DATEPART, DAY, MONTH,
and YEAR. DATENAME returns a character
string; the rest return integers. See the BOL for a full list of supported date
parts and their abbreviations.
DECLARE @c DATETIME
SET @c = '20010624 07:36:56.910'
SELECT 'This is Week ' +
DATENAME(WEEK, @c) + ' of FY' + DATENAME(YEAR, @c)
SELECT DATEPART(QUARTER, @c) AS
QUARTER
SELECT DATEPART(YEAR, @c) AS YEAR,
DATEPART(MONTH, @c) AS MONTH,
DATEPART(DAY, @c) AS DAY
SELECT YEAR(@c) AS YEAR, MONTH(@c)
AS MONTH, DAY(@c) AS DAY
This is Week 26 of FY2001
QUARTER
-----------
2
YEAR MONTH DAY
----------- ----------- -----------
2001 6 24
YEAR MONTH DAY
----------- ----------- -----------
2001 6 24
A
few additional notes about conversion and formatting. Watch for unintentional
loss of precision converting among data types. The effect will probably be
subtle.
Datetime and smalldatetime have the highest
precedence of all the data types. If you mix them with other data types using
an operator, the server will attempt implicit conversion of the lower precedence
data types to datetime or smalldatetime as required. If it
can't do the conversion, it returns an error message. See the initial
discussion of CAST.
With
regard to formatting, if you intend to import date values into another program,
keep that program's requirements in mind. In many cases, that means no
formatting at all--just export the data as datetime or smalldatetime values, by casting
if necessary. Otherwise, the program may treat the "dates" as nothing
more than text and fail to sort or group them properly. You'll save yourself
much grief with Excel pivot tables, among other things.
Calculations and Comparisons
Basic Arithmetic
SQL
Server can handle addition and subtraction of time without resorting to
functions, if you want. Use whichever is clearer.
-- Add 1.5 days to current time
SELECT GETDATE() AS Start,
GETDATE()+1.5 AS 'Estimated Finish'
-- Done all your shopping yet?
SELECT
CEILING(CAST('20011225'-GETDATE() AS FLOAT)) AS 'Shopping days left'
Functions
There
are two workhorse time functions, DATEADD and DATEDIFF. As their names
suggest, they add (or subtract) periods to a date, or return the difference
between two dates, respectively.
DATEADD
DATEADD adds the specified
interval to a date and returns a datetime or smalldatetime. You can use any of
the usual collection of date parts.
DECLARE @C SMALLDATETIME
SET @c = '20010624'
SELECT @C AS 'Start'
SELECT DATEADD(S,172800,@c) AS '+172800 seconds',
DATEADD(HH,48,@c) AS '+48 hours',
DATEADD(D,2,@c) AS '+2 days'
SELECT DATEADD(WK,-3,@c)as '-3 weeks'
SELECT DATEADD(M,1,CONVERT(CHAR(8),@c,112))-DAY(@c) as 'End of Month'
SELECT DATEADD(YY,1,'20000229') AS '1 yr. after Feb. 29, 2000'
-- Constants are evaluated at runtime. The following won't work.
SET ARITHABORT OFF -- Set this to ON and run again
SELECT DATEADD(D,1,'20010229') AS '1 day after Feb. 29, 2001'
Start
---------------------------
2001-06-24 00:00:00
+172800 seconds +48 hours +2 days
---------------------------
--------------------------- ---------------------------
2001-06-26 00:00:00 2001-06-26 00:00:00 2001-06-26 00:00:00
-3 weeks
---------------------------
2001-06-03 00:00:00
End of Month
---------------------------
2001-06-30 00:00:00.000
1 yr. after Feb. 29, 2000
---------------------------
2001-02-28 00:00:00.000
1 day. after Feb. 29, 2001
---------------------------
NULL
Arithmetic overflow occurred.
An
interesting use for DATEADD is converting Julian
dates to Gregorian ones. Strictly speaking, a Julian date is the
number of days since noon, Universal Time on January 1, 4713 BCE. Some
applications use Julian dates, but with a different base date. My employer's
accounting system uses 1 A.D. as the base. December 31, 2000 was 730485. Since
the dates we work with are within datetime's range, DATEADD can do the
conversion.
DECLARE @j INT
SET @j = 730636
SELECT @j AS 'Julian',
DATEADD(D,@j-730485,'12/31/2000') AS 'Gregorian'
Julian Gregorian
-----------
---------------------------
730636 2001-05-31 00:00:00.000
DATEADD isn't useful for
business days. It doesn't know about weekends or holidays. Rather than going
through gyrations to avoid Sunday and New Years, many applications build lookup
tables in advance and reference them instead.
DATEDIFF
DATEDIFF returns an integer
representing the number of date parts between two dates. If the start date
falls after the end date, the result is negative. Note that it only counts
boundaries. If you ask it how many years elapsed between December 31, 1999, and
January 1, 2001, it will reply, "2." To rehash some prior examples:
SELECT
DATEDIFF(D,GETDATE(),'20011225') AS 'Shopping days left'
-- Convert from Gregorian to Julian. Must know base date and be in range.
SELECT DATEDIFF(D,'12/31/2000',GETDATE())+730485 as 'Julian'
The
big gotcha with DATEDIFF is date boundaries.
Even calculating age isn't straightforward.
DECLARE @BIRTH SMALLDATETIME, @AGEAT
SMALLDATETIME
SELECT @BIRTH = '12/10/1990', @AGEAT
= '6/24/2001'
-- This is the wrong way
SELECT DATEDIFF(YY,@BIRTH,@AGEAT) AS
'Age using straight DATEDIFF'
-- This is one correct way
SELECT DATEDIFF(YY,@BIRTH,@AGEAT) -
CASE WHEN (MONTH(@BIRTH) = MONTH(@AGEAT) AND DAY(@BIRTH) >
DAY(@AGEAT)
OR MONTH(@BIRTH) > MONTH(@AGEAT))
THEN 1 ELSE 0 END AS 'Age correcting DATEDIFF'
Age using straight DATEDIFF
---------------------------
11
Age correcting DATEDIFF
-----------------------
10
DATEDIFF increments whenever
the year, or whatever date part you specified, changes. This could be useful
for calculating how many model years old a vehicle is, but not helpful when
you're trying to calculate anniversaries. It can trip up experts, as this discussion shows. (If
the link breaks, look for a thread called "Finding anniversary dates - Ken
Henderson Book" on Devdex, Google, etc.) Interestingly, not only is
Henderson's query wrong, so is the "corrected" one propounded. See if
you can understand how the second query works, and why it erroneously omits
January rows once the system date reaches December 2.
Comparisons
SELECT statements to
retrieve rows based on dates are straightforward. The main concerns are string
constants, which were previously discussed, handling combined date and time
values, and query optimization. Following is a simple table and some queries
against it.
CREATE TABLE work_study(
id INT IDENTITY,
r_type CHAR(1) NOT NULL,
r_time SMALLDATETIME NOT NULL)
GO
INSERT work_study
VALUES('A','1/5/2000')
INSERT work_study
VALUES('B','1/5/2000 14:10')
INSERT work_study
VALUES('B','1/5/2000 14:20')
INSERT work_study VALUES('A','2/8/2000')
INSERT work_study
VALUES('B','2/8/2000 6:05')
INSERT work_study
VALUES('B','2/8/2000 6:07')
INSERT work_study
VALUES('A','4/10/2001')
INSERT work_study
VALUES('B','4/10/2001 15:21')
INSERT work_study
VALUES('B','4/10/2001 15:27')
SELECT COUNT(*) as '= April 10,
2001' FROM work_study WHERE r_time = '4/10/2001'
SELECT * FROM work_study WHERE
r_time = '4/10/2001'
SELECT COUNT(*) as 'All April 10,
2001'
FROM work_study
WHERE r_time BETWEEN '4/10/2001' AND
'4/10/2001 23:59:59'
SELECT COUNT(*) AS 'FY2000 Rows'
FROM work_study
WHERE YEAR(r_time) = 2000 and r_type
= 'B'
--Use like to extract times
SELECT r_time as 'Like 6:05' FROM
work_study WHERE r_time LIKE '%6:05%'
SELECT DATENAME(DW,r_time) as
'Weekday', COUNT(*) as 'Count'
FROM work_study
WHERE r_type = 'B'
WHERE r_type = 'B'
GROUP BY DATENAME(DW,r_time)
--Crosstabs: See
http://support.microsoft.com/support/kb/articles/Q175/5/74.ASP
SELECT SUM(CASE WHEN YEAR(r_time) =
2000 THEN 1 ELSE 0 END) AS '2000 Rows',
SUM(CASE WHEN YEAR(r_time) = 2001 THEN
1 ELSE 0 END) AS '2001 Rows'
FROM work_study
WHERE r_type = 'B'
= April 10, 2001
----------------
1
id r_type r_time
----------- ------
---------------------------
7 A 2001-04-10 00:00:00
All April 10, 2001
------------------
3
FY2000 Rows
-----------
4
Like 6:05
---------------------------
2000-02-08 06:05:00
Weekday Count
------------------------------
-----------
Tuesday 4
Wednesday 2
2000 Rows 2001 Rows
----------- -----------
4 2
Some
things to note:
- Unless you never
store time values with your dates, you have to make provisions for
extracting all rows for the specified date.
- String constants
must be correctly formatted for your locale or you will either get errors
or strange results. Using YYYYMMDD should be foolproof.
- You can use LIKE to extract
times. See BOL.
- You can use all
of the usual date functions to do selection, sorting, and grouping.
One
potential big issue with date functions is performance. Date columns should be
indexed if date will used as a search argument. Unfortunately, the server won't
use the index, except perhaps for index scans, once you use a function.
Rewriting queries to avoid functions can lead to cumbersome code, but it's
worth it if the query runs often. In extreme cases, splitting the date parts
into separate, indexed columns, is beneficial.
-- Don't do this often.
SELECT COUNT(*)
FROM employees
WHERE YEAR(hire_date) = 1998
-- Do this instead.
SELECT COUNT(*)
FROM employees
WHERE hire_date BETWEEN '19980101' AND '19981231'
-- Same thing here. Try to avoid this.
SELECT COUNT(*)
FROM employees
WHERE MONTH(hire_date) = 11
-- If it's not too long, this is very effective.
SELECT COUNT(*)
FROM employees
WHERE hire_date BETWEEN '19901101' AND '19901130'
OR hire_date BETWEEN '19911101' AND '19911130'
OR hire_date BETWEEN '19921101' AND '19921130'
OR hire_date BETWEEN '19931101' AND '19931130'
OR hire_date BETWEEN '19941101' AND '19941130'
OR hire_date BETWEEN '19951101' AND '19951130'
OR hire_date BETWEEN '19961101' AND '19961130'
OR hire_date BETWEEN '19971101' AND '19971130'
OR hire_date BETWEEN '19981101' AND '19981130'
OR hire_date BETWEEN '19991101' AND '19991130'
OR hire_date BETWEEN '20001101' AND '20001130'
OR hire_date BETWEEN '20011101' AND '20011130'
I
experimented while writing this article and got a consistent six-fold
improvement using constants instead of functions. Your mileage may vary. The
difference widened a little as the number of rows increased. Don't give up on
functions, though. Queries on indexed columns should run quickly, no matter
what approach you take. It's when you run the query 1,000 times per day that
the milliseconds add up.
Miscellany
Keys
Dates
are probably a bad idea for keys. You're betting heavily that you'll never need
to insert two or more identical values into a column. Granted, datetime's resolution is
1/300th of a second, but why take the bet?
Constraints
Other
than keys, constraints are particularly useful with date columns. Default
constraints can record the time when each row is inserted. Check constraints
can meet business needs, such as prohibiting time values in a column. That way,
a hire date can never be '7/2/2001 10:00.' They can meet legal needs, such as
prohibiting transactions on Sunday.
Summary
Almost
any non-trivial database requires working with date or time values. SQL Server
has a good collection of tools for manipulating them; however, with this power,
comes complexity. The developer should strive to understand the server's
approach to dates so they may get the results and performance they need.
No comments:
Post a Comment