Having a clear audit trail is a must-have for any financial application. With an ever-increasing number of corporations under pressure for stricter corporate governance and compliance to the multitude of acts, such as Sarbanes-Oxley, HL7, and ISO-17799 to name a few, just about every development project to come down the pipeline includes some type of auditing requirement.
In this article, I will demonstrate a strategy to implement auditing that allows for easy snapshots at a table level. Our goal will be to track all changes to data, including inserts, updates, and deletes on the table, in a manner that allows for us to easily recreate a snapshot of the table at a particular time in the past. This particular solution can be used on any existing table with data or a brand new table. My particular implementation makes use of two simple triggers. The choice to use triggers or stored-procedures with the auditing code is entirely up to you. I chose to use triggers as I then don’t have to update any existing stored-procedures or update my application code to use new ones. It also ensures that regardless of how the table is being updated, an audit trail will be maintained.
Scenario
We have the following table that is being used by a resource management application.
1 CREATE TABLE [dbo].[tbWorkOff]
2 (
3 [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
4 [SubJobID] [int] NOT NULL,
5 [DisciplineID] [int] NOT NULL,
6 [PayPeriod] [datetime] NOT NULL,
7 [Hours] [decimal](18, 0) NULL,
8 [HoursPerPeriod] [decimal](18, 0) NULL,
9 [WorkOffTypeID] [int] NOT NULL
10 )
Business has put forth a requirement to be able to view the state of the table at any date and time in the past. At present, this is being accomplished by backing up the database on a nightly basis and then recovering the database from the backup done on the particular date.
Step 1
First, we will add an EffectiveDate column to tbWorkOff. We will set a DEFAULT constraint on this column with a value of GETDATE(). This means that each row inserted in this table will get the current datetime for its EffectiveDate.
1 ALTER TABLE [dbo].[tbWorkOff]
2 ADD [EffectiveDate].[DATETIME] NULL CONSTRAINT [DF_tbWorkOff_EffectiveDate] DEFAULT (GETDATE())
Next, we will run an UPDATE statement to initialize all existing rows with an EffectiveDate.
1 UPDATE [dbo].tbWorkOff]
2 SET [EffectiveDate] = '2007-01-01 12:00:00 AM'
Finally, we will update the EffectiveDate column to not allow null values. This will ensure that there is always an EffectiveDate present for each row.
1 ALTER TABLE [dbo].[tbWorkOff]
2 ALTER COLUMN [EffectiveDate].[DATETIME] NOT NULL
Step 2
Next, we will add a History table to audit tbWorkOff. This table has all the columns of the table to audit plus a HistoryID(identity) and an EndDate(datetime) column. The EndDate column has a default constraint with a value of GETDATE().
1 CREATE TABLE [dbo].[tbWorkOffHistory]
2 (
3 [HistoryID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
4 [ID] [int] NOT NULL,
5 [SubJobID] [int] NOT NULL,
6 [DisciplineID] [int] NOT NULL,
7 [PayPeriod] [datetime] NOT NULL,
8 [Hours] [decimal](18, 0) NULL,
9 [HoursPerPeriod] [decimal](18, 0) NULL,
10 [WorkOffTypeID] [int] NOT NULL,
11 [EffectiveDate] [datetime] NOT NULL,
12 [EndDate] [datetime] NOT NULL CONSTRAINT [DF_tbWorkOffHistory_EndDate] DEFAULT (GETDATE())
13 )
Now that we have our tbWorkOffHistory table in place, our objective is straight forward. Here are the three scenarios:
1. Insert new row in tbWorkOff:
The EffectiveDate column will be set to the current datetime.
2. Update existing row in tbWorkOff:
In this case, we must copy the row to be updated to tbWorkOffHistory where the EndDate column will get set to the current datetime. This allows for the row state before the update to be captured in tbWorkOffHistory and the EffectiveDate and EndDate columns tell us the lifetime of this row in tbWorkOff. Next, we update the row in tbWorkOff with the new values and provide the current datetime as the EffectiveDate.
3. Delete existing row in tbWorkOff:
We perform the exact same steps as 2 above, except this time we delete the row from tbWorkOff.
So how do we do this with minimum impact on the existing system? Answer: With two AFTER triggers like so:
AFTER-UPDATE TRIGGER:
1 CREATE TRIGGER [dbo].[trg_AfterUpdateTBWorkOff]
2 ON [dbo].[tbWorkOff]
3 AFTER UPDATE
4 AS
5 BEGIN
6 SET NOCOUNT ON;
7
8 INSERT INTO tbWorkOffHistory
9 (ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate)
10 SELECT ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate
11 FROM DELETED
12
13 UPDATE tbWorkOff SET EffectiveDate = GETDATE()
14 WHERE ID IN (SELECT ID FROM DELETED)
15
16 END
AFTER-DELETE TRIGGER:
1 CREATE TRIGGER [dbo].[trg_AfterDeleteTBWorkOff]
2 ON [dbo].[tbWorkOff]
3 AFTER DELETE
4 AS
5 BEGIN
6 SET NOCOUNT ON;
7
8 INSERT INTO tbWorkOffHistory
9 (ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate)
10 SELECT ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate
11 FROM DELETED
12
13 END
Step 3
Good job so far. At this stage you can verify that auditing is taking place by adding, updating, and deleting rows in tbWorkOff and watch the entries get created in tbWorkOffHistory. Now the fun part is getting the auditing information out in various formats. I’ve created a base UDF that can be used to retrieve a snapshot of the table at a certain datetime in the past. I chose to use a multi-valued UDF over a stored procedure since the UDF can be used in Select statements. This UDF will return you all the rows that were in the table at the specified datetime with the values at that datetime regardless of whether rows have been added, updated, or deleted from the table since then. You can easily customize the UDF for your own tables by renaming the table names and the structure of the returning table. The UDF is undoubtedly complex and if you can think of a more elegant solution…I’m all ears.
1 CREATE FUNCTION dbo.udf_tbWorkOffSnapShot(@EffectiveDate DATETIME)
2 RETURNS @tbWorkOffSnapShot TABLE
3 (
4 [ID] [int] PRIMARY KEY NOT NULL ,
5 [SubJobID] [int] NOT NULL,
6 [DisciplineID] [int] NOT NULL,
7 [PayPeriod] [datetime] NOT NULL,
8 [Hours] [decimal](18, 0) NULL,
9 [HoursPerPeriod] [decimal](18, 0) NULL,
10 [WorkOffTypeID] [int] NOT NULL,
11 [EffectiveDate] [datetime] NOT NULL
12 )
13 AS
14 BEGIN
15 WITH SnapShot(ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate)
16 AS
17 (
18 --Return rows that have not been inserted or modified since the EffectiveDate passed in
19 SELECT ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate
20 FROM [tbWorkOff]
21 WHERE EffectiveDate <= @EffectiveDate
22 UNION
23 --Retrieve the row values for rows that have been modified or deleted
24 SELECT ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate
25 FROM tbWorkOffHistory history1
26 WHERE ID NOT IN (SELECT ID
27 FROM tbWorkOff
28 WHERE EffectiveDate <= @EffectiveDate)
29 AND EffectiveDate = (SELECT MAX(EffectiveDate)
30 FROM tbWorkOffHistory history2
31 WHERE history1.ID = history2.ID and EffectiveDate <= @EffectiveDate)
32 AND HistoryID NOT IN (SELECT HistoryID
33 FROM tbWorkOffHistory history1
34 WHERE ID NOT IN (SELECT ID
35 FROM tbWorkOff)
36 AND EndDate = (SELECT MAX(EndDate)
37 FROM tbWorkOffHistory history2
38 WHERE history1.ID = history2.ID)
39 AND EndDate <= @EffectiveDate)
40 )
41 INSERT @tbWorkOffSnapShot
42 SELECT ID, SubJobID, DisciplineID, PayPeriod, Hours, HoursPerPeriod, WorkOffTypeID, EffectiveDate
43 FROM SnapShot
44 RETURN
45
46 END;
Summary
In short, with a few small tweaks, we can implement an auditing strategy for any table. For highly transactional systems with a lot of updates, the history table can grow pretty fast. In that case, one can implement a policy to delete rows with EndDate prior to x years in the past. If a requirement is present to maintain all data regardless of it’s age, as in line-of-business financial applications, then one can utilize partitioned views and distribute the history data over multiple tables where each table may hold historical data for a particular year.