From: Christoffer on 28 Apr 2010 02:36 Hello, one or two of my colleague has managed to break a foreign key in our database, I'd like some help to understand how this have happened. This might be a long story so bear with me :) There are two tables involved in this. The "DataRepository" that holds data that are to be sent to a "DataAddress". Thus the DataRepository has a foreign key to DataAddress. The DataRepository has a couple of columns, but the important ones is Dar_Id (PK, bigint) and the Dar_Daa_Id (FK, bigint). The DataAddres has a Daa_Id (PK, bigint). The foreign key is set between Dar_Daa_Id and Daa_Id. The story is that the DataRepository and DataAddress tables grew quickly in our production systems because a daily clean up job wasn't started. This was discovered when the server was low on disk space. The clean up job was started and ran a stored procedures that did two things: DELETE FROM [DataRepository] WHERE [Dar_Id] NOT IN (SELECT [Meq_Dar_Id] FROM [MessageQueue]) AND [Dar_CreateDate] < DATEADD(hour, -1, getdate()) DELETE FROM [DataAddress] WHERE [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Sender] FROM [DataRepository] WHERE [Dar_Daa_Id_Sender] IS NOT NULL) AND [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Receiver] FROM [DataRepository] WHERE [Dar_Daa_Id_Receiver] IS NOT NULL) The first statement pretty much deletes data from DataRepository that is not in use and is older than one hour. The second deletes data from DataAddress that isn't used be DataRepository. After the clean up job (took about 55 minutes), we ended up with 20 rows in DataRepository that does not exist in DataAddress(!). The rows were created when the clean up job had just finished and were created in less than a minute. I've checked the sys.foreign_keys table and both is_disabled and is_not_trusted is 0. I'm not sure what the next step is, how can this happen? Cheers, Chris
From: John Bell on 28 Apr 2010 09:30 On Tue, 27 Apr 2010 23:36:01 -0700, Christoffer <Christoffer(a)discussions.microsoft.com> wrote: >Hello, one or two of my colleague has managed to break a foreign key in our >database, I'd like some help to understand how this have happened. This might >be a long story so bear with me :) > >There are two tables involved in this. The "DataRepository" that holds data >that are to be sent to a "DataAddress". Thus the DataRepository has a foreign >key to DataAddress. The DataRepository has a couple of columns, but the >important ones is Dar_Id (PK, bigint) and the Dar_Daa_Id (FK, bigint). The >DataAddres has a Daa_Id (PK, bigint). The foreign key is set between >Dar_Daa_Id and Daa_Id. > >The story is that the DataRepository and DataAddress tables grew quickly in >our production systems because a daily clean up job wasn't started. This was >discovered when the server was low on disk space. The clean up job was >started and ran a stored procedures that did two things: > >DELETE FROM [DataRepository] >WHERE [Dar_Id] NOT IN >(SELECT [Meq_Dar_Id] FROM [MessageQueue]) AND [Dar_CreateDate] < >DATEADD(hour, -1, getdate()) > >DELETE FROM [DataAddress] >WHERE [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Sender] FROM [DataRepository] >WHERE [Dar_Daa_Id_Sender] IS NOT NULL) >AND [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Receiver] FROM [DataRepository] >WHERE [Dar_Daa_Id_Receiver] IS NOT NULL) > >The first statement pretty much deletes data from DataRepository that is not >in use and is older than one hour. The second deletes data from DataAddress >that isn't used be DataRepository. > >After the clean up job (took about 55 minutes), we ended up with 20 rows in >DataRepository that does not exist in DataAddress(!). The rows were created >when the clean up job had just finished and were created in less than a >minute. I've checked the sys.foreign_keys table and both is_disabled and >is_not_trusted is 0. > >I'm not sure what the next step is, how can this happen? > >Cheers, >Chris Hi You don't say how the data is inserted! To be consistent the delete statements should be in a single transaction. Posting DDL would be a lot clearer and easier to understand than your description. Example data to re-create the scenario would also be useful. What does not seem to fit is that you are deleting from DataAddress when Daa_id does not match Dar_Daa_id_Sender or Dar_Daa_id_Receiver not Dar_Daa_ID which is where you say the FK defined. John
From: Christoffer on 6 May 2010 10:30 > On Tue, 27 Apr 2010 23:36:01 -0700, Christoffer > <Christoffer(a)discussions.microsoft.com> wrote: > > >Hello, one or two of my colleague has managed to break a foreign key in our > >database, I'd like some help to understand how this have happened. This might > >be a long story so bear with me :) > > > >There are two tables involved in this. The "DataRepository" that holds data > >that are to be sent to a "DataAddress". Thus the DataRepository has a foreign > >key to DataAddress. The DataRepository has a couple of columns, but the > >important ones is Dar_Id (PK, bigint) and the Dar_Daa_Id (FK, bigint). The > >DataAddres has a Daa_Id (PK, bigint). The foreign key is set between > >Dar_Daa_Id and Daa_Id. > > > >The story is that the DataRepository and DataAddress tables grew quickly in > >our production systems because a daily clean up job wasn't started. This was > >discovered when the server was low on disk space. The clean up job was > >started and ran a stored procedures that did two things: > > > >DELETE FROM [DataRepository] > >WHERE [Dar_Id] NOT IN > >(SELECT [Meq_Dar_Id] FROM [MessageQueue]) AND [Dar_CreateDate] < > >DATEADD(hour, -1, getdate()) > > > >DELETE FROM [DataAddress] > >WHERE [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Sender] FROM [DataRepository] > >WHERE [Dar_Daa_Id_Sender] IS NOT NULL) > >AND [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Receiver] FROM [DataRepository] > >WHERE [Dar_Daa_Id_Receiver] IS NOT NULL) > > > >The first statement pretty much deletes data from DataRepository that is not > >in use and is older than one hour. The second deletes data from DataAddress > >that isn't used be DataRepository. > > > >After the clean up job (took about 55 minutes), we ended up with 20 rows in > >DataRepository that does not exist in DataAddress(!). The rows were created > >when the clean up job had just finished and were created in less than a > >minute. I've checked the sys.foreign_keys table and both is_disabled and > >is_not_trusted is 0. > > > >I'm not sure what the next step is, how can this happen? > > > >Cheers, > >Chris > > > Hi > > You don't say how the data is inserted! To be consistent the delete > statements should be in a single transaction. > > Posting DDL would be a lot clearer and easier to understand than your > description. Example data to re-create the scenario would also be > useful. > > What does not seem to fit is that you are deleting from DataAddress > when Daa_id does not match Dar_Daa_id_Sender or Dar_Daa_id_Receiver > not Dar_Daa_ID which is where you say the FK defined. > > John > . > Heya The data is inserted by using INSERT over a period of time (no batch moves). The underlying service using the database is a distribution network. So each time a package arrives it is inserted in the DataRepository, the sender and receiver are inserted in the DataAddress table. I also noticed that the DELETE statements aren't in a transaction, but that "should" not be a problem. They can be used separately. And sorry about the column names for the FK, there is no Dar_Daa_Id, its Dar_Daa_Id_Receiver and Dar_Daa_Id_Sender I'm not sure how I can provide you with data to recreate the problem since inserting the data would yield a error :) I'll post one of the rows below though. Scripting the tables and stored procedures gives me this DDL. CREATE TABLE [dbo].[DataAddress]( [Daa_Id] [bigint] IDENTITY(10,1) NOT NULL, [Daa_AdapterId] [uniqueidentifier] NOT NULL, [Daa_ClientUserName] [nvarchar](256) NOT NULL, CONSTRAINT [PK_DataAddress] PRIMARY KEY CLUSTERED ( [Daa_Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO CREATE TABLE [dbo].[DataRepository]( [Dar_Id] [bigint] IDENTITY(10,1) NOT NULL, [Dar_Key] [binary](32) NOT NULL, [Dar_InitialVector] [binary](16) NOT NULL, [Dar_CreateDate] [datetime] NOT NULL, [Dar_Daa_Id_Sender] [bigint] NOT NULL, [Dar_Daa_Id_Receiver] [bigint] NULL, [Dar_DataTrackingId] [uniqueidentifier] NULL, [Dar_Data] [varbinary](max) NULL, CONSTRAINT [PK_DataRepository] PRIMARY KEY CLUSTERED ( [Dar_Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO ALTER TABLE [dbo].[DataRepository] WITH CHECK ADD CONSTRAINT [FK_DataRepository_DataAddress_Receiver] FOREIGN KEY([Dar_Daa_Id_Receiver]) REFERENCES [dbo].[DataAddress] ([Daa_Id]) GO ALTER TABLE [dbo].[DataRepository] CHECK CONSTRAINT [FK_DataRepository_DataAddress_Receiver] GO ALTER TABLE [dbo].[DataRepository] WITH CHECK ADD CONSTRAINT [FK_DataRepository_DataAddress_Sender] FOREIGN KEY([Dar_Daa_Id_Sender]) REFERENCES [dbo].[DataAddress] ([Daa_Id]) GO ALTER TABLE [dbo].[DataRepository] CHECK CONSTRAINT [FK_DataRepository_DataAddress_Sender] GO CREATE PROCEDURE [dbo].[DataAddress_Insert] ( @Daa_AdapterId uniqueidentifier, @Daa_ClientUserName nvarchar(256), @Daa_Id bigint out ) AS BEGIN DECLARE @IdTable TABLE ([Daa_Id] bigint) INSERT INTO [DataAddress] ( [Daa_AdapterId], [Daa_ClientUserName] ) OUTPUT INSERTED.[Daa_Id] INTO @IdTable VALUES ( @Daa_AdapterId, @Daa_ClientUserName ) SELECT @Daa_Id = [Daa_Id] FROM @IdTable END GO CREATE PROCEDURE [dbo].[DataRepository_Insert] @Dar_Key binary(32), @Dar_InitialVector binary(16), @Dar_Daa_Id_Sender bigint, @Dar_Daa_Id_Receiver bigint, @Dar_DataTrackingId uniqueidentifier, @Dar_Data varbinary(max), @Dar_Id bigint OUTPUT AS BEGIN DECLARE @IdTable TABLE ([Dar_Id] bigint) INSERT INTO [DataRepository] ( [Dar_Key], [Dar_InitialVector], [Dar_CreateDate], [Dar_Daa_Id_Sender], [Dar_Daa_Id_Receiver], [Dar_DataTrackingId], [Dar_Data] ) OUTPUT INSERTED.[Dar_Id] INTO @IdTable VALUES ( @Dar_Key, @Dar_InitialVector, getdate(), @Dar_Daa_Id_Sender, @Dar_Daa_Id_Receiver, @Dar_DataTrackingId, @Dar_Data ) SELECT @Dar_Id = [Dar_Id] FROM @IdTable END GO Dar_Id: 10010453 Dar_Key: 0x636E134BA5327071447F5CBD60BC48EB6A1FFB838F5DE65007CDDDF6C6F027BA Dar_InitialVector: 0xEF395BD72247C3B5F4C506DBAE31EEFA Dar_CreateDate: 2010-04-22 09:20:27.630 Dar_Daa_Id_Sender: 14786105 Dar_Daa_Id_Receiver: 14786106 Dar_DataTrackingId: 46212D94-349D-4289-95BA-178BAD7B6AE7 Dar_Data: NULL Hope this helps :) /Chris
From: John Bell on 7 May 2010 06:27 On Thu, 6 May 2010 07:30:01 -0700, Christoffer <Christoffer(a)discussions.microsoft.com> wrote: >> On Tue, 27 Apr 2010 23:36:01 -0700, Christoffer >> <Christoffer(a)discussions.microsoft.com> wrote: >> >> >Hello, one or two of my colleague has managed to break a foreign key in our >> >database, I'd like some help to understand how this have happened. This might >> >be a long story so bear with me :) >> > >> >There are two tables involved in this. The "DataRepository" that holds data >> >that are to be sent to a "DataAddress". Thus the DataRepository has a foreign >> >key to DataAddress. The DataRepository has a couple of columns, but the >> >important ones is Dar_Id (PK, bigint) and the Dar_Daa_Id (FK, bigint). The >> >DataAddres has a Daa_Id (PK, bigint). The foreign key is set between >> >Dar_Daa_Id and Daa_Id. >> > >> >The story is that the DataRepository and DataAddress tables grew quickly in >> >our production systems because a daily clean up job wasn't started. This was >> >discovered when the server was low on disk space. The clean up job was >> >started and ran a stored procedures that did two things: >> > >> >DELETE FROM [DataRepository] >> >WHERE [Dar_Id] NOT IN >> >(SELECT [Meq_Dar_Id] FROM [MessageQueue]) AND [Dar_CreateDate] < >> >DATEADD(hour, -1, getdate()) >> > >> >DELETE FROM [DataAddress] >> >WHERE [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Sender] FROM [DataRepository] >> >WHERE [Dar_Daa_Id_Sender] IS NOT NULL) >> >AND [Daa_Id] NOT IN (SELECT [Dar_Daa_Id_Receiver] FROM [DataRepository] >> >WHERE [Dar_Daa_Id_Receiver] IS NOT NULL) >> > >> >The first statement pretty much deletes data from DataRepository that is not >> >in use and is older than one hour. The second deletes data from DataAddress >> >that isn't used be DataRepository. >> > >> >After the clean up job (took about 55 minutes), we ended up with 20 rows in >> >DataRepository that does not exist in DataAddress(!). The rows were created >> >when the clean up job had just finished and were created in less than a >> >minute. I've checked the sys.foreign_keys table and both is_disabled and >> >is_not_trusted is 0. >> > >> >I'm not sure what the next step is, how can this happen? >> > >> >Cheers, >> >Chris >> >> >> Hi >> >> You don't say how the data is inserted! To be consistent the delete >> statements should be in a single transaction. >> >> Posting DDL would be a lot clearer and easier to understand than your >> description. Example data to re-create the scenario would also be >> useful. >> >> What does not seem to fit is that you are deleting from DataAddress >> when Daa_id does not match Dar_Daa_id_Sender or Dar_Daa_id_Receiver >> not Dar_Daa_ID which is where you say the FK defined. >> >> John >> . >> > >Heya > >The data is inserted by using INSERT over a period of time (no batch moves). >The underlying service using the database is a distribution network. So each >time a package arrives it is inserted in the DataRepository, the sender and >receiver are inserted in the DataAddress table. > >I also noticed that the DELETE statements aren't in a transaction, but that >"should" not be a problem. They can be used separately. And sorry about the >column names for the FK, there is no Dar_Daa_Id, its Dar_Daa_Id_Receiver and >Dar_Daa_Id_Sender > >I'm not sure how I can provide you with data to recreate the problem since >inserting the data would yield a error :) I'll post one of the rows below >though. Scripting the tables and stored procedures gives me this DDL. > >CREATE TABLE [dbo].[DataAddress]( > [Daa_Id] [bigint] IDENTITY(10,1) NOT NULL, > [Daa_AdapterId] [uniqueidentifier] NOT NULL, > [Daa_ClientUserName] [nvarchar](256) NOT NULL, >CONSTRAINT [PK_DataAddress] PRIMARY KEY CLUSTERED >( > [Daa_Id] ASC >)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = >OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] >) ON [PRIMARY] >GO > >CREATE TABLE [dbo].[DataRepository]( > [Dar_Id] [bigint] IDENTITY(10,1) NOT NULL, > [Dar_Key] [binary](32) NOT NULL, > [Dar_InitialVector] [binary](16) NOT NULL, > [Dar_CreateDate] [datetime] NOT NULL, > [Dar_Daa_Id_Sender] [bigint] NOT NULL, > [Dar_Daa_Id_Receiver] [bigint] NULL, > [Dar_DataTrackingId] [uniqueidentifier] NULL, > [Dar_Data] [varbinary](max) NULL, >CONSTRAINT [PK_DataRepository] PRIMARY KEY CLUSTERED >( > [Dar_Id] ASC >)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = >OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] >) ON [PRIMARY] >GO > >ALTER TABLE [dbo].[DataRepository] WITH CHECK ADD CONSTRAINT >[FK_DataRepository_DataAddress_Receiver] FOREIGN KEY([Dar_Daa_Id_Receiver]) >REFERENCES [dbo].[DataAddress] ([Daa_Id]) >GO > >ALTER TABLE [dbo].[DataRepository] CHECK CONSTRAINT >[FK_DataRepository_DataAddress_Receiver] >GO > >ALTER TABLE [dbo].[DataRepository] WITH CHECK ADD CONSTRAINT >[FK_DataRepository_DataAddress_Sender] FOREIGN KEY([Dar_Daa_Id_Sender]) >REFERENCES [dbo].[DataAddress] ([Daa_Id]) >GO > >ALTER TABLE [dbo].[DataRepository] CHECK CONSTRAINT >[FK_DataRepository_DataAddress_Sender] >GO > >CREATE PROCEDURE [dbo].[DataAddress_Insert] >( > @Daa_AdapterId uniqueidentifier, > @Daa_ClientUserName nvarchar(256), > @Daa_Id bigint out >) >AS >BEGIN > >DECLARE @IdTable TABLE ([Daa_Id] bigint) > >INSERT INTO [DataAddress] >( > [Daa_AdapterId], > [Daa_ClientUserName] >) >OUTPUT INSERTED.[Daa_Id] INTO @IdTable >VALUES >( > @Daa_AdapterId, > @Daa_ClientUserName >) > >SELECT @Daa_Id = [Daa_Id] FROM @IdTable > >END >GO > >CREATE PROCEDURE [dbo].[DataRepository_Insert] > @Dar_Key binary(32), > @Dar_InitialVector binary(16), > @Dar_Daa_Id_Sender bigint, > @Dar_Daa_Id_Receiver bigint, > @Dar_DataTrackingId uniqueidentifier, > @Dar_Data varbinary(max), > @Dar_Id bigint OUTPUT >AS >BEGIN > >DECLARE @IdTable TABLE ([Dar_Id] bigint) > >INSERT INTO [DataRepository] >( > [Dar_Key], > [Dar_InitialVector], > [Dar_CreateDate], > [Dar_Daa_Id_Sender], > [Dar_Daa_Id_Receiver], > [Dar_DataTrackingId], > [Dar_Data] >) >OUTPUT INSERTED.[Dar_Id] INTO @IdTable >VALUES >( > @Dar_Key, > @Dar_InitialVector, > getdate(), > @Dar_Daa_Id_Sender, > @Dar_Daa_Id_Receiver, > @Dar_DataTrackingId, > @Dar_Data >) > >SELECT @Dar_Id = [Dar_Id] FROM @IdTable > >END >GO > >Dar_Id: 10010453 >Dar_Key: 0x636E134BA5327071447F5CBD60BC48EB6A1FFB838F5DE65007CDDDF6C6F027BA >Dar_InitialVector: 0xEF395BD72247C3B5F4C506DBAE31EEFA >Dar_CreateDate: 2010-04-22 09:20:27.630 >Dar_Daa_Id_Sender: 14786105 >Dar_Daa_Id_Receiver: 14786106 >Dar_DataTrackingId: 46212D94-349D-4289-95BA-178BAD7B6AE7 >Dar_Data: NULL > >Hope this helps :) >/Chris Hi Chris It's interesting that the FKs are created with CHECK as altering the constraint to set CHECK is not necessary. If the FKs were created with NOCHECK and there is was invalid data before they were created, then altering the constraints to set CHECK would still work, but if this occurred is_not_trusted would be 1 in sys.foreign_keys which you say is not the case. Therefore if you drop the foreign key and then re-created them it should fail. I'm not sure if you are going to find out how this occurred and I don't think you are going to find out now. Going forward I would make sure that your transactions are correctly formed and that your error checking is thorough. John
|
Pages: 1 Prev: How to keep format when importing Excel into SQL. Next: double mail sent |