Lowering priority: maintaining unique keys when offline

Monday, March 19, 2012

maintaining unique keys when offline

If you have a "Orders" table that is being sync'd to subscribers that are ocassionaly offline, and the subscribers add rows to their local Orders table. When they go online to sync with the published "Orders" table, how do you handle keeping the "OrderId" field unique?

Example:

Both salespeople sync the following data down:

OrderId Desc

1 Order 1

2 Test Order

Both salespeople go offline and add orders

Salesperson 1 adds:

OrderId Desc

3 Joes Order

Salesperson 2 adds:

OrderId Desc

3 Kathys Order

Now, when they go back online, they both will sync their orders up to the main database and they both have the OrderId of 3.

The main problem with using an int identity as a primary key is that it gets assigned by the database on insert; as you're discovering, assigning it outside of the database creates key collisions.

There are several different approaches you can use. All of them have problems:

1) If the row contains some combination of values that are always unique, use this combination as the primary key.

Problems with this approach: Not always possible. If the table's going to be referenced as part of a foreign-key relationship, you have to replicate all of the parts of the key in the referencing rows.

2) Use a composite primary key with two columns, or a primary key that's a concatenation of two values. One is a token that's unique to each process that's creating records; the second is a sequentially-assigned value that each process is responsible for assigning. In your example, you'd use the salesperson as the token. So you could either make salesperson, order id the primary key, or you could create nvarchar keys like "joe:1".

Problems with this approach: The token has to be invariant, i.e. changing the salesperson on the order can create key collisions. The database can no longer assign primary keys, since the PKs are being assigned offline, outside of the database.

3) Use a primary key of type uniqueidentifier (i.e.a GUID). All GUIDs are unique, so you'll never have a key collision.

Problems with this approach: your identifier won't be usable by humans. Sorting by GUIDs is useless. GUIDs use 32 bytes of storage, as opposed to 4 for int.

4) Use a temporary local key (using either of the above 2 approaches) when creating records on the client, and assign the real key when the row is inserted.

Problems with this approach: The client won't automatically know what the keys are of the rows it just inserted into the database, and will have to re-query the database to get their values.

|||

SQL replication also has the ability to assign identity ranges to subscribers. You might want to look into it as a solution. The only problem is in a high subscriber, high volume scenario is assigning an appropriate range to each subscriber.

I looked into this solution for the company I currently work for.We decided using a GUID was a much better solution.It eliminates the need to monitor the identity ranges.

Monday, March 19, 2012

maintaining unique keys when offline

No comments:

Post a Comment

Lowering priority

Blog Archive

About Me