Prev: fillfactor gets set to zero for toast tables
Next: Parameter oddness; was HS/SR Assert server crash
From: Peter Crabtree on 14 May 2010 14:56 Recently, in preparation for migrating an application to postgres, I got to this part of the manual (which is *excellent* so far, by the way): http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html A quick check with the folks on #postgresql confirmed my understanding, which was that the locking semantics of setval() and nextval() make this unsafe: SELECT setval('my_seq', nextval('my_seq') + 500); Now, I was reminded that I could simply do this: SELECT nextval('my_seq') FROM generate_series(1, 500); But of course then I would have no guarantee that I would get a contiguous block of ids, which means if I'm using this to do a mass insert of records which refer to each others' ids (example: storing a directed, linear graph), I either have to do a correlated update on the client side, after transferring the keys (consider the cost of doing this for a few million records - 4 MB in keys per million records, for, in extreme cases, 12 MB of data to be inserted -- 33% overhead in the worst case, presuming symmetric bandwidth), or I have to insert into a temporary table, then have the db backend do the update, then insert from there to the real table. Both are imperfect options in terms of performance and complexity. Thus, before I start work on it, I propose an extension to the current nextval(): SELECT nextval('my_seq', 500); This would increment the my_seq sequence by its interval * 500, and return the first valid key. This both makes client code that needs a bunch of PKs simpler to implement, and saves in performance, since the client can just replace all its PKs (presuming they're currently a contiguous block from 1 to n) with my_starting_pk + current_pk, so this: pk | next_node ----+----------- 0 | 1 1 | 2 2 | 0 can be easily updated like this: SELECT nextval('my_seq', (SELECT count(*) FROM my_table)); UPDATE my_table SET pk = currval('my_seq') + pk, next_node = currval('my_seq') + next_node; to something like this: pk | next_node --------+---------- 521650 | 521651 521651 | 521652 521652 | 521650 This is a net gain of performance and ease of implementation in many cases where a large number of ids from a sequence are needed -- with a small added benefit of the keys being guaranteed to be contiguous. I don't see any technical problems with this; postgres already can "pre-allocate" more than one key, but the number is semi-static (the CACHE parameter to CREATE SEQUENCE). This might break existing user code if they've defined a nextval(regclass, integer), but I don't see any way to Finally, I've checked sequence.c -- this looks pretty straightforward to implement, but I figured checking with this list was wise before starting work. Apologies if I've been overly wordy. Peter -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Pages: 1 Prev: fillfactor gets set to zero for toast tables Next: Parameter oddness; was HS/SR Assert server crash |