Code versus Int [Server Configuration]

Prev: sp_releaseapplock timeout expired?
Next: SQL Server 2008 - user wants to directly backup database to his desktop

From: Arjen on 12 May 2010 17:22

Hi,

Web applications often use a parameter to select a specific record. Hotmail,
for example, uses a code like "ZIaXbPiFjyPUfJG44nag9A%3d%3d" to retriev a
specific mail message.

Question 1)
Is an Int type not prefered?

Question 2)
What is a smart way to generate unique random numbers?
Date + number??? 051320101 = 05-13-2010-1

Thanks!
Arjen

From: Erland Sommarskog on 12 May 2010 17:50

Arjen (boah123(a)hotmail.com) writes:
> Web applications often use a parameter to select a specific record.
> Hotmail, for example, uses a code like "ZIaXbPiFjyPUfJG44nag9A%3d%3d" to
> retriev a specific mail message.
>
> Question 1)
> Is an Int type not prefered?

Not always. There are a couple of problem with using numbers. One is
that sequential numbers can be easy to guess. Which can be helpful
for intruders.

In case of big web sites, they have farms of servers and the ids are
generated on the web servers. Thus ids must be generated without risk
for collision. This can be difficult with numbers, even with 64-bit
integers. A common device are GUIDs, which are 128 bits, and generated
in a random way so that the risk for collision is so small that it is
negligible. If Hotmail uses a GUID or just a hash of some sort, I don't
know.

> Question 2)
> What is a smart way to generate unique random numbers?
> Date + number??? 051320101 = 05-13-2010-1

In SQL Server, checksum(newid()) is a good random generator. But if
you want your series to not include duplicates, you will need to
track generated numbers in a table. If you generate 100000 random
32-bit numbers, you have a likelyhood of 0.3 of having at least one
duplicate, if memory serves. Or if it was 0.3 for not getting any
dups at all.

--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

From: Arjen on 15 May 2010 14:03

"Erland Sommarskog" <esquel(a)sommarskog.se> schreef in bericht
news:Xns9D76F27C26394Yazorman(a)127.0.0.1...
> Arjen (boah123(a)hotmail.com) writes:
>> Web applications often use a parameter to select a specific record.
>> Hotmail, for example, uses a code like "ZIaXbPiFjyPUfJG44nag9A%3d%3d" to
>> retriev a specific mail message.
>>
>> Question 1)
>> Is an Int type not prefered?
>
> Not always. There are a couple of problem with using numbers. One is
> that sequential numbers can be easy to guess. Which can be helpful
> for intruders.
>
> In case of big web sites, they have farms of servers and the ids are
> generated on the web servers. Thus ids must be generated without risk
> for collision. This can be difficult with numbers, even with 64-bit
> integers. A common device are GUIDs, which are 128 bits, and generated
> in a random way so that the risk for collision is so small that it is
> negligible. If Hotmail uses a GUID or just a hash of some sort, I don't
> know.
>
>> Question 2)
>> What is a smart way to generate unique random numbers?
>> Date + number??? 051320101 = 05-13-2010-1
>
> In SQL Server, checksum(newid()) is a good random generator. But if
> you want your series to not include duplicates, you will need to
> track generated numbers in a table. If you generate 100000 random
> 32-bit numbers, you have a likelyhood of 0.3 of having at least one
> duplicate, if memory serves. Or if it was 0.3 for not getting any
> dups at all.
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se
>
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000:
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>

Thanks for your reply!

When I use GUIDs as key in a relational database then I need to copy the
specific guid into other tables (as reference to the main table). Correct?

From a performance perspective, is sql server using int values as index for
this guid-relationship in the background?

Thanks!

From: Erland Sommarskog on 15 May 2010 16:36

Arjen (boah123(a)hotmail.com) writes:
> When I use GUIDs as key in a relational database then I need to copy the
> specific guid into other tables (as reference to the main table). Correct?

Correct. This is no different from other keys.

> From a performance perspective, is sql server using int values as index
> for this guid-relationship in the background?

No. Operations with GUIDs are likely to be somewhat less efficient than
operations on ints, since few computers have 128-bit architecture. However,
since GUIDs are just bits, they are far more effcient than character data.
(For which lots of rules for case-insensitity etc applies.

And it would made little sense to try to distill GUIDs to integers. As
I mentioned, with 32-bit integers you have a fair chance of collision
with 100000 values - which is not much in a relational database.

--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

From: Arjen on 16 May 2010 06:01

"Erland Sommarskog" <esquel(a)sommarskog.se> schreef in bericht
news:Xns9D79E5F4BF92FYazorman(a)127.0.0.1...
> Arjen (boah123(a)hotmail.com) writes:
>> When I use GUIDs as key in a relational database then I need to copy the
>> specific guid into other tables (as reference to the main table).
>> Correct?
>
> Correct. This is no different from other keys.
>
>> From a performance perspective, is sql server using int values as index
>> for this guid-relationship in the background?
>
> No. Operations with GUIDs are likely to be somewhat less efficient than
> operations on ints, since few computers have 128-bit architecture.
> However,
> since GUIDs are just bits, they are far more effcient than character data.
> (For which lots of rules for case-insensitity etc applies.
>
> And it would made little sense to try to distill GUIDs to integers. As
> I mentioned, with 32-bit integers you have a fair chance of collision
> with 100000 values - which is not much in a relational database.
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se
>
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000:
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>

Erland, thanks again!

About my second question / your second answer: I will keep using GUIDs
(unique identifiers). It was not my intention to distill GUIDs to
integers... I thought that maybe sql server distill guids to ints when the
guid is a relationship... in other words, when the relationship "connects"
to guids sql server will make an index based on ints because that might be
fast to search in.

Arjen

| Next | Last
Pages: 1 2
Prev: sp_releaseapplock timeout expired?
Next: SQL Server 2008 - user wants to directly backup database to his desktop