Prev: CTE with multiple anchors
Next: Q;re Descriptive stats
From: Just D. on 5 Mar 2010 00:47 There is a simple task - to return the records from table1 where the key table1.key1 doesn't exist in the table2. The first idea was to write it as: SELECT * FROM table1 WHERE NOT table1.key1 IN (SELECT table2.key2 FROM table2) Simple, but it's way too inefficient for sure. Is there some trick that I'm missing, something like an opposite to INNER JOIN? I doubt it exists, but who knows... There were two other ideas - to use a status column and change the status when the record was sent to the table2, using this column in the query, but for some reason my boss doesn'tlike this idea, explaining that we'll need to restructure the whole table, change the related logic, table is huge, any database structure corrections are painful, since the database is around 300 GBytes now and keeps growing, etc. I tried to explain that scannign the same table again and again is not the best way, but... Adding another table where we could track this status was also aworking idea, but it was put on hold for a while. Any other ideas? Just D.
From: Erland Sommarskog on 5 Mar 2010 03:15 Just D. (no(a)spam.please) writes: > There is a simple task - to return the records from table1 where the key > table1.key1 doesn't exist in the table2. The first idea was to write it > as: > > SELECT * FROM table1 WHERE NOT table1.key1 IN (SELECT table2.key2 FROM > table2) > > Simple, but it's way too inefficient for sure. It may be, but it could also be very efficient, all depending on keys, indexes and cardinalities. Some people like to use a left join: SELECT ... FROM table1 LEFT JOIN table2 ON table1.key1 = table2.key2 WHERE table.key2 IS NULL Sometimes this may give a better plan. It may also give exactly the same plan. Personally I prefer: SELECFT * FROM table1 WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.key = table2.key) This usually gives the same plan as NOT IN, but there is a trap with NULL values if you use NOT IN, that you cannot walk into with NOT EXISTS. > There were two other ideas - to use a status column and change the > status when the record was sent to the table2, using this column in the > query, but for some reason my boss doesn'tlike this idea, explaining > that we'll need to restructure the whole table, change the related > logic, table is huge, any database structure corrections are painful, > since the database is around 300 GBytes now and keeps growing, etc. I > tried to explain that scannign the same table again and again is not the > best way, but... Adding another table where we could track this status > was also aworking idea, but it was put on hold for a while. Now you are venting thoughts about the underlying business problem, and maybe it offers some better ideas, but you would need to give more information than just glimpses, if you want some input from this angle. -- Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
From: Tom on 5 Mar 2010 08:37 select * from table1 left join table2 on table1.key on table2.key where table2.key is null Just D. wrote: > There is a simple task - to return the records from table1 where the key > table1.key1 doesn't exist in the table2. The first idea was to write it as: > > SELECT * FROM table1 WHERE NOT table1.key1 IN (SELECT table2.key2 FROM > table2) > > Simple, but it's way too inefficient for sure. Is there some trick that I'm > missing, something like an opposite to INNER JOIN? I doubt it exists, but > who knows... > > There were two other ideas - to use a status column and change the status > when the record was sent to the table2, using this column in the query, but > for some reason my boss doesn'tlike this idea, explaining that we'll need to > restructure the whole table, change the related logic, table is huge, any > database structure corrections are painful, since the database is around 300 > GBytes now and keeps growing, etc. I tried to explain that scannign the same > table again and again is not the best way, but... Adding another table where > we could track this status was also aworking idea, but it was put on hold > for a while. > > Any other ideas? > > Just D.
From: Tom on 5 Mar 2010 08:46 Should be select * from table1 left join table2 on table1.key = table2.key where table2.key is null Tom wrote: > select * > from table1 > left join table2 on table1.key on table2.key > where table2.key is null > > Just D. wrote: > > There is a simple task - to return the records from table1 where the key > > table1.key1 doesn't exist in the table2. The first idea was to write it as: > > > > SELECT * FROM table1 WHERE NOT table1.key1 IN (SELECT table2.key2 FROM > > table2) > > > > Simple, but it's way too inefficient for sure. Is there some trick that I'm > > missing, something like an opposite to INNER JOIN? I doubt it exists, but > > who knows... > > > > There were two other ideas - to use a status column and change the status > > when the record was sent to the table2, using this column in the query, but > > for some reason my boss doesn'tlike this idea, explaining that we'll need to > > restructure the whole table, change the related logic, table is huge, any > > database structure corrections are painful, since the database is around 300 > > GBytes now and keeps growing, etc. I tried to explain that scannign the same > > table again and again is not the best way, but... Adding another table where > > we could track this status was also aworking idea, but it was put on hold > > for a while. > > > > Any other ideas? > > > > Just D.
From: Just D. on 5 Mar 2010 16:04
Erland, Thanks so much! That was very helpful! Just D. "Erland Sommarskog" >> There is a simple task - to return the records from table1 where the key >> table1.key1 doesn't exist in the table2. The first idea was to write it >> as: >> >> SELECT * FROM table1 WHERE NOT table1.key1 IN (SELECT table2.key2 FROM >> table2) >> >> Simple, but it's way too inefficient for sure. > > It may be, but it could also be very efficient, all depending on keys, > indexes and cardinalities. > > Some people like to use a left join: > > SELECT ... > FROM table1 > LEFT JOIN table2 ON table1.key1 = table2.key2 > WHERE table.key2 IS NULL > > Sometimes this may give a better plan. It may also give exactly the same > plan. > > Personally I prefer: > > SELECFT * FROM table1 > WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.key = table2.key) > > This usually gives the same plan as NOT IN, but there is a trap with NULL > values if you use NOT IN, that you cannot walk into with NOT EXISTS. > > >> There were two other ideas - to use a status column and change the >> status when the record was sent to the table2, using this column in the >> query, but for some reason my boss doesn'tlike this idea, explaining >> that we'll need to restructure the whole table, change the related >> logic, table is huge, any database structure corrections are painful, >> since the database is around 300 GBytes now and keeps growing, etc. I >> tried to explain that scannign the same table again and again is not the >> best way, but... Adding another table where we could track this status >> was also aworking idea, but it was put on hold for a while. > > Now you are venting thoughts about the underlying business problem, and > maybe it offers some better ideas, but you would need to give more > information than just glimpses, if you want some input from this angle. > > -- > Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx |