From: Gregor Kovač on 20 May 2010 00:31 Hi! Today we had a hot discussion about the performance of LOAD vs. IMPORT. I'd like to hear what you have to say: - Should LOAD be faster that IMPORT in every circumstance? - Is IMPORT more suited for small files and LOAD for large files? - Is LOAD more suited for table that have many indexes and IMPORT for tables that have few or no indexes? Best regards, Kovi
From: Frederik Engelen on 20 May 2010 04:27 On May 20, 6:31Â am, Gregor KovaÄ <gregor.ko...(a)mikropis.si> wrote: > Hi! > > Today we had a hot discussion about the performance of LOAD vs. IMPORT. > I'd like to hear what you have to say: > - Should LOAD be faster that IMPORT in every circumstance? > - Is IMPORT more suited for small files and LOAD for large files? > - Is LOAD more suited for table that have many indexes and IMPORT for tables > that have few or no indexes? > > Best regards, > Â Â Â Â Kovi Hello Kovi, Interesting questions, although I'm not sure they are always the rigth ones to ask. There are factors that can influence your choice for IMPORT or LOAD that might be more important: - recoverability of the operation - for moving data between databases, load from cursor has a lot of advantages - availability of data during/after insert - presence of generated columns - compression on tables - ... I'll still try to give a shot at your questions, though: > - Should LOAD be faster that IMPORT in every circumstance? I tried a small table with a lot of indexes. A load operation takes consistently 4 times longer than the corresponding insert. Another thing to consider for no-so large tables: how fast can you type 'SET INTEGRITY...'? :-) > - Is IMPORT more suited for small files and LOAD for large files? I think you need to consider at least the points above before taking into account the size of the file. > - Is LOAD more suited for table that have many indexes and IMPORT for tables According to my little test on a small table with lots of indexes (15 on a 3-column table), it's exactly the inverse. I didn't do the test for large tables, there it might also depend on the indexing mode you choose (incremental or rebuild). I suggest you just give it a try yourself. -- Frederik Engelen Realdolmen
From: Serge Rielau on 20 May 2010 10:29 Kovi, Feedback from backstage: Load is faster than import in most of cases. Better to describe cases where import can be faster: 1) Correct about small files. Load has a startup and shutdown cost. 2) Parallel import can catch up and beat load (ie, running lots of parallel import command with the data split already) specifically when XML is in the table. More indexes will likely hurt import more than load, because the data isn't sorted, and the index inserts are a single row at a time. Load will be faster with larger files regardless of the number of indexes -- Serge Rielau SQL Architect DB2 for LUW IBM Toronto Lab
From: kenfar on 21 May 2010 09:09 On May 20, 8:29 am, Serge Rielau <srie...(a)ca.ibm.com> wrote: > More indexes will likely hurt import more than load, because the data > isn't sorted, and the index inserts are a single row at a time. Load > will be faster with larger files regardless of the number of indexes I've found that indexes are my enemy with the load utility - even though I can load the data in parallel and only do incremental index builds, the index build phase isn't parallel. Rebuilding indexes isn't a great strategy for real-time data warehousing, and I don't yet have enough funding to go to DPF. My current server has a decent number of rather slow CPUs. So, in this scenario I can split a file into 8 individual pieces then import each simultaneously and get better overall performance. Unfortunately. Think I might be missing anything? Thanks, Ken
From: Serge Rielau on 21 May 2010 23:36 On 5/21/2010 9:09 AM, kenfar wrote: > On May 20, 8:29 am, Serge Rielau<srie...(a)ca.ibm.com> wrote: > >> More indexes will likely hurt import more than load, because the data >> isn't sorted, and the index inserts are a single row at a time. Load >> will be faster with larger files regardless of the number of indexes > > I've found that indexes are my enemy with the load utility - even > though I can load the data in parallel and only do incremental index > builds, the index build phase isn't parallel. Rebuilding indexes > isn't a great strategy for real-time data warehousing, and I don't yet > have enough funding to go to DPF. My current server has a decent > number of rather slow CPUs. So, in this scenario I can split a file > into 8 individual pieces then import each simultaneously and get > better overall performance. Unfortunately. Think I might be missing > anything? Just a hunch. You are on DB2 9.1 or Db2 8? I.e. prior to the multi-threaded engine? Cheers Serge -- Serge Rielau SQL Architect DB2 for LUW IBM Toronto Lab
|
Next
|
Last
Pages: 1 2 Prev: HADR is failing during online backup Next: CLP giving weird error code for simple SQL |