From: esmith2112 on 14 Dec 2009 10:53 In a corporate consolidation of servers, our DB2 server got migrated to a new shared enviroment comprised of IBM P6 with 2 LPARs, 9 CPUs, 25 GB memory, and an EMC Symmetrix RAID5 SAN for storage. We're running AIX 5.3 and DB2 9.5. Before the migration we ran on a slower 16-CPU machine with a fully mirrored disk system. In general performance is better on the new machine for transactional processing. The exception is one database backup which is now 2 to 3 times slower for the actual "backup database" command. What's strange, to me at least, is that there are two other databases which are in the 20-30 GB size range which have backup times similar to their times on the old server. The slow database is just under 200 GB in size and backs up in 6 hours instead of 2 hours and change. The new system was created by imaging the old system and laying down a copy on the new box with all directory structures being identical. All tablespace storage is SMS. We suspected this had something to do with the SAN component and played with registry settings like DB2_PARALLEL_IO, and also tried altering the backup command to specify explicit PARALLELISM values, all to no avail. Since it's just the one database, we thought there was something special about the way it was defined, but failed to turn up any significant differences (not that there aren't any--we just couldn't find them). What other parameters should we be looking at? My skill set is more toward applications, than system administration, so I'm at a little bit of a loss here. Thanks, Evan
From: stefan.albert on 15 Dec 2009 05:28 The first thing I would look at, is the backup device and the physical attachment of it. You don't say anything about it, so my first shot would be: if the read performance improved, the backup time should be also faster if you use the same backup device AND (and that's my second shot) it's not the same where the data lives. The data disks were mirrored and are now a RAID-5, if you write your backup on the same disks, the write rate can be slower because of parity generation and writing. This could explain the decrease of write performance. What about your write performance for data itself - and where do you write your backup to? Anyway: For security reasons, the backup should not be on the platform where the data lives On Dec 14, 4:53 pm, esmith2112 <esmith2...(a)gmail.com> wrote: > In a corporate consolidation of servers, our DB2 server got migrated > to a new shared enviroment comprised of IBM P6 with 2 LPARs, 9 CPUs, > 25 GB memory, and an EMC Symmetrix RAID5 SAN for storage. We're > running AIX 5.3 and DB2 9.5. Before the migration we ran on a slower > 16-CPU machine with a fully mirrored disk system. In general > performance is better on the new machine for transactional processing. > The exception is one database backup which is now 2 to 3 times slower > for the actual "backup database" command. What's strange, to me at > least, is that there are two other databases which are in the 20-30 GB > size range which have backup times similar to their times on the old > server. The slow database is just under 200 GB in size and backs up in > 6 hours instead of 2 hours and change. > > The new system was created by imaging the old system and laying down a > copy on the new box with all directory structures being identical. All > tablespace storage is SMS. We suspected this had something to do with > the SAN component and played with registry settings like > DB2_PARALLEL_IO, and also tried altering the backup command to specify > explicit PARALLELISM values, all to no avail. Since it's just the one > database, we thought there was something special about the way it was > defined, but failed to turn up any significant differences (not that > there aren't any--we just couldn't find them). > > What other parameters should we be looking at? My skill set is more > toward applications, than system administration, so I'm at a little > bit of a loss here. > > Thanks, > > Evan
From: esmith2112 on 15 Dec 2009 10:15 On Dec 15, 5:28 am, "stefan.albert" <stefan.alb...(a)spb.de> wrote: > The first thing I would look at, is the backup device and the physical > attachment of it. > You don't say anything about it, so my first shot would be: if the > read performance improved, the backup time should be also faster if > you use the same backup device AND (and that's my second shot) it's > not the same where the data lives. The data disks were mirrored and > are now a RAID-5, if you write your backup on the same disks, the > write rate can be slower because of parity generation and writing. > This could explain the decrease of write performance. What about your > write performance for data itself - and where do you write your backup > to? > Anyway: For security reasons, the backup should not be on the platform > where the data lives > Oops, I guess it doesn't paint a complete picture without detailing the target. The files are indeed backed up to disk on the same SAN device where the data resides, then picked up by TSM and written to tape. We find it odd that the other databases backed up in the same fashion, don't suffer from similar performance hits. We suspected it was something particular to the instance or database itself in relationship to the SAN. But you may be on to something with the parity generation. Since after posting, I loaded data via the IMPORT command into a table on the database in question. It was 32K rows (50-bytes each) that takes approximately 20 seconds on the old server but takes over 10 minutes on the new server. Could the RAID overhead cause such hit?
From: darko on 15 Dec 2009 13:07 On Dec 15, 4:15 pm, esmith2112 <esmith2...(a)gmail.com> wrote: > On Dec 15, 5:28 am, "stefan.albert" <stefan.alb...(a)spb.de> wrote: > > > The first thing I would look at, is the backup device and the physical > > attachment of it. > > You don't say anything about it, so my first shot would be: if the > > read performance improved, the backup time should be also faster if > > you use the same backup device AND (and that's my second shot) it's > > not the same where the data lives. The data disks were mirrored and > > are now a RAID-5, if you write your backup on the same disks, the > > write rate can be slower because of parity generation and writing. > > This could explain the decrease of write performance. What about your > > write performance for data itself - and where do you write your backup > > to? > > Anyway: For security reasons, the backup should not be on the platform > > where the data lives > > Oops, I guess it doesn't paint a complete picture without detailing > the target. The files are indeed backed up to disk on the same SAN > device where the data resides, then picked up by TSM and written to > tape. We find it odd that the other databases backed up in the same > fashion, don't suffer from similar performance hits. We suspected it > was something particular to the instance or database itself in > relationship to the SAN. > > But you may be on to something with the parity generation. Since after > posting, I loaded data via the IMPORT command into a table on the > database in question. It was 32K rows (50-bytes each) that takes > approximately 20 seconds on the old server but takes over 10 minutes > on the new server. Could the RAID overhead cause such hit? You did not state clearly if the backup is written to the same disks (in RAID 5) that contain the database. Then, it might be possible that you have slower backup due to disk layout. Regarding RAID 5, everyone should at least look at www.baarf.com. RAID 5 should not be performance limiting factor for backup operations since most write operations during backup should be full stripe writes, which avoid write penalty of RAID 5. It would not be good practice to put everything (and especially tablespaces and logs) on same disks in RAID 5 configuration, although Symmetrix storages have massive caches. However, I doubt that disk layout may be the only one to blame for slowing down from 20 seconds to over 10 minutes for data load. You will probably have to investigate for additional suspects. Darko Krstic
From: stefan.albert on 17 Dec 2009 04:49 Hmm - thats a difficult one... One thing comes up in my mind: Block sizes. May be the block sizes of DB / OS / EMC² don't match, therefore much more data is written than actually needed. You can try to monitor the traffic from DB-disks to your server and then the traffic back to the disks where the backup lives. If these are the same adapters the traffic is mixed up, but there might be the chance to look for the reads (DB->server) and writes (server->backup). But different adapters would be better to monitor. Or you have the chance to monitor the traffic for the file systems. For AIX (I don't know your OS) you could use nmon... When you see much more writing than reading (if only the backup is active on the server) I would assume, that the page sizes don't match. I don't know if there is an internal monitor for the traffic in the EMC² box - that would also be a good thing to look at... On Dec 15, 4:15 pm, esmith2112 <esmith2...(a)gmail.com> wrote: > On Dec 15, 5:28 am, "stefan.albert" <stefan.alb...(a)spb.de> wrote: > > > The first thing I would look at, is the backup device and the physical > > attachment of it. > > You don't say anything about it, so my first shot would be: if the > > read performance improved, the backup time should be also faster if > > you use the same backup device AND (and that's my second shot) it's > > not the same where the data lives. The data disks were mirrored and > > are now a RAID-5, if you write your backup on the same disks, the > > write rate can be slower because of parity generation and writing. > > This could explain the decrease of write performance. What about your > > write performance for data itself - and where do you write your backup > > to? > > Anyway: For security reasons, the backup should not be on the platform > > where the data lives > > Oops, I guess it doesn't paint a complete picture without detailing > the target. The files are indeed backed up to disk on the same SAN > device where the data resides, then picked up by TSM and written to > tape. We find it odd that the other databases backed up in the same > fashion, don't suffer from similar performance hits. We suspected it > was something particular to the instance or database itself in > relationship to the SAN. > > But you may be on to something with the parity generation. Since after > posting, I loaded data via the IMPORT command into a table on the > database in question. It was 32K rows (50-bytes each) that takes > approximately 20 seconds on the old server but takes over 10 minutes > on the new server. Could the RAID overhead cause such hit?
|
Pages: 1 Prev: Need HowTo for using included TSA on windows platform Next: Newbie Question |