Prev: Web replication
Next: Snapshot agent won't start
From: Aramid on 14 Dec 2009 11:29 It's been a heated discussion in the office now whether our Distribution database should be on FULL or SIMPLE (default) recovery model, and I hope anyone of you could shed some light. Our infrastructure is composed of four (4) P2P sites, each one located in a different geographical area. We are using SQL 2005 x64 EE, on 32GB 8-way machines. Each site has a 2-node cluster for the production database, and another 2-node cluster solely dedicated for the distribution db. The shared storage for both clusters in all sites are SANs. There's a relatively large amount of traffic between the sites, with about 100mb of data updates per site every hour. Our application must run 24/7. Synch with backup is enabled on the distributor, but not on the publication. Right now, we decided to put the distribution database in FULL recovery mode, thinking that frequent t-log backups (hourly) will help truncate the publication logs quickly, which are also backed up hourly. Some engineers in the organization argues that the distribution database should be set back to SIMPLE (the default) and argues that this is better in terms of overall performance and ease of management. Others claim that having the distribution database in full recovery helps in keeping it trim, with a fixed-size transaction log. I've searched the web for days and could not see any authoritative discussion on this, other than Hilary Cotters opinion that it should be set in full to keep the t-log size static. What should be the recovery model for the distribution database relative to our scenario? Thanks for your help! Aramid
From: hilary on 14 Dec 2009 14:53 When it comes down to it, it really depends on what your recoverability/DR goals are. There is little performance advantage to full over simple. I always like to use full as it helps me to maintain the tlogs. I have run into situations with simple where the log does grow uncontrollably and I don't feel comfortable with that in a production environment. What you need to focus on is what will happen if you loose one of your nodes. When you restore you can expect a lot of write activty as each node backfills missing transactions with the node which has just come back online. If you can only restore this database back to the time of last nights backup there will be a lot of read and write activity occuring. Minimizing this window by restoring only the tlogs will help. While your engineers are correct - simple does mean less management, in my opinion I would rather do the minimally more management (essentially one script which can be scheduled) and be protected from a disaster, as opposed to using simple and having to wait hours to catch up from a disaster. Full does not really guanrantee you a fixed size of log. Your log will vary depending on the amount of data you log. If you do a reindexing your logs can be huge. But over time their growth will stabilize, ie you should reach a steady state where the log file size is generally fixed, but the log backups can vary. You want to avoid log growth, but with both simple and full you should reach a steady state as long as your workloads remain the same. The sync with backup option will not help you at all if you only have it enabled for the distribution database. you will need to use it in conjunction with the publication database as well. HTH Hilary "Aramid" <Aramid(a)discussions.microsoft.com> wrote in message news:F61C97B0-4544-4CFB-B4A3-987BCEE482F8(a)microsoft.com... > It's been a heated discussion in the office now whether our Distribution > database should be on FULL or SIMPLE (default) recovery model, and I hope > anyone of you could shed some light. > > Our infrastructure is composed of four (4) P2P sites, each one located in > a > different geographical area. We are using SQL 2005 x64 EE, on 32GB 8-way > machines. Each site has a 2-node cluster for the production database, and > another 2-node cluster solely dedicated for the distribution db. The > shared > storage for both clusters in all sites are SANs. There's a relatively > large > amount of traffic between the sites, with about 100mb of data updates per > site every hour. Our application must run 24/7. > > Synch with backup is enabled on the distributor, but not on the > publication. > Right now, we decided to put the distribution database in FULL recovery > mode, thinking that frequent t-log backups (hourly) will help truncate the > publication logs quickly, which are also backed up hourly. > > Some engineers in the organization argues that the distribution database > should be set back to SIMPLE (the default) and argues that this is better > in > terms of overall performance and ease of management. Others claim that > having the distribution database in full recovery helps in keeping it > trim, > with a fixed-size transaction log. > > I've searched the web for days and could not see any authoritative > discussion on this, other than Hilary Cotters opinion that it should be > set > in full to keep the t-log size static. What should be the recovery model > for > the distribution database relative to our scenario? > > Thanks for your help! > > Aramid
|
Pages: 1 Prev: Web replication Next: Snapshot agent won't start |