Prev: GRE Keys
Next: Spanning-tree Protocol implementation
From: David Brown on 26 Oct 2009 15:52 Gabriel Knight wrote: > Hi all I need a free program to backup a ubuntu server for my school class, > it has to be as good or better than Rdiff and Rsync the server will use SSH, > MYSql and be a file and web server and do a couple of other things. I need > it to be either a gui or text box program. > As others have said, the obvious choice here would be ... rsync. Far and away the biggest issue with backups is recovery. If you haven't tested your recovery system, or if it is inconvenient or unreliable, then your backups are useless. The great thing about rsync is that you have a straight copy of your data - recovery is a simple file copy. rsync has a lot of options and functionality, which can be useful for automated backups. In particular, when combined with hard link copies you can get snapshot backups where each backup takes only the space needed to store the difference (like a traditional incremental backup), yet you've still got everything directly available. <http://www.mikerubel.org/computers/rsync_snapshots/> rsnapshot is an example automation of this sort of system. For rsync backups of things like databases, the best idea is to do a database dump before the rsync. However, if you can't conveniently arrange for that sort of thing, it may not actually be necessary. Running an rsync on a database's data directory will give you a snapshot of the database files at the time. If you were then to stop the database server and copy those files back again (simulating a restore), and start the server, it would seem to the server that it had suffered a system crash or power out, and it would recover the data using the journals and logs in these files. You will very likely lose some of the latest changes to the database, and it's conceivable that there might be inconsistencies due to writes to the files during the time taken to do the rsync, but the server should fix these on startup. Whether this level of backup is acceptable or not depends on the sort of data you are dealing with, and how often it is changing. Proper dumps are normally a better choice, but simple file copies are better than nothing.
From: Keith Keller on 26 Oct 2009 18:37 On 2009-10-26, David Brown <david.brown(a)hesbynett.removethisbit.no> wrote: > > Running an rsync on a database's data directory will give you a snapshot > of the database files at the time. If you were then to stop the > database server and copy those files back again (simulating a restore), > and start the server, it would seem to the server that it had suffered a > system crash or power out, and it would recover the data using the > journals and logs in these files. You will very likely lose some of the > latest changes to the database, and it's conceivable that there might be > inconsistencies due to writes to the files during the time taken to do > the rsync, but the server should fix these on startup. Whether this > level of backup is acceptable or not depends on the sort of data you are > dealing with, and how often it is changing. Proper dumps are normally a > better choice, but simple file copies are better than nothing. In my opinion the most important part of the above is "Proper dumps are normally a better choice". I would expand that to *much* better. If your database is small, you might as well just dump it. And if it's large, then a filesystem rsync (without table locking or similar) might result in odd issues with your data that might not be resolved by the dbms' recovery system. For taking a backup of MyISAM tables in mysql, you can use mysqlhotcopy, which properly locks the tables, copies the raw database files, then unlocks the tables. This way you get a consistent database snapshot without possibly having to generate and store a full text dump, which might take longer and take more disk space. (The downside is that, since the backup is stored as binary data, not text, a backup system that works off of diffs won't be as efficient.) --keith -- kkeller-usenet(a)wombat.san-francisco.ca.us (try just my userid to email me) AOLSFAQ=http://www.therockgarden.ca/aolsfaq.txt see X- headers for PGP signature information
From: johnny bobby bee on 26 Oct 2009 22:19 Gabriel Knight wrote: > Hi all I need a free program to backup a ubuntu server for my school class, > it has to be as good or better than Rdiff and Rsync the server will use SSH, > MYSql and be a file and web server and do a couple of other things. I need > it to be either a gui or text box program. Grsync.
From: David Brown on 27 Oct 2009 04:53 Keith Keller wrote: > On 2009-10-26, David Brown <david.brown(a)hesbynett.removethisbit.no> wrote: >> Running an rsync on a database's data directory will give you a snapshot >> of the database files at the time. If you were then to stop the >> database server and copy those files back again (simulating a restore), >> and start the server, it would seem to the server that it had suffered a >> system crash or power out, and it would recover the data using the >> journals and logs in these files. You will very likely lose some of the >> latest changes to the database, and it's conceivable that there might be >> inconsistencies due to writes to the files during the time taken to do >> the rsync, but the server should fix these on startup. Whether this >> level of backup is acceptable or not depends on the sort of data you are >> dealing with, and how often it is changing. Proper dumps are normally a >> better choice, but simple file copies are better than nothing. > > In my opinion the most important part of the above is "Proper dumps are > normally a better choice". I would expand that to *much* better. If > your database is small, you might as well just dump it. And if it's > large, then a filesystem rsync (without table locking or similar) might > result in odd issues with your data that might not be resolved by the > dbms' recovery system. > It depends somewhat on your needs, your database usage, and your database server. I don't know about MySQL, but postgresql seems to be good at working with such straight file copy backups. If you don't write much to your database, your chances of a corrupt copy become smaller (and then you just use the previous backup instead). If you are using a system like rsnapshot with multiple backup copies that are hardlinked when the files are unchanged, you get very small incremental costs per backup. With monolithic dump files, even a single change to the database means that the whole file must be saved for each backup (though rsync will still minimise the traffic transferred in the copy). You can take it a stage further by putting your database files on an LVM volume and doing an LVM snapshot, which you then use for the backup. Then your files are consistent exactly as though the power was turned off when the snapshot is taken - and a database server worth the name should be able to recover from that. > For taking a backup of MyISAM tables in mysql, you can use mysqlhotcopy, > which properly locks the tables, copies the raw database files, then > unlocks the tables. This way you get a consistent database snapshot > without possibly having to generate and store a full text dump, which > might take longer and take more disk space. (The downside is that, > since the backup is stored as binary data, not text, a backup system > that works off of diffs won't be as efficient.) > You get a fully consistent database snapshot in this way (and as I said, proper dumps like this are normally the best choice). But you have several downsides. First, you have extra processes to run and synchronise (not a problem with cron and a script, but it might be an issue for people wanting a simpler system, or if the backup is initiated from a different computer). Second, you are locking all the tables during the backup - that may or may not be an issue. Thirdly, as you say your backups may take more time, and they take more space since you can't take advantage of hard-linked copies (diff-based backups are horrible - I wouldn't recommend them). It's all a matter of balancing your needs. If you are already doing an rsync backup of everything else on the machine, including the database files is very simple. It may be good enough for you. Doing database dumps is the "right" way to do the backups, so that should be the method to use unless you have good reason not to. But wanting a simple and easy solution without having to learn about dumps may well count as a good enough reason. Whatever method you choose, do a practice restore to make sure you can recover your data!
From: Keith Keller on 27 Oct 2009 14:04
On 2009-10-27, David Brown <david(a)westcontrol.removethisbit.com> wrote: > > It depends somewhat on your needs, your database usage, and your > database server. I don't know about MySQL, but postgresql seems to be > good at working with such straight file copy backups. I don't know if this is true with newer versions of postgresql, but the versions I've used, the data store was not portable across different architectures. Indeed, there was no guarantee of portability even across different machines on the same architecture, even using the same version of PostgreSQL! (I think that I've done it, but that would have been a long time ago. I did get caught by this issue, though: my main home server's power supply died, and the only other machine I had available was a ppc box. My naive solution, simply using the data store from the old machine's hard drive, didn't work, and I had to resort to an old dump (because I'd recently broken my backup script). Fortunately this was not a huge problem for me, but I could imagine it being a big problem if you had no dumps at all.) So I think it's not wise to depend on this mode as a primary backup--it could serve as a desperation backup, but the primary backup should be a proper dump (or mysqlhotcopy if you can and desire). For MyISAM tables, MySQL works fine with filesystem backups; I believe these files are portable across architectures, but don't quote me on that. I don't know how portable filesystem snapshots are for InnoDB tables. > If you don't > write much to your database, your chances of a corrupt copy become > smaller (and then you just use the previous backup instead). If you are > using a system like rsnapshot with multiple backup copies that are > hardlinked when the files are unchanged, you get very small incremental > costs per backup. With monolithic dump files, even a single change to > the database means that the whole file must be saved for each backup > (though rsync will still minimise the traffic transferred in the copy). This is all true, but I don't see a way to reliably get a good backup otherwise. And you'll only have real difficulties if your database is quite large--most typical dbs should not create enormous dump files anyway. > If you are already doing an > rsync backup of everything else on the machine, including the database > files is very simple. It may be good enough for you. Doing database > dumps is the "right" way to do the backups, so that should be the method > to use unless you have good reason not to. But wanting a simple and > easy solution without having to learn about dumps may well count as a > good enough reason. Well...I think that anyone intent on using a database regularly should not allow themselves to get lazy and rely on filesystem backups. They should start the ''right'' way, and only use a filesystem backup if they really don't care all that much about their databases in the first place (or can recreate it quickly from what they already have on their filesystem). > It's all a matter of balancing your needs. > Whatever method you choose, do a practice restore to make sure you can > recover your data! Double-plus-yes to the above! In addition, do a practice restore to a different machine if at all possible. --keith -- kkeller-usenet(a)wombat.san-francisco.ca.us (try just my userid to email me) AOLSFAQ=http://www.therockgarden.ca/aolsfaq.txt see X- headers for PGP signature information |