From boxbackup-dev at boxbackup.org Mon Jun 1 10:47:35 2009 From: boxbackup-dev at boxbackup.org (David Sommerseth) Date: Mon, 01 Jun 2009 11:47:35 +0200 Subject: [Box Backup-dev] Backup of PostgreSQL database Message-ID: <4A23A3B7.6060903@topphemmelig.net> Hi! I've been using boxbackup-0.10 for over a year in a production environment. And it really makes me sleep quite well at night. But there is one thing which can cause some unclear nightmares. Proper backup of PostgreSQL. Right now, I'm running the worst scenario of DB backup - backing up the raw DB files without shutting down the database server. But I have done restores from it this way, but it is far from ideal as the WAL might not be consistent. This can cause troubles if WAL data is not written to the table files yet. PostgreSQL do support some kind of Point-In-Time-Recovery, with special commands for preparing files for backup. http://www.postgresql.org/docs/8.3/static/continuous-archiving.html Has anyone looked into such support in BoxBackup? If not, I'd be glad to spend some time digging into this. With proper documentation in BoxBackup on how to configure BoxBackup and PostgreSQL, linking in libpq and calling pg_start_backup() and pg_stop_backup() SQL functions is actually how I see the implementation could be done. Any thoughts about this? I do not have plenty of time, but if nobody is looking into it I believe I could have some suggestions for solutions ready within some months. Another aspect which I know some financial industries are interested in as well is some kind of stream-backup of all SQL queries which modifies the database in which ever way (DELETE/UPDATE/INSERT/GRANT/CREATE/DROP). So each query is then logged and backed up immediately. There are some commercial solutions already for Oracle, where this is done to a tape streamer. Not sure if BoxBackup would be suitable for such an approach. kind regards, David Sommerseth From boxbackup-dev at boxbackup.org Mon Jun 1 11:01:35 2009 From: boxbackup-dev at boxbackup.org (David Sommerseth) Date: Mon, 01 Jun 2009 12:01:35 +0200 Subject: [Box Backup-dev] Soft-RAID support Message-ID: <4A23A6FF.3080401@topphemmelig.net> Hi! I was reading through the BoxBackup documentation, and one crucial point of why I chose BoxBackup seems to change ... "The server currently supports a kind of RAID 5 in userland for extra reliability. It is designed to use three separate paths which are mounted from three separate physical disks (not partitions on the same disk!). This is deprecated and will be removed in a future version. We recommend that you disable it instead, otherwise you may lose your stored data when this feature is removed. " http://www.boxbackup.org/trac/wiki/ConfiguringAServer Is there any reasons this will be changed? I am currently using this feature for one reason: Safe distributed backup. I have several clients connecting to the BoxBackup storage server via local network or VPN. The next phase I'm about to implement is to distribute each of these 3 data folders to 3 different physical locations. This way I don't need to worry too much if one remote location gets compromised, as the theory explained in the documentation is that you need minimum two sets to rebuild the third set to make the data usable. If one set gets compromised we will have plans how to rebuild the backup storage and distribute three complete new sets to new locations and destruct the two remaining remote sets. When the data is encrypted in addition, it really provides a good solution for secure distributed remote backup. Of course, the main backup server got all three directories available. But the security level of this server location is also higher. My idea for those 3 remote storages was to locate them in physically remote places within the organisation, or maybe in some cases at some associates homes. As the backup data is encrypted and you need to restore 2/3 of the directories, this is considered safe enough. I evaluated BoxBackup and set it up before this part of the documentation changed. Anyhow, there's also a contradictory sentence later on in the same URL: "NOTE Running the server in non-RAID mode has not been tested as extensively as in RAID file mode." kind regards, David Sommerseth From boxbackup-dev at boxbackup.org Tue Jun 2 19:46:15 2009 From: boxbackup-dev at boxbackup.org (Chris Wilson) Date: Tue, 2 Jun 2009 21:46:15 +0300 (EAT) Subject: [Box Backup-dev] Backup of PostgreSQL database In-Reply-To: <4A23A3B7.6060903@topphemmelig.net> References: <4A23A3B7.6060903@topphemmelig.net> Message-ID: Hi David, On Mon, 1 Jun 2009, David Sommerseth wrote: > I've been using boxbackup-0.10 for over a year in a production > environment. And it really makes me sleep quite well at night. But > there is one thing which can cause some unclear nightmares. Proper > backup of PostgreSQL. > > Right now, I'm running the worst scenario of DB backup - backing up the > raw DB files without shutting down the database server. But I have done > restores from it this way, but it is far from ideal as the WAL might not > be consistent. This can cause troubles if WAL data is not written to > the table files yet. > > PostgreSQL do support some kind of Point-In-Time-Recovery, with special > commands for preparing files for backup. > > http://www.postgresql.org/docs/8.3/static/continuous-archiving.html With MySQL I think people tend to do a database dump and back that up. It's guaranteed consistent for innodb databases. But if you have a huge database, I guess it's inefficient. With Postgres, it looks like you could use the NotifySysadmin script to issue the CHECKPOINT and SELECT pg_start_backup() commands before the backup starts, and SELECT pg_stop_backup() when it finishes. Does that meet your requirements? Can you easily back up the entire postgres data directory? > Another aspect which I know some financial industries are interested in > as well is some kind of stream-backup of all SQL queries which modifies > the database in which ever way (DELETE/UPDATE/INSERT/GRANT/CREATE/DROP). > So each query is then logged and backed up immediately. There are some > commercial solutions already for Oracle, where this is done to a tape > streamer. Not sure if BoxBackup would be suitable for such an approach. The most obvious implementation would result in the creation of millions of small "files" (one per query) which Box Backup is not particularly good at, since it stores each one as a separate file on the server. So the restore would be slow and the store would use a lot of disk space. However, you can stream an object to the server with on-the-fly encryption, so one could write code to keep a connection open and encrypt each command and write it to the server as it happens. I'd be a little concerned about what would happen if the connection was interrupted for any reason. I think the server would discard the whole file (query log). Perhaps it could be made not to do that with some flag to the StoreFile command. So Box Backup could probably be made to do something like what you want, but I think it would require a signficant programming effort. Perhaps the "financial industries" would be interested in paying for it? Cheers, Chris. -- _ ___ __ _ / __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK | / (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer | \ _/_/_/_//_/___/ | We are GNU-free your mind-and your software | From boxbackup-dev at boxbackup.org Tue Jun 2 19:54:31 2009 From: boxbackup-dev at boxbackup.org (Chris Wilson) Date: Tue, 2 Jun 2009 21:54:31 +0300 (EAT) Subject: [Box Backup-dev] Soft-RAID support In-Reply-To: <4A23A6FF.3080401@topphemmelig.net> References: <4A23A6FF.3080401@topphemmelig.net> Message-ID: Hi David, On Mon, 1 Jun 2009, David Sommerseth wrote: > I was reading through the BoxBackup documentation, and one crucial point > of why I chose BoxBackup seems to change ... > > "The server currently supports a kind of RAID 5 in userland for extra > reliability. It is designed to use three separate paths which are > mounted from three separate physical disks (not partitions on the same > disk!). This is deprecated and will be removed in a future version. We > recommend that you disable it instead, otherwise you may lose your > stored data when this feature is removed. " > http://www.boxbackup.org/trac/wiki/ConfiguringAServer > > Is there any reasons this will be changed? Support for it was never finished (no recovery procedure), it is pretty limited (only supports RAID 5 and three devices) and it was written at a time when OS/software and hardware RAID were not as ubiquitous or well supported as they are now. I can see your point about the usefulness of this for distributed encrypted backup. However I'm not convinced about the overall merits of storing the data in three separate locations. It's already encrypted to the point where a server compromise could get virtually no useful information out of the backups. You could achieve what you want with distributed OS-level RAID on iSCSI, ATA over Ethernet or NBD devices. > I evaluated BoxBackup and set it up before this part of the > documentation changed. Anyhow, there's also a contradictory sentence > later on in the same URL: > > "NOTE Running the server in non-RAID mode has not been tested as > extensively as in RAID file mode." Strictly speaking, in my mind, this is not contradictory as it doesn't say that userland RAID is better or recommended, just more tested. However I think it may no longer be true. I suspect that few people are using the userland RAID feature in production. If anyone except David is, please speak up! Cheers, Chris. -- _ ___ __ _ / __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK | / (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer | \ _/_/_/_//_/___/ | We are GNU-free your mind-and your software | From boxbackup-dev at boxbackup.org Sat Jun 6 12:00:01 2009 From: boxbackup-dev at boxbackup.org (boxbackup-dev at boxbackup.org) Date: Sat, 6 Jun 2009 12:00:01 +0100 (BST) Subject: [Box Backup-dev] Current open tickets Message-ID: <20090606110001.8B7CD326029@www.boxbackup.org> Note: to view an indiviual ticket, use: https://www.boxbackup.org/trac/ticket/(number) The following is a listing of current problems submitted by Box Backup users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Ticket Owner Component Summary - ------ ------ ------------- ------------------------------------------------------------ n 4 martin box libraries Port Box Backup to AIX n 6 box libraries Contribute code: SMTP client, HTTP server, Database drivers, n 7 box libraries Improve restore speed on local repositories n 8 chris box libraries Improve handling of directories with many files n 13 chris bbackupd Fix file locking on Windows n 14 chris bbackupd Fix large file issues on Windows n 16 chris bbackupquery Restore deleted directories may fail a 17 chris bbackupquery List files using wildcards a 20 chris bbackupctl bbackupctl reload reports prior settings n 45 ben bbackupd File diff performance patch (reduced disk IO and wall time n 46 chris bbackupd bbackupd only ever saves reverse diffs, corrupted files on s n 47 chris bbackupd Account numbers greater than 2^31 (0x7fffffff) do not work c n 48 chris bbackupd Locations that don't exist on first run are never tried agai n 49 chris bbackupd ID map (rename tracking) broken since [288] n 50 chris bbackupquery No way to capture stderr under Windows n 51 chris bbackupd No way to force bbackupd to re-upload files under Windows n 52 chris bbackupd Unable to control the maintenance of old vs. deleted files n 53 chris bbackupd Comparing root directory locations does not work under Windo n 54 chris bbackupd Locations not found on disk (e.g. unmounted filesystems) can n 55 chris bbackupd Should store and preserve directory timestamps 20 tickets total. From boxbackup-dev at boxbackup.org Sun Jun 7 12:01:00 2009 From: boxbackup-dev at boxbackup.org (James O'Gorman) Date: Sun, 7 Jun 2009 12:01:00 +0100 Subject: [Box Backup-dev] Point in time restores? Message-ID: <82F7370C-2380-4059-A852-2225ABFF174F@netinertia.co.uk> Hi, How feasible would it be to implement a point in time restore feature? e.g. I have a group of files or directories which have been deleted/ modified, but I want to restore the data as it was backed up on a certain date. This is a feature I use quite a lot with TSM at work and can be quite handy, but I'm not sure if Box Backup stores enough information to be able to do this? I'm aware we can restore "old" objects, but this is only one at a time, I think... James From boxbackup-dev at boxbackup.org Mon Jun 8 19:03:24 2009 From: boxbackup-dev at boxbackup.org (Chris Wilson) Date: Mon, 8 Jun 2009 21:03:24 +0300 (EAT) Subject: [Box Backup-dev] Point in time restores? In-Reply-To: <82F7370C-2380-4059-A852-2225ABFF174F@netinertia.co.uk> References: <82F7370C-2380-4059-A852-2225ABFF174F@netinertia.co.uk> Message-ID: Hi James, On Sun, 7 Jun 2009, James O'Gorman wrote: > How feasible would it be to implement a point in time restore feature? > e.g. I have a group of files or directories which have been > deleted/modified, but I want to restore the data as it was backed up on > a certain date. > > This is a feature I use quite a lot with TSM at work and can be quite > handy, but I'm not sure if Box Backup stores enough information to be > able to do this? > > I'm aware we can restore "old" objects, but this is only one at a time, > I think... I'm afraid it doesn't store enough information to do it accurately. In particular, the deletion date of files is never stored, and if a file is missing from the store, or only newer versions are available, it's not possible to know whether it did exist at that point in time (and was later removed by housekeeping) or not. I've started work on implementation of snapshots, but it's a really big and tricky job and I'm really busy at the moment so it won't be ready in a hurry. Cheers, Chris. -- _ ___ __ _ / __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK | / (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer | \ _/_/_/_//_/___/ | We are GNU-free your mind-and your software | From boxbackup-dev at boxbackup.org Tue Jun 9 19:06:54 2009 From: boxbackup-dev at boxbackup.org (James O'Gorman) Date: Tue, 9 Jun 2009 19:06:54 +0100 Subject: [Box Backup-dev] Point in time restores? In-Reply-To: References: <82F7370C-2380-4059-A852-2225ABFF174F@netinertia.co.uk> Message-ID: Hi Chris, On 8 Jun 2009, at 19:03, Chris Wilson wrote: > I'm afraid it doesn't store enough information to do it accurately. > In particular, the deletion date of files is never stored, and if a > file is missing from the store, or only newer versions are > available, it's not possible to know whether it did exist at that > point in time (and was later removed by housekeeping) or not. Ah, right. Gotcha. > I've started work on implementation of snapshots, but it's a really > big and tricky job and I'm really busy at the moment so it won't be > ready in a hurry. It'd be a really nice thing to have, but I understand it's a big job. I'm struggling to keep up with stuff outside work too... Looks like I'll have to get around my Thunderbird-fail the hard way then... it deleted all mail over 90 days old :( James From boxbackup-dev at boxbackup.org Sat Jun 13 12:00:00 2009 From: boxbackup-dev at boxbackup.org (boxbackup-dev at boxbackup.org) Date: Sat, 13 Jun 2009 12:00:00 +0100 (BST) Subject: [Box Backup-dev] Current open tickets Message-ID: <20090613110000.9B1EB325046@www.boxbackup.org> Note: to view an indiviual ticket, use: https://www.boxbackup.org/trac/ticket/(number) The following is a listing of current problems submitted by Box Backup users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Ticket Owner Component Summary - ------ ------ ------------- ------------------------------------------------------------ n 4 martin box libraries Port Box Backup to AIX n 6 box libraries Contribute code: SMTP client, HTTP server, Database drivers, n 7 box libraries Improve restore speed on local repositories n 8 chris box libraries Improve handling of directories with many files n 13 chris bbackupd Fix file locking on Windows n 14 chris bbackupd Fix large file issues on Windows n 16 chris bbackupquery Restore deleted directories may fail a 17 chris bbackupquery List files using wildcards a 20 chris bbackupctl bbackupctl reload reports prior settings n 45 ben bbackupd File diff performance patch (reduced disk IO and wall time n 46 chris bbackupd bbackupd only ever saves reverse diffs, corrupted files on s n 47 chris bbackupd Account numbers greater than 2^31 (0x7fffffff) do not work c n 48 chris bbackupd Locations that don't exist on first run are never tried agai n 49 chris bbackupd ID map (rename tracking) broken since [288] n 50 chris bbackupquery No way to capture stderr under Windows n 51 chris bbackupd No way to force bbackupd to re-upload files under Windows n 52 chris bbackupd Unable to control the maintenance of old vs. deleted files n 53 chris bbackupd Comparing root directory locations does not work under Windo n 54 chris bbackupd Locations not found on disk (e.g. unmounted filesystems) can n 55 chris bbackupd Should store and preserve directory timestamps 20 tickets total. From boxbackup-dev at boxbackup.org Sat Jun 20 12:00:00 2009 From: boxbackup-dev at boxbackup.org (boxbackup-dev at boxbackup.org) Date: Sat, 20 Jun 2009 12:00:00 +0100 (BST) Subject: [Box Backup-dev] Current open tickets Message-ID: <20090620110000.E7F0B326029@www.boxbackup.org> Note: to view an indiviual ticket, use: https://www.boxbackup.org/trac/ticket/(number) The following is a listing of current problems submitted by Box Backup users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Ticket Owner Component Summary - ------ ------ ------------- ------------------------------------------------------------ n 4 martin box libraries Port Box Backup to AIX n 6 box libraries Contribute code: SMTP client, HTTP server, Database drivers, n 7 box libraries Improve restore speed on local repositories n 8 chris box libraries Improve handling of directories with many files n 13 chris bbackupd Fix file locking on Windows n 14 chris bbackupd Fix large file issues on Windows n 16 chris bbackupquery Restore deleted directories may fail a 17 chris bbackupquery List files using wildcards a 20 chris bbackupctl bbackupctl reload reports prior settings n 45 ben bbackupd File diff performance patch (reduced disk IO and wall time n 46 chris bbackupd bbackupd only ever saves reverse diffs, corrupted files on s n 47 chris bbackupd Account numbers greater than 2^31 (0x7fffffff) do not work c n 48 chris bbackupd Locations that don't exist on first run are never tried agai n 49 chris bbackupd ID map (rename tracking) broken since [288] n 50 chris bbackupquery No way to capture stderr under Windows n 51 chris bbackupd No way to force bbackupd to re-upload files under Windows n 52 chris bbackupd Unable to control the maintenance of old vs. deleted files n 53 chris bbackupd Comparing root directory locations does not work under Windo n 54 chris bbackupd Locations not found on disk (e.g. unmounted filesystems) can n 55 chris bbackupd Should store and preserve directory timestamps 20 tickets total. From boxbackup-dev at boxbackup.org Sat Jun 27 12:00:01 2009 From: boxbackup-dev at boxbackup.org (boxbackup-dev at boxbackup.org) Date: Sat, 27 Jun 2009 12:00:01 +0100 (BST) Subject: [Box Backup-dev] Current open tickets Message-ID: <20090627110001.C09B3326026@www.boxbackup.org> Note: to view an indiviual ticket, use: https://www.boxbackup.org/trac/ticket/(number) The following is a listing of current problems submitted by Box Backup users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Ticket Owner Component Summary - ------ ------ ------------- ------------------------------------------------------------ n 4 martin box libraries Port Box Backup to AIX n 6 box libraries Contribute code: SMTP client, HTTP server, Database drivers, n 7 box libraries Improve restore speed on local repositories n 8 chris box libraries Improve handling of directories with many files n 13 chris bbackupd Fix file locking on Windows n 14 chris bbackupd Fix large file issues on Windows n 16 chris bbackupquery Restore deleted directories may fail a 17 chris bbackupquery List files using wildcards a 20 chris bbackupctl bbackupctl reload reports prior settings n 45 ben bbackupd File diff performance patch (reduced disk IO and wall time n 46 chris bbackupd bbackupd only ever saves reverse diffs, corrupted files on s n 47 chris bbackupd Account numbers greater than 2^31 (0x7fffffff) do not work c n 48 chris bbackupd Locations that don't exist on first run are never tried agai n 49 chris bbackupd ID map (rename tracking) broken since [288] n 50 chris bbackupquery No way to capture stderr under Windows n 51 chris bbackupd No way to force bbackupd to re-upload files under Windows n 52 chris bbackupd Unable to control the maintenance of old vs. deleted files n 53 chris bbackupd Comparing root directory locations does not work under Windo n 54 chris bbackupd Locations not found on disk (e.g. unmounted filesystems) can n 55 chris bbackupd Should store and preserve directory timestamps 20 tickets total. From boxbackup-dev at boxbackup.org Sat Jun 6 13:04:59 2009 From: boxbackup-dev at boxbackup.org (David Sommerseth) Date: Sat, 06 Jun 2009 14:04:59 +0200 Subject: [Box Backup-dev] Soft-RAID support In-Reply-To: References: <4A23A6FF.3080401@topphemmelig.net> Message-ID: <4A2A5B6B.3050501@sommerseths.net> Chris Wilson wrote: >> "The server currently supports a kind of RAID 5 in userland for extra >> reliability. It is designed to use three separate paths which are >> mounted from three separate physical disks (not partitions on the same >> disk!). This is deprecated and will be removed in a future version. We >> recommend that you disable it instead, otherwise you may lose your >> stored data when this feature is removed. " >> http://www.boxbackup.org/trac/wiki/ConfiguringAServer >> >> Is there any reasons this will be changed? > > Support for it was never finished (no recovery procedure), it is pretty > limited (only supports RAID 5 and three devices) and it was written at a > time when OS/software and hardware RAID were not as ubiquitous or well > supported as they are now. I would be willing, with some guidance to look into such a tool, if that is the main criteria for dropping this support. The soft-raid solution itself seems to work flawlessly and seems to only need this recovery tool. Or are there any other issues which is not to well known with the soft-raid which should make me worried? Are there any critical bugs related to the current implementation? > I can see your point about the usefulness of this for distributed > encrypted backup. However I'm not convinced about the overall merits of > storing the data in three separate locations. It's already encrypted to > the point where a server compromise could get virtually no useful > information out of the backups. You could achieve what you want with > distributed OS-level RAID on iSCSI, ATA over Ethernet or NBD devices. Regarding encryption, yes, that is one key element. But if the organisation looses one remote storage with the complete backup directory, it got all the needed information needed to begin to crack the encryption. If you need minimum 2 sets to be able to crack the encryption, you have another layer of security. And it was this combination which caught my attention. When you add locally encrypted disks, you have the third layer of security. In general, to achieve the best security, you need as many layers on top of each other as possible (within your acceptable performance limits, of course). And when you need to have 2 separate datasets to make the backup data readable and useful, it is a very good security layer which you are about to remove, compared to just have data and storage partition encrypted. With encryption, you are always dependent on the progress of the CPU power. What was considered to be a good encryption 3 years ago, is not as good today. because it's much easier to crack due to increased CPU power. That's what why spreading the encrypted data is just as important as well. Regarding iSCSI, ATAoE or NBD, that will require more bandwidth. Those remote sites I was about to setup, will not have the capacity on the connection to have such setup working efficient. By using rsync between the master and the slaves, the transfer goes much more efficient. >> I evaluated BoxBackup and set it up before this part of the >> documentation changed. Anyhow, there's also a contradictory sentence >> later on in the same URL: >> >> "NOTE Running the server in non-RAID mode has not been tested as >> extensively as in RAID file mode." > > Strictly speaking, in my mind, this is not contradictory as it doesn't > say that userland RAID is better or recommended, just more tested. Yes, exactly. And that was also why I choose to setup the soft-raid solution. Increased possibilities for security, and better tested. > However I think it may no longer be true. I suspect that few people are > using the userland RAID feature in production. If anyone except David > is, please speak up! I would also be interested in hearing others experiences as well! If I'm the only one, I agree, it's not much point in continuing this support in BoxBackup. Then I would need figure out another way how to solve this. I will not continue on this path if soft-raid disappears for sure in BoxBackup. kind regards, David Sommerseth