[Box Backup-dev] COMMIT r235 - box/chris/boxi/bin/bbstored

Chris Wilson boxbackup-dev at fluffy.co.uk
Thu Dec 15 15:36:09 GMT 2005


Hi Ben,

>>  When the store is corrupt on the server (e.g. missing raidfile directory), 
>>  bbstored will die when trying to read it, which gives the client a 
>>  TLSReadFailed. I couldn't think of a less helpful error to send to the 
>>  client, so I tried to improve it.
>> 
>>  I wanted the client to be able to phone/email the server operator and say 
>>  something more useful than "it doesn't work" or "the server disconnected 
>>  me". Perhaps they shouldn't care about RAID files, but the error should at 
>>  least indicate that the store is corrupt and the server admin should run 
>>  bbstoreaccounts check on it.
>
> Surely the server admin will be monitoring their server? If not, then you 
> should not be trusting them anyway.

Ideally, yes. But Real Server Admins(TM) are not perfect, not immune from 
making mistakes, are overworked and overstretched, and don't always have 
time to read all their logs, let alone act on them.

As a client I would verify my backups regularly rather than trusting the 
server operator to read and understand their logs. Or even better, I 
would expect my backup software to do that for me. And I would expect a 
better error message than TLSReadFailed if the store really was corrupt - 
which, as you say, a good admin should already have spotted and fixed
before allowing me to connect again.

Then again, I'm probably not the average client, and I'm willing to listen 
to others' views and do whatever I can to improve things in such a way 
that nobody disagrees.

> In most cases, that error will not happen because of corrupt stores. Do 
> you really want the client to think their backups are corrupt when most 
> likely they're not?

I just wanted to point out that it was an option, along with network 
issues. I didn't think they would automatically blame the server admin
for a corrupt account when that was only the third possibility listed, 
after network issues. But I'm happy with the other, potentially less scary 
message as well.

>> >  In which case, checking for a RaidFile exception isn't quite right, 
>> >  because you're missing the case where the object ID refers to a file.
>> 
>>  Sorry, what case? Surely that's the client's business. It always asks the 
>>  server for an object ID.
>
> An object can be a file or a directory. If you try to use a file object ID 
> where a directory is expected, you'll get a different error.

But that wasn't the error I was trying to catch. I made the least invasive 
change that stopped the server from crashing when I had a particular 
problem with the store. I'm not yet familiar enough with the code, or what 
exceptions could be thrown by that code path, to write a completely 
general error handler for that List command. Do you want me to try anyway?

> And surely the fact that it's using a RaidFile is an implementation detail 
> which the client has no need to know, and in fact, just obscures the real 
> problem?

Which is what, a corrupt store?

> That's the one.

OK, sorry, fixed in branch.

Cheers, Chris.
-- 
_ ___ __     _
  / __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |




More information about the Boxbackup-dev mailing list