Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

First, there is a FreeBSD Errata Notice for this that offers an nice quick collection of the various bugs and subsequent repairs, with links to summaries, for anyone who is catching up on this issue:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275308

Second, I don't like the editorialization of this title ("In OpenZFS and Btrfs, everyone was just guessing") at all. No, nobody was "just guessing", but as far as I know there is no featureful FS that has undergone formal verification. It's a large codebase started by a now long defunct company that solved critical problems and delivered a lot of value, but there absolutely can be issues lurking over long time periods. It's a testament if anything that work and usage did uncover the issue, it wasn't brushed away at all but instead drilled down on and solved in short order, and now the already extensive test suites are expanded again in an organized way. And spekaing of "critical problems":

Third, a lot of the commentary around these sorts of things seems to indulge in noticing the rare misses while ignoring many hits. Amongst other reasons part of the core motivation for me to switch to ZFS 100% in 2010/2011 or so and doggedly stick with it ever since was precisely because I experienced data rot (permanent corruption) with my data under previous filesystems where that WAS NOT A BUG. HFS/UFS/NTFS/XFS/whatever, none of them offer any guarantees of data integrity in the first place! Bit flips somewhere, hardware has issues, copying has noise, whatever? RAID-5 write hole? Those or lots of other things are not a bugs at all in old filesystems, because that's just how those primitive things were. I've been carrying forward data since my first Apple IIe, and went back to find some of my old work, early digital photos and drawings I cared about, and somewhere along the line it had gotten mucked up. I know not where or when, because there was no chain of checksumming and trust that would have a chance to alert me. It's impossible for a human to keep up with terabytes of data manually, it has to be fully automated, baked in. At least in ZFS there being any corruption is a drop-everything big deal problem that is incredibly rare and niche and serious people will care very much about. Pretending the old stuff was good or even acceptable is pure bullshit.

It doesn't have to be mathematically perfect to deliver value, and importantly to be far superior to everything that came before. And it's inarguable that it's been battle tested very very hard for a very long time at this point. It's certainly saved my bacon a few times.



Yep, this is how I ended up running ZFS. Because with XFS the answer to "oh no a power cut" was "lol, a bunch of files might be zeros. Figure out which ones."

The whole weird implication that there's something else which is perfect is always bizarre: there isn't, everyone knows that, it's extremely non-trivial to do that.


> Because with XFS the answer to "oh no a power cut" was "lol, a bunch of files might be zeros. Figure out which ones."

Got bitten by that one once as well. I recall the XFS FAQ had something like in it like: "If you know it hurts, then don't do it."


Wasn't this XFS problem solved by write barriers?

I remember that happening to me once long ago, but never on rhel 7 or above.


I just want to say that you can do checksums on...I mean under xfs/ext etc on linux with dm_integrity, I've talked to many admins who have never heard of this device mapper.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: