Aside from the author dismissing tape because "drives are expensive" and then concluding that the primary disadvantage to optical is "the discs are expensive" (wow, big surprise...) and then he doesn't bother to actually calculate the costs of either...
...and not describing how they tested for errors...
...and not having particularly rigorous testing methodology (the DVD was apparently left for three months before he got around to checking it)...
...no organization looks at a requirement like "archive these records for 50 years" and then commands its IT department to only store the data on media that will be expected to survive 50 years.
What you do is store the data in such a way that your chance of losing data per year is within acceptable margins, on the most suitable storage method at the time, and migrated periodically when that system's chance of data loss falls below acceptable margins, appears to be getting too impractical to operate/maintain, or there are improvements in reliability or practicality.
That can include factors like cost, availability of knowledgeable labor (ask the IRS how finding COBOL programmers is going), parts/service availability/cost, and so on.
Other comments:
> HDDs, especially NAS disks, have an “always on” assumption
Even in the early 2000's, commercial NASes were offering systems that could power down portions of the array that were unused. There was significant interest in minimizing opex, and that's mostly power. Nowadays unraid and other solutions allow for the same.
> And if you ever wanted to take disks offline and store them on a shelf, you don’t really know how long they’ll survive - unless you plug them in every now and then.
"you don't know if that HDD is gonna work if it's sitting on a shelf" is irrelevant because your backups shouldn't be sitting on a shelf for any significant period of time, regardless of the media/mechanism. See discussion above: you should be rotating your media off-site periodically, bringing it back for tests. The biggest problem with HDDs is that SATA connectors are not rated for frequent use; their connect/disconnect rating is often in the range of "hundred or so." (This can be partially addressed by using a cable that stays plugged into the drive, and replacing the cable when it is past a certain number of cycles.)
> if you ever wanted to take disks offline and store them on a shelf, you don’t really know how long they’ll survive - unless you plug them in every now and then.
> And yet, the expectation is, that you will be able to read the disks without problem - even with zero maintenance. Indeed, that was the outcome of my ~15 year experiment!
That's not the expectation, at all? CD-Rs used to have a lifetime measured in months before they start showing failures, especially if they had an adhesive label. I don't understand how the author possibly could be reading ten year old CD-Rs. That's basically impossible.
DVD-Rs faired a bit better, and bluray disks even better.
The issue with HDDs is the lubrication of the motor
> any NAS will automatically check for errors, and notify if problems are found.
No. No. No. NO. This is probably one of the biggest myths of RAID and NAS devices.
Just because you have a RAID array doesn't mean it is configured to scrub its arrays; this often has to be enabled.
Just because you have a RAID array scrubbing doesn't mean you're going to find out about it. Make sure your reporting works, and ideally report on success, not just failure, and have something else that reports on failure to receive
Last but not least:
Just because your RAID array controller (or software) finds a parity error or unreadable block doesn't mean it does what you expect, and most people expect "beep bop boop, parity/mirror inconsistency detected. Fixed, meatsack! Pat yourself on the back for being smart in using RAID."
Reality: "beep bop boop, parity or mirror inconsistency detected, so go and verify your files or restore from backup, meatsack."
RAID arrays cannot self-repair mirror or parity errors because there is no way to do so. The array controller has no idea whether it's the parity bit that is wrong, or the data. In a mirror, it has no idea which drive is correct - just that one of them is bad. RAID is not for data consistency.
This is why ZFS is usually the top pick for data consistency: it knows the checksum of files and metadata, in addition to the on-disk redundancy via mirrored copies or parity. If it finds an error, it can play out both scenarios to see which results in files or metadata that match their checksum.
> I don't understand how the author possibly could be reading ten year old CD-Rs.
The cyanine dye CD-Rs were the most unreliable but the later phthalocyanine and azo discs are more robust. Mine all readback fine so long as they have been stored well without sunlight exposure.
Small addendum: The default installs of OpenZFS automatically configure monthly scrubs (typically on the first Sunday of the month), which is a great thing.
Unfortunately, depending on your local mail setup and your user account setup, you won't get the mail that is sent on errors when it's delivered to the root account.
...and not describing how they tested for errors...
...and not having particularly rigorous testing methodology (the DVD was apparently left for three months before he got around to checking it)...
...no organization looks at a requirement like "archive these records for 50 years" and then commands its IT department to only store the data on media that will be expected to survive 50 years.
What you do is store the data in such a way that your chance of losing data per year is within acceptable margins, on the most suitable storage method at the time, and migrated periodically when that system's chance of data loss falls below acceptable margins, appears to be getting too impractical to operate/maintain, or there are improvements in reliability or practicality.
That can include factors like cost, availability of knowledgeable labor (ask the IRS how finding COBOL programmers is going), parts/service availability/cost, and so on.
Other comments:
> HDDs, especially NAS disks, have an “always on” assumption
Even in the early 2000's, commercial NASes were offering systems that could power down portions of the array that were unused. There was significant interest in minimizing opex, and that's mostly power. Nowadays unraid and other solutions allow for the same.
> And if you ever wanted to take disks offline and store them on a shelf, you don’t really know how long they’ll survive - unless you plug them in every now and then.
"you don't know if that HDD is gonna work if it's sitting on a shelf" is irrelevant because your backups shouldn't be sitting on a shelf for any significant period of time, regardless of the media/mechanism. See discussion above: you should be rotating your media off-site periodically, bringing it back for tests. The biggest problem with HDDs is that SATA connectors are not rated for frequent use; their connect/disconnect rating is often in the range of "hundred or so." (This can be partially addressed by using a cable that stays plugged into the drive, and replacing the cable when it is past a certain number of cycles.)
> if you ever wanted to take disks offline and store them on a shelf, you don’t really know how long they’ll survive - unless you plug them in every now and then.
> And yet, the expectation is, that you will be able to read the disks without problem - even with zero maintenance. Indeed, that was the outcome of my ~15 year experiment!
That's not the expectation, at all? CD-Rs used to have a lifetime measured in months before they start showing failures, especially if they had an adhesive label. I don't understand how the author possibly could be reading ten year old CD-Rs. That's basically impossible.
DVD-Rs faired a bit better, and bluray disks even better.
The issue with HDDs is the lubrication of the motor
> any NAS will automatically check for errors, and notify if problems are found.
No. No. No. NO. This is probably one of the biggest myths of RAID and NAS devices.
Just because you have a RAID array doesn't mean it is configured to scrub its arrays; this often has to be enabled.
Just because you have a RAID array scrubbing doesn't mean you're going to find out about it. Make sure your reporting works, and ideally report on success, not just failure, and have something else that reports on failure to receive
Last but not least:
Just because your RAID array controller (or software) finds a parity error or unreadable block doesn't mean it does what you expect, and most people expect "beep bop boop, parity/mirror inconsistency detected. Fixed, meatsack! Pat yourself on the back for being smart in using RAID."
Reality: "beep bop boop, parity or mirror inconsistency detected, so go and verify your files or restore from backup, meatsack."
RAID arrays cannot self-repair mirror or parity errors because there is no way to do so. The array controller has no idea whether it's the parity bit that is wrong, or the data. In a mirror, it has no idea which drive is correct - just that one of them is bad. RAID is not for data consistency.
This is why ZFS is usually the top pick for data consistency: it knows the checksum of files and metadata, in addition to the on-disk redundancy via mirrored copies or parity. If it finds an error, it can play out both scenarios to see which results in files or metadata that match their checksum.