Well, it looks like I’ve got some new drives coming in for my home built NAS. This will significantly increase the amount of space available and also gives me the chance to geek out a little and play with a newer setup.
Why?
Over the years, I’ve accumulated a bunch of digital content. People sometimes use a bunch of external drives but I’ve always preferred to centralize and for many years, I’ve used a NAS.
Since law school (2006), I’ve used a Linux based system NAS. This was after a catastrophic data loss in my system back when I was living in Austin which were configured just as JBOD. The total available space has of course increased over time, but I’ve consistently used mdraid and XFS as my foundation, with a few years where I incorporated LVM.
In the last few years, I’ve also gotten more into photography and now regularly have several gigabytes of photos to pull off of memory cards every week or three. That and I’ve been getting into film as well. Often times, the output from the digital cameras (Nikon D850 and Leica Q2) and from my Epson V600 aren’t worth keeping, but there are middling keepers that I’ve found can be good material for use later.
I still use custom desktop computers I put together. The current system is seven years old but still performs admirably. It has always used an SSD and that has allowed for significantly greater longevity. Not long ago, I was getting frustrated with its speed. This pain was felt most acutely when importing pictures into Lightroom, which is a taxing process no matter what the computer. But then Adobe followed through with long promised performance updates and the performance has been more acceptable. I also recently got an nVidia GTX 1660 that was a significant upgrade over my older Radeon 7850. So processing power wise, the current setup was sufficient. Don’t get me wrong, Ryzen processors perform substantially better than the Intel i5-4570 Haskell I’m running and I admit we are living in the system’s twilight years. But that’s not the pressure point.
Rather, it is all these pictures. I’ve already upgraded the system’s SSD to provide more space for high-speed system storage. No matter what I do, Lightroom demands that the catalog and the previews are on the local system. So that piece will always remain. But with the NAS I currently have and because the NAS and my desktop are wired to the router, the original RAW files need not stay on the desktop.
The goal of this upgrade was to therefore increase the available drive space and allow for the reliance on the NAS as a central repository.
How?
Over the years, I’ve become very familiar with the capabilities of XFS along with the limitations. It handles small files better now and even deletes files quickly (no, that was once a problem). But one fundamental issue with the design of XFS is that provides no protection against bitrot. Since utilizing a RAID, I practice good data hygiene and scrub regularly. We are, however, now in the 20th century and we have better tools.
For example, now we have filesystems like ZFS and BTRFS that checksum the data while simplifying the software stack so that LVM is no longer needed. I’ve always been curious about ZFS and that filesystem’s alleged ability to avoid bitrot.
But of course, ZFS is incompatible with the Linux GPL license. And Linus seems to dislike it as well. But to be fair, ZFS on Linux is quite successful, and Ubuntu (my favored distro) officially supports ZFS, There’s no practical reason why implementing ZFS on my NAS should be difficult.
In comparison, BTRFS built into the Linux kernel. BTRFS is certainly not as mature as ZFS which was developed in the last years at Sun Microsystems. Stories of data corruption and data loss haunt the Internet, particularly for RAID5/6. But BTRFS has a killer capability – performing RAID on the chunk level instead of the device level. This allows for devices having different sizes to be used together in a filesystem. For a home NAS, this is an appealing possibility.
Did it Work?
Over the course of the last week, I’ve put the drives in and gotten my system back online. Last weekend I ran some diagnostics to confirm that the drives were of the right size, and then shucked them. Although some people run a full test where they read/write from each block of the drive, I’m too impatient for that. And my planned migration strategy would do a significant amount of that anyway. Don’t forget, you have to disable the 3.3v pin on these shucked drives with some Kaptom tape or other technique.
I pulled one array of drives out of my Fractal Design Node 804 and replaced the 2TB Western Digital Red drives with the new 10TB Western Digital White drives. While I was in there, I rewired and rearranged some things. The SSD now sits up at the front of the case in a built-in holder. I put two of the old 2TB drives at the bottom of the case in the motherboard bay. These two drives will likely become built-in backups for critical data. Given the positioning of the drives, I had to go buy some left-angle connectors so that I could wire them up. But that’s a small price to pay to have these two drives as available storage.
I’ve setup the new array (4x 10TB) as a BTRFS volume using RAID5 for data and RAID1C3 for metadata. One reason I made the plunge at this time was that Linux 5.5 merged BTRFS support for RAID1C3. Given this configuration, my system can survive the loss of one drive, and should there be an issue during the rebuild, another error in metadata. This should be sufficient to rescue data if failure comes for me. For the older array that’s still online (4x 4TB), I’ve migrated the data over the new array. How did I get the data from my other array (4x 2TB) off? I used one of the new 10TB drives to backup the information from the array. After that, I shucked the drives and installed the system. I built the BTRFS array using the other three drives and transferred data from the fourth drive over the last few days. This was a good way to test the system and make sure things worked properly.
Software wise, there have been glitches. BTRFS notoriously complicates the calculation of free space. Several days ago, my array was happily humming along. The array ran into free space issues when I began copying the my photo archives onto the system. Apparently this was once upon a time an issue with BTRFS but had been largely corrected. Except that there were some issues in kernel 5.5. Fortunately, Ubuntu allows you to install mainline kernels if you’re crazy so I’ve gone down this route until the kernel is officially supported.
Overall the array performs at least as fast as the old XFS setup, if not faster. Lightroom has taken to the NAS quite nicely and so far I’m not noticing any performance loss from storing originals on the NAS. Here’s hoping this setup doesn’t end up being too fragile.
What will I do with the older drives and array? The 4x 4TB array will probably be reformatted in the near future to also use BTRFS. I’m hesitant to bring these into the other array since 1) those drives are older and 2) additional drives would just increase the likelihood that the overall array (e.g., the 4x 4TB drives and the 4x 10TB drives) would fail. So I’ll probably segregate that data for certain types of content.
The 2x 2TB drives? Those will be live backup drives I think, maybe in RAID1. Or maybe not. If they’re straight up backups then who cares. That leaves 2x 2TB drives that are outside of the case. I’m not sure what’ll happen to those.
The dorkness will continue…