gemini://tobykurien.com/articles/2022-05-17-smol-data-centre.gmi

I've been listening to several podcasts and watching some YouTube videos of tech hobbyists and many of them talk about setting up a data centre at home for running various things (like NAS, Home Assistant, etc.). Almost all of them seem to want to go as over the top as possible, e.g. having terabytes of NVMe storage with ZFS or RAID or some other complicated setup, running Kubernetes clusters on their Pi's, complaining about how they aren't getting their full 10 gigabits of throughput from their NAS even though they re-wired the house with Cat6a, how their Starlink isn't roaming so they have to suffer the indignation of using 5G, etc.

Meanwhile, back at my house, I'm very happily routing my (capped) LTE internet connection through a Pi 2, connected to a 2.4GHz WiFi AP at 100mbit, and total shared storage of about 120Gb (of which more than 50% is unused). I didn't want to waste my Pi 4 by using it in place of the bottleneck that is the Pi 2, even though it would increase the throughput of my internet connection, because it's fast enough as it is already. That is to say, my setup is the polar opposite of what seems to be the norm these days amoung tech enthusiasts, even though I consider myself a tech enthusiast.

I get that for many of these enthusiasts they want to use the hobby as an opportunity to learn some enterprise-grade technology. However, I still clearly remember the days before technology got "good enough" (by my standards anyway), and so I started thinking deeply about what I would consider a minimum viable home data centre, or as I'm calling it here, a Smol Data Centre (inspired by the smolnet). The intention is to host smolnet stuff like Gemini content, but also maybe run my mail server, calendar/contact DAV server, and other small services at home.

My over-arching guiding philosophy in this endeavour is to make it as cheap, simple, and efficient as possible. Unfortunately "simple" can mean different things to different people, so I'd like to define it a bit more clearly. For me, simplicity means:

Tangentially, not being reliant on a specific product or service that may not exist in 10 years time

This basically translates to: I would prefer to use simple UNIX command line tools that have been around forever, rather than YAML incantations to big shiny new monoliths. Also, for efficiency, I would like to have as few "moving parts" as possible.

Any spare Raspberry Pi I find in my drawers, which currently consists of several Pi 1's, Pi Zero's, a Pi 2, and a Pi 3 A+

Mount them vertically (like dishes in a dish rack) for passive cooling and minimal dust accumulation

Put a "roof" over them to protect them from dust and possible water leaks from the ceiling.

Powered by car phone chargers (i.e. 12V to 5V USB), connected to a 12V Lithium battery that is charged constantly from mains. This acts as a strong DC UPS at low cost. I could have used powerbanks, but the 12V battery also supplies my modem, switch, and home alarm.

The Pi's boot off a microSD card, and mount a USB thumbdrive as their main storage (/home partition). USB thumbdrives are really cheap, and I find 32Gb to be more than enough for my needs. I have several spare that I can swap in when needed.

The Pi's with ethernet jacks are wired into a switch, but the Pi 3 A+ is connected via WiFi (I know! The horror! It's placed closed to the AP though).

I would have liked to run OpenBSD on all the Pi's, but alas it only runs on a Pi 3 or Pi 4. I have it running on the Pi 3. One thing I really like about OpenBSD is that it's easy to get my head around and to administer. As an example, running `mount`, `set`, `export`, or even `ps ax` returns a sane screenful or less of understandable output, unlike in most modern Linuxes, where you have to `grep` the result to find the needle in the messy haystack.

For the older Pi's, I chose to run Alpine Linux, which is also small, stable, and easy to understand.

Where possible, I make the OS microSD card read-only (for reliability), and mount `tmpfs` for things like `/var`. I have to temporarily make the filesystem read-write to do OS updates.

I'm not using RAID or ZFS, but plain old ext4 on a USB thumbdrive. Sounds crazy, but I've got thumbdrives from over 15 years ago that are still working reliably! As long as it's backed up, a failure shouldn't be a problem. Here are my thoughts on backups:

I prefer backups to be interactive and manual, rather than automated, for several reasons: I back up to hard drives so I'd like them to be powered off when not in use (reliability and security); I'd have to enter passwords to do backups (security); I get to see if there are errors; I know for sure the backups ran.

I backup to 2 or more drives and create checksums of all backup images/sets to verify integrity

I don't encrypt backups (sensitive files should be encrypted on the original storage) for reliability and future-proofing (got bitten by TrueCrypt)

Offsite backups: how "offsite" do I really need my backup to be? I'd simply like to protect my backups from fire, theft, or minor flooding. So my "offsite" backup can be one of the backup hard drives stored in the garage, or outhouse, or maybe in a Tupperware container inside the pool motor housing outside!

In order to simplify connecting my servers to the Internet, I decided to use a tiny VPS server at a provider. I am currently just using `autossh` to maintain a constant SSH port forward from each server to the VPS. I would like to learn and use WireGuard, but that adds a lot of complexity and for now the port forward works fine. For web-based services, I have `nginx` on the VPS server that manages the Let's Encrypt SSL certificate, reverse proxying to my various servers, and adds rate-limiting and other protections. This greatly simplifies the admin required on each of my servers as it's centrally managed.

Other layers of security can be added to each server, for example, instead of running services inside Docker containers, I'm running them in sandboxes created using BubbleWrap. I think `bwrap` is a very under-appreciated tool and I've been using it to great effect even for my dev workflow, but that's a topic for another post.

This is an ongoing exploration for me, and I'm struggling to find online discussions around the topic of minimal self-hosting at home. If you have ideas or want to have a discussion, do reply to this post on Antenna or get in touch, I'd love to hear from you!

Smol Data Centre

Simplicity

Physical

OS

Backups

Security and connecting to the Internet

Thoughts?