Lab Infra Rebuild Part 1


I’ve been posting on socials a bit about rebuilding my lab and some opinions I had on tools, approaches and more. Some people have asked for a way to keep up with my efforts, so I figured it might be time to post here for the first time since 2018!

In this post I’ll focus on what came before, a bit of a recap of my previous setup. Additionally, to a general software refresh I have also been in Malta now 8 years and a lot of my office hardware was purchased around the time of moving here, so we’ll also cover replacing NAS servers and more.

My previous big OS rebuilt was around CentOS 7 days, so that’s about 3 years ago now, high time to revisit some choices.

My infra falls in the following categories:

  • Development machines: usually 10 or so Virtual Machines that I use mainly in developing Choria.io
  • Office support equipment: NAS, Printers, Desktops, Laptops etc
  • Networking equipment: mainly home networking stuff for my locations
  • Hosting publicly visible items: This blog, DNS, Choria Package repos etc
  • Management infrastructure: Choria, Puppet, etc
  • Monitoring infrastructure: Prometheus and friends
  • Backups: for everything
  • General all-purpose things like source control etc
  • My actual office

Below I’ll do a quick run through all the equipment, machines, devices etc. I use regularly. I’ve largely replaced it all and will detail that in the following posts. It’s not huge infra or anything, all told about 20 to 30 instances in 5 or 6 locations.

Development Machines

I’ve had 3 x Intel NUC machines, each with 512GB solid state and 8GB RAM since 2016. They have served me well and really I could see them going for a while longer.

They were each a KVM host and had on them 3 to 8 CentOS VMs. One had some bigger ones as that was my general shell machine and the other were mainly there to add some servers to my node count for Choria development. Choria being inherently a distributed system having some more nodes help.

For my needs they were great but unfortunately they have a hardware problem, their BIOS battery die and it is soldered on to the board so not exactly user serviceable.

Two of these are just retiring - I will remove the SSD and see what it can be used for, more on that later - but otherwise they are done.

One of them is actually a bit newer so it’s finding a new home as a little desktop for my 6 y/o mainly for Scratch and Minecraft etc.

I have been using RedHat since version 0.9 Halloween Beta release then used CentOS for ages - even donated hardware - and after they seemed to be intent on self-destruction I started moving to Alma Linux.

Public Hosting and Management

I host my blog and a few other public bits myself. When I rebuilt all my things 3 years ago I wanted to get some more real Kubernetes experience so I got 3 x 8GB droplets from Digital Ocean with a Kubernetes cluster.

There I hosted the sites, used their managed MySQL, used their Object store and volumes.

It’s been fine generally, I don’t think I like the complexity for what I needed but it was good experience - more on this later. This cluster started with Linode managed KE but it was early days for them in Kubernetes and so I bailed - their support was terrible at the time wrt Kubernetes.

This was surprising as I’ve been a Linode customer since almost their day one and they have been consistently amazing. But when they launched managed Kubernetes it seemed their EU timezone support people were like level 1 only and I always ended up waiting till US time to get things resolved, several times many hour outages. It’s a sad outcome, at the time I moved all I felt comfortable away from them fearing it was a start of a downward slide in quality. Apart from things I am reluctant to move Linode has essentially lost me as a customer.

So I moved to Digital Ocean. It was fine, I have only really one pain with them - they do forced updates of the Kubernetes version (fine) but the way they do it always result in 2 full upgrades and node replacements. This means my outgoing IPs for things like Prometheus kept changing which was extremely annoying.

The Kubernetes infra also ran a 3 node Choria Broker cluster, Prometheus+Alert Manager and Graphite. Alerts to Victorops that I pay for.

Apart from that I had Linode machines for Puppet, Name Servers, Choria Package repos and some general use machines that I should have killed years ago.

I guess the bills for this came to like $400 a month, helps to have a company to bill this to.

Office Equipment

My main office desktop is one of the last Intel iMacs with an old, like 2009 era, Apple Thunderbolt display. I’ve had iMacs since the very first one and they were one of my favourite form factors. It’s a bit sad about this one as it was a really nice machine but Apple is moving fast to stop supporting Intel so now there are no more OS updates.

The screen being ridiculously old and Thunderbolt only - no HDMI - is now just useless.

The current iMacs do look nice but the problem is there is no screen I can put next to them that don’t look like crap and I want either a big monitor or 2, it’s just not working anymore to use iMacs.

I have an old tank Brother printer that is maybe 15 to 20 years old. It just refuses to die but I think its paper feed has now dried up so maybe time to go.

Other than things actually stuck to my office I use a range of new Macbooks - currently a 16 inch M2 Pro with 32GB RAM.

For file storage I use Dropbox of course but QNAP devices for larger storage. I have an old TS-439 that is now 14 years old and still getting security updates (!!!), this is at my office. At home I have a TS-451+. Each provide around 4TB of usable space.

At home I also have a 34 inch ultra-wide display that I put my Laptop on when I sit at the desk here. I got this 7 years ago now so getting a bit old.

Networking Equipment

I have 4 locations I work from often with bits of equipment scattered everywhere. Some time ago I had a mix of networking equipment but I’ve been standardising on Ubiquiti. I am still sad about Apple retiring their Wi-Fi range. Mikrotik was nice enough but I wanted something a bit more polished.

In Malta I have 3 Dream Machines (the round tube one) and around 16 switches, APs, etc. My house is 450 year old with 1.5 meter thick walls so every room gets an AP. I like the single vendor system as all have a single management pane and generally things like VPNs etc. just work.

In Latvia I have a Dream Machine Pro as I have a set of surveillance cameras there, I am considering updating to the Pro at my main house at least also as I’d like to add a camera outside.

In Malta I used to be on Melita for everything, now on Go who have a really great new all fibre to the house network. I have one location left to move and then I’ll have 1Gb links everywhere. In Latvia a 4G link that’s actually quite surprisingly good (190 Mbps down 11 up).

Firewalls tended to be a mix of the Ubiquiti devices and iptables.

General Infra

My mail has been hosted at Fastmail for years and they are amazing. The web ui is a bit of a mess to be honest but at least it doesn’t keep changing. Their mail hosting features are rock solid though.

I used GitHub for everything since they introduced unlimited Private repos. I do pay for GitHub though.

Backups

I have a 30TB Hetzner machine that runs Bacula. It does daily backups of everything I have and does a full 3 month rotation of Full backups with 1 month of Incrementals onto a RAID-10 disk setup.

The main QNAP syncs my Dropbox onto its disks daily.

The main QNAP is set up as 2 x RAID-1 volumes. The main volume is synchronised daily to the office QNAP and I do monthly manual disk rot checks and sync between the RAID-1 volumes. This gives me a 1 month old full backup of the entire QNAP at home and daily off-sites. Monthly I also sync the QNAP to the Hetzner machine.

This means every file hits 9 drives in 3 locations over 2 countries with access to file versions going back 3 months. Ample time to recover from user error like deleting the wrong files and also a lot of redundancy built in.

Physical Office

I’ve had an Office in a town called Mosta here for 4 years now, I have not used it much because parking around it was hell. On paper it was 5 minutes from School and I would stay there while the boy is at School - in practise it could take me 40 minutes to get parking and 12 to drive back home. And when I found parking it could be very far from the office - walking along pavements in Summer is no joke here.

It just never worked, I ended up working from home most of the time. I should have bailed out of it years ago but just never got around to it.

Conclusion

So that about rounds it up, keep reading to hear what literally everything is being replaced with and what new things are being added to the mix to boot!