• 0 Posts
  • 1 Comment
Joined 11 months ago
cake
Cake day: October 18th, 2023

help-circle
  • There’s been some great responses here, I’m adding my own here.

    I’ve been in the Infrastructure/Security space for 22 years, homelabbing for at least that long if not longer. I’ve had a 42u rack in my apartment, a dedicated server room in a previous home, all the way down to a single RPI3 hosting what I needed. Things change, and it’s a hobby for me so it’s definitely fun to experiment, but here’s what I’ve learned over the years.

    In an enterprise/production environment especially with today’s “cloud” everything, machines that are not being utilized are wasted resources. In both a professional and homelab setting, I don’t start to look at machine expansion until I’ve hit 70-80% utilization of CPU and RAM. The reason being, if I’m constantly at 30-40% utilization, the server(s) are doing their job with plenty of headroom. When it starts to approach a consistent 70-80% then I ask myself the following question.

    “Do I need to add another machine or am I simply hosting things that I don’t use or need?”. This will drive the action as to whether or not to scale out my machine/lab or to reduce the workloads.

    I don’t think there’s a best practice here. It’s going to come down to an individual answering that question for themselves.

    If you don’t have it already, I would highly recommend adding some monitoring to your system. Something like the Prometheus/Grafana/Node Exporter. This will give you a good feel for what’s happening on your Proxmox host and for each VM. From there you’re able to have some data to drive your decision further.

    One thing to keep in mind, let’s say you have a 4 core 8 thread machine on Proxmox. The kernel scheduler will handle allocating cpu threads as fast as it can as well as garbage collecting memory back to the system. By that, you could have 8-10 VMs on your proxmox host all with 1-2 cores each and unless they are all being heavily utilized at the same time, the overprovisioning of CPU cores is totally fine. Memory is more important in that regard as well as having fairly quick underlying VM storage to avoid IO delay.

    My setup that I’m working on codifying at the moment is a 8 core N305/16gb ram/500gb m2 storage Mini PC running Proxmox. This machine has dual nics so I can utilize a trunk for VLANs and what not. I keep all networking services there such as multiple resolvers (pihole or whatever), Unifi controller, and anything that’s not coupled directly to my homelab services.

    The second machine I have is a i5-8279u (4c/8t) mini pc with 32gb ram and 2tb m2 pcie3 nvme storage running proxmox. It runs all of the main services in my lab/home and does quite well. If I want to test a new piece of software or just experiment I tend to use it temporarily to better understand the workload needs for the deployment/vm. If I decided that it’s very resource intensive and will push the overall host to 70-80% utilization I start looking at adding another node.

    I treat my “fleet” very much as a utility. So for instance if I have home assistant (which I’m currently not on, but moving back to soon). I might allocation 2 cores, 4gb of ram, 40gb of storage, and leave it at that. Keeping things smallish, and monitoring very closely over a period of weeks and months helps me to determine if I’m “rightsized” well enough for whatever it is I’m running. Having the monitoring stack in place really helps me to understand what’s going on and how I should pivot if at all.

    If I see small persistent spikes in utilization (say 5-10 minutes a few times a day where things spike). I simply ignore it, because it’s part of just how things go. But if it’s pegged at that level for a long time 12-24hrs as an example, then it’s time to start going down that path of resource allocation. Be it changing the Proxmox settings or maybe adding another machine.

    I really don’t think about growing my homelab, I just think of what I need to get the job done, what I need to experiment, and move on from there.

    I hope this was helpful.