hello everyone I have a small service I have been developing as a side gig for the last year and I want to self-host since I got some new hardware and I want to dive in the world of self-hosting and DevOps more (I’m a dev)

for now, I have it setup in a couple of EC2 instances, but things are getting expensive and since I got some new workstations from a friend that they were decommissioning (they are moving to a new place and have a bigger budget for a data center) I got 2 Lenovo Workstation each with Ryzen 9 5900X 32GB RAM and 1TB NVMe

I want to set up HA for this service and maybe add some other computer as a NAS, so my question is how should I go about doing this I was thinking proxmox on both nodes and setup HA, but I think that needs a common Data store (probably will be using the NAS for this) and how should I go about setting up the HTTP server (I have some stuff that also runs in docker containers I was thinking having one HTTP server for managing all that traffic with DNS and stuff) and monitoring any help is much appreciated

  • sanwfa@alien.topB
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    Think of backup solutions / separate storage as well for all your data.

  • ElevenNotes@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Can you service run as HA without hypervisor HA? Like is it a webservice with a database backend?

  • wing03@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    TrueNAS enterprise with the cluster mirroring for HA iSCSI for your hypervisor storage?

    I’m getting out of the mom and pop web and mail hosting business and taking a very slow and long shutdown myself by moving out of co-location and into my basement. So just TrueNAS core, ESXi vsphere essentials to do the vmware snapshots for that end of the business.

  • basicallybasshead@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    but I think that needs a common Data store (probably will be using the NAS for this)

    Note that this setup could be a single point of failure. For true HA, you might want to consider deploying two storage boxes with replication (SAN or NAS), or configure a hyper-converged infrastructure using solutions like VMware VSAN, Ceph, or StarWind VSAN.

    • BadrEddine456@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      true was thinking about using True NAS for this, but don’t know if True NAS has anyway of doing replication with another node

  • dazchad@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    HA involves many factors: service uptime, link uptime, db uptime, etc. I’d probably put a reverse proxy in front and use the servers as upstream. web servers tend to be more reliable, so in your case a single instance ought to suffice.

    Aside from actual HA tools, your most important asset in this stage is a uptime check service that pings your server every n seconds, a reliable backup/restore procedure, and a one-button deployment strategy.

    Shits can and will probably happen. What are you going to do when it does? And how fast can you respond? I say this because you most likely won’t get HA right in the first, second, or third time, unless you already have tons of experience behind you. Embrace failure and plan accordingly.

    • BadrEddine456@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      true in my current setup on EC2 i have two Postgres dbs one is just replicating, but I had an incident when a bug i wrote in my spring app eat all the available RAM and the VM got stuck, and I lost about 6 hours worth of user data , so that’s why I’m thinking maybe HA could help if a VM in one node is stuck or blocked or something of that kind the hypervisor will spin another one on a different node or am i wrong here

      • dazchad@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        If you had monitoring, you wouldn’t have taken 6 hours to catch it.

        I’d say learn HA anyway because it’s a good skill, but that doesn’t prevent you from having the other parts I mentioned. I say this because, again, unless you are experienced with HA, there will be edge cases where it’s not going to do what you though it would do, and your service will be down all the same. Monitoring/alerting and one-click/shell script install will be much more valuable in the short-mid term.

        • BadrEddine456@alien.topOPB
          link
          fedilink
          English
          arrow-up
          1
          ·
          10 months ago

          true that i did have uptimekuma for simple monitoring but the 6 hour down time was me, i was unreachable i was inside a factory doing some troubleshooting with no service and forgot to ask for wifi creds 🤦‍♂️(was stressed that day ) as for rollback and install i run those through github actions for CI/CD