I know, I know, clickbaity title but in a way it did. It also brought in the situation in the first place but I’m just going to deliberately ignore that. Quick recap:

  1. I came home at 3pm from the city, my internet at home didnt work.
  2. checked multiple devices, phones worked out of wifi, I figured I need to restart the router
  3. I login to the router and it responds totally normal but my local network doesnt. (Its always dns, I know)
  4. I check the router log and see 100s of login attempts over the past couple of days.
  5. I panic and pull the plug, try to get into my server by installing an old monitor, works, many errors about dns
  6. Wife googles with her phone, seems I had https login from outside on and someone found the correct port, its disabled now
  7. Obviously, local network still down, I replug everything and ssh into the server which runs pihole as dns
  8. pihole wont start dns, whatever I do
  9. I use history and find I "chmod 700"ed the dns mask directory instead of putting it in a docker volume…
  10. I check the pihole.log, nothing
  11. I check the FTL log, there is the issue
  12. I return it to 777, everything is hunky dory again.

Now I feel very stupid but I found a very dangerous mistake by having my lan fail due to a less dangerous mistake so I’ll take this as a win.

Thanks for reading and have a good day! I hope this helps someone at some day.

    • Possibly linux@lemmy.zip
      link
      fedilink
      English
      arrow-up
      47
      arrow-down
      1
      ·
      8 months ago

      Because you don’t have a way to know what’s been compromised. Take your data only and make sure to verify nothings been tampered with.

      Trust me it will be better in the long run.

      • haui@lemmy.giftedmc.comOP
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        26
        ·
        8 months ago

        Yeah, I dont feel like setting up a whole cloud infrastructure on a hunch. I‘m running like 15 different services and they are all compartmentalized. It would take weeks to reset all this. So far nobody got anywhere from what I can see.

        • khorak@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          13
          ·
          8 months ago

          One word of advice. Document the steps you do to deploy things. If your hardware fails or you make a simple mistake, it will cost you weeks of work to recover. This is a bit extreme, but I take my time when setting things up and automate as good as possible using ansible. You don’t have to do this, but the ability to just scrap things and redeploy gives great peace of mind.

          And right now you are reluctant to do this because it’s gonna cost you too much time. This should not be the case. I mean, just imagine things going wrong in a year or two and you can’t remember most things you know now. Document your setup and write a few scripts. It’s a good start.

          • haui@lemmy.giftedmc.comOP
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            1
            ·
            8 months ago

            I get your point. Ansible is quite interesting too. I do document most of the things I do but I have to admit I have been slacking a bit, recently. There is just so much stuff that needs doing and a lot of interesting projects to learn about that sometimes stuff gets forgotten.

            My personal impression of the linux space is still that folks get dumped on by the community for not being immersed in the nitty gritty though.

            Thats neither fun nor will it work to get more people interested in linux. People make mistakes, learn to help without judging.

            Have a good one.

            • khorak@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              2
              ·
              8 months ago

              I know what you mean. Most people mean well, some are a bit too aggressive, but probably also mean well. I honestly sometimes roll my eyes when I start reading about tailscale, cloudflare tunnels etc. The main thing is not to expose anything you don’t absolutely need to expose.

              For access from the outside the most you should need is a random high port forwarded for ssh into a dedicated host (can be a VM / container if you don’t have a spare RaspberryPi). And Wireguard on a host which updates the server package regularly. So probably not on your router, unless the vendor is on top of things.

              Regarding ansible and documenting, I totally get your point. Ten years ago I was an absolute Linux noob and my flatmate had to set up an IRC bouncer on my RPi. It ran like that for a few years and I dared not touch anything. Then the SD card died and took down the bouncer, dynDNS and a few other things running on it.

              It takes me a lot of time to write and test my ansible playbooks and custom roles, but every now and then I have to move services between hosts. And this is an absolute life saver. Whenever I’m really low on time and need to get something up and running, I write down things in a readme in my infra repository and occasionally I would go through my backlog when I have nothing better to do.

              • haui@lemmy.giftedmc.comOP
                link
                fedilink
                English
                arrow-up
                1
                ·
                8 months ago

                Thanks for elaborating! This is very helpful and I appreciate it. Will definitely check out ansible.

                I think i‘m probably on my way there anyway as I‘m setting up my own git forge and starting to use proper versioning.

                Then I‘ll probably try out ansible on some vm or new device. Have a good one!

          • haui@lemmy.giftedmc.comOP
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            7
            ·
            8 months ago

            Wow, a lot of people would set up a new server because of intrusion attempts in a log i guess. If I did that in a job I‘d get fired for doing nothing else but resetting everything every week.

            As an admin, you have to keep the CTO from using „master“ or „admin“ as the ssh password on a production server. Just so you know what level of stupidity makes the big bucks out there.

            • prettybunnys@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              3
              ·
              8 months ago

              As an admin I’d question why the CTO has a login on a production server.

              You would do well listening more when you ask for advice.

              • lando55@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                8 months ago

                For Chiefly reasons of course. Now whether or not that server is active in the cluster is another matter entirely, but hey if it makes him/her feel important /shrug

        • teawrecks@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          2
          ·
          8 months ago

          You’re saying you see a bunch of login attempts on your router, but you don’t think they actually got into it?

    • Lem453@lemmy.ca
      link
      fedilink
      English
      arrow-up
      9
      ·
      8 months ago

      If you have everything on docker compose migrating to another host is pretty easy. I could probably migrate my 11 stacks of 36 containers in 2 to 3 hrs

      • dutchkimble
        link
        fedilink
        English
        arrow-up
        4
        ·
        8 months ago

        Why would it take 2 to 3 hrs? Download time of container images?

        • lando55@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          8 months ago

          Figure ~45 minutes to run to the liquor store for a decent single malt, another ~25 minutes for the pizza rolls, quick power nap, wake up and redeploy. That’s about 2 hours.

          • Lem453@lemmy.ca
            link
            fedilink
            English
            arrow-up
            4
            ·
            8 months ago

            Pretty much this. Lot of padding in those numbers or waiting for some manual things to install etc

      • haui@lemmy.giftedmc.comOP
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        8 months ago

        If everything works well, I could probably do that too. But I‘ve had too many obscure little things happen that 10x the amount of time needed so I always plan for the worst case.

        Also, my point was that people are being massively overreacting due to the fact that my logs showed signs of attacks, not intrusion.

        I run many servers and the commercial ones I am much more slow and careful with. Every public facing service has attacks in their logs and I deal with them. I know what experience you guys have but its not hosting public services.

        the arrogance with which people suggest someone is incompetent is baffling. Not talking about you but quite a number of comments where condescending af.

        Thanks for the advice with ansible. I might actually give this a go.

    • surewhynotlem@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      8 months ago

      Why would I start fresh?

      If they had access to a machine, the first thing you do is install some kind of root kit so you can get access again later. This could be as small as modifying an existing binary to do things it isn’t supposed to do.

      If they didn’t access any machine, your fine.

      • haui@lemmy.giftedmc.comOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        8 months ago

        Thanks for elaborating. I appreciate it. To my knowledge, nobody had access to the network or the machine.