So I’m trying to get unraid set up for the first time, I’m still running the free trial and assuming I can get this set up, I do plan on purchasing it but I’m starting to get frustrated and could use some help.

I previously had a drobo, but that company went under and I decided to switch to an unraid box because at least as far as I can tell, it’s the only other NAS solution that will let me upgrade my array over time with mixed drive capacities.

Initially everything was fine, I popped in all my empty drives, set up the array with 1x20Tb drive set up as parity, 1x20tb and 5x14TB drives set up as data disks and I started to move stuff over from the backups that I had from the previous Drobo NAS using the unassigned devices plugin and a USB 3 Hard drive dock.

Well, what had happened twice now is that randomly a device will go disabled with no indication of what’s wrong. The blank drives are only about a year old and have before this shown absolutely no signs of failing. They are also still passing the SMART tests with no errors. The first time this happened it was the brand new 20 terabyte drive and the parity slot. I was able to resolve this by unassigning that drive starting the array and then stopping to reassign the drive. That’s what a forum post. I was able to find suggested and that seemed to have worked, but it started a whole new parity sync That was estimated to take a whole week. The thing is I don’t have a week to waste so I went ahead and started moving files back onto the system again, but now the same thing has happened to disc 3. One of the 14 terabyte drives.

I’m at my wits end, the first time it happened I couldn’t figure it out. So I just wiped the array reinstalled on raid and try it again because I just couldn’t figure it out. Are there any kind of common pitfalls than anyone could recommend me checking or anyone in general? Just willing to help?

My hardware: Ryzen 7 5700G 64GB of 3200 ECC DDR4 (2x16gb is currently installed, but I’ve got another 2x16 that just arrived yesterday. I need to install that was just shipped late) 8 NAS bays in the front case

Blank Drives: 2x20TB, 5x14TB, 1x8TB Drives with Data on them: 2x20TB, 1x10TB, 1x8TB (totaling around 40 TB of data across them)

Once the data is moved off of the drives that have data on them, I do intend to add them to the array. My NAS case has eight bays, and two internal SSDs that are separate from the NAS bays, One sata ssd set up as a cache, the other an NVME m.2 set up as a space for appdata.

As of last night before I went to bed I had about 3 terabytes of data moved onto the array, but during the overnight copy, something happened to my disc 3 which made the device marked as offline. I couldn’t find any error messages informing me why the disk was offline but it was marked as offline.

The Parity drive was already invalid because I was copying data in while the parity sync was happening, and now I can’t get the array to start at all.

I tried doing something that a forum post recommended which was to start the array with the disc unassigned in maintenance mode, then stop the array and then restart it with disc 3 Reassigned to the correct drive, but it refuses to do so. It tells me that there are too many wrong or missing disks.

The weirdest part is that I know that disc 3 still has the data on it because if I unassigned 3 and then mount that drive using the unassigned devices plugin I can see all the data is still browsable and there.

I’m starting to feel real dumb cuz I don’t know what I’m doing wrong. I feel like there’s got to be something simple that I’m doing wrong and I just can’t figure out what it is.

  • dmtalon@infosec.pub
    link
    fedilink
    English
    arrow-up
    5
    ·
    10 months ago

    Has all your current hardware been stress tested? The sata cables known good?

    I’ve never had unRAID actually disable a disk before and I’ve got 11+yo drives still plugging along. I’ve gone through a few hardware swaps over the years but ultimately any issues I’ve had were cable related so far. (Generally I’ve seen CRC errors in the past with cable issues)

    I guess except for the Ryzen power supply adjustment in the bios (causing reboots and wasn’t cable related). But everything else so far has been sata cables for me.

    Sounds like you’re in a hurry but if you haven’t yet, run a long memory tests and check reseat/replace sata cables etc. Check memory voltages make sure bios is powering it and sets the speed correctly. All the hardware basics.

    Good luck.

    • focusforte@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      I have seen a few messages about CRC errors, since then I did swap out one of the SAS to 4xSATA cables based on some advice when I posted this on a different platform, and so far it seems to be working reliably now. So I think that might have been my problem.

      In order to make it so that I could actually keep working in the short-term, (because I don’t have a week to wait for a Parity sync…) I have removed the Parity drive from the array so that I can move data on without having to wait for the Parity sync to finish. My intention is to copy all of the data onto the array and then once All the data is in add the parity drive back in, let it do a Parity sync, and then once that sync is done then wipe and add those 20 terabyte drives (The ones that currently have data on them) to the array.

      Does this seem like a worthwhile plan?

      • dmtalon@infosec.pub
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        Hrm…

        I guess if you’ve got a safe backup that might work, but once the parity rebuild starts you’ll wanna let it sit and do that. Any array usage will be quite slow I believe.

        I’ve never run without a parity and rebuilt it after.

        • focusforte@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          10 months ago

          Yeah I was running with the parity, but something happened I think it was the SATA cable issue, and the parody drive had to be reset, parity sync was going, but I continue to onboard data to the array. And I think this is where one of my problems happened. Array usage wasn’t slow, I was able to onboard data just fine, but that was when I had that overnight issue where one of the array drives just dropped out.

          So my solution so that I didn’t have to start completely over again was to remove the parity drive, get my array drive back in, and then start bringing data back on. If I add the parity drive then I can’t bring my data on until the sync is over which had an estimated time of about a week. I don’t have a week.

          But yeah, what I’m doing is the drives that have data on them I’m not moving anything onto the array, I’m copying things onto the array. So worst case if something happens to the array all of the original data is still on the separate drives. Then I put the parity drive in, and let it rebuild the parity for a week, then after the parity is built I will wipe the other drives the data is on right now and add them to the array.

  • pedroparamo@lemm.ee
    cake
    link
    fedilink
    English
    arrow-up
    4
    ·
    10 months ago

    I’ll add my two cents. I had a similar issue and it was because I had all of my drives connected to one power connector using those daisy chains extender cables. So each time I did a parity all kinds of errors would pop up and disable the drives.

    • focusforte@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      The case I’m using has a back plane where I connect the SATA data, and the molex power. It’s got one molex power per 4 drives. Currently both of those are off of a single cable running off the power supply. I’ll swap that to be a dedicated cable for each.

  • Nogami@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    Sounds like a hardware problem. If you’re going with motherboard SATA connectors, don’t.

    Look into a professional SAS adapter and use that instead. You’ll be super happy you did. Very cheap on eBay.

    • focusforte@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      I do have an HBA. it is a LSI 9300-16I. And I’ve got a fan zip tied to it so that the heat sink has proper cooling. I’ve got these SAS to sata cables running to the backplane of the NAS bays. It converts each one of those SAS ports to 4x SATA plugs.

  • atmur@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    I had a similar issue, it ended up being a hardware problem (my SAS expander specifically). Make sure everything connecting your drives is working correctly.

  • acosmichippo@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    10 months ago

    I decided to switch to an unraid box because at least as far as I can tell, it’s the only other NAS solution that will let me upgrade my array over time with mixed drive capacities.

    other NASes like synology will do it. it just has to re-distribute the data which can take some time and a lot of HDD cycles.

    https://kb.synology.com/en-us/DSM/tutorial/how_to_expand_storage

    I agree with the other posts, you most likely have a hardware issue that is causing the disks to disconnect.

    • focusforte@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      Are there any software methods I can use to try to narrow that down? I don’t exactly have a ton of spare hardware to just swap in to test it 😅

        • focusforte@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          10 months ago

          I’ve done that a couple times, but l I’ll definitely do that again next time I pop the thing open and put in the two additional RAM sticks that finally arrived.