Jump to content

Current Ubuntu. 

24x Drive server
2 Raid-z2 vdevs

 


        115G resilvered, 0.73% done, 1 days 19:31:12 to go
config:

        NAME                        STATE     READ WRITE CKSUM
        Leyline                     DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            sda                     DEGRADED     0     0     4  too many errors
            sdb                     DEGRADED     0     0     4  too many errors
            sdc                     DEGRADED     0     0     4  too many errors
            sdd                     DEGRADED     0     0     4  too many errors
            sde                     DEGRADED     0     0     4  too many errors
            sdf                     DEGRADED     0     0     4  too many errors
            sdg                     DEGRADED     0     0     4  too many errors
            sdh                     DEGRADED     0     0     4  too many errors
            replacing-8             DEGRADED     0     0     4
              5779891482370295872   UNAVAIL      0     0     0  was /dev/sdi1/old
              12188090009805835136  UNAVAIL      0     0     0  was /dev/sdi1/old
              sdi                   ONLINE       0     0     0  (resilvering)
            sdj                     DEGRADED     0     0     4  too many errors
            replacing-10            DEGRADED     0     0     4
              11848766891228109773  UNAVAIL      0     0     0  was /dev/sdk1/old
              sdk                   ONLINE       0     0     0  (resilvering)
            sdl                     DEGRADED     0     0     4  too many errors
          raidz2-1                  DEGRADED     0     0     0
            sdm                     DEGRADED     0     0   647  too many errors
            sdn                     DEGRADED     0     0   647  too many errors
            scsi-35000cca2526063e4  DEGRADED     0     0   647  too many errors
            sdp                     DEGRADED   222     0   428  too many errors
            sdq                     DEGRADED     0     0   647  too many errors
            replacing-5             DEGRADED     0     0   649
              3361907636893199834   OFFLINE      0     0     0  was /dev/sdr1/old
              sdr                   ONLINE       0     0     0  (resilvering)
            sds                     DEGRADED     0     0   647  too many errors
            sdt                     DEGRADED     0     0   647  too many errors
            sdv                     DEGRADED     0     0   647  too many errors
            17338618180947265615    OFFLINE      0     0     0  was /dev/sdu1  (awaiting resilver)
            sdw                     DEGRADED     0     0   647  too many errors
            sdx                     DEGRADED     0     0   647  too many errors
          sdu                       ONLINE       0     0     0
        logs
          nvme1n1                   ONLINE       0     0     0
          nvme2n1                   ONLINE       0     0     0



I know I have lost data but I am trying to replace sdu1 but I think I F-d up and added it the NEW drive to a separate vdev or something. Now I can't offline it as it just offlines the old sdu. 

What do I do>?

Link to comment
https://linustechtips.com/topic/1515553-zpool-drive-replacement-issues/
Share on other sites

Link to post
Share on other sites

Yea it looks like SDU is at the top level. And you can't remove it, so its just gonna have to be there.

 

Do you have a bad disk controller or something? 

 

This looks like a fix the drive controller to fix all the drives with errors, then replace the min drives to get it mounted, copy all the data off, and reformat.

Link to post
Share on other sites

23 minutes ago, Electronics Wizardy said:

Yea it looks like SDU is at the top level. And you can't remove it, so its just gonna have to be there.

 

Do you have a bad disk controller or something? 

 

This looks like a fix the drive controller to fix all the drives with errors, then replace the min drives to get it mounted, copy all the data off, and reformat.

No just old drives beginning to fail. 
F***K seriously. SO if that single vdev fails the whole pool crashes?


I am slivering but Damn I had almost 80tbs. This sucks so much

 

Link to post
Share on other sites

11 minutes ago, Bmoney said:

No just old drives beginning to fail. 
F***K seriously. SO if that single vdev fails the whole pool crashes?


I am slivering but Damn I had almost 80tbs. This sucks so much

 

Yup that single drive would take the whole pool down. Probably time to reformat.

 

It seems very unlikely this is drives alone to me. I'd run zpool zero to reset those stats and see which ones really have issues.

Link to post
Share on other sites

3 minutes ago, Electronics Wizardy said:

Yup that single drive would take the whole pool down. Probably time to reformat.

 

It seems very unlikely this is drives alone to me. I'd run zpool zero to reset those stats and see which ones really have issues.

Where can I offload 80TB?
At the very least 30tb is not replaceable. 

Can I Rent a file server? Cause internet is slow here. 

 

Link to post
Share on other sites

2 hours ago, Bmoney said:

At the very least 30tb is not replaceable. 

Then how do you not have backups already?

 

Either rent a fireball (or snowball), or use the money you would spend on that to get some more drives and move the irreplaceable data over. With that kind of money you should be able to get about 40TB worth of drives. This should allow you to set up a new ZFS pool, to which you can copy your irreplaceable data. Then once that's safe, do remove the single-drive vdev from the pool, figure out which drives are faulty and get them out of your pool. Now you should be left with enough space to juggle your data and rebuild your original pool. Then use the new drives to implement proper backups of your irreplaceable data. I'd recommend using Borg Backup.

Link to post
Share on other sites

Also, set up weekly scrubs and an automated way of alerting you of failures.

 

I've found it advantageous in the past to add a hot spare to the zpool. That's how I'm running currently. Weekly scrubs which will immediately trigger the hot spare to be used to resilver at the first io errs on the drives.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×