Help! Getting my 8TB drive working with H200

Jelly-Monster · June 25, 2020

Hi everyone,

I recently upgraded the Perc6 Raid controller in my Dell R710 to a H200 which I've flashed to IT mode. My current drive setup is:

(2x) 1TB hard drives in RAID1

(1x) 2TB hard drive

(1x) 8TB hard drive which I purchased from eBay at the same time as the H200

I can boot into Proxmox fine, and see all 4 disks. However, when I clicked "Initialize Disk with GPT" on the 8TB, I get "command '/sbin/sgdisk /dev/sdb -U R' failed: exit code 2".

Also, when I try and use fdisk I get "fdisk: cannot open /dev/sdb: Input/output error". I've also booted a Ubuntu live image and get the same error with fdisk.

The drive was advertised as "new", but I guess it could be faulty.

Does anyone have any ideas?

WereCatf · June 25, 2020

2 minutes ago, Jelly-Monster said:

Does anyone have any ideas?

Have you tried the disk without the RAID-controller?

Jelly-Monster · June 25, 2020

2 minutes ago, WereCatf said:

Have you tried the disk without the RAID-controller?

Unfortunately I haven't got an other way to test it. I could order SATA to SAS adapter and try it in my gaming PC.

Master Disaster · June 25, 2020

24 minutes ago, Jelly-Monster said:

Unfortunately I haven't got an other way to test it. I could order SATA to SAS adapter and try it in my gaming PC.

Try unplugging the 8TB entirely then swapping the 2TB over to the port the 8TB was in and see if the 2TB stops working.

Jelly-Monster · June 25, 2020

41 minutes ago, Master Disaster said:

Try unplugging the 8TB entirely then swapping the 2TB over to the port the 8TB was in and see if the 2TB stops working.

I've swapped both, the 8TB is now /dev/sdc and still the same input/output error.

Master Disaster · June 25, 2020

26 minutes ago, Jelly-Monster said:

I've swapped both, the 8TB is now /dev/sdc and still the same input/output error.

Then its more than likely a faulty drive however it would still be good to test it outside of the RAID Controller just to be 100% sure.

Jelly-Monster · June 25, 2020

2 minutes ago, Master Disaster said:

Then its more than likely a faulty drive however it would still be good to test it outside of the RAID Controller just to be 100% sure.

Yeah, I'm thinking the same. I've got a SATA to SAS adapter coming next week, so I guess we'll see what happens.

Aragorn- · June 25, 2020

check your kernel logs (dmesg) and look at /proc/partitions to see if the machine is actually recognising the drive properly and to see if its producing any errors.

We have multiple H200/H310's at work (in actual Dell servers) and i have one at home reflahsed to stock LSI firmware in my home media server, all happily talk to 8TB drives. Its very unlikely its a controller issue.

Oddly enough, i purchased a 8TB SAS drive for my home setup last year from ebay, which was pretty much DOA. The drive spun up and was detected by Linux, but you couldnt do anything with it and it just spewed out errors. I was going to return it to the ebay seller, then realised it was still under warranty with WD, so i sent it back to them and they replaced it no probs.

Jelly-Monster · June 25, 2020

17 minutes ago, Aragorn- said:

check your kernel logs (dmesg) and look at /proc/partitions to see if the machine is actually recognising the drive properly and to see if its producing any errors.

We have multiple H200/H310's at work (in actual Dell servers) and i have one at home reflahsed to stock LSI firmware in my home media server, all happily talk to 8TB drives. Its very unlikely its a controller issue.

Oddly enough, i purchased a 8TB SAS drive for my home setup last year from ebay, which was pretty much DOA. The drive spun up and was detected by Linux, but you couldnt do anything with it and it just spewed out errors. I was going to return it to the ebay seller, then realised it was still under warranty with WD, so i sent it back to them and they replaced it no probs.

Good shout. I'm no good at reading the logs though.... it does seem to be spewing up some errors:

[14624.392973] mpt2sas_cm0: log_info(0x3112043b): originator(PL), code(0x12), sub_code(0x043b)
[14624.393010] sd 0:0:2:0: [sdc] Unaligned partial completion (resid=113928, sector_sz=512)
[14624.393016] sd 0:0:2:0: [sdc] tag#3096 CDB: Read(32)
[14624.393021] sd 0:0:2:0: [sdc] tag#3096 CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[14624.393025] sd 0:0:2:0: [sdc] tag#3096 CDB[10]: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00
[14624.393033] sd 0:0:2:0: [sdc] tag#3096 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_SENSE
[14624.393038] sd 0:0:2:0: [sdc] tag#3096 Sense Key : Illegal Request [current]
[14624.393043] sd 0:0:2:0: [sdc] tag#3096 Add. Sense: Logical block guard check failed
[14624.393051] sd 0:0:2:0: [sdc] tag#3096 CDB: Read(32)
[14624.393058] sd 0:0:2:0: [sdc] tag#3096 CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[14624.393062] sd 0:0:2:0: [sdc] tag#3096 CDB[10]: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00
[14624.393066] blk_update_request: protection error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 17 prio class 0

No idea what this means though?

Master Disaster · June 25, 2020

5 minutes ago, Jelly-Monster said:

Good shout. I'm no good at reading the logs though.... it does seem to be spewing up some errors:

[14624.392973] mpt2sas_cm0: log_info(0x3112043b): originator(PL), code(0x12), sub_code(0x043b)
[14624.393010] sd 0:0:2:0: [sdc] Unaligned partial completion (resid=113928, sector_sz=512)
[14624.393016] sd 0:0:2:0: [sdc] tag#3096 CDB: Read(32)
[14624.393021] sd 0:0:2:0: [sdc] tag#3096 CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[14624.393025] sd 0:0:2:0: [sdc] tag#3096 CDB[10]: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00
[14624.393033] sd 0:0:2:0: [sdc] tag#3096 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_SENSE
[14624.393038] sd 0:0:2:0: [sdc] tag#3096 Sense Key : Illegal Request [current]
[14624.393043] sd 0:0:2:0: [sdc] tag#3096 Add. Sense: Logical block guard check failed
[14624.393051] sd 0:0:2:0: [sdc] tag#3096 CDB: Read(32)
[14624.393058] sd 0:0:2:0: [sdc] tag#3096 CDB[00]: 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00
[14624.393062] sd 0:0:2:0: [sdc] tag#3096 CDB[10]: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00
[14624.393066] blk_update_request: protection error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 17 prio class 0

No idea what this means though?

Quote


The sector size of the block layer is 512 bytes, but integrity interval
size might be different (in case of 4K block size of the media). At the
initiator side the virtual start sector is the one that was originally
submitted by the block layer (512 bytes) for the Reftag usage. The
initiator converts the Reftag to integrity interval units and sends it to
the target. So the target virtual start sector should be calculated at
integrity interval units. prepare_fn() and complete_fn() don't remap
correctly the Reftag when using incorrect units of the virtual start
sector, which leads to the following protection error at the device:

"blk_update_request: protection error, dev sdb, sector 2048 op 0x0:(READ)
flags 0x10000 phys_seg 1 prio class 0"

To fix that, set the seed in integrity interval units.

Might help? I've got zero experience with SAS so apologies if its irrelevant.

Aragorn- · June 25, 2020

Google some of the errors and see what you find...

This post for instance:

https://serverfault.com/questions/971722/dmesg-full-of-i-o-errors-smart-ok-four-disks-affected

Suggests a similar error caused by a bad cable.

I would look at the logs from boot time and make sure the controller sees the drive with the correct capacity. I would also check /proc/partitions to confirm the capacity is correct. You say you've flashed it so there shouldnt be any issues, but i'm pretty sure older LSI 2008 firmwares had issues with drives over 2TB, and from memory the IT flashing process is a multi-stage affair involving flashing older versions then newer versions.

I believe the mpt2_sas driver prints the version number when it initialises the card:

root@anduin:~# dmesg | grep mpt2sas |grep FW
[    2.944049] mpt2sas_cm0: LSISAS2008: FWVersion(19.00.00.00), ChipRevision(0x03), BiosVersion(07.37.00.00)

From memory 20 is the newest, i'm running 19. But part of the reflashing process requires you to flash on version 7 or something.

Similarly, when it detects the drive you'll get output like this:

root@anduin:~# dmesg | grep "sd 0:0:0:0"
[    3.588604] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    3.596850] sd 0:0:0:0: [sda] 11721045168 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    3.624694] sd 0:0:0:0: [sda] Write Protect is off
[    3.659857] sd 0:0:0:0: [sda] Mode Sense: f7 00 10 08
[    3.678043] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    3.755552] sd 0:0:0:0: [sda] Attached SCSI disk

You can remove the 0:0:0:0 bit in the command to show all the drives.

If you like, do a fresh boot and then post up the whole boot log and we can look thru it.

Jelly-Monster · June 26, 2020

12 hours ago, Aragorn- said:

This post for instance:

https://serverfault.com/questions/971722/dmesg-full-of-i-o-errors-smart-ok-four-disks-affected

Suggests a similar error caused by a bad cable.

I did think about the cables, as the Dell branded cables are so expensive I got some from AliExpress. But I've moved the drive into a known working slot, and still get the same issue. Do you think it could still be a bad cable?

12 hours ago, Aragorn- said:
I would look at the logs from boot time and make sure the controller sees the drive with the correct capacity. I would also check /proc/partitions to confirm the capacity is correct. You say you've flashed it so there shouldnt be any issues, but i'm pretty sure older LSI 2008 firmwares had issues with drives over 2TB, and from memory the IT flashing process is a multi-stage affair involving flashing older versions then newer versions.

I believe the mpt2_sas driver prints the version number when it initialises the card:
root@anduin:~# dmesg | grep mpt2sas |grep FW
[    2.944049] mpt2sas_cm0: LSISAS2008: FWVersion(19.00.00.00), ChipRevision(0x03), BiosVersion(07.37.00.00)
From memory 20 is the newest, i'm running 19. But part of the reflashing process requires you to flash on version 7 or something.

I'm running firmware ver sion 20.00.07.00-IT, and in the SAS topology it is seeing the 8TB drive. I've attached screen captures that might help.

12 hours ago, Aragorn- said:
Similarly, when it detects the drive you'll get output like this:
root@anduin:~# dmesg | grep "sd 0:0:0:0"
[    3.588604] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    3.596850] sd 0:0:0:0: [sda] 11721045168 512-byte logical blocks: (6.00 TB/5.46 TiB)
[    3.624694] sd 0:0:0:0: [sda] Write Protect is off
[    3.659857] sd 0:0:0:0: [sda] Mode Sense: f7 00 10 08
[    3.678043] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    3.755552] sd 0:0:0:0: [sda] Attached SCSI disk
You can remove the 0:0:0:0 bit in the command to show all the drives.

If you like, do a fresh boot and then post up the whole boot log and we can look thru it.

I've attached the logs from fresh boot, and using the GREP "sd" command. Hopefully it makes more sense to you.

I'll do a bit more Googleing of the errors in the meantime.

DMESG DMESG GREP SD

Jelly-Monster · June 26, 2020

I've potentially found the solution! Whilst playing around with smartctl, I noticed it was formatted with type 2 protection. Not really knowing what this is, I did a bit of Google-ing and found this:

http://talesinit.blogspot.com/2015/11/formatted-with-type-2-protection-huh.html

I'm currently in the process of formatting (it's at 1.26%), which feels like progress.

Aragorn- · June 26, 2020

Obscure for sure, i guess these drives have come out of a enterprise SAN type system. I ran into a similar issue about 15 years ago with some recovered fibre channel disks, but those actually reported 520byte sectors to the OS, and the disk tools clearly reported the sector size error as the issue. Clearly things have evolved, but the errors have got more obscure.

Fingers crossed that sorts it for you

Jelly-Monster · June 27, 2020

21 hours ago, Aragorn- said:

Obscure for sure, i guess these drives have come out of a enterprise SAN type system. I ran into a similar issue about 15 years ago with some recovered fibre channel disks, but those actually reported 520byte sectors to the OS, and the disk tools clearly reported the sector size error as the issue. Clearly things have evolved, but the errors have got more obscure.

Fingers crossed that sorts it for you

Yeah, that's what I'm thinking. I bought the drive from Ebay, and although it was advertised as new it didn't come in sealed packaging.

The format took about 30 hours, and thankfully worked.

Thanks everyone for the help

Nick7 · July 1, 2020

On 6/26/2020 at 8:54 PM, Aragorn- said:

Obscure for sure, i guess these drives have come out of a enterprise SAN type system. I ran into a similar issue about 15 years ago with some recovered fibre channel disks, but those actually reported 520byte sectors to the OS, and the disk tools clearly reported the sector size error as the issue. Clearly things have evolved, but the errors have got more obscure.

Fingers crossed that sorts it for you

Beware of enterprise disks for storage systems.

Some time ago I had a chance to play a bit with several disk enclosures with disks which were used for HP EVA8400 storage system.

I had them connected on x86 server running Linux.

I got it working with ZFS, and all was good.

But, there was one interesting thing - when disks encountered errors on read, it would just send soft error warning and actually give WRONG data to the OS! I guess firmware of EVA8400 knows how to handle this properly, but having same disk connected to x86 server yielded such result.

Luckily, using RAID6 and ZFS - ZFS itself noticed checksum errors (yay for checksums!), and corrected them.

Sign In

Help! Getting my 8TB drive working with H200

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Topics

Latest From Linus Tech Tips:

De-Google Your Life - Part 1

Latest From Tech Quickie:

Why Can’t You Buy a “Dumb TV?”

Latest From TechLinked:

Microsoft will see EVERYTHING….

Latest From GameLinked:

What happened to Valve?

Latest From ShortCircuit:

I have been blessed - Moondrop x Crinacle DUSK IEMs