Jump to content

We are running an ml350 g6 for some time now with esxi 5.5 on it. Yesterday the raid 5 that has all the vm disks stopped working. I open the controller of the raid, a P410i, and searched for an answer only to find this: 

https://drive.google.com/open?id=0BwpBULNj2r4BamRZZHVfZ0RfZWs

The server when it boots it shows me the following message

https://drive.google.com/open?id=0BwpBULNj2r4BeVg5SS1hR013SW8

 

The array in my view appears to be fine but the controller doesn't recognize it. The message scares us to the bone an we avoid pressing F2 for now. From my searches up to now there is no clarification of this being recoverable even though all the drives in the array appear to be ok from the controller screen. Any ideas? In all our brilliance we got no backup of this array by the way...

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/
Share on other sites

Link to post
Share on other sites

I haven't worked with HP servers for a while, but I've had issues where I've turned off the server and removed the raid card and inserted back and it worked fine.

 

Try it. It could work for you. :) 

CPU: AMD Ryzen 5 5600X | CPU Cooler: Stock AMD Cooler | Motherboard: Asus ROG STRIX B550-F GAMING (WI-FI) | RAM: Corsair Vengeance LPX 32 GB (4x 8 GB) DDR4-3000 CL16 | GPU: Nvidia GTX 1060 6GB Zotac Mini | Case: K280 Case | PSU: Cooler Master B600 Power supply | SSD: 1TB  | HDDs: 1x 250GB & 1x 1TB WD Blue | Monitor: 24" Acer S240HLBID | OS: Win 11 Pro.

 

Home Lab:  Lenovo ThinkCenter M82 Hyper-V Server 2022 | Dell OptiPlex 9020 Hyper-V Server 2022 | TP-LINK TL-SG108E | Cisco Catalyst C2960CG 8 Port Switch | HP MicroServer G8 SCCM Server | 2x Dell PowerEdge R630 Hyper-V Server 2022

 

 

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-9895916
Share on other sites

Link to post
Share on other sites

3 minutes ago, Abdul201588 said:

I haven't worked with HP servers for a while, but I've had issues where I've turned off the server and removed the raid card and inserted back and it worked fine.

 

Try it. It could work for you. :) 

Any risk of losing our data with this move?

 

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-9895935
Share on other sites

Link to post
Share on other sites

21 minutes ago, Renegate said:

Any risk of losing our data with this move?

 

shouldnt be as long as you do a safe shutdown and dont mess anything up when you put it back together somehow. I would do a bit more research just to be 100% sure if this is important server

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-9896009
Share on other sites

Link to post
Share on other sites

There's not much you can do other than press F2, I've seen that quite a lot and never seen it damage the array or any data. Usually it's caused by a disk temporary disappearing and makes the RAID card panic.

 

If all drives are present and healthy you won't have any problems.

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-9896061
Share on other sites

Link to post
Share on other sites

4 minutes ago, leadeater said:

There's not much you can do other than press F2, I've seen that quite a lot and never seen it damage the array or any data. Usually it's caused by a disk temporary disappearing and makes the RAID card panic.

 

If all drives are present and healthy you won't have any problems.

some folks recommended to take the cards and plug them in a linux machine and make DD of each disk as a point of recovery so that we could reconstruct the array back in the drives in case the card make changes to it that loses the data.

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-9896088
Share on other sites

Link to post
Share on other sites

5 minutes ago, Renegate said:

some folks recommended to take the cards and plug them in a linux machine and make DD of each disk as a point of recovery so that we could reconstruct the array back in the drives in case the card make changes to it that loses the data.

That is a huge amount of time to do that and doesn't really protect you that much. If something does go wrong which is unlikely you'd have to copy the raw data back to the correct disks and recreate the array and not initialize it and hope that it works. The more likely thing you'll end up doing if there is a problem is having to use a recovery tool and pull the data off the disks that way, write down the RAID stripe setting now btw as you'll need them to do this.

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-9896105
Share on other sites

Link to post
Share on other sites

  • 3 weeks later...
On 5/27/2017 at 6:23 PM, leadeater said:

That is a huge amount of time to do that and doesn't really protect you that much. If something does go wrong which is unlikely you'd have to copy the raw data back to the correct disks and recreate the array and not initialize it and hope that it works. The more likely thing you'll end up doing if there is a problem is having to use a recovery tool and pull the data off the disks that way, write down the RAID stripe setting now btw as you'll need them to do this.

Just as followup on the insident

 

2 drives on the array had died and the health check on the HP was showing them as healthy! We had the raid shiped to a recovery company that repaired the motor on one of the disks so that we can access the array again. As of now i officially hate raid on HP. The raid even though it was recovered all the data in it was partially or totaly corrupted. We managed to scrap bit's and pieces of our data but it's totally a mess.

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-10008324
Share on other sites

Link to post
Share on other sites

3 hours ago, Renegate said:

Just as followup on the insident

 

2 drives on the array had died and the health check on the HP was showing them as healthy! We had the raid shiped to a recovery company that repaired the motor on one of the disks so that we can access the array again. As of now i officially hate raid on HP. The raid even though it was recovered all the data in it was partially or totaly corrupted. We managed to scrap bit's and pieces of our data but it's totally a mess.

Yea this is one of the general flaws with RAID which is why for file data storage people are migrating away from traditional RAID to ZFS etc. Failure to detect drive faults can happen on LSI controllers too.

 

Because you are running ESXi and using local storage you don't really have much choice in the matter for RAID, you need to transition to external NFS datastores hosted on ZFS to prevent that type of corruption from happening.

 

Edit:

The data corruption most likely happened when the array tried to rebuild itself and the second disk went faulty but wasn't detected as so meaning it was writing false data over the array from improper parity bit checks.

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-10009059
Share on other sites

Link to post
Share on other sites

11 hours ago, leadeater said:

Out of interest were those 300GB SAS disk 15K RPM ones?

Should I be worried about the 2x 300GB 15K SAS drives I have in my ESXi server? ;) 

For Sale: Meraki Bundle

 

iPhone Xr 128 GB Product Red - HP Spectre x360 13" (i5 - 8 GB RAM - 256 GB SSD) - HP ZBook 15v G5 15" (i7-8850H - 16 GB RAM - 512 GB SSD - NVIDIA Quadro P600)

 

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-10012392
Share on other sites

Link to post
Share on other sites

2 minutes ago, leadeater said:

Unfortunately kinda yes, 15K SAS disks are known to be particularly unreliable/short service life.

Well I have 4 of them (2x 146GB and 2x 300GB) - I actually don't remember whether I got them from work or somewhere else, but so far they've been reliable enough.

 

I'm using them in RAID1 so I'm less worried about failure compared to say RAID5.

For Sale: Meraki Bundle

 

iPhone Xr 128 GB Product Red - HP Spectre x360 13" (i5 - 8 GB RAM - 256 GB SSD) - HP ZBook 15v G5 15" (i7-8850H - 16 GB RAM - 512 GB SSD - NVIDIA Quadro P600)

 

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-10012411
Share on other sites

Link to post
Share on other sites

1 minute ago, dalekphalm said:

Well I have 4 of them (2x 146GB and 2x 300GB) - I actually don't remember whether I got them from work or somewhere else, but so far they've been reliable enough.

 

I'm using them in RAID1 so I'm less worried about failure compared to say RAID5.

Yea build quality wise etc they are no worse than 10K SAS, just that the extra 5K rpm really does add much more strain so they fail easier/quicker. Good drives will still last for ages there's just less tolerance for not so great ones.

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-10012426
Share on other sites

Link to post
Share on other sites

43 minutes ago, leadeater said:

Yea build quality wise etc they are no worse than 10K SAS, just that the extra 5K rpm really does add much more strain so they fail easier/quicker. Good drives will still last for ages there's just less tolerance for not so great ones.

Fair enough - One is a Dell branded one, and the other I think is seagate branded, for the 300GB drives. The 146GB drives are both Dell branded. I'm really only using them because they're there.

For Sale: Meraki Bundle

 

iPhone Xr 128 GB Product Red - HP Spectre x360 13" (i5 - 8 GB RAM - 256 GB SSD) - HP ZBook 15v G5 15" (i7-8850H - 16 GB RAM - 512 GB SSD - NVIDIA Quadro P600)

 

Link to comment
https://linustechtips.com/topic/785202-hp-raid-stopped-working/#findComment-10012574
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×