Jump to content

VMWare ESXi 6.0 host hard shutting down every few days

Go to solution Solved by PlenumRated,
On 6/18/2021 at 12:08 AM, BloodKnight7 said:

well... best of lucks with the testing, if you dont find the culprit in the PSUs, the PDUs or the UPS then Im afraid there could be an issue with the motherboard....which would be normal due to the server age, electronic components start to degrade. 

Thank you! I think the issue was related to my UPS. Although none of my other network gear or hardware had any other issues. I have a R710 with Hyper-V 2019 on it and saw no issues. 

 

I'm going to spin up my other HP with the HPE VMWare 6.7 image and give that machine a shot. 

 

 

Thank you all!

Hey everyone! 

 

I've got a cluster of HP DL380 G7 and HP DL380 G8's that run ESXi. The G7's have 6.0 and 6.0 (planning to upgrade). The G8's have 7.0. Both are running the HPE versions of ESXi. Homelab servers are licensed by enterprise keys. For some reason my 6.0 host keeps hard shutting down every few days. I haven't been able to pinpoint the cause. I've ensured that VM's are either activated or on evaluation keys. I have glanced over the logs but can't find anything significant. Has anyone else had this issue? The hardware should be working. When the HP shuts down, it has red LED's on both PSU's. Before you say PSU's are bad, I had the same issue with 6.0 on a G6 HPE server. It can't be a hardware fault since these machines were pulled from a working enterprise. I am also using iLO on the machines. 

 

Any ideas? 

I am able to re-pull logs if needed. 

 

Thank you! 

"Just another day at the office" Captain Price -Call of Duty

Link to comment
Share on other sites

Link to post
Share on other sites

I would start by checking the ILO logs...if its a hardware issue you should be able to pinpoint it in the ipmi logs.... if there is nothing there then we would need to check vmware logs... specifically /var/run/log/vmksummary

Link to comment
Share on other sites

Link to post
Share on other sites

20 hours ago, BloodKnight7 said:

I would start by checking the ILO logs...if its a hardware issue you should be able to pinpoint it in the ipmi logs.... if there is nothing there then we would need to check vmware logs... specifically /var/run/log/vmksummary

Thank you! I actually read somewhere about the ILO causing issue. I wonder what fault could have hit the ILO. I'll check and get back with you. 

 

 

 

"Just another day at the office" Captain Price -Call of Duty

Link to comment
Share on other sites

Link to post
Share on other sites

15 hours ago, PlenumRated said:

Thank you! I actually read somewhere about the ILO causing issue. I wonder what fault could have hit the ILO. I'll check and get back with you. 

 

 

 

Update the iLO to the latest available version on all hosts, that should fix most bugs. 

I've also seen a certain version of the iLO driver for ESXi cause crashes. Updating to the latest version

available for ESXi 6.0 should fix that (or upgrading to ESXi 6.5/6.7 using the HPE customised image for the G6/G7). 

PC Specs - AMD Ryzen 7 5800X3D MSI B550M Mortar - 32GB Corsair Vengeance RGB DDR4-3600 @ CL16 - ASRock RX7800XT 660p 1TBGB & Crucial P5 1TB Fractal Define Mini C CM V750v2 - Windows 11 Pro

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 5/28/2021 at 10:04 AM, NelizMastr said:

Update the iLO to the latest available version on all hosts, that should fix most bugs. 

I've also seen a certain version of the iLO driver for ESXi cause crashes. Updating to the latest version

available for ESXi 6.0 should fix that (or upgrading to ESXi 6.5/6.7 using the HPE customised image for the G6/G7). 

Hello! 

 

ILO was already up to date. I was going to try to update to 6.5 or 6.7 U3. 

 

However, question. Can I still use vSphere Software instead of the webUI? I'd prefer to use vSphere Software. 

 

image.png.ecfb1506a72967eca2be4a60a7d55940.png

"Just another day at the office" Captain Price -Call of Duty

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, PlenumRated said:

Hello! 

 

ILO was already up to date. I was going to try to update to 6.5 or 6.7 U3. 

 

However, question. Can I still use vSphere Software instead of the webUI? I'd prefer to use vSphere Software. 

 

image.png.ecfb1506a72967eca2be4a60a7d55940.png

The old C client cannot be used with 6.7 or 6.5 beyond U1. Since version 6.5U2 the webUI isn't using flash anymore but full HTML5.

Features specific to 6.0 and beyond are also not available when not using the webUI, so this is a good time to switch.

 

Keep in mind you should do the upgrade to 6.5/6.7 with the specific HPE custom image from VMware, don't use the generic update

tooling i.e. via SSH, as you'll miss out on some HPE specific drivers and fixes.

PC Specs - AMD Ryzen 7 5800X3D MSI B550M Mortar - 32GB Corsair Vengeance RGB DDR4-3600 @ CL16 - ASRock RX7800XT 660p 1TBGB & Crucial P5 1TB Fractal Define Mini C CM V750v2 - Windows 11 Pro

 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, NelizMastr said:

The old C client cannot be used with 6.7 or 6.5 beyond U1. Since version 6.5U2 the webUI isn't using flash anymore but full HTML5.

Features specific to 6.0 and beyond are also not available when not using the webUI, so this is a good time to switch.

 

Keep in mind you should do the upgrade to 6.5/6.7 with the specific HPE custom image from VMware, don't use the generic update

tooling i.e. via SSH, as you'll miss out on some HPE specific drivers and fixes.

Great information, I already had the HPE images downloaded. I'll just have to burn them to a USB. 

 

Thank you! I have cleared the ILO since the last hard shutdown. If it shutsdown, before I can update, I can take a look. 

"Just another day at the office" Captain Price -Call of Duty

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 weeks later...

Hello everyone, server shutdown again.. 

 

This is the iLO log. 

(Hopefully you can see) 

image.thumb.png.377afce289baa2701be00bdc1f9b5c82.png

 

I think I'll upgrade my second host to 6.7 U3 HPE and see what happens. Does anyone have any thoughts for the random shutdowns? I've seen it on multiple machines from HPE. I'm using the HPE ESXi version. 

"Just another day at the office" Captain Price -Call of Duty

Link to comment
Share on other sites

Link to post
Share on other sites

Well... that is intersting, I havent touched an HP server in about 9 years, but from what I can see those brown-out events means that your server lost power temporarily, this could either mean something is wrong with the motherboard, or the PSUs, or even the UPS system you have the server connected to could be at fault.

 

Try just having 1 of the power supplies plugged in, if it fails try the other one. Also... why is your onboard clock not with the proper day time, maybe your battery is dead?

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, BloodKnight7 said:

Well... that is intersting, I havent touched an HP server in about 9 years, but from what I can see those brown-out events means that your server lost power temporarily, this could either mean something is wrong with the motherboard, or the PSUs, or even the UPS system you have the server connected to could be at fault.

 

Try just having 1 of the power supplies plugged in, if it fails try the other one. Also... why is your onboard clock not with the proper day time, maybe your battery is dead?

Great info. I've started to think it could be my UPS. I actually swapped one PSU over to direct line voltage. We'll see if that fixes it. If one PSU goes out, then we know the UPS is bad. I'll get into the iLO and see if I can get the onboard clock set properly. It's possible the battery is dead.. 

"Just another day at the office" Captain Price -Call of Duty

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, PlenumRated said:

Great info. I've started to think it could be my UPS. I actually swapped one PSU over to direct line voltage. We'll see if that fixes it. If one PSU goes out, then we know the UPS is bad. I'll get into the iLO and see if I can get the onboard clock set properly. It's possible the battery is dead.. 

well... best of lucks with the testing, if you dont find the culprit in the PSUs, the PDUs or the UPS then Im afraid there could be an issue with the motherboard....which would be normal due to the server age, electronic components start to degrade. 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 weeks later...
On 6/18/2021 at 12:08 AM, BloodKnight7 said:

well... best of lucks with the testing, if you dont find the culprit in the PSUs, the PDUs or the UPS then Im afraid there could be an issue with the motherboard....which would be normal due to the server age, electronic components start to degrade. 

Thank you! I think the issue was related to my UPS. Although none of my other network gear or hardware had any other issues. I have a R710 with Hyper-V 2019 on it and saw no issues. 

 

I'm going to spin up my other HP with the HPE VMWare 6.7 image and give that machine a shot. 

 

 

Thank you all!

"Just another day at the office" Captain Price -Call of Duty

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, PlenumRated said:

Thank you! I think the issue was related to my UPS. Although none of my other network gear or hardware had any other issues. I have a R710 with Hyper-V 2019 on it and saw no issues. 

 

I'm going to spin up my other HP with the HPE VMWare 6.7 image and give that machine a shot. 

 

 

Thank you all!

No issues, glad we could be of service! 🙂 Maybe this older server needs more amps/watts and that was juicing out the ups. Anyways, good luck with your future endeavors. 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×