Jump to content

Windows Server random shutdowns, nothing in event viewer....

I'm running a windows server VM on proxmox and it randomly shuts down. The last one was the first one I was actually there to witness but it's happened a total of 6 times:
image.thumb.png.826db99059bb590d9d90ef75e76e9a7d.png

(this is filtered to just show the unexpected shutdowns)

 

When I go back in the log and look around the time the shutdown occurs there's nothing useful.
image.thumb.png.b138f3f1609f2f7344d0260fefdec481.png

The only thing in the log before it happens in a bunch of 'service x entered running state' & the 1 warning that there was 'No valid response received from manually configured peer time.windows.com'

This is basically a fresh install of Server 2022, only thing running was some HFS servers, a discord bot, & a minecraft server. All of which have ran before without issue on the old install of Server 2019.

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

Have you seen it in the process of shutting down?

 

Does the PROXMOX log or console show any information about the VM when it shuts down?

 

Do you have any hardware devices passed through to the VM or is everything virtualized?

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Windows7ge said:

Have you seen it in the process of shutting down?

 

Does the PROXMOX log or console show any information about the VM when it shuts down?

 

Do you have any hardware devices passed through to the VM or is everything virtualized?

Bingo, PROXMOX log reports this:

Jun 24 17:57:27 proxmox QEMU[3753287]: KVM: entry failed, hardware error 0x80000021
Jun 24 17:57:27 proxmox QEMU[3753287]: If you're running a guest on an Intel machine without unrestricted mode
Jun 24 17:57:27 proxmox QEMU[3753287]: support, the failure can be most likely due to the guest entering an invalid
Jun 24 17:57:27 proxmox QEMU[3753287]: state for Intel VT. For example, the guest maybe running in big real mode
Jun 24 17:57:27 proxmox QEMU[3753287]: which is not supported on less recent Intel processors.
Jun 24 17:57:27 proxmox kernel: set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
Jun 24 17:57:27 proxmox QEMU[3753287]: EAX=00003330 EBX=32c91180 ECX=00000001 EDX=00000000
Jun 24 17:57:27 proxmox QEMU[3753287]: ESI=de455040 EDI=32c9d240 EBP=00000000 ESP=c9645c00
Jun 24 17:57:27 proxmox QEMU[3753287]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
Jun 24 17:57:27 proxmox QEMU[3753287]: ES =0000 00000000 ffffffff 00809300
Jun 24 17:57:27 proxmox QEMU[3753287]: CS =bc00 7ffbc000 ffffffff 00809300
Jun 24 17:57:27 proxmox QEMU[3753287]: SS =0000 00000000 ffffffff 00809300
Jun 24 17:57:27 proxmox QEMU[3753287]: DS =0000 00000000 ffffffff 00809300
Jun 24 17:57:27 proxmox QEMU[3753287]: FS =0000 00000000 ffffffff 00809300
Jun 24 17:57:27 proxmox QEMU[3753287]: GS =0000 00000000 ffffffff 00809300
Jun 24 17:57:27 proxmox QEMU[3753287]: LDT=0000 00000000 00000000 00000000
Jun 24 17:57:27 proxmox QEMU[3753287]: TR =0040 32ca0000 00000067 00008b00
Jun 24 17:57:27 proxmox QEMU[3753287]: GDT=     32ca1fb0 00000057
Jun 24 17:57:27 proxmox QEMU[3753287]: IDT=     00000000 00000000
Jun 24 17:57:27 proxmox QEMU[3753287]: CR0=00050032 CR2=006a0f80 CR3=001ae002 CR4=00000000
Jun 24 17:57:27 proxmox QEMU[3753287]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Jun 24 17:57:27 proxmox QEMU[3753287]: DR6=00000000ffff0ff0 DR7=0000000000000400
Jun 24 17:57:27 proxmox QEMU[3753287]: EFER=0000000000000000
Jun 24 17:57:27 proxmox QEMU[3753287]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.
Jun 24 17:57:28 proxmox kernel: fwbr101i0: port 2(tap101i0) entered disabled state
Jun 24 17:57:28 proxmox kernel: fwbr101i0: port 2(tap101i0) entered disabled state
Jun 24 17:57:28 proxmox kernel:  zd208: p1 p2 p3 p4
Jun 24 17:57:28 proxmox systemd[1]: 101.scope: Succeeded.
Jun 24 17:57:28 proxmox systemd[1]: 101.scope: Consumed 12h 56min 8.125s CPU time.
Jun 24 17:57:29 proxmox qmeventd[1652953]: Starting cleanup for 101
Jun 24 17:57:29 proxmox kernel: fwbr101i0: port 1(fwln101i0) entered disabled state
Jun 24 17:57:29 proxmox kernel: vmbr0: port 3(fwpr101p0) entered disabled state
Jun 24 17:57:29 proxmox kernel: device fwln101i0 left promiscuous mode
Jun 24 17:57:29 proxmox kernel: fwbr101i0: port 1(fwln101i0) entered disabled state
Jun 24 17:57:29 proxmox kernel: device fwpr101p0 left promiscuous mode
Jun 24 17:57:29 proxmox kernel: vmbr0: port 3(fwpr101p0) entered disabled state
Jun 24 17:57:29 proxmox qmeventd[1652953]: Finished cleanup for 101


Edit: everything is virtualized

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

So the VM remained up and running for just under 13hrs which is a rather random period of time for it to give out. Rules out sleep mode (though the server distros usually have this disabled by default).

 

At the moment I'm reading through a PROXMOX Forum thread related to the hardware error 0x80000021 entry in your logs and it describes exactly the problem you're having.

 

Most of the conversation on the first page alone consists of talking about nested virtualization, Proxmox 7.2, kernel versions, and Skylake or similar Intel based CPUs.

  • Are you running any nested virtualization?
  • What CPU are you using?
  • What version of PROXMOX?
  • Kernel?

You can read into it more if you think it could help.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Windows7ge said:
  • Are you running any nested virtualization?
  • What CPU are you using?
  • What version of PROXMOX?
  • Kernel?

No, made sure Hyper-V was off.

image.png.0872394f8795c71ecc3ff18bfc43efe7.png

 

I did change the CPU on the VM to 'kvm64' as suggested by some forum, can't really tell if it did anything until it either shuts down or doesn't....

 

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

Are you using Legacy BIOS or EFI? 440FX or Q35 chipset?

 

Are you able to replicate the same error with a stock image of 2022?

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Windows7ge said:

Are you using Legacy BIOS or EFI? 440FX or Q35 chipset?

 

Are you able to replicate the same error with a stock image of 2022?

EFI, Q35.

No, but It hasn't happened since switching it to kvm64 (though it's only been a few hours so I'm still waiting for a bit to see what happens)

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

11 hours ago, Mnky313 said:

EFI, Q35.

No, but It hasn't happened since switching it to kvm64 (though it's only been a few hours so I'm still waiting for a bit to see what happens)

Given everything worked fine with 2019 it implies there was a change with 2022 that QEMU/KVM doesn't like but I don't know what that would be. I've been using PROXMOX for years but this is an issue I've not run into before.

 

Monitor it and report back if it does it within 24hrs.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Windows7ge said:

Given everything worked fine with 2019 it implies there was a change with 2022 that QEMU/KVM doesn't like but I don't know what that would be. I've been using PROXMOX for years but this is an issue I've not run into before.

 

Monitor it and report back if it does it within 24hrs.

Nothing's happened yet. it's still running.

2019 was on bare metal so now that I know it was caused by something with proxmox that kind of doesn't matter...

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Mnky313 said:

Nothing's happened yet. it's still running.

2019 was on bare metal so now that I know it was caused by something with proxmox that kind of doesn't matter...

Reading deeper into that PROXMOX forum thread will probably be worthwhile if the issue persists. The two situations closely resemble each other so it might prove to be a good resource.

 

How long have you had this new configuration going? A day? A week? A month? It's the intermittent problems that are the biggest bitch to diagnose and fix. Just when you think you've solved it it crops up again.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Windows7ge said:

How long have you had this new configuration going? A day? A week? A month? It's the intermittent problems that are the biggest bitch to diagnose and fix. Just when you think you've solved it it crops up again.

About a week but it seemed to get more common in the past day or 2.

 

Yeah, I'm having another intermittent issue where my laptop randomly shuts down under load *sometimes* and it's a real pain. I'm assuming it's something internally overheating (not the CPU/GPU) because it only seems to happen under load and only randomly.... sometimes it will work for days without issue and sometimes it'll happen multiple times an hour... I'm out of ideas pretty much xD.

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

22 minutes ago, Mnky313 said:

About a week but it seemed to get more common in the past day or 2.

 

Yeah, I'm having another intermittent issue where my laptop randomly shuts down under load *sometimes* and it's a real pain. I'm assuming it's something internally overheating (not the CPU/GPU) because it only seems to happen under load and only randomly.... sometimes it will work for days without issue and sometimes it'll happen multiple times an hour... I'm out of ideas pretty much xD.

The more frequently it happens the better. The faster we should be able to diagnose it. Give it time. See what happens. Go from there.

 

I've dealt with laptops having battery/charging issues. Does the laptop ever stop recognizing the charger? Has it shutdown with the charger attached? It could be an issue of current draw and something not supplying it resulting in a immediate shutdown.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Windows7ge said:

I've dealt with laptops having battery/charging issues. Does the laptop ever stop recognizing the charger? Has it shutdown with the charger attached? It could be an issue of current draw and something not supplying it resulting in a immediate shutdown.

It's only happened when plugged in, but I also basically only use it when plugged in.

I doubt it's a current draw issue as it has 2x 280W bricks, though I guess it's possible when the CPU and GPU pull ~400W combined under full load...

It doesn't have any issue recognizing the chargers, If it starts happening frequently I can try power limiting.

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Mnky313 said:

I doubt it's a current draw issue as it has 2x 280W bricks, though I guess it's possible when the CPU and GPU pull ~400W combined under full load...

The idea here would be hardware degradation. Sometimes a blip in output power under high load can cause the system to shutdown. The theory depends on how old the laptop and power bricks are though.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Windows7ge said:

The idea here would be hardware degradation. Sometimes a blip in output power under high load can cause the system to shutdown. The theory depends on how old the laptop and power bricks are though.

possible, problem is it's hard to focus specifically on power draw to try and narrow it down & getting new bricks is $$$ (like $350+)

Also, in theory it shouldn't crash on sudden power loss, I can unplug the bricks while it's running and it doesn't have any problems (it does significantly throttle on battery though).

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

14 hours ago, Mnky313 said:

possible, problem is it's hard to focus specifically on power draw to try and narrow it down & getting new bricks is $$$ (like $350+)

Also, in theory it shouldn't crash on sudden power loss, I can unplug the bricks while it's running and it doesn't have any problems (it does significantly throttle on battery though).

You mentioned you don't think it's due to overheating. Have you ever seen it thermal throttling? Hitting TJmax?

 

Sudden power loss or blips/instability in power delivery under different loads can manifest in many different forms. Issues like this can be traced down to VRM's on the motherboard not just the power source itself. I can contest that unstable power delivery can cause both BSOD's and sudden shutdowns from personal experiences over the years.

 

The next theory I would have isn't thermal throttling but the possibility that temperature is still associated with the problem. It's not impossible that thermal expansion could be causing a short somewhere due to a design/manufacturing imperfection. Are you usually able to power the laptop back on immediately or is there a process you have to go through to restart it? A cool down period perhaps?

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Windows7ge said:

You mentioned you don't think it's due to overheating. Have you ever seen it thermal throttling? Hitting TJmax?

 

Sudden power loss or blips/instability in power delivery under different loads can manifest in many different forms. Issues like this can be traced down to VRM's on the motherboard not just the power source itself. I can contest that unstable power delivery can cause both BSOD's and sudden shutdowns from personal experiences over the years.

 

The next theory I would have isn't thermal throttling but the possibility that temperature is still associated with the problem. It's not impossible that thermal expansion could be causing a short somewhere due to a design/manufacturing imperfection. Are you usually able to power the laptop back on immediately or is there a process you have to go through to restart it? A cool down period perhaps?


Neither the CPU or GPU hits TJMax when it happens.

And yes, I can turn it back on immediately and it doesn't seem to shut off any more often right after then it would normally.

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, Mnky313 said:


Neither the CPU or GPU hits TJMax when it happens.

And yes, I can turn it back on immediately and it doesn't seem to shut off any more often right after then it would normally.

Hmn...that's a tricky one you've got on your hands then. You've never observed it shutdown while idle or doing something easy? Only under load? I can't immediately think of any other probable causes except maybe try a single stick of RAM? Swap the RAM? Beyond that the only other thing I can think of is try swapping the motherboard but that's not realistic with most laptops.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Windows7ge said:

You've never observed it shutdown while idle or doing something easy? Only under load? I can't immediately think of any other probable causes except maybe try a single stick of RAM? Swap the RAM? Beyond that the only other thing I can think of is try swapping the motherboard but that's not realistic with most laptops.

Never happens under load. The reason I assume it was something on the board overheating was it didn't appear to happen before I swapped in the watercooled heatsink.

Haven't done any real long term testing to see if it does work fine with the air cooled one though as the last time I thought I fixed it by adjusting thermal pads it ran fine for weeks and even under a full stress test for multiple days just to start having problems again later on..... (I only had the air cooled heatsink in for a few hours as I kept having it crash a bunch in quick succession (which didn't happen on the air cooled one).

Haven't tried pulling ram but if it starts really acting up again I might. It's also still *technically* in warranty so If I can narrow it down to the board I could have them swap the board.

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

On 6/25/2022 at 4:35 PM, Windows7ge said:

Reading deeper into that PROXMOX forum thread will probably be worthwhile if the issue persists. The two situations closely resemble each other so it might prove to be a good resource.

 

How long have you had this new configuration going? A day? A week? A month? It's the intermittent problems that are the biggest bitch to diagnose and fix. Just when you think you've solved it it crops up again.

well shit.

It just happened again....

Guess I'm trying some other stuff now.
 

why no dark mode?
Current:

Watercooled Eluktronics THICC-17 (Clevo X170SM-G):
CPU: i9-10900k @ 4.9GHz all core
GPU: RTX 2080 Super (Max P 200W)
RAM: 32GB (4x8GB) @ 3200MTs

Storage: 512GB HP EX NVMe SSD, 2TB Silicon Power NVMe SSD
Displays: Asus ROG XG-17 1080p@240Hz (G-Sync), IPS 1080p@240Hz (G-Sync), Gigabyte M32U 4k@144Hz (G-Sync), External Laptop panel (LTN173HT02) 1080p@120Hz

Asus ROG Flow Z13 (GZ301ZE) W/ Increased Power Limit:
CPU: i9-12900H @ Up to 5.0GHz all core
- dGPU: RTX 3050 Ti 4GB

- eGPU: RTX 3080 (mobile) XGm 16GB
RAM: 16GB (8x2GB) @ 5200MTs

Storage: 1TB NVMe SSD, 1TB MicroSD
Display: 1200p@120Hz

Asus Zenbook Duo (UX481FLY):

CPU: i7-10510U @ Up to 4.3 GHz all core
- GPU: MX 250
RAM: 16GB (8x2GB) @ 2133MTs

Storage: 128GB SATA M.2 (NVMe no worky)
Display: Main 1080p@60Hz + Screnpad Plus 1920x515@60Hz

Custom Game Server:

CPUs: Ryzen 7 7700X @ 5.1GHz all core

RAM: 128GB (4x32GB) DDR5 @ whatever it'll boot at xD (I think it's 3600MTs)

Storage: 2x 1TB WD Blue NVMe SSD in RAID 1, 4x 10TB HGST Enterprise HDD in RAID Z1

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×