@Lefy8880 do you ended up discovering your problem? It might be a little early too celebrate it, but i think i might found a solution.
First of all, your post was one of the most completes I've found approaching this problem. I've found myself troubleshooting this problem over the course of 2 weeks already, I've built a new computer on the past week (6th June to be more precise). My specs are very similar to yours: (not quite equal but in terms of architecture and power consumption they are)
CPU: AMD Ryzen 7 3700X (With the stock cooler)
GPU: MSI RX 5700 Mecha OC GP 8GB GDDR6
MOBO: MSI B450-A PRO MAX (Latest stable bios)
RAM: GSKILL Aegis 2x16GB - 3200Mhz
PSU: Seasonic Focus GX-650, 650W 80+ Gold, Full-Modular
DISK: SSD M2 Western Digital Blue SN550 NVMe
My symptoms are (or at least were i hope) very similar to yours. I could be gaming, browsing and even working with the computer for 4-5 hours and then it would restart/reboot automatically without any warning. I would get around 1 to 2 restarts per day. Just like I've seen everywhere, opening up the Event Viewer or the Reliability History the only critical errors i would get would be kernel power issues (which are very generic) and RadeonSoftware.exe application crashes.
I've tried all the possible solutions I could find around forums/reddit or others:
- disable hardware acceleration in browsers;
- turning off all settings from Wattman/Radeon Software;
- underclocking/volting the GPU
- using 2 distinct PCI-E cables instead of a master/slave cable
- different driver versions, windows versions, windows parameters with DDU and AMD software to remove old ones
- windows clean install
Trust me i've tried ANYTHING possible, u name it. Some solutions would get me more time between random restarts, but i really think it was because they were lowering the power consumption from the card.
I've tried hardware solutions just like yours, I've taken out the PSU (my case is a bit odd/smalish for atx form factor, and PSU is vertically mounted) so i mounted it outside the case, re-done all the cables and all that stuff. I've actually never tried another PSU cause the alternatives i had, had less power/watts. I've also tried a different Nvidia card which would work fine, but it's power usage was way lower than RX 5700.
I've also done temps monitoring, stress tests with Aida64, OCCT, Prime95, Ungine (different tests), FurMark. Test could run for HOURS without crashing or rebooting. I would stop the test, open a game like Valorant or Destiny 2 and boom. Black screen, fans went off, reboot.
My last approach, was to re-do all cables, like from scratch. Every ... Single ... Cable... I've pictures of before and previous. Mounted again the GPU. I've also removed drivers, and installed AMD Enterprise drivers (although I've tried those previously). I can haply say that I'm restart free since Tuesday. I have an uptime (with sleeps of the computer between) of over 3 days. I've actually rebooted the system today just to be sure it would sustain and so far so good. I hope I'm not making jinxs on the subject.
TLDR: tried out every solution, i think the problem was actually the CPU fan/cooler cable which was hidden near the heatsink and was also touching one or two capacitors. Dunno if because of some overheating or even grounding reasons, that could be causing randomly restarts. That with the Enterprise drivers version has got me at least some stability.