Jump to content

Weird Vega 56 crash issues that I can't pin point.

Vortagh

Heya, I have a fucked up issue with my Vega 56 and I'm at the end of my wits. Any help would be welcome.

 

My buddy gave me his Vega 56 ( ROG-STRIX-RXVEGA56-O8G-GAMING) yesterday, because he upgraded to an 3080 and I only had a lowly 970. 

It ran perfectly fine at his place. Absolutely rock stable, no noise, he did no tweaking, nothing. He literally plugged it in and let it run, never touching ANYTHING. Ran fine for two or three years. With me? Won't stop crashing. 

 

I did run DDU from safe mode, letting it get rid off anything it would find, then installing 20.11.2 drivers. 

First thing I did, was run Fire Strike and Time Spy on 3D Mark, which ran without a hitch, but BOY is that card a turbine when running Time Spy. I have never heard something in a PC get that loud.

 

But hey, it's a benchmark. I didn't care. Booted up Battlefront II and continued my campaign. I forgot to change any graphics options, so they were still at a nice but not ultra setting from the 970. Played a few minutes, then I had to do some shooting, ran around aaaand....crash. Ok. Weird. Let's try another game. Shadows of the Tomb Raider. Did the benchmark. Nice, 80 - 165 fps, barely any noise from the fans. Then I realized I was running "only" on high, so I switched to ultra. Started with less than 10 fps below the high settings, nice. Nope, crashes after about 10 seconds in the benchmark. Tried again, crashes on entering the benchmark menu. Tried again, back to crashing at the beginning of the benchmark. Installed GPU Tweak II and ran it again and it ran fine, but the turbine started again (although less loud) and it crashed about two thirds into the benchmark.

 

What the hell is going on here? I checked AMD's performance panel and GPU Tweak and it showed me nothing that seemed wrong. Sure, GPU is at 100%, duh. Power was hard to tell, because the software is a pos and scales the axis in real time

Had to quickly jump from game to desktop and check. Around 264W max. Seems to be in order. Idle temps are around 25-35°C. About 55 under load it seems. Software says the fans are at 0 RPM, which makes sense at idle, but I really doubt that that is correct in game, judging by the noise they made. But GPU Tweak II also says 0 RPM in the OSD.

 

It's connected to a Be!Quiet BQT P9-750W, using two separate 8-Pin cables, on PCIe 1 and 3, which means it draws power from the two 25A 12V rails, in addition to the "normal" 12V 20A motherboard rail. Which should be more than enough...

Gigabyte Aorus Ultra, with Ryzen 5600 and 32GB 3200. There's no OC anywhere in the system.

 

Wtf is going on here? The card was fine. There's no way it died on the way here (it was in an anti-static bag and handled properly). There's no OC involved, the PSU is more than enough. It's connected correctly, seated correctly, the drivers aren't beta or anything, the rest of the system is perfectly fine.

 

Halp pls?

Link to comment
Share on other sites

Link to post
Share on other sites

To be honest I have no idea what's happening but I sometimes experience something similar on my Vega 64 but minę doesn't crash.

Link to comment
Share on other sites

Link to post
Share on other sites

A few additional notes:

 

- the crashes don't always happen right away. Most of the time, they happen pretty fast. But at some rare times, they let me run a bit of the game. Or they just crash right away while starting the game/software.

- Contrary to my first believe, it also happens during 3DMark. I must have been lucky at the beginning (see above).

- There is NO error message. I just get a quick freeze, bit of a black screen and I'm back at my desktop.

- I've tested drivers from 17.7.1 to the newest. It makes absolutely NO difference whatsoever.

- I've tried disabling Dx12.

 

So I tested some more and realized that, when these crashes happen, it's almost always with the same message in event viewer. It states that amdxc64.dll is a "faulty module". Why though? This can't be a bad driver install, after literally half a dozen DDU cleans and reinstalls with several versions of AMD's driver.

 

I googled it, but I can't find anything that seems to be an issue with other people, that I can replicate.

 

And everything else on that system is brand new - literally one week old today - and every driver is up to date. 

 

Link to comment
Share on other sites

Link to post
Share on other sites

With that card I'd check VRM temps, they are notorious for that... Try to run hwinfo and check temps there

My Rig: AMD Ryzen 5800x3D | Scythe Fuma 2 | RX6600XT Red Devil | B550M Steel Legend | Fury Renegade 32GB 3600MTs | 980 Pro Gen4 - RAID0 - Kingston A400 480GB x2 RAID1 - Seagate Barracuda 1TB x2 | Fractal Design Integra M 650W | InWin 103 | Mic. - SM57 | Headphones - Sony MDR-1A | Keyboard - Roccat Vulcan 100 AIMO | Mouse - Steelseries Rival 310 | Monitor - Dell S3422DWG

Link to comment
Share on other sites

Link to post
Share on other sites

Thanks, will test that!

 

However as said - this card ran absolutely fine for more than two years, without anything touched - and the only thing that changed, was that it went from my friend's PC into mine. 

Link to comment
Share on other sites

Link to post
Share on other sites

Well, I did check using HWiNFO and there's no big temp spike or anything. 

 

This is what I'm getting when I start the game and it crashes right away:

430708575_Screenshot(7).png.a970bd7e42a9917dde9de4141565b782.png

 

This is "in game, but crashes right before I can start the benchmark:1668205588_Screenshot(8).png.5f879b1a5bd179a8ef455f2dcedced13.png

 

And this is "crashes at the point where the benchmark gets to the part with the lowest fps":

1880162970_Screenshot(9).png.9081ca2f58418f399b3e16ccbf2fb889.png

 

Can anyone see anything in there, that sticks out to them? Because I don't. Temps seem to be normal to me, including the CPU hot spot. At least from what little I have learned about the Vega series cards. Only thing I was wondering about, is it normal, that the GPU core current is at 140 amps??

Link to comment
Share on other sites

Link to post
Share on other sites

I used GPU Tweak and pushed the card down to 1300 with a power target of 50%. No changes.

 

Just did a fresh install of Win 10. No changes. 

 

Wtf is going on with this card? 

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Vortagh said:

I used GPU Tweak and pushed the card down to 1300 with a power target of 50%. No changes.

 

Just did a fresh install of Win 10. No changes. 

 

Wtf is going on with this card? 

 

The lack of +12v monitoring on that card is laughable.  Even the r9 290X had PCIE +12v monitoring...

 

Easy test, and you're not going to want to hear this----

 

Go get your friend's Power supply, borrow it, install it on your own computer (please do it this way), then test your Vega.

if no crashes, you found the source of the problem.

 

Yes I am fully aware it's inconvenient and you're not going to want to bother with all that work.  But you have a choice to make.  Pinpoint the problem and fix it, or complain on forums about a crashing card.  What path will you choose?

Link to comment
Share on other sites

Link to post
Share on other sites

My vega 64 crashes sometimes as well. All i do is increase voltage and decrease clock speed by a tiny margin for lower states and it seems to solve the problem. :)

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×