Jump to content

AMD Radeon RX 6000 GPUs Mysteriously Start Dying, German Repair Shop Receives 48 Cards With Cracked Chips - Is it from a Driver Update? (Updated)

Some random thoughts:

 

1) I think it is very suspicious that this report is coming from a single repair shop. I would assume that if this was a widespread issue then we would be hearing about it from more sources. Maybe we will hear from others in the coming days, but right now it seems very suspicious to say the least.

 

2) I doubt a driver update is the cause. We humans like to see patterns and draw connections between things, even when no such connection actually exists. The reasons why I don't think a driver did it are because

A) I don't see how a driver could cause this type of damage to the die itself.

B) If it was a driver we should have seen this being more widespread.

 

3) I seriously doubt this is mining related. We haven't seen cards die like this from mining before, and I suspect that if it was a single person bringing in 48 dead cards the repair shop would probably have noted that. Plus, miners tend to take care of their cards because it's an asset that makes them money, and doing things like making sure they run at peak efficiency by for example undervolting makes them more profitable. Unlike cards used for gaming which gets repeatedly temperature cycled, overclocked, and sits in a probably dusty case under a table all day long.

 

4) I would like to know if this was from 48 different customers, and if the cards were all the same model/brand.

 

5) In before a bunch of youtubers make clickbait videos titled "this driver will kill your graphics card!?", or "Is AMD in big trouble!?" (or some variation of a vague and alarmist question that pray on peoples' fears) with them making very exaggerated faces in the thumbnail, either super serious or super surprised.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, LAwLz said:

5) In before a bunch of youtubers make clickbait videos titled "this driver will kill your graphics card!?", or "Is AMD in big trouble!?" (or some variation of a vague and alarmist question that pray on peoples' fears) with them making very exaggerated faces in the thumbnail, either super serious or super surprised.

You won't believe what this software does to your PC...  😮

 

The thing I hate about these news stories is that my son just spent some hard earnt money on a 6700,  he hasn't had a chance to use it yet because he has to save for a PSU to run it, but the thought of it dying anytime soon due to a driver or manufacturing run fault is not nice.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, LAwLz said:

3) I seriously doubt this is mining related. We haven't seen cards die like this from mining before, and I suspect that if it was a single person bringing in 48 dead cards the repair shop would probably have noted that. Plus, miners tend to take care of their cards because it's an asset that makes them money, and doing things like making sure they run at peak efficiency by for example undervolting makes them more profitable. Unlike cards used for gaming which gets repeatedly temperature cycled, overclocked, and sits in a probably dusty case under a table all day long.

My theory was that these card was originally a mining card but got transported/packaged poorly to Germany, maybe on the way there it hit a huge bump or gets thrown around. That's why it's not widespread. 

| Intel i7-3770@4.2Ghz | Asus Z77-V | Zotac 980 Ti Amp! Omega | DDR3 1800mhz 4GB x4 | 300GB Intel DC S3500 SSD | 512GB Plextor M5 Pro | 2x 1TB WD Blue HDD |
 | Enermax NAXN82+ 650W 80Plus Bronze | Fiio E07K | Grado SR80i | Cooler Master XB HAF EVO | Logitech G27 | Logitech G600 | CM Storm Quickfire TK | DualShock 4 |

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, xAcid9 said:

My theory was that these card was originally a mining card but got transported/packaged poorly to German, maybe on the way there it hit a huge bump or gets thrown around. That's why it's not widespread. 

 

As noted previously vis a vis overpressuring the cooler, even a hard knock shouldn't just shatter the die like this. It's a really weird failure.

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, xAcid9 said:

My theory was that these card was originally a mining card but got transported/packaged poorly to Germany, maybe on the way there it hit a huge bump or gets thrown around. That's why it's not widespread. 

It's more likely that it was cooled too quickly. That is what causes hot things to crack. So that puts the blame on two scenarios:

 

The GPU's were already running hot, and then were suddenly cooled, consider the failure condition:

Quote

 all 48 cards come with shorted SOC rail, shorted memory rail a shorted memory controller rail. Upon asking users what they were specifically doing with the cards, most of them responded differently

This is clearly indicative of power being the reason.

 

So these GPU's likely ran extremely hot, and then were suddenly cooled when the power was cut. Fans don't continue to run when power to the card stops, so the logical reason for the cracked die is that the die partly melted and suddenly cooled.

 

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, LAwLz said:

1) I think it is very suspicious that this report is coming from a single repair shop. I would assume that if this was a widespread issue then we would be hearing about it from more sources. Maybe we will hear from others in the coming days, but right now it seems very suspicious to say the least.

Yes that is kinda strange. But on the other hand there are probably not many repair shops that make board level repairs on graphics cards. I mean even the big brands don't usually do it.

Desktop: i9-10850K [Noctua NH-D15 Chromax.Black] | Asus ROG Strix Z490-E | G.Skill Trident Z 2x16GB 3600Mhz 16-16-16-36 | Asus ROG Strix RTX 3080Ti OC | SeaSonic PRIME Ultra Gold 1000W | Samsung 970 Evo Plus 1TB | Samsung 860 Evo 2TB | CoolerMaster MasterCase H500 ARGB | Win 10

Display: Samsung Odyssey G7A (28" 4K 144Hz)

 

Laptop: Lenovo ThinkBook 16p Gen 4 | i7-13700H | 2x8GB 5200Mhz | RTX 4060 | Linux Mint 21.2 Cinnamon

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Kisai said:

It's more likely that it was cooled too quickly. That is what causes hot things to crack. So that puts the blame on two scenarios:

 

The GPU's were already running hot, and then were suddenly cooled, consider the failure condition:

This is clearly indicative of power being the reason.

 

So these GPU's likely ran extremely hot, and then were suddenly cooled when the power was cut. Fans don't continue to run when power to the card stops, so the logical reason for the cracked die is that the die partly melted and suddenly cooled.

 

As the card is failing due to heat or some other reason it's also possible the protections in the die for power and heat fail and then it can no longer protect itself. The kind of damage here is only seen when the die is not being cooled and a huge amount of power goes through it. One of the GN/EVGA videos showed them killing a die and make it burn itself up.

 

That video is probably a good starting point to re-watch to hear their discussion around that. Like I mentioned I had a cooler pop right off the die middle of gaming at 250W+ and there is no visible die damage at all, it's totally dead now though.

 

 

It's also possible all these cards were running XOC BIOS/Firmware which disables all these protections.

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, CarlBar said:

s noted previously vis a vis overpressuring the cooler, even a hard knock shouldn't just shatter the die like this. It's a really weird failure.

I've dropped my R9 270X more times than i can remember.  Still going strong

 With all the Trolls, Try Hards, Noobs and Weirdos around here you'd think i'd find SOMEWHERE to fit in!

Link to comment
Share on other sites

Link to post
Share on other sites

Update to the story 

 

TL;DW - Kris contacted all 48 owner, only 25 replied to him, and only 2 have the original invoice? Turns out most of them just bought the card 2nd hand last December. 

He suspect these card got pressure wash and didn't get to dry properly before they stored them? 

 

| Intel i7-3770@4.2Ghz | Asus Z77-V | Zotac 980 Ti Amp! Omega | DDR3 1800mhz 4GB x4 | 300GB Intel DC S3500 SSD | 512GB Plextor M5 Pro | 2x 1TB WD Blue HDD |
 | Enermax NAXN82+ 650W 80Plus Bronze | Fiio E07K | Grado SR80i | Cooler Master XB HAF EVO | Logitech G27 | Logitech G600 | CM Storm Quickfire TK | DualShock 4 |

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, xAcid9 said:

Update to the story 

 

TL;DW - Kris contacted all 48 owner, only 25 replied to him, and only 2 have the original invoice? Turns out most of them just bought the card 2nd hand last December. 

He suspect these card got pressure wash and didn't get to dry properly before they stored them? 

 

 

From VideoCardz:

 

Quote

The long story short is that it was not the driver after all. As it turns out, the graphics cards that were sent over to KrisFix also had one other thing in common that was previously not mentioned. Many of these cards were sold by the same seller, with a strong suspicion that these cards were previously used for cryptomining.

 

However, the main issue was not the fact they were used for mining, as many post-mining cards are used by gamers around the globe every day. It is the humid condition that these cards were kept in, that might have been the cause of the problem. Prolonged storage of graphics cards in a warehouse with high moisture is a very plausible explanation for GPU cracking.

 

Such conclusions were shared after conducting 150 hours of tests with multiple cards, including AMD MBA (Made by AMD) and custom designs. These tests did not provide many answers, but a more detailed survey among customers definitely did.

 

https://videocardz.com/newz/radeon-gpu-cracking-not-caused-by-drivers-storing-conditions-and-cryptomining-to-blame

 

Going to update the OP.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×