Jump to content

Long time reader, first time poster.  This is a long story, and I couldn't find a similar issue the last few days of searching, so thought I'd throw it out to the crowd for thoughts.

 

I live in Vanuatu, so everything is air freight, about 6 weeks from order usually. 

 

System Spec.

Ryzen 5800x 

Asus Prime x570 pro (bios 3202)

64GB (2 x 32GB Gskill Trident Royal 4000Mhz) clocked at 3800Mhz

Zotac 3090 trident

Samsung 970 evo m2, Samsung 970 evo ssd, 18TB seagate

Custom loop, EK CPU block, GPU block, 360 rad and res/pump combo

Seasonic Ultra Platinum 1000w

Thermaltake P3 case.

 

Threw the system together on the bench, stock coolers and the GPU and an old CPU cooler, did the bios update to 3001 for the 5800 and posted straight up, happy days, all drives found, no worries.  

 

Proceeded to build the system and loop, installed win10, all running fine, all drives found, great.

 

Grabbed the latest Nvidia drivers, installed and disco time.  Screens having an epileptic fit black, on, black, on, black on random intervals.  At desktop, worse in a browser, loading a benchmark or game BSOD's.  Disabled the nvidia display adapter and it's fine, check event viewer, NVLDDMKM.sys has timed out and reset.  Went through all the suggestions I found online, none worked.  

 

Went into the BIOS and forced it PCIE 3.0, flickering is less now, even ran a full heaven benchmark, but games still crashed, and some boots were worse than others, same error.  So pretty determined at this point that this has got to be a PCIE problem, BIOS is also showing the GPU at 8x native directly to the MOBO or on a riser, so start the RMA process with Zotac and ASUS.  Might as well send everything back and start again...remember this was going to be a 3 month RMA process at best due to location.  

 

Anyway find a random topic (of thousands) of the same nvidia drivers timing out and resetting, and his culprit was ram, so dilligently boot into a memtest with both sticks, and ran it for 24 hours.  All clear, zero errors.  Next up was testing the ram slots, so out goes a stick and boot up.  All 4 slots with this stick give the same issues, stick goes out, stick 2 goes in and boot, suddenly I'm stable.  Black screen of epilepsy is gone.  (This is the point I check the qvl on the mobo and ram and realise that I really shouldn't have bought these sticks). 

 

So unfortunately on running on single channel, but systems stable now, into the bios, OC the CPU to 4800 all core, flck to 1900, and RAM to 3800, ditch the ultra-conservative Zotac vbios for something with a bit of a higher power ceiling and OC that.  Temps are sweet maxing out at 60 deg on the GPU at full song and CPU around 75, Timespy scores are in the low 18000's so all is good...except I'm still only running on half the lanes, which still indicates a GPU, mobo or CPU problem.  It was stable for about a month.  

 

So ordered a listed QVL set of RAM and another motherboard, and seeing we were stable and I was running on PCIE 4.0, the 8 lanes I could live with while the ram and mobo turned up.

 

Anyway, on the weekend, during a real solid session (GPU had been at 90% + for around 9 hours), display goes dark, still have audio, other people can still hear me but the game itself has disconnected me as well.  Reset button, boots straight in, no problem, 5 min later same again.  Event viewer now showing the DWM.exe has crashed and kindly requested the system take the GPU along with it.  Anyway, went to bed, load up next day, same again, even in the desktop.  DWM errors.  At this point Zotac has agreed to RMA the card, and I've just been hoping it's not the card, and it's been stable after finding the bad, yet according to memtest86 not bad ram module, so I've left the RMA open and have resigned myself to send the card back.  

 

Revert the vbios to zotac, pull out the card, strip off the EK block, re-install the stock cooler and backplate, drop it into the system, boot up and.....no post.  WTAF, I'm so done with tech...the more you spend the worse it gets.  Anyway, clear cmos, and it posts.  Check GPU info in the bios and WTF...16x native.  Boot into windows, no longer crashing, 6 back to back stress tests on 3Dmark all came back 97-98% (worse was 69.9 and a fail, but I round that up to 97 and a pass)...but she's warm (75, well within spec) and she's slow compared to the OC (runs at 1650-1700Mhz vs 1950-2000 on the OC)...and it's on 16 lanes.  Memory is still buggered, but I've moved on.  Everything is good.  

 

Having no clue what I did, off comes the stock cooler and on goes the EK water block...posts...and 8x native.  Didn't even bother testing in windows.

 

So off comes the EK block and just the stock heat pipes and fans go on, posts with 16x native.  

 

At this point the internet is not helping me, a few posts of GPU's failing with waterblocks, etc, but nothing about a card working half-assed with a waterblock.  

 

So I'm thinking maybe Zotac has some kind of limp mode, half lane mode integrated when it doesn't see fans connected, so I pull the fan plugs off the stock cooler, power on, and 16x native.  

 

Put back on the EK water block, posts at 8x native. 

 

Now I've just spent the past 2 hours going over the back of the PCB trying to work out if the block is somehow shorting out the card...in a minor, but still annoying way.  There's nothing obvious (other than how bad EK's standoff assembly is), so I put 0.5mm plastic washers between the standoffs and the PCB on all but the die mounting screws in an attempt to insulate any potential shorting, but not totally destroy the tension on the die.  Re-install and posts at 16x native.  Take out the washers and posts at 8x native.  

 

Just run a bunch of stress tests, and it's stable...but runs about 5 degrees hotter (63-65 degrees) which I assume is because my tension on the die is slightly lighter.  

 

I guess my real question is and the purpose of this post...has anyone experienced anything like this with GPU water blocks before?  I can imagine if i wasn't having the memory issues I may never have even considered that I was running on half the lanes, and it was a symptom of potentially a much more significant issue. 

 

Edit:  I've gone and ordered a bykski GPU block and backplate, just to see if it's any better.

 

Thanks for reading.

Link to post
Share on other sites

PLS post the Byksi GPU Block Experience. too, it helps us all.

 

I was personally going order the EK GPU Block in a month or so, and your Post just saved me quite a lot of time.

PLS Post the Byksi Experience too.

And that's for writing it all for the community

Link to post
Share on other sites

5 minutes ago, Abdullah Bhutta said:

PLS post the Byksi GPU Block Experience. too, it helps us all.

 

I was personally going order the EK GPU Block in a month or so, and your Post just saved me quite a lot of time.

PLS Post the Byksi Experience too.

And that's for writing it all for the community

Will do, I'm going to try a thinner insulator on top of the standoff's in the mean time...maybe some tape, to reduce the tolerance's. 

 

It very well could be the Zotac board rather than the fault of the block, as the blocks are just milled off the reference design spec, and I assume the Byksi will be similar.  

 

Going over the board now...none of the mounting points of the board, aside from the 4 die mounts, mount PCB to metal.  They all mount from a plastic carrier through the PCB to the backplate, which is insulated from the board by thermal pads over the memory modules.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×