Jump to content

Random NVIDIA Driver Crashes with new GTX 980

This is likely to be a long one, so bare with me as I feel that it's important that I share as many details as possible to fully describe the specific problem that I've got going on.

 

I've been on a roller coaster ride of problems and solutions and I'm now at my wit's end. If you stick it out and read through to my final analysis, you'll see that after loads of effort, I will likely have to make some hardware swaps/RMA's. So, I'm having problems with my NVIDIA Graphics Driver where at seemingly random times (during times of full load and times of light internet-browsing-type loads) my display driver crashes resulting in the following message "Display driver nvlddmkm stopped responding and has successfully recovered." with event ID "4101" in the Windows System Event Viewer. When the crash happens, usually my screen locks up for several seconds (sometimes longer than other times but still just a few seconds usually) where I can still hear what's going on in the background and then it resumes where it was. A couple of times, I have had it where the display never actually recovers but rather just unfreezes leaving me staring at my desktop where I can still see that the application I had open is still running, my mouse moves, I can open other windows, I can hear the music/sound effects my open application, and I can't select (or alt-tab) and open the game back up - keep in mind, this particular result may just be a windows problem after the unexpected crash.

 

 

So, here's a total background of my PC and what has led me to this point (feel free to skim read as this is very thorough):

 

When I entered college in Fall of 2012 for an engineering degree, I realized after a few months that my four year-old laptop was entirely inadequate. Having grown up in an extremely rural community where outside activities thrived and inside activities (gaming) were seen as being "lazy", my computer knowledge was extremely limited to the point where I would argue today that I literally knew nothing about them. Having learned enough by early March 2013 to at least know about what I was looking for, I settled on a pre-built Dell XPS 8500 Desktop with an OEM version of Windows 8, an Intel i7 3770, an OEM GTX 660, 12GB of RAM, a 500W PSU, and a 1TB 7200RPM HDD.

 

For a decent amount of time (at least a year) this served me very well and I was content with it - my previous experience with computers was mainly through that old laptop which was rocking windows vista, an intel pentium processor, 2GB of ram, and a fragmented hard-drive that was beyond repair so this desktop was like a gift from the heavens. However, over time, I didn't/couldn't stop learning about computer parts and I soon realized that I was very much nearing the status of "Computer Geek" - at least when it came to the hardware. I built my first PC a little over a year ago when one of my roommates entrusted me to assemble his new $1700 gaming rig with the parts that I picked out for him. All went well and he's had zero problems and no complaints. Early this past winter, I built (basically) an entirely new computer for myself. This build included:

 

-Corsair 750D case - cooled by two 140mm front intakes, one 140mm rear exhaust, and two 120mm exhaust "push" radiator fans

-Corsair CX750M Power Supply ---> Remember this item for later (single 12V rail, semi-modular, 750W power supply)

-Corsair H100i CPU cooler

-ASUS Z97 Maximus VII Hero motherboard

-Intel i7 4790k CPU

-4X4GB G-Skill Ripjaw DDR3 2133MHz system memory

-120GB Samsung 840 EVO ssd (this item was actually purchased several months earlier)

 

Notice how out of this parts list, I did not included a new GPU. The main purpose of building this computer was because I wanted to do it (who doesn't enjoy building a new computer) and I really didn't trust the incredibly cheap motherboard that came with the Dell. So, adding a new GPU was simply not in the budget and in all honesty, the OEM version of the GTX 660 was really very impressive in most games given its release date and specs relative to the current era of 900 series cards. I built the computer, reused my old HDD so I didn't have to deal with storage crap (I despise dealing with storage, a.k.a. transferring files and programs from one device to the other). As with my last computer build, all went well and I had zero problems. My CPU overclocked pretty average for a 4790k to a stable 4.6GHz with the H100i cooler. Since that time (early January 2015) I have had zero hardware issues, no crashes, no problems at all.

 

Now, we're finally very near the current date. Right before the release of the mystery product that clearly no one knew was coming (the 980Ti), many of the regular GTX 980's were going on some pretty good sales with rebates and whatnot. By this time (May 2015), my 660 was finally starting to show it's age - as game tech advanced, my fps were dropping. Needless to say, I was very interested in the 980's. I did a pretty reasonable amount of research even though I really didn't need to since I'm so up to date with LinusTechTips, JayzTwoCents, TekSyndicate, etc. with the releases and reviews of new hardware. I was semi set on an EVGA SSC 980 when I noticed that the normally priced $700 EVGA GTX 980 Classified (which I honestly didn't even bother looking at before due to its normal price) was on sale at Amazon for $589, included a $20 rebate towards the card, and buyers also received codes for Witcher 3 and the new Batman. As you might have guessed, I finally bought a new GPU to match my reasonably "upper-tier" PC... the EVGA GTX 980 Classified.

 

It came within a couple of days (gotta love Amazon Prime) on June 1, 2015 and as soon as I could, I tore apart my computer and added my new power hungry dual 8-pin GPU monster - up from a single 6-pin GPU. I booted the computer up and immediately went to NVIDIA's website to download the latest driver and be sure that the old one was uninstalled properly. I selected the most recent driver for the 980 and downloaded it (it happened to be a driver that was less than one day old - Driver Version 353.06). After that, I couldn't help myself but fire up the 3DMark Demo on Steam and see how much I actually gained from this card. I had MSI Afterburner installed and running on my other monitor so that I could monitor various elements as the test ran. To my pleasant surprise, the card naturally boosted to 1430MHz and maintained a relatively low temp during the Firestrike benchmark. *It should be noted that I have two Dell U2312HM (1080p res., 60Hz refresh rate, IPS panel) monitors connected to the GPU via a DVI connection to each monitor* As expected, my Firestrike score greatly increased - what I didn't expect was that it nearly tripled! It wasn't too long until I wanted to see what more I could squeeze out of the card. Without touching any voltages, I maxed out the power limit slider and began gradually adjusting the core and memory speed. It was doing well... doing well... kept going... no artifacting... and then the display crashed. This is when I was first greeted with the error message I listed above "Display driver nvlddmkm stopped responding and has successfully recovered." I simply chalked this up as the limit for my OC, backed the speeds down a fair bit, and tried again. It ran just fine. Remember that these crashes I get are pretty random.

 

That evening, I started playing the Witcher 3 for the first time and not long into it, I had another display driver crash. Seeing how it was getting late and I just wanted to relax, I opened up MSI Afterburner and just returned the card to its default settings. After another 20 minutes or so I had yet another display driver crash. This raised the little red flag in my head - I'm now operating at the stock speeds from out of the box and just received another crash. Something is wrong and I don't know what or why. My first reaction was that it was a problem with this very new game - it wasn't long until it crashed again but this time while not in the game. This is where my roller coaster ride that I mentioned at the beginning of this novel began. I began researching, researching, and researching some more (all while I should have been studying for my midterms and doing homework for my summer classes, mind you) to the point where I found countless explanations of my issue, potential fixes, and causes. Thanks to the nature of the internet( -_-), out of all of the causes/effects that I found, each one was the sole explanation and solution to the problem according to that individual... everyone else was wrong (sarcasm). It took more sifting through forums and google searches than I care to admit to but I finally found a very helpful resource that I feel is actually the legitimate source behind my problem. It can be found here.

 

Upon reading and understanding that article as best I could, I checked on a number of solutions - none of what I have tried has solved my problem as today I am still getting display driver crashes at random times. 

 

Here is a list of things I have tried in hopes of solving this issue and that ultimately still led to a crash:

-I first uninstalled and then did a clean install of the latest NVIDIA driver

-I then did this again, this time leaving out the NVIDIA audio software

-I tried rolling back several driver versions

-I uninstalled MSI Afterburner and GeForce Experience

-I double checked numerous power settings in my mother board's BIOS thinking I might be suffocating the card thanks to some who-knows-what ASUS power saving setting

-I Made sure it was in a PCI-eX16 Gen 3 lane

-I tried using EVGA Precision-X as my monitoring software

-I "underclocked" the core clock speed - essentially removing the EVGA factory overclock

-I ran a system memory test - it reported zero problems/errors

-I restored my CPU overclock and voltages to their original settings with a motherboard BIOS settings reset

-I even backed up my entire computer, formatted my SSD and my HDD, and installed a clean (legitimate) version of Windows 8.1 Pro

----->Then I updated all drivers once again to their newest versions and moved only necessary files back (documents, pictures, etc) leaving behind things I could download from the internet later as needed.

-------> to be fair, this was a good idea anyways since I was still using the OEM version of Windows that was still on my original HDD from Dell. This also allowed me to finally move my OS to my SSD.

 

Nothing worked for my driver crashes.

 

An interesting note to consider here is when I was running my monitoring software, whether it was MSI Afterburner or EVGA Precision X, my power usage limit for the GPU never once approached 100%. The highest I ever saw it reach was 91% and that was one of the few times it managed to keep from crashing with a moderate overclock while 3DMark Firestrike was running. It normally sits around 80-85% (give or take a little) during heavy, near 100% GPU usage. Maybe I'm reading into this one little point too much but I feel that I should definitely at least be coming close to a power limit of 100% out of the available 125% with an overclock on this card (I even gently ramped up the voltage on the card once in an effort to draw more power but had no changes).

 

At this point, I'm basically concluding that the GPU is in some way or another faulty. When I try overclocking the GPU is definitely when it crashes most frequently which leads me to believe that whatever my issue is, it's a power delivery problem presented either by my Corsair CX750M which up until now has performed perfectly fine (albeit with a much less power hungry GPU) or a fault in EVGA's custom PCB on my specific Classified card. Remember that I never once had an issue when I was still using my GTX 660.

 

 

 

The entire point of this post wasn't so much about me asking for answers or solutions (though I certainly do not want to discourage people to list their thoughts/suggestions!) but rather to create a post on the LinusMediaGroup Forums to allow the community to be more aware of this type of problem since it appears to be so common and is so hard to diagnose. Due to the nature of PC building and the extreme customization of hardware and software, its nearly impossible to make everything run perfectly all the time so things like this can happen sometimes, I suppose. It seems (from my research) that most of the time this is a software related issue and troubleshooting should be dealt with first from a software angle before jumping to the conclusion that it's faulty hardware. That's exactly what I tried to do before I send away the GPU for a while to EVGA while I wait on a replacement.

 

If you've read even just 50% of this post, Kudos to you! It kept growing larger and larger each time I read through it so I apologize for that - especially given my lack-luster conclusion to this story. I hope everyone could learn something from my experiences here and I encourage you to please post any similar experience you may have had or have dealt with! 

 

Thanks!

Link to comment
Share on other sites

Link to post
Share on other sites

Holy crap this is a long post.

 

EDIT: 2346 WORDS!!!! That's longer than my buyers guide!

 My Buyer’s Guide!   

Build:                                               

CPU: Intel Core i5 4690K Cooler: Cryorig R1 Ultimate RAM: Kingston Fury White Series 8GB SSD: OCZ 100 ARC 240GB HDD: Seagate Barracuda 1TB Motherboard: MSI Z97S SLI Krait Edition Graphics Card: Powercolor PCS+ R9 390 Case: Phanteks Enthoo Pro (White) Power Supply: EVGA G2 750W Monitor: LG 29UM67-P 29" 21:9 Freesync Sexiness Mouse: Razer Deathadder ChromKeyboard: Razer Blackwidow 2014 Headset: Turtle Beach Ear Force XP400

Link to comment
Share on other sites

Link to post
Share on other sites

At this point with the amount you've tested you should be taking this over to the geforce forums.

 

I've been having random driver crashes with my 970 for a recent while now, but I haven't had them whilst in a game. 

The Internet is the first thing that humanity has built that humanity doesn't understand, the largest experiment in anarchy that we have ever had.

Link to comment
Share on other sites

Link to post
Share on other sites

@Logman

 

I hate to say it but this post is way too long and I think a lot of people are going to be turned away by that. I know I will not be reading a post two and a half the size of my most recent science report. 

 

Might want to shorten it up a bit.

 My Buyer’s Guide!   

Build:                                               

CPU: Intel Core i5 4690K Cooler: Cryorig R1 Ultimate RAM: Kingston Fury White Series 8GB SSD: OCZ 100 ARC 240GB HDD: Seagate Barracuda 1TB Motherboard: MSI Z97S SLI Krait Edition Graphics Card: Powercolor PCS+ R9 390 Case: Phanteks Enthoo Pro (White) Power Supply: EVGA G2 750W Monitor: LG 29UM67-P 29" 21:9 Freesync Sexiness Mouse: Razer Deathadder ChromKeyboard: Razer Blackwidow 2014 Headset: Turtle Beach Ear Force XP400

Link to comment
Share on other sites

Link to post
Share on other sites

At this point with the amount you've tested you should be taking this over to the geforce forums.

Agreed, they have more time to read shit too. 

 My Buyer’s Guide!   

Build:                                               

CPU: Intel Core i5 4690K Cooler: Cryorig R1 Ultimate RAM: Kingston Fury White Series 8GB SSD: OCZ 100 ARC 240GB HDD: Seagate Barracuda 1TB Motherboard: MSI Z97S SLI Krait Edition Graphics Card: Powercolor PCS+ R9 390 Case: Phanteks Enthoo Pro (White) Power Supply: EVGA G2 750W Monitor: LG 29UM67-P 29" 21:9 Freesync Sexiness Mouse: Razer Deathadder ChromKeyboard: Razer Blackwidow 2014 Headset: Turtle Beach Ear Force XP400

Link to comment
Share on other sites

Link to post
Share on other sites

Try to rollback the drivers to even older ones, It might work, Otherwise i think the problem is hardware based.

BTW, Good info you gave us.

I dont have any problem reading alot of text, But some might.

My Gaming PC

|| CPU: Intel i5 4690@4.3Ghz || GPU: Dual ASUS gtx 1080 Strix. || RAM: 16gb (4x4gb) Kingston HyperX Genesis 1600Mhz. || Motherboard: MSI Z97S Krait edition. || OS: Win10 Pro
________________________________________________________________

Trust me, Im an Engineer

Link to comment
Share on other sites

Link to post
Share on other sites

There is some talk about the latest Nvidia drivers causing multiple issues for people. Thankfully I was too lazy to update the drivers. I'd rollback.

In Placebo We Trust - Resident Obnoxious Objective Fangirl (R.O.O.F) - Your Eyes Cannot Hear
Haswell Overclocking Guide | Skylake Overclocking GuideCan my amp power my headphones?

Link to comment
Share on other sites

Link to post
Share on other sites

Wow, what a post. I would say use ddu or CCleaner. Or maybe even sfc scan but, you did eve that would have solved those issues. I would rma. I can't believe you haven't. At least contact support, every time they call you mr?? , respond doctor?? is fine, gets more respect. Sorry for the shit post, you did more than most would

Link to comment
Share on other sites

Link to post
Share on other sites

Okay so to start out I am going to guess and say your using windows 7. If so this is caused by the registry configs (or lack of in win7). In windows 8 and up they did an overhaul to better handle the higher end cards. Basically what I am saying is your going to need to patch windows 7 registry in order for your high end card to work properly. To start you will want to open regedit.exe (just type in start menu should pop up). And I want you to go ahead and backup the registry 2 times just to be safe. And if anything happens just boot into safe mode and load one of the 2 backups. After that go ahead and locate HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers   and I want you to Right click on the GraphicsDrivers folder click new REG_DWORD 32bit Value (Make 6 of them) and yes even if you have 64bit OS.  Right click rename the first one you made as TdrDdiDelay, 2nd one name it TdrDebugMode, 3rd TdrDelay, 4th TdrLevel, 5th TdrLimitCount, 6th TdrLimitTime after that double click on TdrdiDelay set to 5 (NOTE: all of these are hex based)  set TdrDebugMode 2 set TdrDelay 2 set TdrLevel 3 set TdrLimitCount 5 set TdrLimitTime 60 (Note go to DCI folder below GraphicsDriver folder and set Timeout to 2 and up by 2 (max is 10) until your display driver stops not responding). 

 

 

This is what I am currently doing to on a friends PC and it worked for him. My sources I got this from are https://msdn.microsoft.com/en-us/library/windows/hardware/ff569918(v=vs.85).aspx

and https://forums.geforce.com/default/topic/569661/tdr-registry-settings/?offset=1

 

Update 2: If on windows 8 go under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers and click on DCI folder see if a Timeout is already there if so set Timeout to 2 and up by 2 (max is 10) until your display driver stops not responding. (would recommend doing a reg-backup before you edit it.)

 

Update one: OS pings video card if I am not mistaken every 2 seconds and if does not receive a response within 2 seconds the driver will crash. "TdrDelay" value increases timespan of response requirement thus potentially preventing the error if video card response within 10 seconds.

Link to comment
Share on other sites

Link to post
Share on other sites

Thanks for all of the awesome feedback everyone! I know the post got REALLY long so I do appreciate the time many of you seem to have spent actually reading through it! Hopefully, if I get the chance today, I'm going to be contacting EVGA Customer Support to see what they have to say. I'm very much anticipating being told to send it back to them at this point. 

Link to comment
Share on other sites

Link to post
Share on other sites

I'm having similar issues since moving to a GTX 980 from a GTX 660, though it's specific to certain games. In my case, I did not change any other hardware or software (other than the display driver for the new card). I built a machine for a friend that also included a 980 and he's getting the exact same problem (actually crashing Fallout 3 in the exact same places I was even).To me, this makes it look more like a driver issue than an OS issue; were it truely an issue with the way the OS resets video adapters, it should have been doing this on the 660 as well.

Link to comment
Share on other sites

Link to post
Share on other sites

I have similiar issue with my Gigabyte gtx 970 gaming g1

(rest of spec is i5 4690k , hyper x fury 2x4gb 1866 , gigabyte gaming 5 motherboard , PSU CS650M , HDD seagate 2TB+SSD 120 GB samsung evo)

 

Those problems ( random crashing , sometimes with artefacting , sometimes i needed to hard reset ) started right after i installed new drivers (353.06) so i thought  it must be bad driver installation ,so i uninstalled it with DDU in safe mode and clean installed older drivers(347.52) but it didnt work so i tried some other solutions i found on internet (disabling HD audio,disabling windows aero,changing physix to CPU,clean reinstaled windows , unmounted gpu and mount it back) but nothing helped...according to monitoring softwares ( gpu-z ,HWmonitor,realtemp) gpu max temp in load is 65ºC ...i dont know if i should send it back or try more things to make it work normally ...

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×