Jump to content

Salticid

Member
  • Posts

    55
  • Joined

  • Last visited

Awards

This user doesn't have any awards

System

  • CPU
    i9 9900K
  • Motherboard
    ASUS ROG Maximus XI Hero Wifi
  • RAM
    G.Skill Trident Z 32 Gb DDR4-3600
  • GPU
    EVGA 2080 TI FTW3
  • Case
    Thermaltake View 71 TG
  • Storage
    Samsung - 970 Evo Plus 250 GB M.2-2280, Samsung - 860 Evo 1 TB 2.5", Seagate Barracuda 2 TB 7200 RPM 3"
  • PSU
    EVGA SuperNOVA P2 1200 W 80+ Platinum
  • Display(s)
    ACER Predator Z1, Samsung 32" secondary
  • Cooling
    Custom loop: EK Supremacy Evo CPU block, EVGA Hydrocopper GPU Block, EK 360 Rads x2, Bitspower fittings & OD 16mm PETG pipe, Thermaltake D5 pump & res combo, Thermaltake Riing 120mm static pressure fans x6
  • Keyboard
    ROG Strix Flare
  • Sound
    Sound Blaster X Katana
  • Operating System
    Windows 10 Pro

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I wanted to post this here and close the topic, in case my solution one day helps others. The issue was in fact nvlddmkm Error Event 14. This error, it turns out, is not anything specific but a catch-all for anything that causes the driver to time out. This can be due to hardware failure, driver failure, firmware/software conflicts, data corruption. This appears to be why there are so many different issues out there with similar symptoms and the same message but different "fixes", all of which are actually workarounds because until you know the actual issue you can't fix it. I don't know why I defaulted to this being a hardware issue, there was a time that I would have looked at software first. Once I was directed at software though, I tracked the specific error in my event log to the day it started. That day I had installed some software from my employer for some volunteer work I had signed up to do. That first event was actually catastrophic, but corrected itself. I didn't see any more issues directly for weeks, and after that what I did see was slow, creeping issues that did not seem immediately related (but were), getting worse over time until last week. This software, BTW, was Citrix Workspace. A software known at one time to cause issues and conflicts with dual monitor workstations that used GPUs. That was over 10 years ago, but I'm sure we're all familiar with "It's fixed" being only partly, often barely true. Once I realized the connection of when everything started tho, I decided on one last ditch effort. I had already uninstalled the software that was the culprit, but that hadn't fixed the issue. Something must have been left over in the system - unsurprising. By the time I figured this out I was close to RMA-ing my card. I had done every "fix" for this out there. None worked and some made it worse. Last ditch: Format the C drive and Reinstall windows. I also got a new surge protector with a tiny UPS in it, 2 new Certified DP cables (was using 1 HDMI and 1 DP of suspect quality before), and completely re-did my peripheral cable management. My system has been stable and working perfectly with no errors or warns in the evet log at all for 48 hours. Temps are even a little bit cooler. In my case, it was a corruption caused by software. Always check out this possibility. Trace the events back to their first date they started and try and remember what you did that day.
  2. I see there are no new suggestions or testing thoughts/advice? I have some updates tho. I posted in the EVGA forums since my card/cooling plate and PSU are EVGA. I was pointed to this thread: https://forums.evga.com/Comprehensive-Windows-10-Black-Screen-Trouble-shooting-Guide-m3131813.aspx Which got me to look at my event viewer. Silly me, I had defaulted straight to Hardware. Lo and Behold! nvlddmkm Error! Event 14! Followed by a Display warning. At exactly the times I have started logging. Looking this up further and because the thread above is mainly for the RTX 30Ks, I found this: Which of the many threads I have found on the topic across nVidia, EVGA and Reddit thus far, is the most comprehensive. I'm going to investigate this more. See if there's a "Fix" I haven't tried yet. The issue appears tho be Hardware Acceleration across applications that idle a lot and do not put a lot of load on the card - the game I play most fits that bill too, considering I can play it on the CPU graphics. At 7 FPS, but it does play. If anyone has run into this and has the answer that worked, please save me a lot of reading, stress, and a potential OS re-install day and let me know? Meanwhile, I implemented most of the "fixes" from the first thread - which the RTX20K series all agreed on. Many of them I already had. Some I did not. Last night the system was pretty stable, but this morning I had a driver failure that lost the whole desktop and I had to reboot. I'm starting to back everything up in prep for a whole system re-install. Also looking at re-arranging some furniture and my cable management temporarily to try and see if I can plug the computer into a wall directly And looking at buying some new Certified DP cables for the monitors to just chuck that possibility out the window too.
  3. Edit to update: I ran OCCT PSU Stress, GPU Stress and VRAM pattern testing. The Stress Tests I only ran for 5 minutes. I suppose I could go more but that's stressing my system far more than anything I put it through to cause this power fluctuation, and the EVGA Tech said I should know pretty quickly, in under the 5 minutes I ran it, if the PSU or GPU were the issues. Within those 5 they should cause the same issue or cut out. I was super nervous, regardless. My CPU temps got really high on the PSU Stress test, so I really don't want to do more than that 5 minutes unless there's a super good reason. I could test the VRAM more. That didn't actually cause a whole lot of pressure on the system and didn't find any issues. Anyway, none of the tests found any issues and my machine did not replicate the problem. So not a PSU or a GPU problem after all? What does that leave? CPU and RAM? And then as I was writing this, it happened again and OCCT failed to reload. I didn't have Afterburner going at the same time. I am really at the end of my proverbial rope with this... Edit: WELP I tried to update my last post but somehow ended up quoting myself and now I can't delete this post and add it to the last one. That's odd and annoying... My apologies.
  4. Another Dupe post, apologies, but I'd like to ask if anyone has opinions on OCCT Testing? I just got off the phone with EVGA Tech Support. Both the PSU and the GPU are EVGA products and within warranty, so I contacted them to ask for help and also to see about RMA possibilities. I am going to continue with this test running the graphics off only the CPU for the rest of today, but my concern is that the real power draw on the PSU is the GPU. Without one being used in replacement of mine, it's not really a valid test of the PSU to see if that's the problem. The person I spoke with suggested I try the OCCT PSU Test and also their GPU test. Based on which fails then we know where the problem lies. I can get an RMA on either component - tho availability of a new card may be a different issue entirely. I've never really heard of these tests, though it sounds like a good idea...
  5. Apologies for the double post but I wanted to update with some interesting info. I am indeed able to disable the GPU without removing it. Asus BIOS has a setting in advanced that allows the user to set which GPU to use, either integrated or PCI. There's a few combinations of choices, but what I did was enable the dual monitor setup on the onboard motherboard, previously set to disabled. The setting for which one to preferentially use then switches automatically to Onboard. Save and Reset and switch my cables to the Mobo I/O as it reboots. Then go into Device Manager and disable the nVidia card. Currently GPUz, HWInfo64 and Afterburner are not registering the nVidia Card and are only showing the onboard Intel graphics. I have never actually wanted this thing to happen before. Until now. I really don't want it to be the GPU. The PSU is far easier to replace. Crossing my fingers.
  6. I'd wondered about the possibility of it being the surge protector or something... It's certainly a thought. May give it a try. I went all morning without an incident. Was getting confident so over lunch fired up a game to try and push and see if it was really stable. About 15 minutes and it was triggered again. Couple of afterburner shots showing the power drop on both CPU and GPU. There are no temp or usage spikes before it drops and it doesn't drop enough to shut the machine off, but just enough for the monitors to blank and the game video to not reload. There is a quick spike after everything comes back up. The drop always lasts 2 seconds and something I'm beginning to notice is it seems to be happening the same 2 seconds in a minute when it does happen. So now I'm not certain if the hour and minute is also the same, and have started a log. We shall see and hopefully I get this sorted before I have enough of a list to see there's a real pattern. So the issue isn't that software I had installed and it doesn't appear to be driver. But it's just weird to me that it's recent. I'm going to try disabling my GPU and run off the integrated graphics now, see if that works. And if I manage it I'll see if it happens again.
  7. I'm pretty sure there's a setting in BIOS that allows me to bypass, so I can force use of integrated. I'll look more deeply into that if I end up needing to. Thank you for your help so far!
  8. OK, So the system had those power dropouts again last night after the BIOS update. Boo. I did not use Memtest86 because I didn't have time to create a bootable flash drive yesterday. I might be able to do that tonight is problems still persist. I ran the memtest Jurrunio suggested in 12 instances (sorry, I know you said 14) each set to 2670 megs to test, and let them go all night. This Memtest found no problems. Screenshot attached. I had a couple ideas overnight of possible software issues that might be causing this, because this problem really kind of started after I installed some software (Citrix Workspace) for some volunteer work I was doing. So I uninstalled that and a few other things and also adjusted my sleep settings to "never". Updated my graphics driver - new one just came out. And just ran a windows update to boot. I'll be monitoring my system with CPUz and Afterburner today so if a power dropout happens again I can catch it on more than just the GPU - I have seen this before across the system but the only screencap I have and have shown is the GPU. So assuming I will need to move forward with disabling the graphics card and running on the onboard Intel graphics... Can I simply unplug the GPU from the PSU so it gets no power? Do I also need to then disable the device in Device Manager, or would that be enough to make the card undetected and force use of the onboard graphics? Because of course, I would have also done some severe cable management through the back of the case so while yeah, the cables would reach had I not done that, as things stand? Nope. Hell, I am not entirely sure I can get the PSU out of the case without draining the system and pulling some pipes anyway. WOOO! Go me! Way to plan ahead! Aesthetics or die! I might be able to get it out but will have to be super patient and gentle. Or it could be an excuse to clean an change out my coolant.
  9. Thank you! I figured because you mentioned stress testing the RAM that Prime95 would be better, but I'm happier running Memtest unwatched overnight if need be. I just updated to the latest BIOS. Hopefully that fixes it and that's that. Fingers crossed! If I have any issues tonight I will run Memtest overnight and see if it comes up with anything.
  10. Hmm, no, I didn't check the RAM. I can run Prime95 tonight and see how that goes. It looks like there are a few new BIOS since I last updated... It's something I rarely think of because I try to leave BIOS alone but this is a case where it makes sense. Thank you. I'll give it a try.
  11. Update: Ok, so I guess that was a dumb question, yes I can disable the GPU from the BIOS then Device manager. I'll give that a try when I have a bit of time. If it is the PSU, it seems I'm still in trouble because the pipes look like they may block removal of the PSU... Maybe. Sigh. Welp, unfortunately not a possible course of action. I can't borrow one of these from someone else, IDK anyone who has one and IDK any gamers near me anyway. Second, this is a liquid cooled rig. The GPU is liquid cooled too. Hardlines. I can't just pull components out. And if I could I have no cooler and fan to replace the liquid block for the CPU while I test in this way, anyway. Limitations of the choice I made going with liquid over air I guess. Is there a way to simply disable the GPU without taking it out of the machine? Otherwise I may have to just swap out a PSU I guess.
  12. If it were VRM I wouldn't see this on the CPU as well, would I? So assuming PSU, any ideas to test so I could prove it?
  13. Hi! Been trying to solve this for weeks and am stumped what to do next. I could really use some help. Essentially, I am happily toodling along, working, gaming, whatever and both my monitors just cut out. It's completely random. If I am gaming, when they come back up often the game window will not reload and I have to force close the game and re-log. Specs: Asus ROG MAXIMUS XI HERO (WI-FI) ATX LGA1151 Intel Core i9-9900K 3.6 GHz 8-Core Processor EVGA GeForce RTX 2080 Ti 11 GB FTW3 ULTRA G.Skill Trident Z Royal 32 GB (4 x 8) DDR4-3600, CL17 VGA P2 1200 W 80+ Platinum Certified Modular ATX OS Drive: Samsung 970 Evo Plus 250 GB M.2-2280 NVME SSD OS: Microsoft Windows 10 Pro Full 64-bit Dual monitors, one is an Asus Predator connected via DVI and other is a Samsung connected via HDMI Everything is on a strong surge protector GPU and CPU are liquid cooled. I have cleaned with DDU in Safe mode and re-installed graphics drivers. Benchmarked with Heaven and Superposition - stable, good framerates and respectable scores considering I'm not overclocking. There's no overheating or spike in usage, even benchmarking on extreme the CPU doesn't really get above 50c and the GPU stays at about 55 C. This happens even when idling anyway. Monitored with Afterburner and GPUz and when this happens it looks like there's a power drop-off throughout the system. Attached is a GPUz screenshot of the issue. I notice this in Afterburner on the CPU as well. I'm starting to suspect the PSU and am unhappy about that prospect in so many ways. I have no way to test it and no backup...
  14. I didn't say that. I said it becomes mostly moot when dealing with a single closed loop. Things change a bit when dealing with dual, split loops because there are things you can definitely do to help optimize heat exchange with those. You are talking about two different CPU coolers here. A Corsair H55 and a Corsair H100I. If I understand correctly, you would be using the H55 on your current rig, but when you switch over to Ryzen you would move up to the H100I? In either build, probably the best heat exchange for good airflow in the case would be to put the CPU cooler fan(s) on intake and the GPU on Exhaust. This is because the GPU loop will usually be dispersing more heat than the CPU, and it's more beneficial to avoid dispersing that inside the case airflow if at all possible - improving cooling for your Chipset VRMs and RAM. These AIO rads are 120s or 240s, so if you have room for another 120 or two along the side you position either rad, do so and run it in the same direction the rad is running - intake with intake and exhaust with exhaust. The temp difference may be a few degrees but if you can easily get those few degrees, go for it.
  15. That video only shows that it makes a difference if and only when you are running an air-cooled GPU with an open air shroud, and your radiator is on your exhaust fans. A more accurate test would have run additional steps of changing the top mount to intake and measuring, and also changing the front mount to exhaust and measuring. The open air shroud inside the case releases more heat directly into the airflow of the case. I would expect that the issue is less where the radiator is placed in the box, and more passing hotter air over a radiator for cooling versus pulling fresher, cooler air over the radiator fins for cooling. The more heat-soaked the air, the less heat it can then take up to cool the fluid in the loop, and thus the component it is trying to cool. So a front mounted exhaust versus top, I would expect to be almost the same, maybe a degree or two under that 86 degree measurement. But he didn't do that test so we don't know for sure. Conclusions at 12:54 To your question about adding liquid cooling to your GPU. If your configuration is a single closed loop, no, in most cases your rad position becomes pretty moot. Far less heat is being released directly into the case because the GPU is on the water loop - though your loop and rads will definitely be hotter from the GPU so you can expect to see slightly higher temps on the CPU than you would on a dedicated CPU loop or on dual loops with one dedicated loop for the CPU and one for the GPU. RAM and Mobo Chips will still be releasing heat too and still need airflow and cooling, of course. But they don't release nearly the heat into the case that an open-air GPU does. The GPU is the biggest heater of any component in a system, and in a small room some can even change the whole room's overall temperature over time. You always need intake for airflow, and you always need exhaust, and to some degree or another on most cases each side of the flow is going to have to have some radiator on it - at the very least a 240 on each. Unless of course your case is specifically designed with more options for airflow allowing you to pull good intake in someplace without radiators. Without a concept case like that I'm curious how one could avoid pulling intake over a rad and still effectively maintain good airflow and positive to neutral air pressure. But everything I've seen shows that doing so - passing intake over a rad in a full custom loop - doesn't make more than a couple of degrees difference in temp, whereas temperatures overall are far more affected by the ambient temperature of the environment/room the computer is in, the clock speeds on the GPU and CPU and the speed of the fans on the rads than the actual positioning of the rads and fans themselves.
×