Jump to content

Second graphics card randomly completely disconnecting

ThorntonStrolia

I am wondering if someone has an idea why this is happening. Now, I cannot say for certain this is not my motherboard which is relatively old and has seen better days. It is possible it is just the second PCIe lane. But, if there is a known reason for this that I just can't find, I am hoping to start there instead of jumping to parts since I don't have another motherboard lying around on hand to test and be sure. I have 2 RTX 30 series cards. a 3070 and a 3090. (side note before I continue, this also happens with my 10 and 20 series cards). Whatever card is plugged into the bottom of my 2 PCIe lanes works for a while..... like a couple hours, and then randomly whatever card is in the bottom PCIe lane just cuts out. It appears to still be getting power but windows won't recognize it. I have a 1300 watt PSU so I am not TOOO tooo too concerned about power limitations, although I suppose it is possible. My UPS reads only 530 watts at peak draw even with both 30 series so I don't think that is it. I had heard recently that some power supplies don't like the power requirements of 30 series cards and they can have stability problems with some PSUs? I don't know if that is true or not or how that explains why one is cutting out and not the other but maybeee?

 

I may be misremembering but I vaguely recall first experiencing the issue when I first got my 3070 which I got before the 3090. I was running the 1080 in slot 1 for my monitors and the 3070 in slot 2 and the 3070 cut out. I freaked for a few minutes and then swapped the cards and then it worked for about a week and then the 1080 cut out. I thought the 1080 died but having not tested to be sure yet, I am now not so positive.

 

Anyway, I know the only way to be SUREEEE would be to test everything on a new board and run stability checks and things, and if that yields nothing, try a different PSU. I HAVE tried different cables already also and swapping cables seems to make no difference. I am just wondering if someone more knowledgeable than me about the system as it works as an actual system may have knowledge of issues like this relating to 30 series cards since thats about when I remember this starting, or if you think this is something else, if you can help me get an idea of what things it might be, and if you have any ideas of specific things I should trouble shoot for. One way or the other I do plan to get a new motherboard sooner or later but now with the new ryzen stuff coming I am probably going to hold off a short while for that.

Link to comment
Share on other sites

Link to post
Share on other sites

Why are you running multiple video cards?

Also, have you simply tested each one in each slot, one at a time to see what happens?

NOTE: I no longer frequent this site. If you really need help, PM/DM me and my e.mail will alert me. 

Link to comment
Share on other sites

Link to post
Share on other sites

31 minutes ago, Radium_Angel said:

Why are you running multiple video cards?

Also, have you simply tested each one in each slot, one at a time to see what happens?

I am a computer animator. For me, more RT + more CUDA + more minimum VRam (that 3070 is really killing me with its 7gb) means more better. That is the short version. Yeah all the cards (barring the 1080 ti which I haven't tested) work in the top slot all the time. The bottom slot, once that issue happens, will send video signal in post but strangely the card will not be recognized by windows OR the bios. But I will get video signal for those first few seconds while the it is asking me if I want to boot into the bios. But all this is ONLY if I don't remove the card from the bottom slot once the issue starts. If I take both cards out and run a new one fresh in the bottom slot it will just stay working. The issue only happens after 2 have been working and registered by windows for a few hours, then the second card just drops like a thumbdrive that got pulled out incorrectly

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, ThorntonStrolia said:

Yeah all the cards (barring the 1080 ti which I haven't tested) work in the top slot all the time

And if you try just in the bottom slot?

NOTE: I no longer frequent this site. If you really need help, PM/DM me and my e.mail will alert me. 

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, ThorntonStrolia said:

My UPS reads only 530 watts at peak draw even with both 30 series

That seems way too low with a 3080 and 3090. 

What are your system specs? Cpu, Motherboard, any and all SATA/PCIE devices and exact model of PSU. Maybe that can give us some hints.

A 3070 and a 3090 can potentially be up to 5 PCIE connectors. How many run back to the PSU as it's own cable? Are you using pigtails in here? 

Do you have a second PSU you can use to set up a dual PSU set up for testing to make sure it's not cutting out due to lack of connectors?

Does Event Log show anything? 

 

Edit: Just as a side note on multi GPU set ups. My F@H rig has a 2070, 2 1080s and a 1070. I had a weird thing where I could not for the life of me get the 1070 and a 1080 to be recognized or used. A restart later after days of messing with it, it just all decided to work and has since. 

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

14 minutes ago, IkeaGnome said:

That seems way too low with a 3080 and 3090. 

What are your system specs? Cpu, Motherboard, any and all SATA/PCIE devices and exact model of PSU. Maybe that can give us some hints.

A 3070 and a 3090 can potentially be up to 5 PCIE connectors. How many run back to the PSU as it's own cable? Are you using pigtails in here? 

Do you have a second PSU you can use to set up a dual PSU set up for testing to make sure it's not cutting out due to lack of connectors?

Does Event Log show anything? 

No pig tails, 7700k, asus z270e, samsung nvme EVGA 1300g. And it is not that weird to me at all. I should say, MY peak draw, not stress test. When I am rendering, maxing out the ram, I sit right around 530w avg on both cards. Sometimes it goes up or down by like 80 watts, especially as it is initializing a new frame (increases to low 600s) or ends a frame, (may decreases to high 400s. Not as low as full idle but that is because the scene is still loaded up in vram). And yes it is 5/ my 8 PCIe power connectors. Again, no pigtails, doesn't seem to matter what combo of cards I use, One in top, fine. One in bottom slot, fine. one in top AND bottom, fine until the computer like goes to sleep, or usually when I just stop looking at it for like 30 minutes. I come back and the second card is gone. At this time, if I take the top card out, whatever the top card is, th bottom card will still send video to show the asus loading screen but after that, notta. Move it to the top slot, works perfectly. BUT, if I move it back to the bottom slot and put NO card in the top slot, works perfectly again, no issues.

 

Event log does show I believe a warning (I got bored so decided to finaly ask about this. Been running just the 3090 for about a month now so I havent had any issues recently, idol wattage around 190w and render wattage averaging around 390-400w or so. In fact I am running octane right now and my live preview is reading 235w at 11gb vram usage). I think, if I recall, the warning reads a device was disconnected but doesn't do anything special really. I don't recall the code it provided off the top of my head, but nothing that makes windows shut down. Whatever card shuts off, the fans turn off but the card is clearly still active because it reaches like stupid temps. I took a contactless thermometer to one and it was reading around 70c at idol.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, ThorntonStrolia said:

asus z270e

Just to confirm, when you are talking about top and bottom slot, you mean x16/x8_1 and x4_1 right? Are you going into the bios and forcing that bottom slot into x4? 

If your GPUs have space for it, I'd start with putting one in the x8_1 slot and the x8_2 slot. Go into BIOS and set those to x8 mode. The top one might be trying to run at x16 still. See if that gets you a bit more stable. 

Spoiler

image.png.59f6bd39dafd9c2a92b506976f76a4a9.png

image.png.51056dddc4fa61052f82620fd6aaf6ee.png

Here's a bit of what I did to get my stuff working again. It seems forcing BIOS into gen 3 and not "Auto" for PCIE worked for me. I feel like I remember also forcing them all into x8 slots as well though. 

 

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, IkeaGnome said:

Just to confirm, when you are talking about top and bottom slot, you mean x16/x8_1 and x4_1 right? Are you going into the bios and forcing that bottom slot into x4? 

If your GPUs have space for it, I'd start with putting one in the x8_1 slot and the x8_2 slot. Go into BIOS and set those to x8 mode. The top one might be trying to run at x16 still. See if that gets you a bit more stable. 

  Reveal hidden contents

image.png.59f6bd39dafd9c2a92b506976f76a4a9.png

image.png.51056dddc4fa61052f82620fd6aaf6ee.png

Here's a bit of what I did to get my stuff working again. It seems forcing BIOS into gen 3 and not "Auto" for PCIE worked for me. I feel like I remember also forcing them all into x8 slots as well though. 

 

It is an older mobo so I do force it into gen 3 just to be safe. I will definitely need a riser to get to the x4 slot. the 3090 takes up like 3 and a half slots and not even a 1080 would fit in that slot to begin with so, it may be a while before I can do that sorta test. I had a power surge about a year ago that fried my old PSU and a few other parts so my best guess is just that something is broken in the mobo and likely a new one will fix it. But given that I have the cards now but I don't plant to upgrade the CPU and mobo for a few months to find out, I am mostly just curious if there are any other people having issues like this or if this is an isolated thing. It is sounding like the latter so far

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, ThorntonStrolia said:

3 and a half slots and not even a 1080 would fit in that slot to begin with so, it may be a while before I can do that sorta test.

Does it act up if you do the 3070 and the 1080? They might allow for use of that slot. 

Even if the 10 and 20 series cards act up, I'd put them in there to see if they act up with that second slot occupied and the top one forced into x8. 

4 minutes ago, ThorntonStrolia said:

It is sounding like the latter so far

I think you'd be surprised at how many people with non SLI, mixed GPUs run into. It's a small portion of people that do it in the first place, but seems like stuff just never wants to work quite right. 

 

5 minutes ago, ThorntonStrolia said:

PSU and a few other parts so my best guess is just that something is broken in the mobo and likely a new one will fix it.

What other parts? There's a chance that something in that bottom lane got all borked. But, you'd think it'd act up more often.

 

I know you said the bottom slot randomly disconnects. There's a pattern to it somewhere. Is it when you first start loading a frame and that slot would be bogged down with more power? Is it more when you're finishing up and it's not going back to idle correctly? 

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, IkeaGnome said:

Does it act up if you do the 3070 and the 1080? They might allow for use of that slot. 

Even if the 10 and 20 series cards act up, I'd put them in there to see if they act up with that second slot occupied and the top one forced into x8. 

I think you'd be surprised at how many people with non SLI, mixed GPUs run into. It's a small portion of people that do it in the first place, but seems like stuff just never wants to work quite right. 

 

What other parts? There's a chance that something in that bottom lane got all borked. But, you'd think it'd act up more often.

 

I know you said the bottom slot randomly disconnects. There's a pattern to it somewhere. Is it when you first start loading a frame and that slot would be bogged down with more power? Is it more when you're finishing up and it's not going back to idle correctly? 

strangely no. I think it is power related because I never have the issue if I keep using the computer from the moment I plug the card in and power up. But whether I use the computer for 10 minutes or 50 hours, without fail, if I step away for 30 minutes or if it goes to sleep and I come back, the second card is gone even in the bios. I don't know a super whole lot about the PCIe lane connectors but if there is a pin in the connector that sends a signal when the computer is waking up or going to sleep, I wonder if that pin got shot and then that lane goes to sleep and just doesn't come back. Sometimes restarting the computer fixes is but usually the only fix is to take the card out, power on, power off, put the card back in, power on, and it reads the second lane again.

 

As far as the 3070 and 1080ti, it doesn't quite fit. Just a hair. All the cards are big tripple fan coolers so, yeah, not exactly the smallest things in the world. They really need that SLI spacing.

 

I really didn't know most people don't run 2 cards without SLI. Pretty normal not to SLI or nVlink cards in 3D rendering unless you arel inking quadros in a workstation or teslas in a server or something.

 

A few SSDs died, as well, Anything that wasnt samsung was dead on the spot. I have used samsung drives ever since because even the only one that died still works externally once in a while, the other couple samsung drives lived, and all the other drives were toast. One graphics card was shot. How shot I don't know but it was an oldy and still worked for a fow weeks after so I didn't think much of it til one day I smelled a weird smell while rendering and realized the fans werent running on the cooler. Cool. Then after I restarted it just never turned back on.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, ThorntonStrolia said:

I wonder if that pin got shot and then that lane goes to sleep and just doesn't come back.

Easy way to test that would be to use cables from the 3090 on the 3070. If you swap the two off the 3070 onto the 3090, it should start being the one with that problem. 

You could also give both ends of that cable the old visual inspection. Look for discoloration, broken pin etc. 

PCIe_pinout.png

If the problem is always in the bottom slot, I'd guess damage to that slot, damaged cable, or BIOS isn't set up correctly. 

If you go into BIOS>Tool>Graphics Card information, how are they showing up? When it crashes, does the bottom slot GPU even show up in BIOS if you don't do the reset stuff? 

 

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, IkeaGnome said:

Easy way to test that would be to use cables from the 3090 on the 3070. If you swap the two off the 3070 onto the 3090, it should start being the one with that problem. 

You could also give both ends of that cable the old visual inspection. Look for discoloration, broken pin etc. 

PCIe_pinout.png

If the problem is always in the bottom slot, I'd guess damage to that slot, damaged cable, or BIOS isn't set up correctly. 

If you go into BIOS>Tool>Graphics Card information, how are they showing up? When it crashes, does the bottom slot GPU even show up in BIOS if you don't do the reset stuff? 

 

I meant in the PCIe lanes themselves. The PSU that I had is long gone, dead and burried. Wouldn't stay on anymore after that. It was a corsair 850. Now on the EVGA because when I got it it was the only thing that wasn't 2000w that could handle my graphics cards needs. and no doesn't show up in bios. When it works it does but once it stops working it shows nothing plugged into the slot

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, ThorntonStrolia said:

No doesn't show up in bios.

Does it look like it's turning on at the same time as the other GPU? 

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, IkeaGnome said:

Does it look like it's turning on at the same time as the other GPU? 

I havent tried recently enough to say for sure. I mean, yeah to the best of my recolection, yeah, all the RGB and things comes on. Fans don't start in that state but whatever card is in the second slot will be doing SOMEthing because it gets hot as hell regardless of which card it is. Everythng sorta seems to come on all at once

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, ThorntonStrolia said:

all the RGB and things comes on. Fans don't start in that state but whatever card is in the second slot will be doing SOMEthing because it gets hot as hell regardless of which card it is.

That sounds like an initialization error to me. You don't have a 4x card you could put in there next time it acts up, do you? WiFi card etc. 

If it's messed up somewhere in the x8 or x16 slot, you might be able to get away with a riser. Throw a x4 slot riser in, put the GPU in that. That would avoid motherboard at least. I'm not sure how much of the x8 bandwidth you need or if that would hamper what you're doing with editing. 

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

19 minutes ago, IkeaGnome said:

That sounds like an initialization error to me. You don't have a 4x card you could put in there next time it acts up, do you? WiFi card etc. 

If it's messed up somewhere in the x8 or x16 slot, you might be able to get away with a riser. Throw a x4 slot riser in, put the GPU in that. That would avoid motherboard at least. I'm not sure how much of the x8 bandwidth you need or if that would hamper what you're doing with editing. 

It wouldn't You can render at 2X just fine. Not uncommon to run 4 or more cards cards on an i9 or something. People have been screaming don't do it for forever as if it is going to slow anything down. With 4000-7000 cuda cores per card it will slow down initialization of a render by a couple seconds but then the render is pretty much directly perportional to your cuda cores so it frequently is 1 step backwards and 4000 forwards so to speak. more cuda more better

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×