Jump to content

Intel Skylake CPU flaw is making systems crash under heavy load.

GoodBytes

1) for the one worrying about Prime95 kills the CPU: set a max. power level in the BIOS (I have 100 watts for my i7-4790k). With prime it is pinned to 100 watts and downclock by about 100 MHz but it don't go wilde.

2) Prime95 does some tests by calculating an FFT. This is one of the most used functions in signal processing. Sure you don't only have a loop with an FFT as you use the results, but it still have to compute it hundrets of times per second. It better not crash.

3) A CPU is extremly complex product. With multiple cores, cache managemant and millions of possible race condition it's very difficult to catch all possible events. It's bad for us consumer but from an engineering point of view I can understand it.

4) i prefer a fix that has a 5% performance hit over a unstable system. Also a crash is better than if the result of the instruction is wrong but the system doesn't realize it.

Mineral oil and 40 kg aluminium heat sinks are a perfect combination: 73 cores and a Titan X, Twenty Thousand Leagues Under the Oil

Link to comment
Share on other sites

Link to post
Share on other sites

The more i read posts on this forum about Prime95 and exploding CPU's, the more apparent it is that people have absolutely no idea what Prime95 is, what it does, and what it was even invented for x.x

My (incomplete) memory overclocking guide: 

 

Does memory speed impact gaming performance? Click here to find out!

On 1/2/2017 at 9:32 PM, MageTank said:

Sometimes, we all need a little inspiration.

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

The more i read posts on this forum about Prime95 and exploding CPU's, the more apparent it is that people have absolutely no idea what Prime95 is, what it does, and what it was even invented for x.x

I have no idea where they got these things from. I think it was all made up here to justify unstable cpu overclocks or something like that, its pretty specific to this forum.

Link to comment
Share on other sites

Link to post
Share on other sites

Since no competition Intel started crappin their cpu, no performance gain is one thing but bad haswell bad skylake meh they arent doing their best at all.

Link to comment
Share on other sites

Link to post
Share on other sites

Well, if it's something a bios update can fix I guess no harm done.

 

What I'm interested in is whether they'll take this opportunity to axe overclocking on locked cpus... :rolleyes:

Edit: Wow I should read the whole post...

        Pixelbook Go i5 Pixel 4 XL 

  

                                     

 

 

                                                                           

                                                                              

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

I have no idea where they got these things from. I think it was all made up here to justify unstable cpu overclocks or something like that, its pretty specific to this forum.

Prime95 can override manual voltage settings in the bios. It can potentially fry your cpu as it will request more voltage than is allowed. It has nothing to do justifying unstable over clocks. It is a very real danger.

Link to comment
Share on other sites

Link to post
Share on other sites

The more i read posts on this forum about Prime95 and exploding CPU's, the more apparent it is that people have absolutely no idea what Prime95 is, what it does, and what it was even invented for x.x

 

JFC the cancer in this thread, please save me

 

I really haven't heard a single thing about "skylake crashing under heavy load"   XTU benchmark is essentially Prime95, and I have boatloads of friends that bench that all the time, never heard of them mentioning crashes unless OC was unstable.

 

I have no idea where they got these things from. I think it was all made up here to justify unstable cpu overclocks or something like that, its pretty specific to this forum.

 

It's because 90% of this forum are mindless sheep that just copy paste what some other uninformed idiot posted.

 

Prime95 can override manual voltage settings in the bios. It can potentially fry your cpu as it will request more voltage than is allowed. It has nothing to do justifying unstable over clocks. It is a very real danger.

 

 

With ADAPTIVE voltage on HASWELL yes it can.

Stuff:  i7 7700k @ (dat nibba succ) | ASRock Z170M OC Formula | G.Skill TridentZ 3600 c16 | EKWB 1080 @ 2100 mhz  |  Acer X34 Predator | R4 | EVGA 1000 P2 | 1080mm Radiator Custom Loop | HD800 + Audio-GD NFB-11 | 850 Evo 1TB | 840 Pro 256GB | 3TB WD Blue | 2TB Barracuda

Hwbot: http://hwbot.org/user/lays/ 

FireStrike 980 ti @ 1800 Mhz http://hwbot.org/submission/3183338 http://www.3dmark.com/3dm/11574089

Link to comment
Share on other sites

Link to post
Share on other sites

Personal experience =/= everyone's experience.

Most people I know with Skylake and myself haven't experience a crashed in our games, including Fallout 4. And I render and bake a lot with Unity 3D which hammers all cores to 100% for 15mins as it does computations and rendering, I am pretty sure that is more demanding than playing Fallout 4, that game doesn't even reach more than 60% load on my i7 6700K, let alone it doesn't even use up a core at 100%.

If you experience crashes on common workload then this is a different issue entirely, this topic is about crashing in almost or absolute "worse case" scenario workload, you could have a bad sample CPU, motherboard, RAM, video card,

I have the 6700k over clocked to 4.6ghz. I verified the over clock for 24 hours in AIDA64, and in real bench for 4 hours. I had tomb raider crash the very first time I attempted to play it. It never crashed again after that. I thought that was unusual as I had installed a fresh copy of Windows 10 and the latest drivers for my gtx 980.

Link to comment
Share on other sites

Link to post
Share on other sites

The Skylake bugg isn't Odd, it's Prime

 

P95 isn't any different than other distributed computation applications

it's the AVX2 instructions that are heavily taxing on Intels

desktop haswell chips increase voltage and tun into mini-spupernovas

xeon haswells reduce clock speed but stay at the same voltage to manage the heavy hit on AVX2

 

I suppose on Skylake the same happens and it increases the voltage too much and triggers some protection on chip that causes it to halt

it's just 22nm vs 14nm and how they deal with extreme voltages is what causes the difference in one versus the other, I wonder what happens on Broadwell chips )))

 

would be pretty funny if could witness "Halt-Catch-Fire" bug again )))

CPU: Intel i7 5820K @ 4.20 GHz | MotherboardMSI X99S SLI PLUS | RAM: Corsair LPX 16GB DDR4 @ 2666MHz | GPU: Sapphire R9 Fury (x2 CrossFire)
Storage: Samsung 950Pro 512GB // OCZ Vector150 240GB // Seagate 1TB | PSU: Seasonic 1050 Snow Silent | Case: NZXT H440 | Cooling: Nepton 240M
FireStrike // Extreme // Ultra // 8K // 16K

 

Link to comment
Share on other sites

Link to post
Share on other sites

It's because 90% of this forum are mindless sheep that just copy paste what some other uninformed idiot posted.

I believe it all started with linus video on haswell overclocking

CPU: Intel i7 5820K @ 4.20 GHz | MotherboardMSI X99S SLI PLUS | RAM: Corsair LPX 16GB DDR4 @ 2666MHz | GPU: Sapphire R9 Fury (x2 CrossFire)
Storage: Samsung 950Pro 512GB // OCZ Vector150 240GB // Seagate 1TB | PSU: Seasonic 1050 Snow Silent | Case: NZXT H440 | Cooling: Nepton 240M
FireStrike // Extreme // Ultra // 8K // 16K

 

Link to comment
Share on other sites

Link to post
Share on other sites

The Skylake bugg isn't Odd, it's Prime

 

P95 isn't any different than other distributed computation applications

it's the AVX2 instructions that are heavily taxing on Intels

desktop haswell chips increase voltage and tun into mini-spupernovas

xeon haswells reduce clock speed but stay at the same voltage to manage the heavy hit on AVX2

 

I suppose on Skylake the same happens and it increases the voltage too much and triggers some protection on chip that causes it to halt

it's just 22nm vs 14nm and how they deal with extreme voltages is what causes the difference in one versus the other, I wonder what happens on Broadwell chips )))

 

would be pretty funny if could witness "Halt-Catch-Fire" bug again )))

 

I believe it all started with linus video on haswell overclocking

 

 

It only does it on adaptive voltage, if you set manual volts it doesn't do it.  Not to mention it's not the only benchmark/stress test that will over-volt.  It's just haswell & the stupid FIVR that is the problem, Skylake has no problems running p95 from what I have seen in overclocking threads on OCN, and people selling prime95 high OC validated CPU's.

 

Not to mention Skylake is more voltage resilient than haswell is, 1.4v 24/7 is NP with proper cooling.

Stuff:  i7 7700k @ (dat nibba succ) | ASRock Z170M OC Formula | G.Skill TridentZ 3600 c16 | EKWB 1080 @ 2100 mhz  |  Acer X34 Predator | R4 | EVGA 1000 P2 | 1080mm Radiator Custom Loop | HD800 + Audio-GD NFB-11 | 850 Evo 1TB | 840 Pro 256GB | 3TB WD Blue | 2TB Barracuda

Hwbot: http://hwbot.org/user/lays/ 

FireStrike 980 ti @ 1800 Mhz http://hwbot.org/submission/3183338 http://www.3dmark.com/3dm/11574089

Link to comment
Share on other sites

Link to post
Share on other sites

As I can see, people just don't read on this forum, they go by their own convictions and hammer on it until everybody just gives up or the thread gets locked. :|

 

One user posted this:
http://www.tomshardware.com/news/skylake-prime-number-bug,30979.html

 

And here, I will highlight the important parts

 

 

 

Intel discovered that its latest 6th Gen (Skylake) Core processors have a bug that can cause the system to freeze or crash when calculating prime numbers.

 

So to elaborate, this isn't Prime95 exclusive problem it is a problem involving prime numbers and while that is certainly not a problem for the majority of regular consumers IT IS A PROBLEM for enterprises that do this kind of research and scientific calculations because this problem is not exclusive to consumer CPUs as it's affecting Skylake architecture.

 

 

The bug is not limited to Prime95; it may also affect compute-intensive programs such as scientific and financial applications. This problem affects all Skylake processors ranging from the low-end Core M up to the top-end Xeon CPUs.

 

They will fix it soon and everything will hopefully be back to normal operation so stop arguing if this is a problem because Prime95 is so much intensive that nothing ever is intended to run it and look at the facts. This goes for you @Enderman

Link to comment
Share on other sites

Link to post
Share on other sites

You don't have to go THAT far back, Broadwell has a problem too causing bluescreens when using MS Office 2013.

That was confirmed to be Microsoft's fault.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

People need to read post again.

 

 

It says only under certain loads, so what they're saying is it needs to be a specific work load in order to cause the crash. Playing games is most likely not an issue.

System Specs:

CPU: Ryzen 7 5800X

GPU: Radeon RX 7900 XT 

RAM: 32GB 3600MHz

HDD: 1TB Sabrent NVMe -  WD 1TB Black - WD 2TB Green -  WD 4TB Blue

MB: Gigabyte  B550 Gaming X- RGB Disabled

PSU: Corsair RM850x 80 Plus Gold

Case: BeQuiet! Silent Base 801 Black

Cooler: Noctua NH-DH15

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Feeling sad for those who experience these issue :( Wonder what Intel will do about this.

Connection200mbps / 12mbps 5Ghz wifi

My baby: CPU - i7-4790, MB - Z97-A, RAM - Corsair Veng. LP 16gb, GPU - MSI GTX 1060, PSU - CXM 600, Storage - Evo 840 120gb, MX100 256gb, WD Blue 1TB, Cooler - Hyper Evo 212, Case - Corsair Carbide 200R, Monitor - Benq  XL2430T 144Hz, Mouse - FinalMouse, Keyboard -K70 RGB, OS - Win 10, Audio - DT990 Pro, Phone - iPhone SE

Link to comment
Share on other sites

Link to post
Share on other sites

wow why do so many people make up things that i never said??

 

of course there is a problem with skylake CPUs

 

my entire point is that this problem only happens when a specific function of the CPU is used, which is extremely rare since the only widespread program that uses that function is prime95, and no other consumer workload will ever encounter that issue

and as far as I (and everyone else on this forum) know there is nothing else that puts a load on a CPU like prime95 does

 

 

there is a really big difference between "this CPU has problems running everything" and "this CPU has problems running this one specific program that you shouldnt be running in the first place"

 

skylake is the latter

once again i doubt intel would release a fix if prime95 is the only program with problems

Link to comment
Share on other sites

Link to post
Share on other sites

Seriously :|

CPU: AMD Ryzen 5800X GPU: Nvidia RTX 3080 12GB + Motherboard: Asus Crosshair VIII Hero

  Case: Asus ROG Strix Helios Gundam Edition Power Supply: Asus ROG Thor 850P

 

Link to comment
Share on other sites

Link to post
Share on other sites

Looking into the details its a problem in the avx1 instruction set when calculating particular sized fast former transforms. It's definitely not going to be just prime 95 a lot of everyday software uses ffts they are everywhere in business applications and in video encoding, music and well everything!

So its pretty bad even if rare.

Link to comment
Share on other sites

Link to post
Share on other sites

Looking into the details its a problem in the avx1 instruction set when calculating particular sized fast former transforms. It's definitely not going to be just prime 95 a lot of everyday software uses ffts they are everywhere in business applications and in video encoding, music and well everything!

So its pretty bad even if rare.

 

Yes FFTs are all over the place. It is THE instrument for signal  precessing.

In fact I'm running a simulation at work right now that runns over nigth as it takes about 12 hours. It does about 100 FFTs per second. Luckily it's on an i7-4790. Would be pain in the neck if the system crashes once a day.

Mineral oil and 40 kg aluminium heat sinks are a perfect combination: 73 cores and a Titan X, Twenty Thousand Leagues Under the Oil

Link to comment
Share on other sites

Link to post
Share on other sites

because it puts an impossible load on the CPU

no regular program in the world puts that kind of load, so it is a completely unrealistic stress test

it causes specific parts of the CPU to heat up purposefully just for the point of being hot, not actually to put load

this can make the CPU reach dangerously high temperatures without proper cooling, and even with good cooling, it can cause the core voltage to increase far above safe levels to keep the CPU stable if your motherboard supports adaptive voltage

 

so basically you heat up the CPU a ton, reduce its lifespan, and potentially kill it with overvoltage just to have an unrealistic stress scenario which will not ever be achieved anywhere close to by a standard program

And that's what I want when I stress test my CPU, I want to push it to it's limit even if it's "unrealistic". And about the voltage thing from what I seen you can just set it manually.

  ﷲ   Muslim Member  ﷲ

KennyS and ScreaM are my role models in CSGO.

CPU: i3-4130 Motherboard: Gigabyte H81M-S2PH RAM: 8GB Kingston hyperx fury HDD: WD caviar black 1TB GPU: MSI 750TI twin frozr II Case: Aerocool Xpredator X3 PSU: Corsair RM650

Link to comment
Share on other sites

Link to post
Share on other sites

And that's what I want when I stress test my CPU, I want to push it to it's limit even if it's "unrealistic". And about the voltage thing from what I seen you can just set it manually.

theres a difference between a stress test and a torture test

you should not be torturing components

a regular stress test like aida64 or intel XTU is already enough to make sure it runs 100% stable for ANY workload

 

and the voltage is done automatically, you cant stop it

this is why intelligent people actually use the older version of prime95 to test components, not the versions after the AVX addition

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to comment
Share on other sites

Link to post
Share on other sites

As I can see, people just don't read on this forum, they go by their own convictions and hammer on it until everybody just gives up or the thread gets locked. :|

 

One user posted this:

http://www.tomshardware.com/news/skylake-prime-number-bug,30979.html

 

And here, I will highlight the important parts

 

 

So to elaborate, this isn't Prime95 exclusive problem it is a problem involving prime numbers and while that is certainly not a problem for the majority of regular consumers IT IS A PROBLEM for enterprises that do this kind of research and scientific calculations because this problem is not exclusive to consumer CPUs as it's affecting Skylake architecture.

 

 

They will fix it soon and everything will hopefully be back to normal operation so stop arguing if this is a problem because Prime95 is so much intensive that nothing ever is intended to run it and look at the facts. This goes for you @Enderman

My 7 year old processor can run it so aparently something is designed for it.

 

If i use too little voltage if i set the vcore to 1.19v it will crash with prime 95 its what you call NOT stable. so prime95 is not only about the temperatures.

Link to comment
Share on other sites

Link to post
Share on other sites

theres a difference between a stress test and a torture test

you should not be torturing components

a regular stress test like aida64 or intel XTU is already enough to make sure it runs 100% stable for ANY workload

 

and the voltage is done automatically, you cant stop it

this is why intelligent people actually use the older version of prime95 to test components, not the versions after the AVX addition

I ran aida64 for 48 hours on my Xeon X5450 with it at 4.4GHz. I thought it was stable until I started playing Crysis 3. I managed to get it stable for 48 hours after some tweaks with Prime95 and it never crashed (just had thermal issues due to the AIO). BTW you do know that the voltage can be set manually? And that the only CPU damaged by Prime95 are because of high overclocks with Haswell when running Prim95, or those with poor overclocking skills? Prime95 doesn't kill CPU, its overclocking them that does. For example when running a program that fully utilises the CPU with the FMA2 instruction set, power consumption of my i7 4790K jumps to 130W, and it stays at 4GHz as going higher can cause the core voltage to exceed "safe" limits.

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

theres a difference between a stress test and a torture test

you should not be torturing components

a regular stress test like aida64 or intel XTU is already enough to make sure it runs 100% stable for ANY workload

 

and the voltage is done automatically, you cant stop it

this is why intelligent people actually use the older version of prime95 to test components, not the versions after the AVX addition

As previous posts mentioned, some people do use the CPUs for these intensive workloads. It also still remains as a fact that all Intel had to do to actually test if it properly worked was run Prime95. Sorta makes you wonder if Intel is properly testing their architectures and whether the BIOS update will affect said users.

 

Personally, if I run any sort of test with a hardware component (CPU, RAM or whatnot) at factory specifications and it ends up failing/crashing, I would label the product I received as broken. I don't care if it is a rare scenario, since if I'm unlucky enough, I could be screwed over and unlike any other component, the CPU is the most expensive to switch out (since it is an architecture problem). This problem could also cause a long process of RMAs (which is pretty terrible), since it is architecture problem and not a singled out bad chip.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×