Jump to content

FMA3 bug discovered in (Ry)Zen

zMeul

source: https://www.heise.de/newsticker/meldung/Bug-in-AMD-Ryzen-Kompletter-Systemabsturz-bei-manchen-FMA3-Anwendungen-3641409.html

http://forum.hwbot.org/showthread.php?t=167605

 

AAEAAQAAAAAAAAk3AAAAJDI4ODlhYmM4LWI3ZmYt

 

HWBot users have discovered a reproductible hardware bug that seems to affect FMA3 instructions

the bug appears to show up when it gets to

Quote

Single-Precision - 128-bit FMA3 - Fused Multiply Add:

 

if this is a instruction set but and not something else, this could be potentially patched via a BIOS micro-code update 

so far, this bug has been confirmed on:

Quote
  • 1800X + Asus Prime B350M-A (BIOS 0502)
  • 1700 + Asus Prime B350M-A (BIOS ???)
  • 1700 + Asus Crosshair VI Hero
  • 1700 + Asus Crosshair VI Hero (BIOS 5803) (two sets of memory G.Skill + Kingston - also fails with overvolted SOC)
  • 1800X + Asus Crosshair VI Hero (Windows 7) - Once pass, mostly failures.

Heise.DE was able to reproduce the bug with Ryzen 7 1700X on the MSI X370 XPower Gaming Titanium and a Ryzen 7 1700 on the Asus Crosshair VI Hero

 

instruction set bugs are not uncommon, as Intel's own Skylake launch was troubled with AVX bug that was discovered couple of months after

 

at this time, AMD did not respond to the inquiry made by Heise.DE

 

---

 

update: 16th of March

 

more info on this from PCPer https://www.youtube.com/watch?v=dDrOQrjJTSs

it seems that the FMA3 set isn't actually bugged, but it seems it doesn't have enough power delivered to it as 25% of the people who OCed their CPU were able to ran the test successfully

 

sources inside AMD have said that it's a known issue and a BIOS update is indeed planned

Edited by zMeul
Link to comment
Share on other sites

Link to post
Share on other sites

So a bug was found in a benchmark software that 

Quote

 uses highly-optimized code with modern command set extensions such as SSE and FMA3 (Fused Multiply-Add 3)

So outside these types of programs literally not a problem.

                     ¸„»°'´¸„»°'´ Vorticalbox `'°«„¸`'°«„¸
`'°«„¸¸„»°'´¸„»°'´`'°«„¸Scientia Potentia est  ¸„»°'´`'°«„¸`'°«„¸¸„»°'´

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, vorticalbox said:

So a bug was found in a benchmark software that 

So outside these types of programs literally not a problem.

Well who's to say other programs won't also need/use that thing?

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, vorticalbox said:

So a bug was found in a benchmark software that 

So outside these types of programs literally not a problem.

the bug is not in the software, it's in the CPU's FMA3 instruction set - not the same thing -_-

Link to comment
Share on other sites

Link to post
Share on other sites

Well as far as i can work out they dropped FMA4 so i don't think that many programs use it...

Can be wrong tho, i have no idea how this stuff works.

If you want my attention, quote meh! D: or just stick an @samcool55 in your post :3

Spying on everyone to fight against terrorism is like shooting a mosquito with a cannon

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, samcool55 said:

Well as far as i can work out they dropped FMA4 so i don't think that many programs use it...

Can be wrong tho, i have no idea how this stuff works.

as far as I can tell, both AVX2 and FMA3 is used by ZFS file system - so, that's kinda' big in Linux world

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, zMeul said:

the bug is not in the software, it's in the CPU's FMA3 instruction set - not the same thing -_-

I understand that but if any given software doesn't end up calling that function then no problems. If a lot of programs did use this function there would be a shit storm right now, rather than some benchmarker that happened to find it. 

I'm not saying this isn't a massive issue, it is, it's just not going to cause that my problems.

                     ¸„»°'´¸„»°'´ Vorticalbox `'°«„¸`'°«„¸
`'°«„¸¸„»°'´¸„»°'´`'°«„¸Scientia Potentia est  ¸„»°'´`'°«„¸`'°«„¸¸„»°'´

Link to comment
Share on other sites

Link to post
Share on other sites

@leadeaterregarding this: https://linustechtips.com/main/topic/746289-amd-responds-to-1080p-gaming-tests-on-ryzen-supports-ecc-ram-win-10-smt-bug/?do=findComment&comment=9457973

I remember what bug was about, it was about TSX implementation bug discovered in Haswell and Broadwell - Intel had to disable the set in the 1st two generations of Haswell and 1st generation of Broadwell

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, vorticalbox said:

I understand that but if any given software doesn't end up calling that function then no problems. If a lot of programs did use this function there would be a shit storm right now, rather than some benchmarker that happened to find it. 

I'm not saying this isn't a massive issue, it is, it's just not going to cause that my problems.

that issue is that you called it a SW bug, it was quite incorrect

whether the software you use actually needs FMA3, or not, doesn't mean the issue is not there

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Jorgen297 said:

Would it be possible to fix this with a microcode update? Or is it embedded in the actual circuitry in the CPU? 

depends on the gravity of the bug,

it is possible to fix it via microcode update - Intel did the same with Skylake's AVX bug

but, if the issue is way to complex, AMD might be forced to disable the entire set - Intel did this for TSX set on two generations of Haswell and 1st gen of Broadwell

Link to comment
Share on other sites

Link to post
Share on other sites

Wow, Ryzen has sure had a rocky launch... one thing after another.  Seems new problems are cropping up faster than they can fix the ones we already knew about :/ 

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Ryan_Vickers said:

Wow, Ryzen has sure had a rocky launch... one thing after another.  Seems new problems are cropping up faster than they can fix the ones we already knew about :/ 

that's why you have engineering samples and test the fuck out of them before going full production

and after production you do fucking validations to confirm the fucker isn't screwing around - especially since AMD aimed to sell this as compute CPU replacement/competitor for Broadwell-E

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, zMeul said:

that's why you have engineering samples and test the fuck out of them before going full production

and after production you do fucking validations to confirm the fucker isn't screwing around - especially since AMD aimed to sell this as compute CPU replacement/competitor for Broadwell-E

It's unfortunate though... they really need this to go well.  We all do, even if you don't buy from them - more competition only helps the market.

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Ryan_Vickers said:

It's unfortunate though... they really need this to go well.  We all do, even if you don't buy from them - more competition only helps the market.

they only have themselves to blame tho

I wonder who will AMD point the finger at this time xD those pesky compilers .. sure

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, samcool55 said:

Well as far as i can work out they dropped FMA4 so i don't think that many programs use it...

Can be wrong tho, i have no idea how this stuff works.

There were two problems with FMA4: it was only on AMD processors so already there is a limited use case, and the only person I know who tried to use it (George Woltman, of Prime95 fame) couldn't extract any performance boost from using it over not using it. 

 

FMA3 was implemented by Intel in Haswell, and it did provide a huge performance boost and gained huge traction in certain mathematical uses.

 

There seems to be some activity to update Prime95 to support Ryzen, although for the moment the latest test version I've seen adds better CPU detection and uses Intel FMA3 code still. I will say there haven't been any problems I'm aware of running the stress test on Ryzen in FMA3 mode. I don't know if things will change much if an AMD native/optimised FMA3 was implemented. My impressions so far are that if you do want high FMA3 performance, stick to Intel. Any Sky/Kabylake quad core i5 would beat an equivalently clocked R7 in this use case. At best R7 has half the IPC.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

Isn't a benchmark/stress-test run like that exactly what you would use in the R&D department to test your implementation of the instruction set? How would this have gone under the radar? I HOPE this is something fixable in the code rather than the lithography.

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, samcool55 said:

Well as far as i can work out they dropped FMA4 so i don't think that many programs use it...

Can be wrong tho, i have no idea how this stuff works.

This bug concerns FMA3, which is different from FMA4.

 

AMD introduced FMA4 first, then switched to FMA3 when it became clear Intel was going to be using FMA3.

 

There likely isn't much software that uses FMA4, but FMA3 usage should be higher because it's what both Intel and AMD are pushing.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, zMeul said:

AdoredTV -_-

so whats wrong with it? he shows evidence and tells the truth.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, zMeul said:

please remove that trash :S

I can't ban you. I would have to ask someone.

Edited by matrix07012
grammar
Spoiler

Quiet Whirl | CPU: AMD Ryzen 7 3700X Cooler: Noctua NH-D15 Mobo: MSI B450 TOMAHAWK MAX RAM: HyperX Fury RGB 32GB (2x16GB) DDR4 3200 Mhz Graphics card: MSI GeForce RTX 2070 SUPER GAMING X TRIO PSU: Corsair RMx Series RM550x Case: Be quiet! Pure Base 600

 

Buffed HPHP ProBook 430 G4 | CPU: Intel Core i3-7100U RAM: 4GB DDR4 2133Mhz GPU: Intel HD 620 SSD: Some 128GB M.2 SATA

 

Retired:

Melting plastic | Lenovo IdeaPad Z580 | CPU: Intel Core i7-3630QM RAM: 8GB DDR3 GPU: nVidia GeForce GTX 640M HDD: Western Digital 1TB

The Roaring Beast | CPU: Intel Core i5 4690 (BCLK @ 104MHz = 4,05GHz) Cooler: Akasa X3 Motherboard: Gigabyte GA-Z97-D3H RAM: Kingston 16GB DDR3 (2x8GB) Graphics card: Gigabyte GTX 970 4GB (Core: +130MHz, Mem: +230MHz) SSHD: Seagate 1TB SSD: Samsung 850 Evo 500GB HHD: WD Red 4TB PSU: Fractal Design Essence 500W Case: Zalman Z11 Plus

 

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, porina said:

There were two problems with FMA4: it was only on AMD processors so already there is a limited use case, and the only person I know who tried to use it (George Woltman, of Prime95 fame) couldn't extract any performance boost from using it over not using it. 

 

FMA3 was implemented by Intel in Haswell, and it did provide a huge performance boost and gained huge traction in certain mathematical uses.

FMA4 and FMA3 do basically the same thing, it's just a small difference. What matters is AMD ditched FMA4 when it became clear Intel would be going for FMA3. AMD actually launched processors with FMA3 support before Intel.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, matrix07012 said:

I can't ban you. I've have to ask someone.

why would you ban me? because AdoredTV is actual trash!?!? are you even aware of the background of that channel?

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×