Jump to content

Ryzen segmentation faults when compiling heavy GCC Linux loads

14 minutes ago, leadeater said:

That could still mean there is something in GCC that needs to be fixed that exists in all current and previous versions of GCC. Until that is fixed, if it is indeed something that GCC needs to do, it'll exist in all versions until that has been done no matter the Linux kernel or even potentially compiler flags.

 

If it's happening across compilers it could be hardware or the way the compilers are handling memory access on the Zen architecture and they are doing it in similar ways which wouldn't be surprising.

 

Have to also wonder if there is also more silent errors too on 'good runs', since it's not consistent.

since some people are seeing it while others don't, it excludes the GCC problem

from what I'm reading it seems like the bug is similar to what FMA bug used to behave because there are people who are saying increasing the Load Line Calibration alleviates the issue somewhat - some see whole lot less segfaults while others don't see them anymore

this ^ leads me to believe parts of the CPU die don't receive as much power as it's needed, just like previous FMA bug

Link to comment
Share on other sites

Link to post
Share on other sites

This could be interesting. I have no doubt that it will be fixed eventually, but Ryzen does seem to have had quite a few bumps in the road from release day until now. As someone who really want's AMD to compete again, even if it only amounts to better pricing and products, etc, it seems like a real shame there have been quite a few publicly obvious issues that it would have been nice to have not seen. 

 

Lets hope this is the last one!

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Pample said:

This could be interesting. I have no doubt that it will be fixed eventually, but Ryzen does seem to have had quite a few bumps in the road from release day until now. As someone who really want's AMD to compete again, even if it only amounts to better pricing and products, etc, it seems like a real shame there have been quite a few publicly obvious issues that it would have been nice to have not seen. 

 

Lets hope this is the last one!

Actually, I see it as somewhat of a good sign.  Not so much the issues themselves, but that there's so many people using them to have run into these issues.  It means there's a wide adoption of Ryzen by a lot of people in different use case scenarios.

 

I'd honestly be more surprised if there were no issues at all, given that this is a brand new architecture.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Jito463 said:

Actually, I see it as somewhat of a good sign.  Not so much the issues themselves, but that there's so many people using them to have run into these issues.  It means there's a wide adoption of Ryzen by a lot of people in different use case scenarios.

 

I'd honestly be more surprised if there were no issues at all, given that this is a brand new architecture.

I totally agree with the adoption rate, would love to actually see some stats from AMD about sales, etc. As you say, the more people using it the better. 

 

I'm not for a moment issues aren't to be expected, Intel has just as many as we all know. It somehow just feels like the ones Ryzen is hitting get more publicity which is a shame. We all need AMD to continue to be competitive, and Ryzen not to just be a one off and I just hope things like this wont have too much of an impact on that.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, LAwLz said:

 

The title is not sensationalistic, and the thread contains useful information for users who may be affected, so I am not sure why you are so upset.

For my part I don't think it is sensationalist at all, far from it. My concern would simply be that I want to see proper competition back in the market and a healthy AMD is PART of how that will happen. Just look at Intel's response to ThreadRipper. 

 

I have both Intel and AMD systems running and I'm happy with both, 1 is a server the other my desktop, I have a small bias towards AMD, but only because I feel Intel has been taking the piss for a long time. Yes AMD have perhaps not helped themselves, but none the less my point stands that Intel COULD have pushed CPU tech far more than it has in the last decade but has chosen not to and charged people for the privilege. 

 

I think all people want is something like a single bug that AMD are likely to fix shortly not over hyped as a doom and gloom problem that will sink the platform. 

 

I do completely agree that people need to not be so upset, focus on the facts and stick with the discussion. So, on that point I would like to point out here that it will be VERY interesting to see how AMD responds to this bug. 

Link to comment
Share on other sites

Link to post
Share on other sites

Has anyone tested it with GCC 7?

Link to comment
Share on other sites

Link to post
Share on other sites

18 hours ago, Curufinwe_wins said:

GCC is arguably the most important single C++ compiler in use today, so yeah this is a really big deal.

 

ESP in acedemia where otherwise ryzen has the chance to replace MUCH more expensive intel clusters. Although if the issue doesn't occur as well with threadripper, then things aren't all hopeless.

This. My university uses GCC quite a bit.

CPU: Ryzen 5950X Ram: Corsair Vengeance 32GB DDR4 3600 CL14 | Graphics: GIGABYTE GAMING OC RTX 3090 |  Mobo: GIGABYTE B550 AORUS MASTER | Storage: SEAGATE FIRECUDA 520 2TB PSU: Be Quiet! Dark Power Pro 12 - 1500W | Monitor: Acer Predator XB271HU & LG C1

 

Link to comment
Share on other sites

Link to post
Share on other sites

Thread cleaned.

 

Another moderator asked to get back on-topic a while ago. This post will be the last warning, anything below this that derails the thread into a AMD vs Intel discussion will receive a warning.

 

This thread ISN'T about AMD vs Intel, please keep the CS in mind;

Quote

No Trolling

  • This includes flame wars such as NVIDIA vs AMD, political or religious debate.
Quote
  • Ensure a friendly atmosphere to our visitors and forum members.
  • Encourage the freedom of expression and exchange of information in a mature and responsible manner.
  • "Don't be a dick" - Wil Wheaton.
  • "Be excellent to each other" - Bill and Ted.
  • Remember your audience; both present and future.

 

If you need help with your forum account, please use the Forum Support form !

Link to comment
Share on other sites

Link to post
Share on other sites

What is GCC? As far as Linux goes I've spent a few days on Ubuntu and that's about it. If I were to upgrade to Ryzen what sort of problems are there on using Linux with it?

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...

For people wanting to try this test on their Ryzen setup's, don't have to install Linux on the HDD.  LiveUSB method works. 


Happens on both machines for me.  Takes a bit longer to happen on the Gigabyte/R7-1700 setup than on the Taichi/R7-1700X.    The fix so far for me is to disable "OpCache" control in the ASROCK UEFI which I believe disables the CPU's Micro Op Cache.    The Gigabyte AB350-Gaming 3 has no such option in it's firmware so can't test if disabling the Micro Op cache fixes it.

Gotta let it run for at least 8 hours, on my Taichi - can get it to happen within 15-20 minutes.   On the Gigabyte, took 6 hours before the build-loop failed.  This looks like a hardware bug, the Micro Op Cache is not a "software" thing.

 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

Here's a quick guide for Ryzen people wanting to try this on their computer but don't want to install Linux (but have Windows running)

 

My Computer:
ASROCK TAICHI with v3.0 of UEFI (Agesa 1.0.0.6a)

32 Gigabytes of DDR4 (2800 MT/s) - Patriot Viper Elite (Dual Rank 16GB UDIMMS)
Ryzen 7 - 1700X (stock speed)
Corsair RM650i PSU
Corsair H110i CPU Cooler (max temp for me is 55c with Prime95)
EVGA GTX 1080ti SC2

 

Note: I'll also try this on my son's Computer which has a Ryzen 1700, 16 GB of DDR4 (3000 MT/s) - Patriot Viper 4 (Dual Rank 8GB UDIMMS), Corsair CS550M Power Supply, Ryzen Wraith Led Cooler, EVGA GTX1060, F7 version of UEFI  (Agesa 1.0.0.6a)

 

BIOS Setup:
Load Defaults (save and reboot)
Load XMP Profile 1
Enable SVM (Virtualization)
Disable CSM
Save and reboot.

 

Couple of Pre-Requisites:

1) 16 GB USB Key
2) Download RUFUS Tool for Windows
3) Download  artful-desktop-amd64.iso *OR* ubuntu-17.04-desktop-amd64.iso

 

Note: daily builds of Artful Aardvark comes with Kernel 4.11, while Zesty Zapus comes with Kernel 4.10.

 

4) Burn ISO image using RUFUS Tool onto USB Drive

5) Download AOMEI Partition Assistant Standard (Free)

6) Run the Partition Assistant  program and resize the FAT32 partition on the USB KEY to 4 Gigabytes, then apply the changes

7) Add a second partition with a file system type of "Unformatted" to the USB KEY, size it at 8 Gigabytes, then apply the changes.

 

Actual Procedure:

 

1) Plug  USB Drive in
2) Boot to UEFI
3) In the UEFI screen, select Exit Tab -> Boot Override -> UEFI USB Disk
4) In the GRUB screen, make sure "Try Ubuntu without installing" is highlighted, hit ''e" key.
5) Change the section with the words "quiet splash" to "nomodeset" then hit "F10"  (note, only change those two words).

6) It should take you to the ubuntu desktop
7) Turn off screen saver


Artful Aardvark (17.10):

 

System Settings -> Power ->  Power Saving [Section] -> Blank screen = Never, then hit the [x] icon to close the window.

 

Zesty Zapus (17.04):

 

Navigate [Gear Icon] -> System Settings -> "Brightness & Lock" -> "Turn screen off when inactive for: "  = Never, then hit [x] icon to close window

 

8) Right click on desktop and select "Open Terrminal"

9) In terminal, type "sudo su -"

10) It should log you in as root and change your working directory to /root

11) Type "free" - if it's showing you have 8 Gigs of SWAP then ignore SECTIONS 12 -> 15

12) Type "lsblk" to get a list of drives and partitions available in the system

13) Locate the block device alias for your USB KEY (in my system - it was /dev/sdb)

14) Type "mkswap /dev/sd?2" - replace the question mark with the correct device alias letter

15) Type "swapon /dev/sd?2" - replace the question mark with the correct device alias letter, this will use the secondary partition on the USB KEY as system swap space

15) type "wget http://funks.ddns.net:8080/tools/ryzen/testRyzenGCC.sh"

16) type "chmod u+x testRyzenGCC.sh"

17) type "./testRyzenGCC.sh" to startup the GCC 7.1.0 build loop which will download the pre-reqs
18) If the build loop crashes or stops, then you got something funky going on with your system (test for at least 8+ hours)

 

Note: may be a good idea to open up another terminal and run "top" to see what's going on while GCC is compiling.    Bad practice to run things as as root but this is a live usb.   My setup needed the "nomodeset" setting otherwise the GUI doesn't come up (1080, 11080TI)

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, NvidiaIntelAMDLoveTriangle said:

AMD in a nutshell.

Dunno if that's fair, Intel just fixed an SMT issue in their microcode that affected Skylake and Kabylake which also lead to compiler segfaults.   Took them quite a while to fix that one too (some users started reporting issues Q2-2016).

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, UDaManFunks said:

Dunno if that's fair, Intel just fixed an SMT issue in their microcode that affected Skylake and Kabylake which also lead to compiler segfaults.   Took them quite a while to fix that one too (some users started reporting issues Q2-2016).

 

After using AMD products for a long long time, and having issues that couldn't be fixed, I can safely say that statement is fair.

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, UDaManFunks said:

Dunno if that's fair, Intel just fixed an SMT issue in their microcode that affected Skylake and Kabylake which also lead to compiler segfaults.   Took them quite a while to fix that one too (some users started reporting issues Q2-2016).

that issue was fixed even before it became mainstream knowledge - it was the mobo manufacturers who were late to implement the fix

 

my GigaByte board got the microcode fix just at the start of this month ... the fuck! and they're not the only ones

Link to comment
Share on other sites

Link to post
Share on other sites

On 6/4/2017 at 0:10 PM, ARikozuM said:

How prevalent is GCC in professional use? I can think of a few, but not many.

It's practically everywhere. Every IDE I used for embedded systems that used C used GCC as the compiler. I use GCC as the compiler at work for most of the software we develop. I've yet to work with someone who doesn't use GCC for their C programs that they deliver to their customers.

Link to comment
Share on other sites

Link to post
Share on other sites

On 6/4/2017 at 3:13 PM, Drak3 said:

You won't hear larger groups complaining, as there isn't a commercial version of Ryzen that fully supports desirable features like ECC memory. Nor would they jump onto a new platform that fast anyways, unless it's a well validated lineup, such as Xeon. They'll probably be on platforms like Sandy/Ivy Bridge, Haswell, or Broadwell, the HEDT variants.

Small groups/ indie devs also aren't likely to be using Ryzen if they've been doing it prior to Ryzen, on a fairly capable system either.

All Ryzen CPUs support ECC memory.  It's up to the motherboard vendor if they wish to support it and even then you could probably still use ECC.  

 

http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/75030-ecc-memory-amds-ryzen-deep-dive.html

1 Timothy 1:15

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, f22luke said:

All Ryzen CPUs support ECC memory.

That's technically false. All Ryzen CPUs are compatible with ECC memory, but the consumer chips are not explicitly supported.

Your link even calls out something required for ECC to be explicitly supported: validation.

It then goes on to say that Ryzen is compatible, but they don't validate it to work with ECC memory.

 

It's like saying you can't overclock a Xeon or locked sku Core i. You can, but Intel will not help you should anything go wrong, under any circumstance, unless you lie about the circumstance. AMD might be more forgiving on that front, but consumer Ryzen does not explicitly support ECC.

Come Bloody Angel

Break off your chains

And look what I've found in the dirt.

 

Pale battered body

Seems she was struggling

Something is wrong with this world.

 

Fierce Bloody Angel

The blood is on your hands

Why did you come to this world?

 

Everybody turns to dust.

 

Everybody turns to dust.

 

The blood is on your hands.

 

The blood is on your hands!

 

Pyo.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Drak3 said:

That's technically false. All Ryzen CPUs are compatible with ECC memory, but the consumer chips are not explicitly supported.

Your link even calls out something required for ECC to be explicitly supported: validation.

It then goes on to say that Ryzen is compatible, but they don't validate it to work with ECC memory.

 

It's like saying you can't overclock a Xeon or locked sku Core i. You can, but Intel will not help you should anything go wrong, under any circumstance, unless you lie about the circumstance. AMD might be more forgiving on that front, but consumer Ryzen does not explicitly support ECC.

Not validating it and not having the feature is still different though. Not validating usually means they aren't testing ECC ram and are not putting them on an HCL list for the product, that doesn't mean it won't work or that there will even be any problems at all.

 

It's not much different than putting a Xeon in an X99 motherboard, the CPU absolutely supports ECC, even Registered, the X99 board on the other hand might not. There are X99 motherboard that support ECC/ECC Reg Asus X99-E WS being one of them.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, leadeater said:

Not validating it and not having the feature is still different though.

You're correct. But supporting something requires validating it. Not supporting something doesn't require removing it though.

 

ECC on Ryzen is not supported, because it is not validated by AMD. AMD didn't cut functionality, but if it doesn't work due to the IMC on a single chip being iffy, mainboards not supporting it, or Iggilious the blue orc of deep Wumpa 7 deems it so, you don't have ground to stand on when bitching to AMD (any mainboard manufacturer that supports ECC on Ryzen boards, you do).

 

And that's it. Yes, ECC will probably work on any Zen chip. But the operative word is probably. In certain cases, probably isn't good enough. Sometimes, absolutely needs to be the operative word. You don't get that with Ryzen.

6 minutes ago, leadeater said:

Not validating usually means they aren't testing ECC ram and are not putting them on an HCL list for the product, that doesn't mean it won't work or that there will even be any problems at all.

Funny, pretty sure I said it wasn't supported, but it was still compatible.

Come Bloody Angel

Break off your chains

And look what I've found in the dirt.

 

Pale battered body

Seems she was struggling

Something is wrong with this world.

 

Fierce Bloody Angel

The blood is on your hands

Why did you come to this world?

 

Everybody turns to dust.

 

Everybody turns to dust.

 

The blood is on your hands.

 

The blood is on your hands!

 

Pyo.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Drak3 said:

That's technically false. All Ryzen CPUs are compatible with ECC memory, but the consumer chips are not explicitly supported.

Your link even calls out something required for ECC to be explicitly supported: validation.

It then goes on to say that Ryzen is compatible, but they don't validate it to work with ECC memory.

 

It's like saying you can't overclock a Xeon or locked sku Core i. You can, but Intel will not help you should anything go wrong, under any circumstance, unless you lie about the circumstance. AMD might be more forgiving on that front, but consumer Ryzen does not explicitly support ECC.

Like I said some motherboard vendors will have ECC ram on their HCL.  AMD is just saying that they are not pushing the board makers to support it.  My statement still stands all Ryzen chips can use ECC given it's paired with the correct board.

1 Timothy 1:15

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, f22luke said:

Like I said some motherboard vendors will have ECC ram on their HCL.

Mainboard manufacturers validating ECC doesn't make ECC a full fledge supported feature of Ryzen. Ryzen itself is not validated for ECC by AMD, and any issues arise are on board manufacturers, that aren't equipped with the inner workings of Ryzen.

2 minutes ago, f22luke said:

My statement still stands all Ryzen chips can use ECC given it's paired with the correct board.

Never said it didn't work. I just said that it isn't supported by AMD in any capacity at any official level. Meaning, crying to AMD because your system doesn't work with ECC should not be considered an option.

Come Bloody Angel

Break off your chains

And look what I've found in the dirt.

 

Pale battered body

Seems she was struggling

Something is wrong with this world.

 

Fierce Bloody Angel

The blood is on your hands

Why did you come to this world?

 

Everybody turns to dust.

 

Everybody turns to dust.

 

The blood is on your hands.

 

The blood is on your hands!

 

Pyo.

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, Drak3 said:

Mainboard manufacturers validating ECC doesn't make ECC a full fledge supported feature of Ryzen. Ryzen itself is not validated for ECC by AMD, and any issues arise are on board manufacturers, that aren't equipped with the inner workings of Ryzen.

Never said it didn't work. I just said that it isn't supported by AMD in any capacity at any official level. Meaning, crying to AMD because your system doesn't work with ECC should not be considered an option.

Just with a few checks I can confirm that Asrock has at least two sets of ECC ram in the QVL for the Taichi.  It is not supported officially by AMD but I don't see how this makes a difference when the board partners are willing to support it. These are the same dies as what we are seeing in the Epyc server chips so it's not a silicon/ hardware issue but rather BIOS/micro code.  I grantee the board partners do know quite a lot about the inner working of their own BIOS/UEFI.  

 

At this point it's just semantics.  Weather or not the end user is confident in the board partner is all it comes down to.  Gigabyte and Asrock have gone so far as to list ECC support in their product spec lists and QVL so that's good enough for me.  If I ran in to an issue I would be talking with them and not AMD any ways.

1 Timothy 1:15

Link to comment
Share on other sites

Link to post
Share on other sites

30 minutes ago, Drak3 said:

ECC on Ryzen is not supported, because it is not validated by AMD.

Actually AMD confirmed Ryzen has ECC support, it's only a board partner issue at this point.

https://www.overclock3d.net/news/cpu_mainboard/amd_confirms_that_ryzen_supports_ecc_memory/1

 

Quote

ECC is not disabled. It works, but not validated for our consumer client platform.
 
Validated means run it through server/workstation grade testing. For the first Ryzen processors, focused on the prosumer/gaming market, this feature is enabled and working but not validated by AMD. You should not have issues creating a whitebox homelab or NAS with ECC memory enabled.
 
yes, if you enable ECC support in the BIOS so check with the MB feature list before you buy.

 

Motherboard manufactures are quite welcome to do that validation themselves testing as many ram kits as possible to verify which ones work, with how many slots populated, single rank or dual rank etc. They already do this it just means adding more products to test which is costly. Ryzen CPUs support ECC, validation/support isn't quite what you are saying it is.

 

AMD just isn't putting it on there official supported feature list for their Ryzen CPUs since then they have to honor it. Intel gives you unlocked multipliers but they do not validate it, the different there however is it's on the official product feature list of K/X processors. Support and validation are different.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, leadeater said:

this feature is enabled and working but not validated by AMD.

1 minute ago, leadeater said:

AMD just isn't putting it on there official supported feature list for their Ryzen CPUs since then they have to honor it.

 

2 minutes ago, leadeater said:

Intel gives you unlocked multipliers but the do not validate it, the different there however is it's on the official product feature list of K/X processors.

Intel validates that the multipliers can be changed, and that the stock clocks are stable. They don't validate clock speeds outside of stock configuration. Hense why overclocking is a supported feature, but Intel makes no guarantee on clocks.

3 minutes ago, leadeater said:

Support and validation are different.

In the context of technology, supporting something is validating that it works, and standing by that validation through warranty and providing help when something goes wrong.

 

Support and compatability are different.

 

Ryzen is compatible with ECC memory. It isn't officially supported.

Come Bloody Angel

Break off your chains

And look what I've found in the dirt.

 

Pale battered body

Seems she was struggling

Something is wrong with this world.

 

Fierce Bloody Angel

The blood is on your hands

Why did you come to this world?

 

Everybody turns to dust.

 

Everybody turns to dust.

 

The blood is on your hands.

 

The blood is on your hands!

 

Pyo.

Link to comment
Share on other sites

Link to post
Share on other sites

Guest
This topic is now closed to further replies.

×