Jump to content

DerBauer hardware survey highlights issues with max boost clock on Ryzen 3000 "It's worse than I thought"

Necrotic

@5x5 I am kinda on board with ya. I got my 3900X and love it. I never expected I'd regularly see 4.6ghz mostly cause few things are gonna use on a system ONLY one thread and lets remember background functions still run and COULD hit another thread. when I hoped for and GOT in spades was improved performance over my E3-1270vs ivy bridge xeon with 4c/8t and max of 3.9ghz [which I despite having better cooling than needed also NEVER got! I hit 3.81ghz but also all cores in use so meh] and guess what it is vastly superior to my ivy bride CPUs and it's great. My HAF 922 replaced by a H500 case with front mesh panel is comparatively dead fucking silent.  I think people are griping over something small. And AMD looks to be addressing it anyways. This feels like the 5ghz rumor pissing match that occured at launch.

 

For refrence my old rig used a Hyper 212 evo, my new rig uses a Dark Rock 4.

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, ravenshrike said:

Did they? Per Steve at HU there were one or two motherboards that hit the boost clocks on the previous BIOS with all the chips, and we have no clue what AMD's internal testing methodology was at launch. It is entirely possible that they saw no issues in internal testing and then had to retest stuff after it was released to figure out why some motherboards weren't hitting it.

 

41 minutes ago, leadeater said:

AGESA 1.0.0.3 came out after release of product and went on sale. AMD validates products much more in advance on earlier microcode revisions and bios versions. You're applying a post release issue as if it were a pre-release issue. Granted 1.0.03 came out very earlier on and actually being aware of a clock impact isn't necessarily known to be the case as soon as you/we might think.

 

Rough example:

AMD validates product on revision A and create product specifications around it

AMD release product for sale and review on revision B

AMD during release and review period addresses various issues and releases revision C (introduction of clock speed issue)

 

tenor.gif

 

Reviews were carried out on quite a few different AGESA versions, not even just two. I don't think anyone doesn't believe Ryzen 3000 was released a bit early but that doesn't actually mean these clock issues would have been spotted even with another month delay. Unless someone notifies you of an issue or you suspect there is one going and validating a large inventory of product that has already been manufactured to verify product meets specification wouldn't happen, far as you're concerned they already had and passed when manufactured.

Did AMD not write and test each version of the AGESA?  Either they failed to test the ryzens on the launched AGESA or they failed to test the AGESA release after the ryzen launch. 

 

They are responsible for testing and releasing both and I find it difficult that something of this nature just slipped through at both ends of product testing.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, mr moose said:

 

Did AMD not write and test each version of the AGESA?  Either they failed to test the ryzens on the launched AGESA or they failed to test the AGESA release after the ryzen launch. 

 

They are responsible for testing and releasing both and I find it difficult that something of this nature just slipped through at both ends of product testing.

You'd be surprised how many things slip though beta and alpha testing of a product. That's why both Intel and AMD release bios updates. Cause you can't see the future.ans there is always a chance something may be broken unintentionally. When testing AGESA 1.0.0.3 they were likely looking at other aspects of the chip and not paying attention to max boost

Link to comment
Share on other sites

Link to post
Share on other sites

Why is AMD adding stupid letters to a 4 digit version numbers. Like, dafaq. AGESA 1.0.0.2 ABBA. Use the freaking numbers you idiots lmao, it's why you have them there.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, mr moose said:

Did AMD not write and test each version of the AGESA?  Either they failed to test the ryzens on the launched AGESA or they failed to test the AGESA release after the ryzen launch. 

 

They are responsible for testing and releasing both and I find it difficult that something of this nature just slipped through at both ends of product testing.

They tested it as far as required for the reported bug that they were fixing and other general functionality. Looking for specific things like 25Mhz reduction in single core boost isn't something you're going to be looking out for if you were already achieving it. Like I said unless you suspect it to be a problem it's not something you're going to be looking for.

 

And there isn't anything to say that AMD's engineering test equipment did not achieve the product spec clocks using AGESA 1.0.0.3, we don't actually know. All we know is that after 1.0.0.3 many 3rd party vendor motherboards in conjunction with Ryzen 3000 CPUs were no longer able to achieve maximum single core boost boost spec, but not all Ryzen 3000 CPUs did not.

 

Something like this is very easy to slip through when you're addressing issues right on the release of the product and it's extremely important to fix issues as quickly as possible. Before a product is released you have the luxury of time, after and within a day of release you don't have the luxury of time and you better fix reported issues quickly or people will complain.

 

Edit:

Far as I see it reviewers were not given the products soon enough and given long enough to carry out tests and engage with AMD on any issues they find. You'll never find every issue before release and it's very important to get product to vendors and get real data. That's why HPE/Dell/Lenovo/Microsoft/Google/AWS get Xeon samples so soon and for so long, to get the product as ready as possible before release.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, RejZoR said:

Why is AMD adding stupid letters to a 4 digit version numbers. Like, dafaq. AGESA 1.0.0.2 ABBA. Use the freaking numbers you idiots lmao, it's why you have them there.

because no new features/improvements come with those changes

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, cj09beira said:

because no new features/improvements come with those changes

So? Version numbers are there for a reason, unless they plan on doing AGESA 1.0.0.0 ABCDEFGHIJKLMNO...

 

And if there was a release, there WERE changes. Otherwise, why would you release a new one? Also minor versions only advance the last number, so if they only change bunch of small things, AGESA 1.0.0.25 would make more sense (if they advance every minor bugfix by 1) than adding bunch of letters in the end. They'll just end up having 1.0.0.X the same for whole decade lol. Just saying...

 

I also take back the complaints people are overreacting over clocks. As Gamers Nexus benchmarked, the game improvements were actually quite significant, at least in games they tested with. If AMD can further tweak it to keep clocks higher on more cores that would be pretty sweet, maybe raising performance even more. Finewine even on CPU's :D

Link to comment
Share on other sites

Link to post
Share on other sites

So essentially what  I am reading is that no one thinks AMD should do anything other than isolated testing of a major product.

 

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

I have a day job in a different area of tech industry, but I see parallels. Product has known bugs, and unknown bugs will be found over time. There is a finite resource to fix those bugs. There will be a determination which bugs get fixed at what time. If you wait until the product is bug free, you'll be waiting forever. It was working "enough" at launch. It is working better now. It will work better in the future.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, mr moose said:

So essentially what  I am reading is that no one thinks AMD should do anything other than isolated testing of a major product.

Well you either run all say 10,000 unit tests for a bug fix that addresses smaller areas of the code or you run the unit tests that are most likely needed to be run that pertain to the sections of code and functionality that was changed. If all 10,000 unit tests would take 2 weeks to complete but you think you only need to run 250 of them and the product is on sale right now with people trying to use them you'll run the 250 you think you need to and release the firmware update.

 

Not everything does actually need to be tested and not everything can be (reasonably), there is only so much a company can do with engineering samples and controlled tests before actual in use feedback is necessary. Even a week longer with reviewers likely would have resulted in half the revisions of 1.0.0.3 that ended up coming out.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, leadeater said:

Well you either run all say 10,000 unit tests for a bug fix that addresses smaller areas of the code or you run the unit tests that are most likely needed to be run that pertain to the sections of code and functionality that was changed. If all 10,000 unit tests would take 2 weeks to complete but you think you only need to run 250 of them and the product is on sale right now with people trying to use them you'll run the 250 you think you need to and release the firmware update.

 

Not everything does actually need to be tested and not everything can be (reasonably), there is only so much a company can do with engineering samples and controlled tests before actual in use feedback is necessary. Even a week longer with reviewers likely would have resulted in half the revisions of 1.0.0.3 that ended up coming out.

I would have considered putting the whole thing under load and pushing it to be a more general test considering clock speeds have been an ongoing issue. 

 

https://www.techpowerup.com/257817/amd-readies-agesa-comboam4-1-0-0-3abb-addresses-several-issues-affecting-3rd-gen-ryzen

 

Now I am not saying these are all the same issue, but if they are having issues with clock speeds and voltage etc then that should be something they test with each new release and patch.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, mr moose said:

I would have considered putting the whole thing under load and pushing it to be a more general test considering clock speeds have been an ongoing issue. 

True but that may have been done and worked fine on the engineering board, or they may have just not done a single core check since the original complaints were all core loads. Who knows. We aren't dealing with 100% all did not situation.

 

There's also the classic problem of the test being too sterile and removing too many variables. AMD may have not actually checked it under Windows and used the chip verification methods, which is more accurate but.... less accurate to real world.

 

1.0.0.3AB from what I remember was the most recent AGESA you could get at release time and only from Gigabyte, this had the boost clock issues in it.

 

Edit:

Also these A, AB, ABA, ABB, ABBA suffixes are ridiculous. Like @RejZoR said there's 4 freaking number sections, damn well use them.

Link to comment
Share on other sites

Link to post
Share on other sites

Seems like the root of the problem is that AMD is rushing pretty much every Ryzen release out the door as fast as possible. As much as I like the aggressive approach, we could have avoided tons of issues (Ryzen 1000 memory incompatibilities, TR Windows scheduler bugs, etc) if they had given each iteration another month or two in the oven. 

 

Spoiler

To be clear, I know the scheduler problems didn't really show up until way later for TR, but it seems like the sort of thing that they could have picked up on with several weeks or months of intensive QC. 

 

Link to comment
Share on other sites

Link to post
Share on other sites

I would like to address those saying "but AMD should've known this and never released it".  As some have mentioned in the thread, not everything gets tested all the time.  My career has mostly been as a Software/Hardware QA Engineer.  As an example at a prior company, I tested several areas of a particular product.  Each of those areas had a test matrix with things we would like to test.  To complete the SHORTEST of those matrixes fully would take 3 weeks of my time full time.  One of them was over 3 months (and yes, the majority of it was highly automated, including a lot of running in parallel).  Then on top of all of that, there was also months of performance testing.  Naturally, we went through ALL of that prior to the MAJOR release of a new product, and often multiple times when using it on various versions and other parts as well.  But that kind of time isn't always available for every little bug update.  So tests are tiered in such a way that "these mean it runs, these mean it functions in a way most people would be happy with, these are specific things we care about more right now, these are the full battery of tests".  For most of the AGESA style updates, one would normally only run the first 3, which would take FAR less time than the full battery of tests, to make sure you fixed an issue and didn't break anything major.  This is, of course, also how MS ends up shipping so many updates that break things in various configurations…so you're not wrong to say "it should be better"…but then you also wouldn't have gotten any updates until sometime mid next year at the earliest.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, justpoet said:

I would like to address those saying "but AMD should've known this and never released it".  As some have mentioned in the thread, not everything gets tested all the time.  My career has mostly been as a Software/Hardware QA Engineer.  As an example at a prior company, I tested several areas of a particular product.  Each of those areas had a test matrix with things we would like to test.  To complete the SHORTEST of those matrixes fully would take 3 weeks of my time full time.  One of them was over 3 months (and yes, the majority of it was highly automated, including a lot of running in parallel).  Then on top of all of that, there was also months of performance testing.  Naturally, we went through ALL of that prior to the MAJOR release of a new product, and often multiple times when using it on various versions and other parts as well.  But that kind of time isn't always available for every little bug update.  So tests are tiered in such a way that "these mean it runs, these mean it functions in a way most people would be happy with, these are specific things we care about more right now, these are the full battery of tests".  For most of the AGESA style updates, one would normally only run the first 3, which would take FAR less time than the full battery of tests, to make sure you fixed an issue and didn't break anything major.  This is, of course, also how MS ends up shipping so many updates that break things in various configurations…so you're not wrong to say "it should be better"…but then you also wouldn't have gotten any updates until sometime mid next year at the earliest.

That's all good a fine, except when do you not test a product after updates to ensure it still does all the main things?   I mean people are constantly hanging shit on MS because updates sometimes cause issues, are the people in this thread trying to excuse the lack of testing going to give MS a free pass as well?  I mean if AMD don't have to test everything then neither should Intel or MS or apple.  

 

For the record I am not saying AMD = bad, I am just saying we shouldn't be too quick to praise them unquestioningly for shortcomings that may have been because they were rushing/under the pump/whatever.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, mr moose said:

That's all good a fine, except when do you not test a product after updates to ensure it still does all the main things?   I mean people are constantly hanging shit on MS because updates sometimes cause issues, are the people in this thread trying to excuse the lack of testing going to give MS a free pass as well?  I mean if AMD don't have to test everything then neither should Intel or MS or apple.  

 

For the record I am not saying AMD = bad, I am just saying we shouldn't be too quick to praise them unquestioningly for shortcomings that may have been because they were rushing/under the pump/whatever.

My point is not to praise any of them, but to suggest that something like a processor that is more complex than software is more likely to have something slip by.  Especially when it has been shown to not affect all the same way, and not even all motherboards, nor overall functionality.  That's kind of the definition of compatibility or performance edge case, which wouldn't be part of the core testing matrix (unlike, say, when MS's update decided to fail if any USB devices were present).  Since AMD makes the chipset, they probably test most things on a reference board implementation that is their own, rather than on any of the usual board partners, to make sure they don't miss something due to an oddity in a specific board partner implementation they didn't know about.

Link to comment
Share on other sites

Link to post
Share on other sites

30 minutes ago, justpoet said:

Since AMD makes the chipset, they probably test most things on a reference board implementation that is their own, rather than on any of the usual board partners, to make sure they don't miss something due to an oddity in a specific board partner implementation they didn't know about.

One of the AGESA releases actually states testing was done on an AMD engineering board.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, justpoet said:

My point is not to praise any of them, but to suggest that something like a processor that is more complex than software is more likely to have something slip by.

You mean like this:

 

On 9/14/2019 at 7:42 AM, mr moose said:

 

And on top of that, CPU's are an extremely complicated product with literally thousands of variables that need to be considered during design and manufacture.   It seems petulant at best when the tech community (especially those who are effectively journalists running youtube channels) spend more time putting a personal spin on this shit rather than just addressing it for what it is.  It doesn't matter if it's a failing of boost clocks or a security flaw, it is similar a situation that is either being addressed or it isn't.  

 

With regard to this specific situation, it should just be accepted they are doing right by their customers for addressing it, but that doesn't excuse advertising something they would have tested and known not to be a thing.  

I really find it hard to believe they didn't put the CPU under a load and make sure it worked fine before releasing the latest Agesa,  regardless which board/system they used.  Especially considering nearly every AGESA update has had something to do with voltage, clocking or some other part that relates directly to the boost functions of the processor.

 

 

I am not saying you are wrong or anything like that, I'm just saying it seems way to big of a part of the product to accept minimalist testing as excuse.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

44 minutes ago, mr moose said:

I am not saying you are wrong or anything like that, I'm just saying it seems way to big of a part of the product to accept minimalist testing as excuse.

I would suggest they're NOT doing minimalist testing, just not every test, or you wouldn't see an update until mid next year.

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, justpoet said:

I would suggest they're NOT doing minimalist testing, just not every test, or you wouldn't see an update until mid next year.

I would call it minimalist.  I could understand the argument if it was an obscure feature like wake on lan or some minute networking protocol bug, but boost clocks is a major part of its design function and marketing.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, mr moose said:

but that doesn't excuse advertising something they would have tested and known not to be a thing.  

That ignores the part where I explained that testing and product spec validation was not carried out on 1.0.0.3, I know I already covered this but that bit is actually important. They advertised what they knew and validated and created all the material based on that which is not 1.0.0.3 or any variant of it. These bugs were introduced after all the product specs were finalized and communicated to vendor partners.

 

That's the difference in saying AMD knowingly advertised a product they knew could not achieve what they said, they advertised exactly what was validated and known to be achievable which is how they created the product stack in the first place.

 

Sure later revisions of AGESA ended up making the product not reach it's advertised capabilities, known to be achievable, but that does not mean AMD was advertising the product knowing it could not do as advertised, quite the opposite in fact. Fixing the bug to bring it back in line with product spec wouldn't have been possible otherwise and AMD wouldn't have even known if it were possible.

 

Subsequent knowings and happenings doesn't apply backwards in product specifications, a future bug after the fact isn't known and doesn't retrospectively invalidate what was tested at that previous time. You can say AMD screwed up and introduce bugs that made the product unable to achieve it's advertised capability but that in no way means AMD was advertising a product with that knowledge, you fix the bug not change the product spec, unless it's impossible to fix which would be a far bigger issue i.e. product recall.

 

Edit:

Product specs were publicly announced back in May and I don't remember when pre-orders opened but well before July 7th. AGESA 1.0.0.3 did not exist back in May.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, leadeater said:

That ignores the part where I explained that testing and product spec validation was not carried out on 1.0.0.3, I know I already covered this but that bit is actually important. They advertised what they knew and validated and created all the material based on that which is not 1.0.0.3 or any variant of it. These bugs were introduced after all the product specs were finalized and communicated to vendor partners.

 

That's the difference in saying AMD knowingly advertised a product they knew could not achieve what they said, they advertised exactly what was validated and known to be achievable which is how they created the product stack in the first place.

 

Sure later revisions of AGESA ended up making the product not reach it's advertised capabilities, known to be achievable, but that does not mean AMD was advertising the product knowing it could not do as advertised, quite the opposite in fact. Fixing the bug to bring it back in line with product spec wouldn't have been possible otherwise and AMD wouldn't have even known if it were possible.

 

Subsequent knowings and happenings doesn't apply backwards in product specifications, a future bug after the fact isn't known and doesn't retrospectively invalidate what was tested at that time. You can say AMD screwed up and introduce bugs that made the product unable to achieve it's advertised capability but that in no way means AMD was advertising a product with that knowledge, you fix the bug not change the product spec.

My point is I am having trouble accepting that when they released the 1003abb that was designed to fix erroneous reporting/requesting for over voltage and clock speeds in monitoring software that they didn't push the CPU to boost as part of the testing.   That's like fixing the brakes on a car but not testing if they work at speeds beyond rolling it out of the garage.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, mr moose said:

My point is I am having trouble accepting that when they released the 1003abb that was designed to fix erroneous reporting/requesting for over voltage and clock speeds in monitoring software that they didn't push the CPU to boost as part of the testing.   That's like fixing the brakes on a car but not testing if they work at speeds beyond rolling it out of the garage.

Like I said the clock issue pre-dates 1.0.0.3ABB.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, leadeater said:

Like I said the clock issue pre-dates 1.0.0.3ABB.

But 1003abb addressed clock and voltage issues, so why wasn't it tested?  Either it was tested and they pushed it out knowing the CPU's would boost properly or they didn't test it after tinkering with it and we are trying to say that's o.k because boost clocks are not a major part of voltage or clock reporting in monitoring software.  And this is ignoring the fact that they know full well every tester and reviewer on the planet is going to do a before and after benchmark with the new Agesa to see what changes.

 

 

 

 

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, mr moose said:

But 1003abb addressed clock and voltage issues, so why wasn't it tested?  Either it was tested and they pushed it out knowing the CPU's would boost properly or they didn't test it after tinkering with it and we are trying to say that's o.k because boost clocks are not a major part of voltage or clock reporting in monitoring software.  And this is ignoring the fact that they know full well every tester and reviewer on the planet is going to do a before and after benchmark with the new Agesa to see what changes.

1.0.0.3ABB addressed idle voltages and idle clock reporting, it wasn't made to address any boost clock problems, along with Destiny 2 and WHEA-Logger warnings. Power state voltages and clocks weren't changed.

 

As for reviewers, the official reviewer guide instructed them to use AGESA 1.0.0.2 which would have been a terrible idea, Anandtech did so go look at that situation/mess. Gamers Nexus was one of the few that conducted the review using 1.0.0.3AB and that still had the boost clock reduction issue. It was impossible to do the reviews using 1.0.0.3ABB anyway since it did not exist at that time.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×