Jump to content

AMD, threads on threads on threads?

ShawnTD

First time news poster here so sorry in advance if i get this wrong.

 

Just ran across this on wccftech(I know, I know) large amounts of salt and whatnot.

 

Apparently the new zen3 architecture from AMD is rumored to have SMT4 (4 threads per core) as stated by the article here,

https://wccftech.com/rumor-amd-zen-3-architecture-to-support-up-to-4-threads-per-core-with-smt4-feature/

 

Which after digging into the wccftech article they are getting their info from a German website Hardwareluxx-

https://www.hardwareluxx.de/index.php/news/hardware/prozessoren/50914-geruechtekueche-zen-3-mit-smt4-und-neue-navi-und-turing-einsteigerkarten.html

 

The article says-

Quote

Rumor has it, that AMD’s next generation CPU microarchitecture is going to have a brand new feature called SMT4, and as the name implies it’s a simultaneous multi-threading feature. Whilst the company’s Zen 2 core does improve on the SMT capability of the original Zen architecture, which was the company’s first ever design to feature SMT, Zen 3 is said to make a giant leap forward by doubling the execution thread count per core from two to four.

 

 

 

While I can't say I know very much about micro architecture or if this has any plausibility or possiblity anytime soon, to me it seems like might be the next new "core wars" beginnings? 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

given all the vulnerabilities that relate to SMT/Hyperthreading, this may not be the best idea.

🌲🌲🌲

 

 

 

◒ ◒ 

Link to comment
Share on other sites

Link to post
Share on other sites

If rumor is true, Microsoft needs to do something with the task manager CPU threads graphs. We're coming to a point where that menu is becoming unreadable. Imagine new 64 core Zen3 with 64 cores and 256 threads. OMFG. I wonder how this will play out in consumer segment and things like laptops. 2 core, 8 threads Ryzen 3 chips? Hm.

 

Also, Microsoft will have to address the core count limits for Windows editions. I forgot what they were exactly, but they'll have to raise that because of how AMD is pumping out core counts...

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, VegetableStu said:

seriously though, what are the caveats for going 4t per core? o_o didn't the xeon phi(??) do something like that too?

I'd imagine it would greatly depend on the architecture itself, in terms of the number of computation units per core and the efficiency of pipeline, as you will now have double the threads vying for the same underlying resources. 

Link to comment
Share on other sites

Link to post
Share on other sites

29 minutes ago, KarathKasun said:

4-way SMT is only going to be for servers/datacenters.

[Citation Required]

 

Main Rig:-

Ryzen 7 3800X | Asus ROG Strix X570-F Gaming | 16GB Team Group Dark Pro 3600Mhz | Corsair MP600 1TB PCIe Gen 4 | Sapphire 5700 XT Pulse | Corsair H115i Platinum | WD Black 1TB | WD Green 4TB | EVGA SuperNOVA G3 650W | Asus TUF GT501 | Samsung C27HG70 1440p 144hz HDR FreeSync 2 | Ubuntu 20.04.2 LTS |

 

Server:-

Intel NUC running Server 2019 + Synology DSM218+ with 2 x 4TB Toshiba NAS Ready HDDs (RAID0)

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Master Disaster said:

[Citation Required]

 

I am the citation.  SMT hurts lots of consumer tasks like gaming because it adds latency to the pipeline for each thread.  This is why you can get better/more stable FPS with SMT off.

 

It helps server workloads because they are not as sensitive to the extra latency because they are generally not doing real-time tasks.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, KarathKasun said:

I am the citation.  SMT hurts lots of consumer tasks like gaming because it adds latency to the pipeline for each thread.  This is why you can get better/more stable FPS with SMT off.

Then you'll forgive me for calling bollocks. Until I hear it from AMD I'm remaining open to either possiblity.

 

Main Rig:-

Ryzen 7 3800X | Asus ROG Strix X570-F Gaming | 16GB Team Group Dark Pro 3600Mhz | Corsair MP600 1TB PCIe Gen 4 | Sapphire 5700 XT Pulse | Corsair H115i Platinum | WD Black 1TB | WD Green 4TB | EVGA SuperNOVA G3 650W | Asus TUF GT501 | Samsung C27HG70 1440p 144hz HDR FreeSync 2 | Ubuntu 20.04.2 LTS |

 

Server:-

Intel NUC running Server 2019 + Synology DSM218+ with 2 x 4TB Toshiba NAS Ready HDDs (RAID0)

Link to comment
Share on other sites

Link to post
Share on other sites

do you think we will see 2 / 4 cores coming back in 2020?

the intel dual pentium i7 etc

 

may be its a solution for mobile devices

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Master Disaster said:

Then you'll forgive me for calling bollocks. Until I hear it from AMD I'm remaining open to either possiblity.

 

8-way SMT has been done already.  Sorry, you arent getting 4-way SMT without Epyc or some future TR platform.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, VegetableStu said:

seriously though, what are the caveats for going 4t per core? o_o didn't the xeon phi(??) do something like that too?

Without direct experience we can try to extrapolate from existing SMT implementations. The practical result from having SMT compared to not is higher throughput from the core, but per thread the execution rate is less. Current Ryzen cores have a lot of execution potential, that can't be fully utilised by a single thread. If the core design were to get even more potential, more threads could help to unlock more of that. If your CPU use relies on throughput and not getting a single thread through ASAP, it could be a benefit. I think it could be a tough sell if it appeared in consumer space though.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

So taking a leaf out of IBM's book then?

41 minutes ago, KarathKasun said:

I am the citation.  SMT hurts lots of consumer tasks like gaming because it adds latency to the pipeline for each thread.  This is why you can get better/more stable FPS with SMT off.

 

It helps server workloads because they are not as sensitive to the extra latency because they are generally not doing real-time tasks.

*In some older, edge case games and programs

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Dabombinable said:

So taking a leaf out of IBM's book then?

*In some older, edge case games and programs

No, across the board.  If you have the cores for what you are doing, SMT reduces performance.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, KarathKasun said:

No, across the board.  If you have the cores for what you are doing, SMT reduces performance.

When ever I've seen testing, comparison and reviews done, that hasn't been the case. The most difference that I have seen BTW was back in the days of the Pentium 4 HT through 600 series. And even then, only a few examples had severe performance drop off. Some of those also having issues later on with multiple cores or CPU (in the case of my dual Pentium III rig).

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, KarathKasun said:

That's first gen Ryzen...and AMD's first implementation of SMT period. What you're seeing is the same as what Intel went through with the P4. You certainly don't see that with Ryzen+ and Ryzen 2. Not only that, that's with games that are still optimised for Intel's ring bus based CPU.

Just look at Intel's results. You can't guess/hypothesise the performance of SMT possibly on AMD's next Ryzen iteration off CPU from 1x generation and 1x refresh ago.

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

They'd build either SMT3 or SMT4 into the cores, then disable part of it for Desktop and TR. There's a good thought we'd see SMT2 on Desktop and SMT3 on TR. The rumors about SMT3 or 4 on Zen3 have been going around for at least a year, and we're already hearing that AMD is showing off early Milan (Zen3 Epyc) samples.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Arika S said:

given all the vulnerabilities that relate to SMT/Hyperthreading, this may not be the best idea.

this thread is about amd, ?, seriously though they have been able to maintain security problems to a minimum, doesn't seem like a problem to them.

3 hours ago, KarathKasun said:

I am the citation.  SMT hurts lots of consumer tasks like gaming because it adds latency to the pipeline for each thread.  This is why you can get better/more stable FPS with SMT off.

 

It helps server workloads because they are not as sensitive to the extra latency because they are generally not doing real-time tasks.

thats no longer the case on many games its quite rare to see performance go down, though i agree that even more smt would not improve things much as there is a good chance tasks would pile on a single core instead of spreading, expect on linux as they have good schedulers 

Link to comment
Share on other sites

Link to post
Share on other sites

Sounds like a hyperscaler request for VM hosting to move thread scheduling down to the hardware layer more as CPU thread allocation time gets tricky with lots of VMs or VMs with many virtual cores, the more CPU threads that exist the easier it is for the hyervisor to give CPU time to a VM. This won't increase performance in many cases, more it will stop performance being lost to inefficiencies and co-stop where the VM is effectively micro stunned waiting for concurrent access to multiple CPU threads.

 

It's pretty easy to cripple a VM host system even with the most high end CPUs by over allocating virtual CPUs, too many or too wide, and even the most basic CPU PC will out perform VMs running on it.

 

SMT4 might just be able to double the number of VMs per host, but it will do no good for highly active VMs demanding lots of CPU time using all the execution resources in the CPU core itself.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Dabombinable said:

When ever I've seen testing, comparison and reviews done, that hasn't been the case. The most difference that I have seen BTW was back in the days of the Pentium 4 HT through 600 series. And even then, only a few examples had severe performance drop off. Some of those also having issues later on with multiple cores or CPU (in the case of my dual Pentium III rig).

Speed =/= acceleration.

One of you is talking about throughput, the other latency. It's a sliding scale. And (all else being equal) a series system is slower than a parallel due to the limits of math and physics.

Canges in other ipc/architecture (thus not all things being equal), means additional threads/workload types as time has progressed have "improved" with SMT etc... but that's because the rest of the pipeline also improves. If we went back and added SMT to those older CPUs, they would perform slower* (only slightly) with it on, and faster* with it off. Likewise, if AMD changed the design for zero types of SMT, then it would be faster (tiny amount) than SMT. But yes, this also only applies to single thread vs multi thread workloads. Which is another cano worms.

 

 

*of cause, only in single threaded workloads. In multithreaded it would be the opposite. But therein lies the thing, it cannot be "both" by definition. ;)

Link to comment
Share on other sites

Link to post
Share on other sites

That would be quite a change, though maybe it won't be on mainstream platform but potentially Threadripper and definitely Epyc tho. 

| Ryzen 7 7800X3D | AM5 B650 Aorus Elite AX | G.Skill Trident Z5 Neo RGB DDR5 32GB 6000MHz C30 | Sapphire PULSE Radeon RX 7900 XTX | Samsung 990 PRO 1TB with heatsink | Arctic Liquid Freezer II 360 | Seasonic Focus GX-850 | Lian Li Lanccool III | Mousepad: Skypad 3.0 XL / Zowie GTF-X | Mouse: Zowie S1-C | Keyboard: Ducky One 3 TKL (Cherry MX-Speed-Silver)Beyerdynamic MMX 300 (2nd Gen) | Acer XV272U | OS: Windows 11 |

Link to comment
Share on other sites

Link to post
Share on other sites

There is also adaptive SMT which is basically used on all modern SMT CPU's. Basically, CPU tries to utilize as many real cores as possible before requesting additional "virtual" threads. This way you see no real performance degradation in games these days and when you do need raw compute it'll fire up everything it has.

Link to comment
Share on other sites

Link to post
Share on other sites

I feel like this would have diminishing returns to the point where it wouldn't be much of a difference.

"If a Lobster is a fish because it moves by jumping, then a kangaroo is a bird" - Admiral Paulo de Castro Moreira da Silva

"There is nothing more difficult than fixing something that isn't all the way broken yet." - Author Unknown

Spoiler

Intel Core i7-3960X @ 4.6 GHz - Asus P9X79WS/IPMI - 12GB DDR3-1600 quad-channel - EVGA GTX 1080ti SC - Fractal Design Define R5 - 500GB Crucial MX200 - NH-D15 - Logitech G710+ - Mionix Naos 7000 - Sennheiser PC350 w/Topping VX-1

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, bcredeur97 said:

I feel like this would have diminishing returns to the point where it wouldn't be much of a difference.

Which is why it's unlikely we'll see it on Desktop. However, IBM's Power architecture already uses up to SMT8 ( https://en.wikipedia.org/wiki/POWER9 ) in certain die configurations. It has a very real use, especially with how "wide" Zen already is, but it's not for desktop. Which is why we likely won't see it on desktop.  We might get SMT3 on Desktop at some point, just because wider is always a way to go with a core design, especially after the next jump in L1 and L2 cache sizes.

Link to comment
Share on other sites

Link to post
Share on other sites

Too bad most games are still not using more than 4 threads.

Specs: Motherboard: Asus X470-PLUS TUF gaming (Yes I know it's poor but I wasn't informed) RAM: Corsair VENGEANCE® LPX DDR4 3200Mhz CL16-18-18-36 2x8GB

            CPU: Ryzen 9 5900X          Case: Antec P8     PSU: Corsair RM850x                        Cooler: Antec K240 with two Noctura Industrial PPC 3000 PWM

            Drives: Samsung 970 EVO plus 250GB, Micron 1100 2TB, Seagate ST4000DM000/1F2168 GPU: EVGA RTX 2080 ti Black edition

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×