Jump to content

Windows To Blame For Poor AMD ThreadRipper Performance

BuckGup
4 minutes ago, GoldenLag said:

AMD probably found issues and reported them to microsoft. and as of today nothing has happened. we have known this being an issue as of launch. and Microsoft probably known earlier than us. 

Probably, maybe?!

 

Hi there guys, we launched a new CPU with these awesome features, but don't use it on Windows ..

Applause all around

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, bitsandpieces said:

Probably, maybe?!

 

Here guying we launched a new CPU with these awesome features, but don't use it on Windows ..

Applause all around

see @leadeater response above. there is more to an issue than: we found the reason behind a bug we properly discovered 3 months after launch, therefore it should have been fixed at launch

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, BuckGup said:

Right but the cause was unknown

You need to pay more $$ per core thats why it capped out, youre dealing with Microshit here xD

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, bitsandpieces said:

Probably, maybe?!

 

Hi there guys, we launched a new CPU with these awesome features, but don't use it on Windows ..

Applause all around

Ahhh, I was wondering what the Windows apologists were gonna come up with this time. So AMD is responsible for the flawed way Windows handles lots of cores? There's a laugh. These problems also apply to Intel CPUs too, but if you actually did ANY amount of research you would have known that.

 

 

i7 2600k @ 5GHz 1.49v - EVGA GTX 1070 ACX 3.0 - 16GB DDR3 2000MHz Corsair Vengence

Asus p8z77-v lk - 480GB Samsung 870 EVO w/ W10 LTSC - 2x1TB HDD storage - 240GB SATA SSD w/ W7 - EVGA 650w 80+G G2

3x 1080p 60hz Viewsonic LCDs, 1 glorious Dell CRT running at anywhere from 60hz to 120hz

Model M w/ Soarer's adapter - Logitch g502 - Audio-Techinca M20X - Cambridge SoundWorks speakers w/ woofer

 

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, GoldenLag said:

AMD probably found issues and reported them to microsoft. and as of today nothing has happened. we have known this being an issue as of launch. and Microsoft probably known earlier than us. 

It is likely that behind the scenes MS are working on this even if they don't give public status updates of it.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

I would say we've known about this issue for a while, including some patches:

 

PLEASE QUOTE ME IF YOU ARE REPLYING TO ME

Desktop Build: Ryzen 7 2700X @ 4.0GHz, AsRock Fatal1ty X370 Professional Gaming, 48GB Corsair DDR4 @ 3000MHz, RX5700 XT 8GB Sapphire Nitro+, Benq XL2730 1440p 144Hz FS

Retro Build: Intel Pentium III @ 500 MHz, Dell Optiplex G1 Full AT Tower, 768MB SDRAM @ 133MHz, Integrated Graphics, Generic 1024x768 60Hz Monitor


 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, porina said:

It is likely that behind the scenes MS are working on this even if they don't give public status updates of it.

very likely. seing as high threadcounts and chiplet design is starting to become the norm with even intel looking at creating chips in the near-ish future.

 

edit: Also, the sqedualler issue might be deeply integrated into Windows which might make it difficult to replace quickly

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, GoldenLag said:

very likely. seing as high threadcounts and chiplet design is starting to become the norm with even intel looking at creating chips in the near-ish future.

Also enthusiasts also need to get on with understanding it too... had various dual socket systems so not new to me, but it is ever more accessible to more people. Not quite mass mainstream yet, but getting closer...

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, porina said:

Also enthusiasts also need to get on with understanding it too... had various dual socket systems so not new to me, but it is ever more accessible to more people. Not quite mass mainstream yet, but getting closer...

wasnt the "dual-socket support" on the threadripper mentioned in a leak? the Ryzen 3000 leaks iirc. dont quote me on that as im pretty unshure about that one

Link to comment
Share on other sites

Link to post
Share on other sites

40 minutes ago, GoldenLag said:

 

i get the vibe he doesnt have too much overview over the subject. Software adapts to hardware, not hardware to software

it's like crying that Nvidia is at fault because valve games don't support raytracing.

Link to comment
Share on other sites

Link to post
Share on other sites

hm... yes this shows me once again, that Windows just isn't very good. I have moved to Linux for 90% of my work and even gaming, thanks to level1tech. The only real reason for using Windows is when I want to record the occasional video and edit it(video production is not very good on Linux)... That's it.

But I have actual reasons for not liking Windows. The updates are often buggy and unfinished, Microsoft installs Beta software on peoples PCs who get updates manually(such as myself, because I like install updates when I'm ready to actually install them), but the Insider Program is obviously responsible for testing unfinished software, the OS Is buggy and inconsistent, the new settings app is just abysmal oh and the best part is the Windows licenses can only be used a finite amount of times to reactivate Windows after a fresh install. That is just madness. I know this can be mitigated by just tying the License to your MS account. Another thing that makes no sense what so ever is that the Physical Office licenses you find in stores, can not be used to renew an active subscription, rather it can only be used to activate a new Subscription.

So yes I have had my fair share of annoyances with Microsoft and their products.  

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, chazragg said:

it's like crying that Nvidia is at fault because valve games don't support raytracing.

was going to make a similar comparison if it continued. 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, porina said:

Also enthusiasts also need to get on with understanding it too... had various dual socket systems so not new to me, but it is ever more accessible to more people. Not quite mass mainstream yet, but getting closer...

Long, long, long time ago I used to run this at home. Super old HP/Compaq ML570 Quad Xeon 700 MHz (the Intel slot CPU generation).

ML_570_G1_Front__02304.1432133732.1280.1280.jpg?c=2

 

Absolute tank of a server, slow as shit, at least by the standard of the time when I had it lol. Great to work on though, hotswap everything including PCI-X cards (you can see the top access panel for them in the pic).

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, GoldenLag said:

wasnt the "dual-socket support" on the threadripper mentioned in a leak? the Ryzen 3000 leaks iirc. dont quote me on that as im pretty unshure about that one

My reference to dual socket was more about NUMA in general. I wasn't aware of talk of TR going multi-socket.

10 minutes ago, leadeater said:

Long, long, long time ago I used to run this at home. Super old HP/Compaq ML570 Quad Xeon 700 MHz (the Intel slot CPU generation).

We might be of similar vintage. My first multi-socket system was on the Abit BP6 with a pair of Celerons in it. Can't remember what the mobo was, but I had 4 threads in 2004! Dual Xeon. Only know the date as it had a 6800 Ultra in it. Yes, that one. I actually did a swap with a friend for his 6800 GT just to get rid of the noise. Still, I think in those days, multi-thread software wasn't much of a thing as normal consumers at best had 2 threads with a P4 HT, dual core wasn't a thing yet.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, porina said:

My reference to dual socket was more about NUMA in general. I wasn't aware of talk of TR going multi-socket.

I dont think its going mutisocket either, but i think it was mentioned in one of the ryzen 3000 leaks. Which may or may not be true.

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, VegetableStu said:

I don't own one right now, but does windows see threadripper as 2 or 4 NUMA nodes? o_o

Don't know if it answers your question or not, but two CCX comes under one NUMA node on the AM4 CPUs.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, VegetableStu said:

I don't own one right now, but does windows see threadripper as 2 (or 4) NUMA nodes? o_o

Depends on the model. All TR WX CPUs operate in exclusive NUMA mode, the smaller core count ones can operate in UMA mode or NUMA mode and you can change it. So for the ones that can operate in NUMA mode you will either have 2 NUMA nodes or 4 NUMA nodes and that is based off the number of active dies, WX = 4 and non WX = 2.

 

taskmanager.png

Link to comment
Share on other sites

Link to post
Share on other sites

32 minutes ago, VegetableStu said:

I don't own one right now, but does windows see threadripper as 2 (or 4) NUMA nodes? o_o

 

23 minutes ago, porina said:

Don't know if it answers your question or not, but two CCX comes under one NUMA node on the AM4 CPUs.

 

It's important to remember NUMA and UMA are all about memory not actually about CPU cores or even specifically sockets. Wherever there is a memory controller with memory attached to it that is a NUMA node or more correctly could be one, Edit: in relation to the CPU.

 

UMA = Uniform Memory Access

NUMA = Non Uniform Memory Access

 

NUMA allows the OS and other sub systems to know which memory regions are local or closest to which CPU/Cores/Socket and place memory pages in optimal locations.

 

In the good/bad old days of North Bridge and South Bridge with Front Side Buses controlling the memory multi socket systems were still a single NUMA node or UMA. All CPUs connected to the FSB and had equal access to memory, equal congested slow high latency access. On die memory controllers actually introduced the NUMA problem, fix one problem create another.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, GoldenLag said:

would be kinda funny if Microsoft never fixed this and it would be the end of windows as it cant handle the thread war going forward. 

 

even funnier, AMD did an isekai comeback and toppled a huge company by the use of an unknown weakness that noone were expecting. 

I would love that so much. I really just want windows to be phased out for Linux(I say as I keep using Windows 10 lol) 

I spent $2500 on building my PC and all i do with it is play no games atm & watch anime at 1080p(finally) watch YT and write essays...  nothing, it just sits there collecting dust...

Builds:

The Toaster Project! Northern Bee!

 

The original LAN PC build log! (Old, dead and replaced by The Toaster Project & 5.0)

Spoiler

"Here is some advice that might have gotten lost somewhere along the way in your life. 

 

#1. Treat others as you would like to be treated.

#2. It's best to keep your mouth shut; and appear to be stupid, rather than open it and remove all doubt.

#3. There is nothing "wrong" with being wrong. Learning from a mistake can be more valuable than not making one in the first place.

 

Follow these simple rules in life, and I promise you, things magically get easier. " - MageTank 31-10-2016

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, leadeater said:

It's important to remember NUMA and UMA are all about memory not actually about CPU cores or even specifically sockets. Wherever there is a memory controller with memory attached to it that is a NUMA node or more correctly could be one.

Maybe it was what @VegetableStu was getting at, is the 4 die TR 2 or 4 NUMA nodes assuming it is operating in NUMA mode? Two memory controllers dies, two without. 

 

I think we got into a discussion in another thread how the memory was attached to a Ryzen die, one per CCX, or two shares between 2 CCX. Looking forward to Rome, is that a single NUMA node regardless of the chiplets? That doesn't exclude the possibility, even necessity, that software may want to be aware of the chiplet structure, but that's kinda separate.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, porina said:

Maybe it was what @VegetableStu was getting at, is the 4 die TR 2 or 4 NUMA nodes assuming it is operating in NUMA mode? Two memory controllers dies, two without. 

 

I think we got into a discussion in another thread how the memory was attached to a Ryzen die, one per CCX, or two shares between 2 CCX. Looking forward to Rome, is that a single NUMA node regardless of the chiplets? That doesn't exclude the possibility, even necessity, that software may want to be aware of the chiplet structure, but that's kinda separate.

I also need to correct what I said as it was only partly correct, it is about the memory but the number of nodes are in relation to which groups of cores/dies/CPU are together and have the same/like access to memory, where it is and if there is any in that NUMA node. That's why TR WX is 4 nodes, 2 active memory controllers with memory attached and 4 dies. The NUMA nodes describe the memory locality for the cores contained in that NUMA node so you'd have two nodes with 1 local memory region and 1 remote memory region and two nodes with 2 remote memory regions.

 

AMD-Ryzen-Threadripper-2990WX-topology.j

https://www.servethehome.com/amd-ryzen-threadripper-2990wx-benchmarks-and-review/

 

Above you can see NUMA node 0 has 63GB of memory and a group of cores, node 1 has no memory and a group of cores, node 2 has 63GB of memory and a group of cores, node 3 has no memory and a group of cores.

 

So while NUMA and UMA are about memory and that is what is important the NUMA nodes are all about the CPU and cores and how they relate to the memory in the system.

 

18 minutes ago, porina said:

Looking forward to Rome, is that a single NUMA node regardless of the chiplets? That doesn't exclude the possibility, even necessity, that software may want to be aware of the chiplet structure, but that's kinda separate.

From what I have seen of the chiplet design and the I/O die having the memory controllers I would say yes potentially there will not be multiple NUMA nodes per CPU/package but I still suspect there might be.

 

The inter core paths are still different between on die and off die and that's still actually important for cross thread memory access, though again it might not matter because all dies are equal and all memory is remote soo??????????????

Link to comment
Share on other sites

Link to post
Share on other sites

Have yet to watch the vid when I get home, but it's a problem with scheduler and Windows kernel bug no. 

| Ryzen 7 7800X3D | AM5 B650 Aorus Elite AX | G.Skill Trident Z5 Neo RGB DDR5 32GB 6000MHz C30 | Sapphire PULSE Radeon RX 7900 XTX | Samsung 990 PRO 1TB with heatsink | Arctic Liquid Freezer II 360 | Seasonic Focus GX-850 | Lian Li Lanccool III | Mousepad: Skypad 3.0 XL / Zowie GTF-X | Mouse: Zowie S1-C | Keyboard: Ducky One 3 TKL (Cherry MX-Speed-Silver)Beyerdynamic MMX 300 (2nd Gen) | Acer XV272U | OS: Windows 11 |

Link to comment
Share on other sites

Link to post
Share on other sites

@leadeater Papermaster said that Rome is a UMA design. So, at least to the system memory, it's a unified device.

 

However, to the Scheduler, that isn't quite true. From a scheduler standpoint, a 64c Rome is kind of like an 8 core/128 thread SMT setup. IBM hasn't quite gotten to SMT16, but AMD is functionally there.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, leadeater said:

Also Windows knows when it is virtualized and acts different than on a physical bare metal install.

Hm, interesting. Is there any info available on it?

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Taf the Ghost said:

From a scheduler standpoint, a 64c Rome is kind of like an 8 core/128 thread SMT setup. IBM hasn't quite gotten to SMT16, but AMD is functionally there.

I'm not sure I'd describe it like that, there's still 64 real cores and the CPU will still expose that fact to the OS. If they are all in a single NUMA node so functionally UMA that makes it really simple for schedulers, everything is equal so no worries, mostly.

 

NUMA nodes still describe the locality of the cores to each other and the caches, see the above picture of the 2990WX.

 

If Rome actually is 1 NUMA node, not functionally but actually, then you would see the number of L3 caches as however many CCXs there are under a single NUMA node.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×