Jump to content

AMD might announce their new CPU lineup on Tuesday

IAmAndre
10 hours ago, Taf the Ghost said:

Well, there is still a penalty for moving from CCX to CCX, so the Domain would act as 3 other "regional" CCXs. Same with the Epyc or Threadripper now. It doesn't matter too much in the massive parallel environments of servers, but it will on things like Threadripper. Though I did originally think they were going to act as a 4-way NUMA node than a fully independent memory controller.

I don't think it's going to matter too much because a lot of the time that hand off between cores ends up being done through system memory, though any parallel compute tasks that have inter core/cache dependencies should suffer a bit (or a lot) when going above the number of threads per chiplet/cluster. Want some more info on how those chiplet pairs are wired 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, 2Buck said:

It's not just the 2990WX though, Linux also handles the 7980XE a lot better. This applies to any high core count solutions AFAIK. Yes, the 2990WX makes the situation more exaggerated due to it's unusual design, but it's definitely not just the 2990WX.

Yes I have seen that video and I can almost guarantee that is an application optimization issue not Windows itself. Quad socket servers running MSSQL have been around for a long time and that is highly optimized and happens to also be one of the fastest DB engines around. Then you have other applications that run faster in Windows than on Linux in that video too.

 

Just because you see an application running poorly under an OS doesn't mean it's the OS that is causing it.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, M.Yurizaki said:

ECS kind of did this:

ecspf88_SIMAa9s_both.jpg

 

Windows' problem is mostly how it schedules work, which can be fixed with a kernel patch if your processor really needs special considerations.

that neon green PATA port though ?

Current Build: SD-DESK-07

 

Case: Bitfenix Prodigy // PSU: SeaSonic SS-650RM // Motherboard: P8Z77-I DELUXE // CPU: Intel Core i5 3570k // Cooler: Corsair H80i // RAM: Patriot Intel Extreme Masters 2X8GB DDR3 1600MHz // SSD: Crucial M500 240GB // Video: EVGA GeForce GTX 660Ti SC 2GB

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, Taf the Ghost said:

@leadeater

 

Now it gets really interesting for Desktop, as there almost has to be a controller Die for it as well. Are we going to see the entire Zen 2 product stack with GPUs on them? That would explain some of the size of the Controller die. The full GPU system on the Raven Ridge APUs is about 110mm2, which would fit pretty nicely on that central die. 

Once you start stripping out all those memory controllers an I/O ports for that many chiplets I bet that I/O die is going to be very small, putting a GPU in it should be easy. You could even go high end and do 2 chiplets (upper left and right right) and 2 HBM stacks (lower left and right side), that to me sounds better than Kaby-G. 

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, leadeater said:

Yes I have seen that video and I can almost guarantee that is an application optimization issue not Windows itself. Quad socket servers running MSSQL have been around for a long time and that is highly optimized and happens to also be one of the fastest DB engines around. Then you have other applications that run faster in Windows than on Linux in that video too.

 

Just because you see an application running poorly under an OS doesn't mean it's the OS that is causing it.

Yep, it's just a huge chain of coincidences and all these applications tested just happen to be poorly optimized for a widely adopted OS and well optimized for a niche one. And one test performed better on Windows, so the situation evens out anyway.

i7 2600k @ 5GHz 1.49v - EVGA GTX 1070 ACX 3.0 - 16GB DDR3 2000MHz Corsair Vengence

Asus p8z77-v lk - 480GB Samsung 870 EVO w/ W10 LTSC - 2x1TB HDD storage - 240GB SATA SSD w/ W7 - EVGA 650w 80+G G2

3x 1080p 60hz Viewsonic LCDs, 1 glorious Dell CRT running at anywhere from 60hz to 120hz

Model M w/ Soarer's adapter - Logitch g502 - Audio-Techinca M20X - Cambridge SoundWorks speakers w/ woofer

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, 2Buck said:

Yep, it's just a huge chain of coincidences and all these applications tested just happen to be poorly optimized for a widely adopted OS and well optimized for a niche one. And one test performed better on Windows, so the situation evens out anyway.

Most of the applications chosen in the test are all Linux native/developed for Linux first, the applications were chosen originally by Phoronix after all. It's not like I'm saying either one isn't potentially a bit better than the other but 'a lot better' isn't something like 3%-5% on average better for example. Drops in performance on orders of 20%+ is extremely rare to be the OS.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, TigerHawk said:

Yeah. I believe Linus used it in several videos, some 64 core 256 thread thing. It couldn't even run windows very well though IIRC?

Xeon Phi is a great example of using the right tool for the job. For my interests, what I'd like is as much FPU power as possible, with fast cache. Everything else doesn't really matter. This is essentially a Xeon Phi. It has the potential to chew through a ton of FP instructions. Atom was used for "other stuff" presumably as it was small and low power. It was never meant to do any heavy lifting. If you just run Windows on it, oh dear, you got a bunch of slow atom cores... and performance sucked. It's like buying a drag racer and complaining it doesn't do corners well.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, leadeater said:

Most of the applications chosen in the test are all Linux native/developed for Linux first, the applications were chosen originally by Phoronix after all. It's not like I'm saying either one isn't potentially a bit better than the other but 'a lot better' isn't something like 3%-5% on average better for example. Drops in performance on orders of 20%+ is extremely rare to be the OS.

Might be the libraries for compiling on Windows.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Taf the Ghost said:

Might be the libraries for compiling on Windows.

I know a lot of the time in the past open source/linux devs who would write stuff for windows would do manual core scheduling in C/C++ because 'they knew better' than Windows which would piss Microsoft off to no end. Why do you think Game Mode exists for TR ?.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, NunoLava1998 said:

can some of you stop with the shitty imgflip memes from 2011

Ok.

[Hold my beer]

2lyocj.jpgvia Imgflip Meme Generator

(Inserts meme from 2001!!!)

 

No, but really. I'm loving what AMD are doing. Pity everything I do is core speed dependent, not multi core dependent. So for now, I'm still using NVidia/Intel. :/

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, leadeater said:

Once you start stripping out all those memory controllers an I/O ports for that many chiplets I bet that I/O die is going to be very small, putting a GPU in it should be easy. You could even go high end and do 2 chiplets (upper left and right right) and 2 HBM stacks (lower left and right side), that to me sounds better than Kaby-G. 

The APU on 7nm isn't until late 2019/early 2020, so my assumption is AMD is going to need to roll with the +1 die for Desktop. Which means there's little reason not to put at least the Media & 2D engine on it. Maybe even a cut-down baseline set of CUs (say 7 for yield reasons, but 6 active). That ~100mm2 is worth a massive amount of TAM being available for OEMs.

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, porina said:

This is the biggest thing for me, assuming it performs as expected. Not like FMA4 before Ryzen (it did twice the work, but took twice the time, so no throughput benefit). Arguably Intel still has AVX-512 but general support for it is still going to take some time, and we probably wont see it in mainstream until they get their 10nm sorted. Still, it is present in server parts already.

I am wondering, not expecting it though, that since AMD can combine those FMA units to do larger ops if they do have or could later release a microcode update that adds support for AVX-512. If you have 2 256bit FMA units that 1 512bit unit, much like they are doing in Zen/Zen+ now. That would only give them 1 512bit unit though, Xeon Gold/Plat has 2 so you'd be back at AMD doing half throughput.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Taf the Ghost said:

The APU on 7nm isn't until late 2019/early 2020, so my assumption is AMD is going to need to roll with the +1 die for Desktop. Which means there's little reason not to put at least the Media & 2D engine on it. Maybe even a cut-down baseline set of CUs (say 7 for yield reasons, but 6 active). That ~100mm2 is worth a massive amount of TAM being available for OEMs.

So they are going APU route with this design?

 

It could get very flexible on chiplets/hbm/gpu dies I guess. But gonna be strange figuring out which is the best for balance performance/heat/io etc. I mean, who is going to need a vega APU with 64 cores?

 

4 minutes ago, leadeater said:

I am wondering, not expecting it though, that since AMD can combine those FMA units to do larger ops if they do have or could later release a microcode update that adds support for AVX-512. If you have 2 256bit FMA units that 1 512bit unit, much like they are doing in Zen/Zen+ now. That would only give them 1 512bit unit though, Xeon Gold/Plat has 2 so you'd be back at AMD doing half throughput.

Interesting. Intel, at trying to be the fastest at everything, seems to have painted themselves into a corner at instead being the fastest in specialist workloads (by adding silicone for them?).

 

AMD instead tried to be just average at everything, then ended up being fastest at the average workload. :P

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, leadeater said:

I am wondering, not expecting it though, that since AMD can combine those FMA units to do larger ops if they do have or could later release a microcode update that adds support for AVX-512. If you have 2 256bit FMA units that 1 512bit unit, much like they are doing in Zen/Zen+ now. That would only give them 1 512bit unit though, Xeon Gold/Plat has 2 so you'd be back at AMD doing half throughput.

One of the leaks for AIDA64's DB has AVX512 in Zen3. Which makes some sense, as the entire instruction set isn't even fully active.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, TechyBen said:

So they are going APU route with this design?

 

It could get very flexible on chiplets/hbm/gpu dies I guess. But gonna be strange figuring out which is the best for balance performance/heat/io etc. I mean, who is going to need a vega APU with 64 cores?

 

Interesting. Intel, at trying to be the fastest at everything, seems to have painted themselves into a corner at instead being the fastest in specialist workloads (by adding silicone for them?).

 

AMD instead tried to be just average at everything, then ended up being fastest at the average workload. :P

The addition of a GPU on all Desktop SKUs matters in the OEM space.

Link to comment
Share on other sites

Link to post
Share on other sites

31 minutes ago, leadeater said:

I know a lot of the time in the past open source/linux devs who would write stuff for windows would do manual core scheduling in C/C++ because 'they knew better' than Windows which would piss Microsoft off to no end. Why do you think Game Mode exists for TR ?.

Isn't Game Mode to prevent the apps that aren't NUMA aware to force the app on one die or the other so it won't take a performance hit?

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, M.Yurizaki said:

Isn't Game Mode to prevent the apps that aren't NUMA aware to force the app on one die or the other so it won't take a performance hit?

we do have adaptive mode now on threadripper so i assume AMD has some put some effort to make it check this, though i dunno how well it works

Link to comment
Share on other sites

Link to post
Share on other sites

Would they really annouse new Ryzen chips calling it "New Horizon"? No, it would be "New Horyzen". Everyone knows that.

Link to comment
Share on other sites

Link to post
Share on other sites

48 minutes ago, leadeater said:

I am wondering, not expecting it though, that since AMD can combine those FMA units to do larger ops if they do have or could later release a microcode update that adds support for AVX-512. If you have 2 256bit FMA units that 1 512bit unit, much like they are doing in Zen/Zen+ now. That would only give them 1 512bit unit though, Xeon Gold/Plat has 2 so you'd be back at AMD doing half throughput.

I'm not familiar with the practical usage details of AVX-512 vs AVX2, but at a high level AVX-512 only becomes interesting with two units, for double the throughput over AVX2. Otherwise it would be no better.

 

The reasons I can think of for Intel to implement single unit AVX-512 are to expand the perceived availability of hardware support so software developers may be more interested in implementing its use.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, M.Yurizaki said:

Isn't Game Mode to prevent the apps that aren't NUMA aware to force the app on one die or the other so it won't take a performance hit?

Both, there are applications that can't handle so many cores and won't actually run.

Link to comment
Share on other sites

Link to post
Share on other sites

46 minutes ago, M.Yurizaki said:

Unless this is the product of a programmer trying to be smart

It's this, some libraries are coded for a fixed maximum number of cores because they are "those people that know better".

Link to comment
Share on other sites

Link to post
Share on other sites

lol

8086k

aorus pro z390

noctua nh-d15s chromax w black cover

evga 3070 ultra

samsung 128gb, adata swordfish 1tb, wd blue 1tb

seasonic 620w dogballs psu

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/5/2018 at 3:25 AM, RorzNZ said:

Next HoriZen

Serious missed opportunity here.

Rest In Peace my old signature...                  September 11th 2018 ~ December 26th 2018

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×