Jump to content

Intel announces Cascade Lake AP

porina
32 minutes ago, leadeater said:

It's not low, it's just not 32 channels of bandwidth.

The actual bandwidth between the Centaur chip and the DRAM itself is 410GB/s per socket.

 

With the next iteration of POWER9 the sustained bandwidth will go from 230GB/s to 350GB/s.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Amazonsucks said:

The actual bandwidth between the Centaur chip and the DRAM itself is 410GB/s per socket.

Doesn't really matter though, once the CPU needs to access memory it's going through those DMI interfaces and that is where you get limited. Power9 Scale Out is 8 channels direct PHY and Power9 Scale Up is 8 DMI interfaces, Scale Out is actually slightly higher memory bandwidth.

 

Upgrading the DMI on the scale out I think is easier to increase the memory bandwidth though, don't have to add more DMI interfaces or wait on faster DRAM standards.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, leadeater said:

Doesn't really matter though, once the CPU needs to access memory it's going through those DMI interfaces and that is where you get limited. Power9 Scale Out is 8 channels direct PHY and Power9 Scale Up is 8 DMI interfaces, Scale Out is actually slightly higher memory bandwidth.

 

Upgrading the DMI on the scale out I think is easier to increase the memory bandwidth though, don't have to add more DMI interfaces or wait on faster DRAM standards.

You meant to write that in reverse right?

 

Power9 scale out(with directly attached memory) maxes out at 150GB/s per socket.

 

The Power9 scale up with buffered memory is 230GB/s per socket.

 

The scale out uses directly atttached DDR4 so it can use standard cheapo DIMMs, where as the scale up uses the insanely expensive CDIMMs.

 

And you mean upgrading the scale up? Because the Centaur buffer means you can use DDR3, DDR4, DDR5, GDDR5 or NVRAM on the DIMM and the Power9 will still be able to talk to it since it goes through the buffer. The scale out versions would need an entire rework with a new built in memory controller(so entirely replacing the CPU) and new DIMMs to access new RAM.

Link to comment
Share on other sites

Link to post
Share on other sites

So, my dumb-ass question. Is 12-Channel memory even a thing?

CPU - Ryzen 7 3700X | RAM - 64 GB DDR4 3200MHz | GPU - Nvidia GTX 1660 ti | MOBO -  MSI B550 Gaming Plus

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, leadeater said:

Those small blades don't come in 4 sockets because there isn't enough space for it.

Not sure I get your point then... that image you show with the 2S system, assuming that is representative (as I don't do servers) then it looks challenging to fit 12 slots per socket. There's a narrow heatsink variation isn't there? Would that give enough extra space? I take it ram going across would be hell for airflow so not an option, which leaves the other option of sacrificing max capacity for channels, with only 6 slots per socket. Are SODIMMs a thing in enterprise? :) I presume the hit to capacity wouldn't necessarily make it worth it?

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, PocketNerd said:

So, my dumb-ass question. Is 12-Channel memory even a thing?

Better to look at it as 2x6 channel.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, PocketNerd said:

So, my dumb-ass question. Is 12-Channel memory even a thing?

X channel ram, is just a binning thing, nothing prevents you from using any kit of ram with a pc that supports more channels, it just means you will have to buy more ram kits, and as the post above said, it will be more like 2 cpus each with 6 channels,

 Also its going to be very interesting to see the latency costs of moving between dies here vs epyc 

Link to comment
Share on other sites

Link to post
Share on other sites

Don't most manufacturers still prefer EPYC over Xeon or Intel's server offerings? I'm not sure if it has to do with the shortage or not.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, PocketNerd said:

So, my dumb-ass question. Is 12-Channel memory even a thing?

What do you mean? Are you asking if there are chips that have 12 or more memory channels on their memory controller? Or are you asking about the DRAM itself having some characteristic?

 

In 2013 the SX-ACE chip had 16 channel DDR3 per socket, and via a buffer chip IBM POWER9 CPUs have 32 memory channels per socket.

 

The Fujitsu SPARC XIfx has 128 serial links between the CPU and 8 stacks of HMC. Those are different from what makes up a normal memory channel though.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Amazonsucks said:

Power9 scale out(with directly attached memory) maxes out at 150GB/s per socket.

 

The Power9 scale up with buffered memory is 230GB/s per socket.

 

The scale out uses directly atttached DDR4 so it can use standard cheapo DIMMs, where as the scale up uses the insanely expensive CDIMMs.

The documentation I read listed the scale out at 271GB/s but it looks like that was wrong, plus combined for both sockets. Sustained is rated at 120GB/s per socket.

 

2 hours ago, Amazonsucks said:

And you mean upgrading the scale up? Because the Centaur buffer means you can use DDR3, DDR4, DDR5, GDDR5 or NVRAM on the DIMM and the Power9 will still be able to talk to it since it goes through the buffer. The scale out versions would need an entire rework with a new built in memory controller(so entirely replacing the CPU) and new DIMMs to access new RAM.

Woops yep, Up not Out. What I mean is if you upgrade the on die DMI interfaces from the current 28.8GB/s to something like 60GB/s you'll have legacy support for existing Centaur modules and newer modules would double the bandwidth, the actual memory doesn't need to change.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, porina said:

Not sure I get your point then... that image you show with the 2S system, assuming that is representative (as I don't do servers) then it looks challenging to fit 12 slots per socket. There's a narrow heatsink variation isn't there? Would that give enough extra space? I take it ram going across would be hell for airflow so not an option, which leaves the other option of sacrificing max capacity for channels, with only 6 slots per socket. Are SODIMMs a thing in enterprise? :) I presume the hit to capacity wouldn't necessarily make it worth it?

Yea that's a chassis that can fit 4 of those 2S hybrid blades in to them, the socket is already the narrow kind. Great for density as you are double that of a 1U server and quadruple that of a 2U server while still having a full compliment of 24 2.5" bays in the front. You also only need 2 PSUs rather than the 8 you would need for the same number of traditional servers. 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Arika S said:

oh, now you've gone and done it, you mentioned TDP in an Intel thread. are you trying to summon him?

Oh no, I practiced due diligence, I drew a pentagram on my monitor with cat blood, lit candles and sacrificed an Athlon64 before posting that.

 

 

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

Why would they disable HT? Is that just a shady move or are they not going to have HT on Cascade Lake?

CPU - Ryzen Threadripper 2950X | Motherboard - X399 GAMING PRO CARBON AC | RAM - G.Skill Trident Z RGB 4x8GB DDR4-3200 14-13-13-21 | GPU - Aorus GTX 1080 Ti Waterforce WB Xtreme Edition | Case - Inwin 909 (Silver) | Storage - Samsung 950 Pro 500GB, Samsung 970 Evo 500GB, Samsung 840 Evo 500GB, HGST DeskStar 6TB, WD Black 2TB | PSU - Corsair AX1600i | Display - DELL ULTRASHARP U3415W |

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Carclis said:

Why would they disable HT? Is that just a shady move or are they not going to have HT on Cascade Lake?

It drops TDP and performance so not out of the picture.

 

Also all the testing was done with hyperthreading dissabled, so either HT isnt great when comparing to other systems or it doesnt have hyperthreading

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Carclis said:

Why would they disable HT? Is that just a shady move or are they not going to have HT on Cascade Lake?

They lied about it being the CPU with most memory channels too. 

 

I cant imagine them abandoning hyperthreading, but there are cases where you want it disabled. I personally have mine off.

Link to comment
Share on other sites

Link to post
Share on other sites

28 minutes ago, GoldenLag said:

It drops TDP and performance so not out of the picture.

 

Also all the testing was done with hyperthreading dissabled, so either HT isnt great when comparing to other systems or it doesnt have hyperthreading

Well that's what I was thinking. Maybe they're using some of the lower binned 24c Xeon chips that don't hit the same frequency at low voltages and they're disabling HT to compensate for the higher power consumption. That would also allow them to cram a second die in without crazy high TDP.

CPU - Ryzen Threadripper 2950X | Motherboard - X399 GAMING PRO CARBON AC | RAM - G.Skill Trident Z RGB 4x8GB DDR4-3200 14-13-13-21 | GPU - Aorus GTX 1080 Ti Waterforce WB Xtreme Edition | Case - Inwin 909 (Silver) | Storage - Samsung 950 Pro 500GB, Samsung 970 Evo 500GB, Samsung 840 Evo 500GB, HGST DeskStar 6TB, WD Black 2TB | PSU - Corsair AX1600i | Display - DELL ULTRASHARP U3415W |

Link to comment
Share on other sites

Link to post
Share on other sites

Now, throw 8 of these in a system, run cinebench and enjoy the world record, while crying inside because you just spent your retirement fund 5 times over.

Link to comment
Share on other sites

Link to post
Share on other sites

52 minutes ago, Carclis said:

Why would they disable HT? Is that just a shady move or are they not going to have HT on Cascade Lake?

Why did they disable it on EPYC as well, I'm guessing in the name of 'fairness'. There's always that caveat of "Only our system was performance optimized, competitor systems were not" usually stated somewhere.

 

The other amusing thing is all the listed performance are just projections, carried out on CentOS with a much older Linux kernel compared to the AMD system that used Ubuntu and the later kernel with Retpoline.

 

Quote

 compared to 1-node, 2-socket 48-core Cascade Lake Advanced Performance processor projections by Intel as of 10/3/2018.

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, leadeater said:

Why did they disable it on EPYC as well, I'm guessing in the name of 'fairness'. There's always that caveat of "Only our system was performance optimized, competitor systems were not" usually stated somewhere.

I'm not familiar with it but I heard SMT is detrimental to performance for AMD systems running that benchmark.

4 minutes ago, leadeater said:

The other amusing thing is all the listed performance are just projections, carried out on CentOS with a much older Linux kernel compared to the AMD system that used Ubuntu and the later kernel with Retpoline.

Security updates that hit performance perhaps?

CPU - Ryzen Threadripper 2950X | Motherboard - X399 GAMING PRO CARBON AC | RAM - G.Skill Trident Z RGB 4x8GB DDR4-3200 14-13-13-21 | GPU - Aorus GTX 1080 Ti Waterforce WB Xtreme Edition | Case - Inwin 909 (Silver) | Storage - Samsung 950 Pro 500GB, Samsung 970 Evo 500GB, Samsung 840 Evo 500GB, HGST DeskStar 6TB, WD Black 2TB | PSU - Corsair AX1600i | Display - DELL ULTRASHARP U3415W |

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, Dylanc1500 said:

Now, throw 8 of these in a system, run cinebench and enjoy the world record, while crying inside because you just spent your retirement fund 5 times over.

these will be limited to 2 socket systems though 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, cj09beira said:

these will be limited to 2 socket systems though 

Yes, I know. It's just the idea sounds great.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Carclis said:

I'm not familiar with it but I heard SMT is detrimental to performance for AMD systems running that benchmark.

Depends on the test.

http://www.crc.nd.edu/~rich/CRC_EPYC_Cluster_Build_Feb_2018/Installing and running HPL on AMD EPYC v2.pdf (on or off is fine)

http://www.crc.nd.edu/~rich/CRC_EPYC_Cluster_Build_Feb_2018/Installing and running HPL on AMD EPYC.pdf (off recommended)

 

Both are Linpack tests, but Linpack is configurable so it does actually matter what you're actually doing.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×