Jump to content

AMD '#Rekt son' reply to Intel's "four glued together" statement

3 hours ago, RadiatingLight said:

TBH that ecosystem slide looks very unimpressive.

It may not be packed with names but the names it has are impressive. HP and Dell EMC are the largest enterprise hardware companies on the planet. Supermicro and Inventec are both massive in their own right. Then there are the other names. That small slide includes a pretty damn big portion of Epyc's intended market.

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, nerdslayer1 said:

not that simple 

giphy.gif.e314ea9275ac71fb94d1e3669e9cbdc0.gif

I'm still confused by those words months later

CPU: Intel i7 7700K | GPU: ROG Strix GTX 1080Ti | PSU: Seasonic X-1250 (faulty) | Memory: Corsair Vengeance RGB 3200Mhz 16GB | OS Drive: Western Digital Black NVMe 250GB | Game Drive(s): Samsung 970 Evo 500GB, Hitachi 7K3000 3TB 3.5" | Motherboard: Gigabyte Z270x Gaming 7 | Case: Fractal Design Define S (No Window and modded front Panel) | Monitor(s): Dell S2716DG G-Sync 144Hz, Acer R240HY 60Hz (Dead) | Keyboard: G.SKILL RIPJAWS KM780R MX | Mouse: Steelseries Sensei 310 (Striked out parts are sold or dead, awaiting zen2 parts)

Link to comment
Share on other sites

Link to post
Share on other sites

On 7/18/2017 at 5:49 PM, Hunter259 said:

It's better to have it be a single die with all cores together. Less latency that way. Obviously you can make a fast chip with dual dies/CCX's but it is a worse way.

Of course a single die is superior from a performance standpoint, but when it comes to practicality it really isn't.

 

You're looking at manufacturing a huge die (the 28 core skylake Xeon die is over SIX HUNDRED SEVENTY MILLIMETERS SQUARED). Since yields don't linearly decrease with die size that means the stuff is gonna cost a ton more. Not to mention that you don't have to tape out another die (saves 10s of millions of dollars) and binning is easier due to being able to mix and match dies.

 

Thats not to mention the huge complexity and R&D that would be needed to design that.

 

That's part of the reason amd is undercutting Intel so hard when it comes to prices.

On 7/18/2017 at 5:51 PM, JurunceNK said:

We'll see when the actual benchmarks come out. What was true in the past may not be true today. But again, we will see when the third parties gets their hands on these EPYC CPUs and benches them.

Anandtech already has done an analysis on it, sometimes the split dies can cause performance impacts when dealing with die to die L3 cache communication but in other scenarios it doesn't matter at all.

Make sure to quote me or tag me when responding to me, or I might not know you replied! Examples:

 

Do this:

Quote

And make sure you do it by hitting the quote button at the bottom left of my post, and not the one inside the editor!

Or this:

@DocSwag

 

Buy whatever product is best for you, not what product is "best" for the market.

 

Interested in computer architecture? Still in middle or high school? P.M. me!

 

I love computer hardware and feel free to ask me anything about that (or phones). I especially like SSDs. But please do not ask me anything about Networking, programming, command line stuff, or any relatively hard software stuff. I know next to nothing about that.

 

Compooters:

Spoiler

Desktop:

Spoiler

CPU: i7 6700k, CPU Cooler: be quiet! Dark Rock Pro 3, Motherboard: MSI Z170a KRAIT GAMING, RAM: G.Skill Ripjaws 4 Series 4x4gb DDR4-2666 MHz, Storage: SanDisk SSD Plus 240gb + OCZ Vertex 180 480 GB + Western Digital Caviar Blue 1 TB 7200 RPM, Video Card: EVGA GTX 970 SSC, Case: Fractal Design Define S, Power Supply: Seasonic Focus+ Gold 650w Yay, Keyboard: Logitech G710+, Mouse: Logitech G502 Proteus Spectrum, Headphones: B&O H9i, Monitor: LG 29um67 (2560x1080 75hz freesync)

Home Server:

Spoiler

CPU: Pentium G4400, CPU Cooler: Stock, Motherboard: MSI h110l Pro Mini AC, RAM: Hyper X Fury DDR4 1x8gb 2133 MHz, Storage: PNY CS1311 120gb SSD + two Segate 4tb HDDs in RAID 1, Video Card: Does Intel Integrated Graphics count?, Case: Fractal Design Node 304, Power Supply: Seasonic 360w 80+ Gold, Keyboard+Mouse+Monitor: Does it matter?

Laptop (I use it for school):

Spoiler

Surface book 2 13" with an i7 8650u, 8gb RAM, 256 GB storage, and a GTX 1050

And if you're curious (or a stalker) I have a Just Black Pixel 2 XL 64gb

 

Link to comment
Share on other sites

Link to post
Share on other sites

Quote

EDIT: This comment is merely about single socket to single socket comparison.... Everything gets tossed out if you have to go higher socket numbers to compete against the other option.

 

 

I mean look.... I have lots of other issues with Intel's actions here... But I personally feel the "glued together" statement is 100% accurate and perfectly acceptable as an important criticism of the design philosophy. Cache latency already jumps quite a bit moving from CCX to CCX, and "infinity fabric" is only slower and higher latency than that.

 

Plus one concern of mine is that ram isn't one coherent structure. They call threadripper quad channel, but accessing through each half requires the full latency hit. 

 

That said.... It probably doesn't make MUCH of a difference in most cases. And doing that "glued together" philosophy is definitely cheaper and easier than monolithic dies. But just because it doesn't matter much (if at all) that they are glued together, BECAUSE AMD HAS TAKEN THAT INTO ACCOUNT, doesn't make Intel's statement wrong.

 

Just my opinion though.

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

57 minutes ago, DocSwag said:

Of course a single die is superior from a performance standpoint, but when it comes to practicality it really isn't.

 

You're looking at manufacturing a huge die (the 28 core skylake Xeon die is over SIX HUNDRED SEVENTY MILLIMETERS SQUARED). Since yields don't linearly decrease with die size that means the stuff is gonna cost a ton more. Not to mention that you don't have to tape out another die (saves 10s of millions of dollars) and binning is easier due to being able to mix and match dies.

If we're also talking the limits of die size and design the performance advantage from a single die could be reversed. Packing more in to a single socket is superior to dual socket or higher.

 

Then if we look at dual socket servers I'd guess that half of them only have the second socket populated since some of the PCIe slots require it as they are only wired to the second socket, here a single socket MCM with more PCIe lanes is superior.

 

Then if you start packing more dies in to a package you could start looking at other things like putting NVRAM on package too, which almost every enterprise storage controller has freeing up ram slots for more system ram or for more NVRAM dimms to dedicate to something else.

 

The more you can do on package the better but the complexity of doing so is higher than single large die, dual socket and add-in cards.

 

Single large die isn't superior, it's superior with the current technology we have today.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, leadeater said:

If we're also talking the limits of die size and design the performance advantage from a single die could be reversed. Packing more in to a single socket is superior to dual socket or higher.

 

Then if we look and dual socket servers I'd guess that half of them only have the second socket populated since some of the PCIe slots require it as they are only wired to the second socket, here a single socket MCM with more PCIe lanes is superior.

 

Then if you start packing more dies in to a package you could start looking at other lings like putting NVRAM on package too, which almost every enterprise storage controller has freeing up ram slots for more system ram or for more NVRAM dimms to dedicate to something else.

 

The more you can do on package the better but the complexity of doing so is higher than single large die, dual socket and add-in cards.

 

Single large die isn't superior, it's superior with the current technology we have today.

Ehhhhhhh.... from a purely CPU performance perspective... a single large die is always going to be superior than the otherwise same system with external interconnects (adding layers of physical abstraction cannot reduce latency for example, which is why monolithic > multi-lithic die > separate sockets, assuming all else the same). But certainly the benefits may very well be worth the small drawbacks, esp if it lets you get into a different regime of technicality (as you say, where a monolithic might be at maximum reticle size).

 

 

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Curufinwe_wins said:

Plus one concern of mine is that ram isn't one coherent structure. They call threadripper quad channel, but accessing through each half requires the full latency hit. 

Intel Skylake-SP has the same problem with it's IMC's and has internal NUMA zones for cores and has preferred paths etc. 14c+ Skylake-X could have the same issues since those are using the same die.

 

Intel's mesh design isn't perfect and neither is AMD's. Both have problems if you scale them out too much with AMD's being a bit more problematic since every die needs to have full mesh coherent links to every die.

 

Depending on Intel's future design goals they might have a bit better framework to build on than AMD, a slightly more costly one but you do benefit from it.

 

Interesting times atm :)

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, leadeater said:

Intel Skylake-SP has the same problem with it's IMC's and has internal NUMA zones for cores and has preferred paths etc. 14c+ Skylake-X could have the same issues since those are using the same die.

 

Intel's mesh design isn't perfect and neither is AMD's. Both have problems if you scale them out too much with AMD's being a bit more problematic since every die needs to have full mesh coherent links to every die.

 

Depending on Intel's future design goals they might have a bit better framework to build on than AMD, a slightly more costly one but you do benefit from it.

 

Interesting times atm :)

Definitely true) But half as many IMC's for the same-ish core count (2 for 14-28 cores set instead of 4 for EPYC), and I think the preferred path should be much lower latency than going across an infinity fabric (since it seems like just CCX to CCX is already worst latency than anything the previous XCC loop fabric had). 

 

But yeah. It's just a different design philosophy. It might be a better one for now. It might actually be the better one long term. But I do think the critique in general is valid.

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Curufinwe_wins said:

Ehhhhhhh.... from a purely CPU performance perspective... a single large die is always going to be superior than the otherwise same system with external interconnects (adding layers of physical abstraction cannot reduce latency for example, which is why monolithic > multi-lithic die > separate sockets, assuming all else the same). But certainly the benefits may very well be worth the small drawbacks, esp if it lets you get into a different regime of technicality (as you say, where a monolithic might be at maximum reticle size).

 

 

Not for all workloads it won't be, think of a CPU the size of Nvidia's Volta GV100 and the mesh linkages in that and full span cross die core communication. Bigger isn't always better, but then there could be another design change after Intel's current mesh that will make monodie better again.

 

Also on a dual socket system if CPU 1 needs to communicate with a PCIe device on CPU 2's PCIe controller you hit problems with QPI bandwidth and latency.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Hunter259 said:

How is saying it's "Glued together" some childish statement. They are, in a way, glued together. Intel did it in the past when they couldn't figure out how to make a 4 core die and have it not be a money sink. They left that as soon as they could. It's an inherently technologically inferior way of making a CPU.

umm... cuz Infinity fabric actually matters

CPU: Intel i7 5820K @ 4.20 GHz | MotherboardMSI X99S SLI PLUS | RAM: Corsair LPX 16GB DDR4 @ 2666MHz | GPU: Sapphire R9 Fury (x2 CrossFire)
Storage: Samsung 950Pro 512GB // OCZ Vector150 240GB // Seagate 1TB | PSU: Seasonic 1050 Snow Silent | Case: NZXT H440 | Cooling: Nepton 240M
FireStrike // Extreme // Ultra // 8K // 16K

 

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Curufinwe_wins said:

Definitely true) But half as many IMC's for the same-ish core count (2 for 14-28 cores set instead of 4 for EPYC), and I think the preferred path should be much lower latency than going across an infinity fabric (since it seems like just CCX to CCX is already worst latency than anything the previous XCC loop fabric had). 

 

But yeah. It's just a different design philosophy. It might be a better one for now. It might actually be the better one long term. But I do think the critique in general is valid.

Yep, those early benchmarks we saw of EPYC 7601 and Skylake-SP pretty much confirm at current sizings Intel's design was generally better.

 

There is one area that EPYC did significantly better at and that was memory bandwidth depending on thread loading etc, where I think that is actually useful is only in an NVMe storage server and it needs A LOT of NVMe SSDs to make the advantage mean anything.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, RadiatingLight said:

TBH that ecosystem slide looks very unimpressive.

only if you exclude the fact that EMC, Supermicro, HPE, H3C are industry colossi and that a company can piggyback on revenue from just a couple of them and still be above the read line at the end of a quarter

Remember kids, the only difference between screwing around and science is writing it down. - Adam Savage

 

PHOΞNIX Ryzen 5 1600 @ 3.75GHz | Corsair LPX 16Gb DDR4 @ 2933 | MSI B350 Tomahawk | Sapphire RX 480 Nitro+ 8Gb | Intel 535 120Gb | Western Digital WD5000AAKS x2 | Cooler Master HAF XB Evo | Corsair H80 + Corsair SP120 | Cooler Master 120mm AF | Corsair SP120 | Icy Box IB-172SK-B | OCZ CX500W | Acer GF246 24" + AOC <some model> 21.5" | Steelseries Apex 350 | Steelseries Diablo 3 | Steelseries Syberia RAW Prism | Corsair HS-1 | Akai AM-A1

D.VA coming soon™ xoxo

Sapphire Acer Aspire 1410 Celeron 743 | 3Gb DDR2-667 | 120Gb HDD | Windows 10 Home x32

Vault Tec Celeron 420 | 2Gb DDR2-667 | Storage pending | Open Media Vault

gh0st Asus K50IJ T3100 | 2Gb DDR2-667 | 40Gb HDD | Ubuntu 17.04

Diskord Apple MacBook A1181 Mid-2007 Core2Duo T7400 @2.16GHz | 4Gb DDR2-667 | 120Gb HDD | Windows 10 Pro x32

Firebird//Phoeniix FX-4320 | Gigabyte 990X-Gaming SLI | Asus GTS 450 | 16Gb DDR3-1600 | 2x Intel 535 250Gb | 4x 10Tb Western Digital Red | 600W Segotep custom refurb unit | Windows 10 Pro x64 // offisite backup and dad's PC

 

Saint Olms Apple iPhone 6 16Gb Gold

Archon Microsoft Lumia 640 LTE

Gulliver Nokia Lumia 1320

Werkfern Nokia Lumia 520

Hydromancer Acer Liquid Z220

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, leadeater said:

Not for all workloads it won't be, think of a CPU the size of Nvidia's Volta GV100 and the mesh linkages in that and full span cross die core communication. Bigger isn't always better, but then there could be another design change after Intel's current mesh that will make monodie better again.

 

Also on a dual socket system if CPU 1 needs to communicate with a PCIe device on CPU 2's PCIe controller you hit problems with QPI bandwidth and latency.

Assuming everything else the same.... in-die interconnects will always be able to deliver as much or more bandwidth, and always lower latency than off-die (I mean that's pretty much a duh moment). There is no way around that.

 

The real thing is though that "everything else" isnt the same, so it might be better for some workloads and come cases.

 

And obviously if monolithic die adds other constraints then as I said in the original comment, it is probably worthwhile to go the way AMD did (the QPI issue being a big deal if you are thus forced into two separate sockets).

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, Curufinwe_wins said:

I mean look.... I have lots of other issues with Intel's actions here... But I personally feel the "glued together" statement is 100% accurate and perfectly acceptable as an important criticism of the design philosophy. Cache latency already jumps quite a bit moving from CCX to CCX, and "infinity fabric" is only slower and higher latency than that.

 

Plus one concern of mine is that ram isn't one coherent structure. They call threadripper quad channel, but accessing through each half requires the full latency hit. 

 

That said.... It probably doesn't make MUCH of a difference in most cases. And doing that "glued together" philosophy is definitely cheaper and easier than monolithic dies. But just because it doesn't matter much (if at all) that they are glued together, BECAUSE AMD HAS TAKEN THAT INTO ACCOUNT, doesn't make Intel's statement wrong.

 

Just my opinion though.

The thing is supposed to compete against Intel's offering of quad-socket and octa-socket systems, I don't think the latency will be any worse than that.

And unless you're gaming on that system... server grade stuff is often inherently independent of other threads, like a webserver which runs every user request in a separate thread and the threads do not interact at all

CPU: Intel i7 5820K @ 4.20 GHz | MotherboardMSI X99S SLI PLUS | RAM: Corsair LPX 16GB DDR4 @ 2666MHz | GPU: Sapphire R9 Fury (x2 CrossFire)
Storage: Samsung 950Pro 512GB // OCZ Vector150 240GB // Seagate 1TB | PSU: Seasonic 1050 Snow Silent | Case: NZXT H440 | Cooling: Nepton 240M
FireStrike // Extreme // Ultra // 8K // 16K

 

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, DXMember said:

The thing is supposed to compete against Intel's offering of quad-socket and octa-socket systems, I don't think the latency will be any worse than that.

And unless you're gaming on that system... server grade stuff is often inherently independent of other threads, like a webserver which runs every user request in a separate thread and the threads do not interact at all

Totally. On package latency is going to be much better than socket to socket latency. Without a doubt.

 

But I think "in general" Intel's XCC is going to compete (obviously not on value) chip to chip against EYPC (same socket counts). I don't see many organizations going.... "Intel's 28 core can't compete head to head with the AMD 32 core, we should use twice as many Intel 16+ core chips against the one."

 

Outside of some spots where super insane pcie bandwidth is most important.

 

 

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Hunter259 said:

It's better to have it be a single die with all cores together. Less latency that way. Obviously you can make a fast chip with dual dies/CCX's but it is a worse way.

Amd make their chip the best way: affordable.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Curufinwe_wins said:

 

But I think "in general" Intel's XCC is going to compete chip to chip against EYPC (same socket counts). I don't see many organizations going.... Intel's 28 core can't compete head to head with the AMD 32 core, we should use twice as many Intel 16+ core chips against the one.

 

Outside of some spots where super insane pcie bandwidth is most important.

Probably isn't an issue anyway, as we know 80% ish of the market is dual socket systems so businesses could just keep buying dual socket servers that have Intel CPUs and just not care what you can do with a single socket AMD.

 

One of AMD's problems they have to compete with is people just going +1 to the last order etc. Why think when you can just reuse your last purchase order, you know that's a working platform and fits in with your current management framework. We do it, we have a standard server configuration list i.e. Storage Server, ESXi host etc and just keep ordering the same configurations until we decide it's time to update our configuration and that is usually only when HPE bring out a new server generation or something else big happens.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, RadiatingLight said:

TBH that ecosystem slide looks very unimpressive.

At first glance I thought the same, but then, this is server stuff so I expect to see companies I've never heard of, and also expect not to see the typical gaming/consumer brands.

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

the slide actually looks professionally put together. Unlike intel cramming together companies, AMD just puts a few significant example companies and important softwares that epyc is going to be used for mainly. It looks unimpressive to the regular user but looks impressive to whoever works in datacenters and so on.

 

Basically intel is trying to impress the dumb, because they know they cant convince the smart ones that intel's current xeon lineup is better than epyc. If you consider the price, power use, performance, etc. The only downside to epyc is the restriction to just 2 CPUs (64 cores total) whereas intel has had xeons that allow 8 CPUs in a system (ive seen 80 cores before) but the price is like $100k from dell, hp, oracle, etc.

 

Still as far as cpu impressiveness goes, sparc was way ahead of its time, its too bad it died out as it was used when people needed CPUs for compute and encryption. They had a CPU with 16 cores, each core capable of 8 SMT (intel and AMD only capable of 2) and were one of the first to release manycore CPUs. The sun sparc ecosystem was pretty impressive back than with being able to do somethings you cant on x86 like bringing up a console that lets you run code direct into the CPU if you are physically on the sparc pc (hence why their keyboards are different).

 

Id really like to see the pricing for the CPU and board and how PCIe to PCIe communication will work since each cpu consisting of 4 cpus have their own PCIe lanes. Its still impressive to have  60 or more lanes on the CPU itself.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, SpaceGhostC2C said:

That's a pretty empty claim to belong in engineering or computer science. I mean, "elegant"? leave that to Versace or whatever :P

From a theoretical design standpoint Ryzen is considerably less elegant than Skylake SP. From an engineering standpoint, where good enough and solving the problem trumps all else when all's said and done, Ryzen is significantly more elegant.

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Hunter259 said:

It's better to have it be a single die with all cores together. Less latency that way. Obviously you can make a fast chip with dual dies/CCX's but it is a worse way.

"Better" depends on perspective. You're looking through a technical lense, but the economic lense is what is really important to generate profits and provide a competitive advantage. More compute power per dollar (with a stable platform) is the most import metric in the industry.

 

A large single die may be technically faster, but  bigger dies have an exponentially higher chance of partial fabrication failure the more extensive it becomes. So getting better processors will come at a price that is very difficult to justify for the best silicon. I believe that is how top end Xeons are binned into different skews today, but the larger the die the more out of control the processes becomes. Unless fabrication processes are greatly improved, very large core count single die CPUs just aren't possible. Might as well target small CPUs with low levels of fabrication failure and focus on improving the communication between the CCXs. This is exactly what Ryzen/EPYC did so price scales linearly with cores instead of exponentially. This allows CPU price per core to scale linearly with large core count processors instead of exponentially within each generation of processor.

 

In summary, obviously increased distance for signals to travel will increase latency, but if that latency is minimized eventually everything should operate smoothly with a significantly lower overhead cost. This will be standard until fabrication can make absurd numbers of cores per die with a very low rate of partial failure. Until this is possible a multi-die CPU will always cost less to manufacture than a single die CPU of similar performance when comparing similar processors with many cores. Lower price for similar performance is a recipe for success.

CPU: i7 4790k @ 4.7 GHz

GPU: XFX GTS RX580 4GB

Cooling: Corsair h100i

Mobo: Asus z97-A 

RAM: 4x8 GB 1600 MHz Corsair Vengence

PSU: Corsair HX850

Case: NZXT S340 Elite Tempered glass edition

Display: LG 29UM68-P

Keyboard: Roccat Ryos MK FX RGB

Mouse: Logitech g900 Chaos Spectrum

Headphones: Sennheiser HD6XX

OS: Windows 10 Home

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Hunter259 said:

"Glued together"

They used it in a derogatory way to damage the brand/put people off and (like you) were wrong. Yes Intel did do the same thing with the Pentium D BUT the infinity fabric interconnect improves the communication vastly so its nothing like the crap Pentium D.

Link to comment
Share on other sites

Link to post
Share on other sites

To respond to the posts regarding single die or multiple and data transfer between one core to another,

 

it doesnt matter how many dies there are, the distance from one core to the furthest matter and how all cores in a cpu are connected matters too be it via mesh or a shared bus.

For example tilera uses a mesh architecture in order to have 72 core CPUs on a single die and i have brought the mesh to it's knees before as it is based on transfers rather than bandwidth.

 

The link can be massive and the speed and latency between each link could easily be around or faster than the L3 cache. Just imagine a bunch of wires directly connecting the CPUs, even if the distance of the wires are different it doesnt matter. This is because the speed of an electron that travels through a conductor is much much faster than the silicons switching on and off for the distance. Its not like a kilometer of wire in distance and the wires are made from conductors, at each end are silicon based controllers. So moving data from one cpu to another could add a couple more cycles to the data but it is not much slower than sending core to core directly via the cpu pipeline. Transferring from one core to another will take the number of cycles as the pipeline at least but if any circuitry exists that allows data to be sent from one core to another's pipeline in the midsection, it does speed things up but also adds cost to making the CPU (increased complexity).

 

So rather than a massive die like intel does where the complicated circuitry of x86 can already get more complicated, AMD opted for a simpler way. All CPUs have a central controller that sends data about so that same controller which not only handles resistors, cache, ram can also handle the infinity fabric and inter CPU interconnects (multi socket).

 

The distance doesnt matter, its how the data moves around the cores that matters be it infinity fabric, cache or communication between cores. The fact that i can crush a manycore's mesh architecture proves that it is not bandwidth or latency based, rather based on how many transfers it can handle. In CPUs itself, there are many kilometers of wire even though it is tiny, electrons could travel many kilometers within a CPU just to complete an instruction.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Hunter259 said:

Uh. This is something that is not changing and is basic Computer Science/Engineering. Multiple dies is a less elegant and technologically advanced

You say that but in preliminary figures it beats the Xeons in every way for half the cost and Intel seem concerned about it. So... Your point Mr Expert?

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Hunter259 said:

The product performs excellently and I have never made shots on it's performance. I just said Intel's claim is correct in that it is a "worse" way of achieving it.

Cheaper and better. Tell us again how its worse and inferior?

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×