Jump to content

does cache still matter?

I wanted to check benchmarks of  ryzen 2700x vs i9 9900k and I saw Intel beats here..after that I checked the new ryzen 3000 specifications and then I noticed AMD always has bigger cache...the new 3000 series will have over 36 MB of cache and again they will be under 5GHz speed without overclocking meanwhile i7 9700k has 4.9Ghz Turbo boost and i9 9900k has 5Ghz,but low cache.So at the end Intel always has more Ghz,but AMD have more cache.Why AMD focuses on so much cache?For example i9 9900k has 5Ghz maximum speed,ryzen 2700x has 4.1Ghz,but more Cache...does the cache compensate the speed at that situation?I mean it's possible for AMD to create a 5Ghz processor ,but they keep creating  ~4.1Ghz-4.5Ghz processors but increase the cache every time,on the other side it feels like Intel just focuses on high Ghz....

Link to comment
Share on other sites

Link to post
Share on other sites

They're different architectures, so the caches are used differently.

 

Ryzen 2xxx has 4 MB L2 cache and 16 MB L3 cache. 9900K has 16 MB in total... I think, just a bit less.

On Ryzen 3xxx AMD uses chiplets.  Basically, instead of making a single big chip with cores+cache+memory controller+SOC (sata/usb), they split the big chip into chiplets.

So they have chiplets which have only cores and their cache, and a separate chiplet which has the IO stuff (memory controller, sata, usb, pci-e)

So when they use two chiplets, you have up to 2 x 8 cores / 16 threads , up to 2 x 16 MB L3 + 2 x 4 MB L2 cache

 

For AMD it makes sense because they can take out the IO stuff and make chips using a cheaper process and they can make the CPU chiplets smaller, so they make more chiplets on a smaller process.

Also, they can reuse these chiplets to make Threadripper processors (up to 4 chiplets) and EPYC (server) processors using up to 8 or 16 chiplets to form a cpu.

 

So they may lose a bit of money by placing so much cache on a chiplet but they can reuse same chiplet in multiple products instead of making several different chips.
 

As for frequency... the lower you get with nm process (lower nm = smaller chips, more chips per wafer, lower voltages required to work, more efficient, less power, more profit for company as they get more chips out of each wafer) the harder it is to achieve high frequencies.

Also, the higher the frequencies, the more difficult it is to make big chips - the amount of time it takes for electronic signals to travel from one side of a chip to another can actually be long enough to make the cpu wait at high frequencies, so you end up not gaining as much performance.

That's why you see them going with lots of cores, each core with it's L1, L2 and L3 caches very close together... all the data is brought as close as possible physically to the processing bits, so that signals travel as little as possible inside the CPU.

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, mariushm said:

They're different architectures, so the caches are used differently.

 

Ryzen 2xxx has 4 MB L2 cache and 16 MB L3 cache. 9900K has 16 MB in total... I think, just a bit less.

On Ryzen 3xxx AMD uses chiplets.  Basically, instead of making a single big chip with cores+cache+memory controller+SOC (sata/usb), they split the big chip into chiplets.

So they have chiplets which have only cores and their cache, and a separate chiplet which has the IO stuff (memory controller, sata, usb, pci-e)

So when they use two chiplets, you have up to 2 x 8 cores / 16 threads , up to 2 x 16 MB L3 + 2 x 4 MB L2 cache

 

For AMD it makes sense because they can take out the IO stuff and make chips using a cheaper process and they can make the CPU chiplets smaller, so they make more chiplets on a smaller process.

Also, they can reuse these chiplets to make Threadripper processors (up to 4 chiplets) and EPYC (server) processors using up to 8 or 16 chiplets to form a cpu.

 

So they may lose a bit of money by placing so much cache on a chiplet but they can reuse same chiplet in multiple products instead of making several different chips.

why are you saying they are different architectures?what do you mean by that?it's not worth comparing them cuz its like to compare toyota prius with a ferrari(cuz ferrari is sports car,toyota prius is for mom and dad)?

Link to comment
Share on other sites

Link to post
Share on other sites

No, it's more like comparing a V8 engine vs inline 8 cylinder - both can be equally good and they get you to destination, just the way they work internally is a bit different.

 

The Intel processor thinks differently internally compared to the AMD processor, but the end result is the same.

They get a list of instructions (add this to that, multiply that, if this is some value then do this otherwise do that) so the processors look at that list of instructions and splits them across lots of mini processing units and do things in parallel wherever possible. Often, the processor  even look at stuff like "if this value is equal to some number do this otherwise do that" and the processor calculates boths versions at the same time using separate mini processing cores just to have that data available for when the actual value is determined (at the beginning of the if) and the processor then just throws out the part that's not needed. The cache is used for example to temporarily store the results of such computations.

The cache can also be used to transfer data between cores - one processor may be designed in such a way that multiple cores are connected to one level 2 cache, so two cores can exchange data directly, while other cpu may be designed in such a way that data must be in level 3 cache for it to be transferred between cores - such design choices can decide the amount of actual cache that must be used ... too little cache and you make the cores wait for data to come in, too much cache that's not used means the cpu costs more to manufacture (that 16 MB cache in a cpu uses probably half the area of the processor... the cpu could be maybe 2-3 times cheaper to make without it, but it's needed)

So the way they use the caches can be a bit different. For regular stuff, common applications, there may be no difference but for some very specialized applications one processor may be better than other at doing some calculations.

Link to comment
Share on other sites

Link to post
Share on other sites

I believe it was Intel that plans to start stacking cache vertically to allow for more capacity.  Giving the CPU as much information as possible keeps it from having to wait for it.  

 

I'd compare it to a cook (CPU) that has all of their ingredients within reach on the counter (cache) and doesn't have to go searching for them in different places.  Like reaching into cabinets (RAM) or searching in a pantry or refrigerator (disk drive).  

 

So yes.  Cache has always mattered and will probably matter increasingly more in the future.  

 

Edit:

I did some more reading and Intel plans to stack all of the things.  Foveros.  

AMD Ryzen 5800XFractal Design S36 360 AIO w/6 Corsair SP120L fans  |  Asus Crosshair VII WiFi X470  |  G.SKILL TridentZ 4400CL19 2x8GB @ 3800MHz 14-14-14-14-30  |  EVGA 3080 FTW3 Hybrid  |  Samsung 970 EVO M.2 NVMe 500GB - Boot Drive  |  Samsung 850 EVO SSD 1TB - Game Drive  |  Seagate 1TB HDD - Media Drive  |  EVGA 650 G3 PSU | Thermaltake Core P3 Case 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, wtf man said:

why are you saying they are different architectures?what do you mean by that?it's not worth comparing them cuz its like to compare toyota prius with a ferrari(cuz ferrari is sports car,toyota prius is for mom and dad)?

Zen and Coffee Lake are different microarchitectures but are both based on the x86 instruction set architecture. I’m not sure a car analogy works well here because neither design is inherently superior to the other the way a Ferrari is faster than a Prius. Coffee Lake is iterative on many other past Intel microarchitectures, while Zen was wholly new and the current Zen 2 is an iteration on Zen. If you’re interested in microarchitecture level analysis, Anandtech has some good coverage of most recent desktop CPU microarchitectures from Intel and AMD. 

AMD Ryzen 7 3700X | Thermalright Le Grand Macho RT | ASUS ROG Strix X470-F | 16GB G.Skill Trident Z RGB @3400MHz | EVGA RTX 2080S XC Ultra | EVGA GQ 650 | HP EX920 1TB / Crucial MX500 500GB / Samsung Spinpoint 1TB | Cooler Master H500M

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 months later...
On 6/23/2019 at 5:09 AM, wtf man said:

I wanted to check benchmarks of  ryzen 2700x vs i9 9900k and I saw Intel beats here..after that I checked the new ryzen 3000 specifications and then I noticed AMD always has bigger cache...the new 3000 series will have over 36 MB of cache and again they will be under 5GHz speed without overclocking meanwhile i7 9700k has 4.9Ghz Turbo boost and i9 9900k has 5Ghz,but low cache.So at the end Intel always has more Ghz,but AMD have more cache.Why AMD focuses on so much cache?For example i9 9900k has 5Ghz maximum speed,ryzen 2700x has 4.1Ghz,but more Cache...does the cache compensate the speed at that situation?I mean it's possible for AMD to create a 5Ghz processor ,but they keep creating  ~4.1Ghz-4.5Ghz processors but increase the cache every time,on the other side it feels like Intel just focuses on high Ghz....

one thing to consider is memory latency.  first and second gen ryzen had a bit more latency than intel.  but with the 3rd gen actually having the memory controller off the main die, latency is WAY up over what it used to be.  the doulbed l3 cache makes up for that by bringing down the average latency.  since the first and second gen cpus were a single solid die with the mem controller on it it wasnt need as much.  without they might not have even got any gains over the 2000 series. 

 

plus cache is VERY important for a lot of pro workloads.  things like compiling i believe.  there is a reason that some intel xeons have 40+ mb of cache on something small like a quad core.  they are made for the pro market.  and l3 is very important for some of those works loads and amd dont have seperate dies for every market.  they use one for everything.

 

intel normally uses the same cores for eveyrthing, but with different cache configs and fabrics (mesh vs ring) and many many different dies for different markets

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×