Jump to content

Raptor Lake P-Core only SKUs

alex75871
4 hours ago, Crunchy Dragon said:

The boost clock is nice, but one thing a lot of people overlook with running a Xeon is that it tends to take more work for them to actually reach the maximum boost clock. I have a Xeon E5-2690v4(2.6Ghz base, 3.5Ghz max boost) and that topped out at 2.9Ghz running Cinebench R23 for the full 10 minutes.

 

2 hours ago, Crunchy Dragon said:

Not a whole lot, actually. High performance in Windows, normal BIOS changes to let it run as far as it can. It was on a Gigabyte X99 board, so it's definitely not being held back by the C612 chipset on a server board.

That's odd, the official all core spec for that CPU is 3.2Ghz

https://en.wikichip.org/wiki/intel/xeon_e5/e5-2690_v4

 

AVX2 offset maybe?

Link to comment
Share on other sites

Link to post
Share on other sites

I still think most people on this forum that hate E-cores are ignorant and have read too much marketing. 

 

 

For almost all tasks that people on this forum use their computers for, the E-cores are either a performance boost, or they don't matter. Rarely is ever will you run until a situation where you'd be better off without them. 

 

 

In before the 0.1% on this forum who do run some workload that are negatively impacted by E-cores mention some time they had issues, and the other 99.9% is people go "see? I was right in hating them!"... 

Link to comment
Share on other sites

Link to post
Share on other sites

31 minutes ago, LAwLz said:

I still think most people on this forum that hate E-cores are ignorant and have read too much marketing. 

 

 

For almost all tasks that people on this forum use their computers for, the E-cores are either a performance boost, or they don't matter. Rarely is ever will you run until a situation where you'd be better off without them. 

 

 

In before the 0.1% on this forum who do run some workload that are negatively impacted by E-cores mention some time they had issues, and the other 99.9% is people go "see? I was right in hating them!"... 

I dont even know who is marketing this info about e cores. AMD only has made a sly comment after Intel said dumb shit. 
I swear, people remember alder lake launch, and think that information remains as valid today. 

The scheduling has improved ALOT with since the launch of windows 11. Sure you may find an issue here or there, but they are few and far between and is only getting better. The only valid reason I can tell for wanting all P core in a new intel workstation/professional desktop would be software per core licensing. (these are not server parts)

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Mark Kaine said:

just doesn't seem to make a lot of sense... i appreciate they dropped the dreaded "e-cores" tho...

 

ps: basically this is the "budget variant" right? maybe there's a market for that, still super unexciting lol.

The DIY NAS and home lab community is excited - clearly you're not the target audience 😀

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, leadeater said:

Not that it matters, fetching remote L3 cache is not optimal and it'll actually be migrated in to local L3 first anyway meaning that 6MB is of limited usefulness if usability is retained.

Intel's ring bus seems pretty good in general, that if data isn't on the local slice but on a further one it doesn't matter much. I believe AMD also use some kind of ring bus within a CCX so similar could apply. They don't call it a ring bus but I recall during Zen 3 era it was figured out it was essentially one with a middle bypass to help cut down travel even more. Both are generally high enough bandwidth at low enough latency not to matter. If you want remote, try talking to a different CCX. I think only with Intel's higher core count CPUs featuring mesh cache does the node to node bandwidth/latency start to take a hit.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, porina said:

Intel's ring bus seems pretty good in general, that if data isn't on the local slice but on a further one it doesn't matter much.

It's more that it requires a copy operation and then another read aka more than 2 cycles to access it, meaning that it's just a really small remote L3 cache if it even stays accessible and has an access penalty outside of just looking at Ring Bus latency. I doubt 6MB is going to offer much benefit, that's quite far off from the 64MB the V-Cache adds.

 

Mind you it would actually be 12MB L3 Cache less compared to the current desktop counterpart, 4x E-Core slices with 3MB each.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, leadeater said:

It's more that it requires a copy operation and then another read aka more than 2 cycles to access it, meaning that it's just a really small remote L3 cache if it even stays accessible and has an access penalty outside of just looking at Ring Bus latency. I doubt 6MB is going to offer much benefit, that's quite far off from the 64MB the V-Cache adds.

When you don't have as much cache, any extra you can get is more impactful. If data is present in that segment and access is still "better" than fetching from ram it could still have value. If it isn't present as claimed then this is academic anyway.

 

BTW on older Intel CPUs, I have previously disabled cores for testing reduced configurations. The associated cache on the disabled core slice remains active in that scenario. I haven't had a hybrid era CPU so don't know if E-core slices behave differently.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, porina said:

BTW on older Intel CPUs, I have previously disabled cores for testing reduced configurations. The associated cache on the disabled core slice remains active in that scenario.

Yea I just managed to find confirmation of that.

 

intel-core-i9-12900k-extra-tests-04.png

 

1 minute ago, porina said:

When you don't have as much cache, any extra you can get is more impactful. If data is present in that segment and access is still "better" than fetching from ram it could still have value.

We'd have to assess actual hit ratios to know how much it helps, like with the RDNA3 Infinity Cache size graph that often makes the rounds. But as I discovered it's 12MB not 6MB which is 50% increase so much more significant in size than I was thinking based on the original comment (25%).

Link to comment
Share on other sites

Link to post
Share on other sites

9 hours ago, porina said:

 

Not every task needs more than 8 cores. Other products exist if you genuinely need more.

"640K ram ought to be enough"

 

The problem is simply that people don't know what they need. Like does any Game benefit from more cores? Rarely does a game use more than 2 cores, let alone 1. Only very recent stuff using DX12u or Vulkan does, and even then most CPU use on other cores are used by other services by the OS, they rarely use full CPU cores, hence "e-cores" for the OS did make sense.

 

But in a server environment, an all e-core configuration would have made more sense in some cases, but all P-cores in others. Not much room for a mixed hybrid approach in the server since the server rarely benefits from a low-energy state other than NAS systems, and even then, those are usually not invoking a low-energy state either, but would benefit more from an all e-core chip because all it does is encrypt/decrypt/compress/decompress and do network data, it benefits from more threads, not typically faster ones since it's just doing disk-to-network.

 

That said, most of the high-core count processes, benefit from having equally clocked cores, otherwise they stall to the slowest one if the task has to run perfectly in parallel, or in the case of other things like rendering tiles of a 3D render, it can benefit from having faster and more cores, but might not benefit from having as much cache. Each configuration is a bit exclusive of each other, but you don't necessarily need a CPU tuned to that task, it's just a waste of money if you buy too much CPU for a task that isn't going to grow over the lifespan of it's use. Hence many dual-core/quad-core PowerEdge systems from over a decade ago are still sitting in server racks. It would make sense to replace them with newer ones for energy reasons, but there's nothing to gain from doing so. You're still paying for the same 15A circuit regardless if you use it or not.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Kisai said:

since the server rarely benefits from a low-energy state other than NAS systems, and even then, those are usually not invoking a low-energy state either, but would benefit more from an all e-core chip because all it does is encrypt/decrypt/compress/decompress and do network data, it benefits from more threads, not typically faster ones since it's just doing disk-to-network.

Depends on the throughput trying to be pushed. If you don't have really high end NICs/DPUs and offloading more than just the standard basics then high performance threads help a lot when doing over 10Gbps, in particular single end to end per session as mot of the time only a single thread can service that. Lots of connections can be spread across threads but that doesn't help each induvial one.

 

There's quite a range when it comes to NAS systems, from your basic 2/4 bay to actually hundreds of disks and NVMe cache devices etc.

 

2 hours ago, Kisai said:

You're still paying for the same 15A circuit regardless if you use it or not.

You are paying a flat rate for power in a hosting agreement? That seems like a bad deal to me since pay per kwh unit charging models exist. Depends though, could be an actually good deal.

 

2 hours ago, Kisai said:

The problem is simply that people don't know what they need.

I would say business buying Xeons/Servers would generally have a little better understanding on their requirements, and even if they don't their software vendor will tell them.

 

2 hours ago, Kisai said:

Rarely does a game use more than 2 cores, let alone 1. Only very recent stuff using DX12u or Vulkan does, and even then most CPU use on other cores are used by other services by the OS, they rarely use full CPU cores, hence "e-cores" for the OS did make sense.

That's simply not true. Games using more than 2 cores has been long standing now and that includes DX11 games. The whole doesn't use XYZ number of cores has been well investigated by many reviewers, it stems from the faulty reasoning that if 6 threads are only 30%-40% utilization on average then it doesn't need 6 or isn't "using" 6. Background OS is inconsequential, you can baseline your system doing nothing and it's legitimately not worth graphing when comparing to the active game running using whatever number of threads it is using. You can disable 2 cores/threads and force the game to now use only 4 on the same CPU and many times the utilization on average for the threads does not change but the average FPS and 1%/0.1% decrease so the reduction in threads is obviously having a performance impact.

 

Threads do not have to be 100% or even 50% utilized on average to state they are being used, if you collect performance data, ensure repeatability and data accuracy you can determine if something is have an impact or not. If a game positively reacts and increases frame rate from having more threads available then it is using those threads regardless of how little on average. 

Link to comment
Share on other sites

Link to post
Share on other sites

For the folks talking about AVX-512, those new Xeons do not support it:

https://ark.intel.com/content/www/us/en/ark/products/236182/intel-xeon-e-2488-processor-24m-cache-3-20-ghz.html

 

I doubt those are going to work with consumer chipsets, so they'll be only a thing for your really expensive thinkstation or other entry level "workstation" device bought in bulk.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Kisai said:

"640K ram ought to be enough"

There will be cases where that's true today, although more in the microcontroller space than general computing.

 

2 hours ago, Kisai said:

The problem is simply that people don't know what they need.

So what is your solution to that "problem"? No one is allowed to buy 8 or fewer cores in case they might need more?

 

2 hours ago, Kisai said:

Like does any Game benefit from more cores? Rarely does a game use more than 2 cores, let alone 1. Only very recent stuff using DX12u or Vulkan does, and even then most CPU use on other cores are used by other services by the OS, they rarely use full CPU cores, hence "e-cores" for the OS did make sense.

Do you even game? A quad core today can still run most games, but you're not going to have the best time if you went that route. Look at the Steam Deck as an example. No one can say games run at a high level on it. The achievement is that it runs useably at all. DX12 and Vulkan have been around for a while, many years. We're way past the point where 4 cores was high end gaming. Now it is barely entry level, with some games starting to get rough even with 6 modern cores.

 

I do have a side project planned when I have the time to dedicate to it. I have a 5775C (Broadwell 4c8t 128MB L4 cache) I want to try on modern games and see what the experience is like. I wont be the first to do this, but I'll experience it first hand.

 

2 hours ago, Kisai said:

it's just a waste of money if you buy too much CPU for a task that isn't going to grow over the lifespan of it's use.

So is 8 cores too much, enough or not enough now?

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, alex75871 said:

The DIY NAS and home lab community is excited - clearly you're not the target audience 😀

yeah.... true.... all spanish villages to me....  🙃

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

31 minutes ago, igormp said:

For the folks talking about AVX-512, those new Xeons do not support it:

meh, only reason for me to get an intel would be avx512, for that sweet emulation boost.... but i guess my 5800x3D does a pretty good job with that (ie. emulation) too (it feels snappier than my 3600 at least) 

 

14 minutes ago, porina said:

So is 8 cores too much, enough or not enough now

not enough, always need more cores, give me 20 cores minimum or death 💀

 

 

this is still like my dream cpu... doesn't have avx512 tho rip.

 

https://www.intel.com/content/www/us/en/products/sku/199332/intel-core-i910900k-processor-20m-cache-up-to-5-30-ghz/specifications.html

 

 

(and its too frigging expensive lol)

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

27 minutes ago, Mark Kaine said:

meh, only reason for me to get an intel would be avx512, for that sweet emulation boost....

Zen 4 is a thing.

 

27 minutes ago, Mark Kaine said:

not enough, always need more cores, give me 20 cores minimum or death 💀

It was in response to what I was replying to, where implications were made that 8 cores were both not enough and too much.

 

27 minutes ago, Mark Kaine said:

this is still like my dream cpu... doesn't have avx512 tho rip.

https://www.intel.com/content/www/us/en/products/sku/199332/intel-core-i910900k-processor-20m-cache-up-to-5-30-ghz/specifications.html

(and its too frigging expensive lol)

Why 10900k? If you need 10+ core Intel with strong AVX-512 then Skylake-X always has been there. If you don't mind weaker AVX-512 at 8 cores then 11700k/11900k? Zen 4 is probably better though. It's AVX-512 implementation is somewhere between the two Intel's above, but makes up for it with clock.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

14 minutes ago, porina said:

Why 10900k

well because back then there wasn't anything better (although i think intel had another very similar chip also) 

 

I'm just saying maybe that many cores aren't needed, but its nice to have and certainly better that these fake "e-cores" that you have to disable for better performance lol... 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Mark Kaine said:

well because back then there wasn't anything better (although i think intel had another very similar chip also) 

 

I'm just saying maybe that many cores aren't needed, but its nice to have and certainly better that these fake "e-cores" that you have to disable for better performance lol... 

If all you do is play games, then yeah, the e-cores are almost useless. Otherwise, they do improve MT a lot.

 

Personally, I'd love a cpu with 4~6 big cores and 20+ small cores, MT beast and the big cores should be more than enough for specific stuff.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

29 minutes ago, igormp said:

If all you do is play games, then yeah, the e-cores are almost useless. Otherwise, they do improve MT a lot.

 

Personally, I'd love a cpu with 4~6 big cores and 20+ small cores, MT beast and the big cores should be more than enough for specific stuff.

i mean sure, i just think there should be a flagship none ecore variant too... say 12c/24t 5.4ghz (6ghz?), big cache, avx512,  500 bucks! (uwu) as the ecores apparently really aren't all that great outside some specific stuff. 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, leadeater said:

That's simply not true. Games using more than 2 cores has been long standing now and that includes DX11 games. The whole doesn't use XYZ number of cores has been well investigated by many reviewers, it stems from the faulty reasoning that ...

No, no, the reasoning behind this has always been a higher clock speed matters more, and has always been the case. If you have a CPU with 2 cores and 4Ghz(like most CPU's produced by Intel until they finally dropped the 2-core i3's with the 7th gen), or 4 cores and 2ghz (like a Xeon) the better CPU always has the higher clock. No DirectX9/OpenGL or DX10 game can utilize more than 2 cores in the graphics pipeline because the graphics pipeline doesn't do that, any multithreading done by a game in DX9 or DX10 is done at the driver level in software, or because the developer went out of their way to find parallelization opportunities elsewhere. I have never seen a single game gain any level of performance from having more cores that wasn't attributable to the higher clock speed/boost of the better CPU. 

 

In fact, even in builds like the "7 gamers, 1 CPU", you clearly have to leave performance on the table to use it, and you would have been better off buying 7 separate, faster PC's for any metric that mattered because all you saved was the cost of additional PSU's, not the Power it uses.  They needed 2 CPU's that cost 2700$ each for 14 cores each. So that was giving 7 users 4 cores and 8x PCIe lanes at 3ghz boost, but also 32GB/36GB of RAM each, but only the memory bandwidth of one channel.

 

That's close-enough, but that RAM bandwidth and the nerf to the GPU bandwidth would have been noticeable. Meanwhile 7 x  i7-4700k would have been a better performer and cost half as much and have a 20% higher attainable clock speed.

 

Look across Intel and AMD's entire product stack, you will not see CPU's with more cores clock higher than one with less cores. It can't, because that TDP hits a wall. Hence the best CPU for gaming will be the one with the highest clock speed. Every time. Disabling cores on those higher core count CPU's allows the CPU to boost higher under the same TDP envelope. That's where that performance comes from.

 

 

2 hours ago, leadeater said:

 

Threads do not have to be 100% or even 50% utilized on average to state they are being used, if you collect performance data, ensure repeatability and data accuracy you can determine if something is have an impact or not. If a game positively reacts and increases frame rate from having more threads available then it is using those threads regardless of how little on average. 

 

Again, Until DX12u/Vulkan, there was no ability to scale render threads without some proprietary extension that only worked on one GPU driver. Look at Intel's drivers. THAT is why it sucked for older games, it simply did not have the drivers written or optimized to do so because a lot more of that work was done by the driver, on the CPU. Where as DX12u and Vulkan, cedes that control to the game, thus allowing higher utilization of the CPU and the GPU. Now if the game doesn't scale, that's the game developer's fault. 

 

How long has DX12 and Vulkan been out? Since Skylake CPU's and Kepler GPU's. It's not that old. You can literately draw a line in the sand where a quadcore actually becomes useful, because again, Intel didn't drop the 2 core's until the 7th gen. So nothing could demand a quadcore until 2016. That doesn't mean every game released in 2016 used it, it's likely most of the games in development started years before and were still using previous DX11/OpenGL proprietary extensions that were not vender neutral. This is again why an Intel GPU would be at a disadvantage, because unless a game made in 2015 was designed to be able to play on the iGPU, it's likely they didn't use any Intel vendor extensions.

 

2 hours ago, porina said:

There will be cases where that's true today, although more in the microcontroller space than general computing.

 

So what is your solution to that "problem"? No one is allowed to buy 8 or fewer cores in case they might need more?

No, the solution is for the CPU manufacturers to not sell or market processors in a way that would preclude them from being useful for "ALL" software types in the first place. Intel screwed up here by nerfing AVX512 off the 12th+ gen chips, so now anything that was designed to be able to use AVX512 (which lowers the clock speed on the CPU when used due to thermal limits being hit) has to consider the possibility that the feature is absent. Games are unlikely to AVX512, but AI loads do. Look at how much AI inference loads are being used by the CPU without the GPU.

 

Like, if your product stack goes from 4 core to 64 core, that means software developers still have to assume that they will be run on a 4 core system, even if there are 64 cores, and what do you do with those other 60 cores? Are you really going to give up 2ghz of clock speed to have 60 cores that your primary task is not going to ever use? I grapple with this question every time I look at AMD's product stack. Yes I do have a use case, but it's only one use case, and if I have to, I have to put that against the gaming use case:

https://www.cpubenchmark.net/singleThread.html

image.thumb.png.6cf951ba612c29a730b74c24b2b5e62f.png

 

The top end Ryzen part, is still slower than the i5 Intel 14th gen parts.

image.thumb.png.5251838939691dba9510d397ec5606bb.png

Look how much further down the X3D version of that high end part is, and also where the Threadrippers sit.

2 hours ago, porina said:

Do you even game? A quad core today can still run most games, but you're not going to have the best time if you went that route.

A quad core can run most games because that is mid-tier Skylake performance. When DX12u and Vulkan were released. No game made at the time Skylake was released was using the DX12/Vulkan pipeline at release.

 

2 hours ago, porina said:

 

So is 8 cores too much, enough or not enough now?

 

4 cores will be enough until Intel drops it off it's product stack, for games and business/productivity.  If you are doing video editing, 3D modeling/rendering , or streaming then 8 cores becomes the "just enough". People are seriously trying to stream and play PC games on 4-cores, and they are quickly finding out even with using the GPU encoder, the overhead from having OBS do "anything" is crippling their ability to stream because every overlay and widget on their stream uses a CPU core. A typical streamer has as many as 16 "browser tabs" worth of CEF layers. Not to mention audio filters that also chain to each other on their own threads. A 4 core is unlikely to be pleasant streaming experience for you the streamer and but also the viewer who will see the stream get interrupted whenever something that needs multiple CPU cores to synchronize is activated.

 

If you are instead trying to min-max your CPU because you have 16 or more CPU cores, that still might not be enough to game and stream if you utilize a lot of the stuff streamers use. It all depends what you're trying to accomplish. Unfortunately streamers tend to be less software tech-savvy and many streamers still ask very basic "head-to-desk" questions and blame the software when it's really that the user doesn't know how much performance is sucked away by having multiple overlays, or really ANY overlays. 

 

A vtuber software stack looks like this:

VTS/VNyan/VSeeFace/etc - 2  CPU cores, puppeting software is usually Unity, sometimes it's Unreal, can use anywhere from 8%(x90 parts) to 30% (x60 products) of a GPU, the bigger the model, the heavier the GPU cost. If they are using a webcam and not an iPhone, that also pegs one additional core at 100%. So there are up to 3 cores being used by this. If the software also utilizes a CEF layer then that's now 4 CPU cores.

OBS - 2 cores (audio and video pipeline) plus one 1 additional core per CEF overlay. When on a collab with 3 other Vtubers, that means you can be using up to 6 cores in OBS alone, and that's before you add filters. If you use x264 instead of the NVenc or other hardware encoder, that can add MANY more CPU cores (can use up to 8 cores before quality starts going down.)

Game - 2 to 4 cores, plus all remaining GPU performance. If you play something that uses DX12u/Vulkan, the game will fight the Vtuber puppeting software and OBS for GPU priority, and a GPU with smaller amounts of VRAM will suffer and swap-thrash against the GPU.

 

So, what does that amount to? A minimum of 6 cores assuming you own an iPhone and have no overlays at all, and your model is 2D and small. If you have a large model (2D or 3D) and are doing a 8 person collab, now you need 14 cores to do what you previously could get away with 6 on. But you also need a much better GPU if you still want to game on that computer. An 8-person collab is going to have to be indie Unity games if you still want to use only one PC per streamer.

 

This is a legitimate use case, where "8 cores is enough" only applies to one type of streamer, the one who doesn't have a webcam or PNG avatar on the screen. 

 

The thing I've discounted here is if all those cores or threads matter. Sure you need to have at least that many THREADS, but you might not need those cores dedicated to all those threads. Only running software encoding, or AI inference on a CPU core, requires a full core dedicated to that feature, where as you might quite literately get away with 8 cores as long as you have 14 threads available.

 

Which brings us back to "e cores", or "how to screw up how many cores you actually need". Those CEF layers above? Likely could run on E-cores, because they usually just spin-wait animations, they don't actually do anything more complicated than be several animated gif's in practice. One gif per CEF layer. But the OS has no way to tell that your 8-person collab plus 4 animated widgets don't need to render at 3840x2160xVsync rate. One way to tank the performance in OBS is to rescale a layer in OBS. Every layer in OBS comes at a CPU penalty if you even rescale it by 1 pixel. You have to make OBS have a canvas at your native resolution, and rescale the video as a whole to avoid those penalties.

 

Hence, we're back to the question of how many cores you really need/can use. If you're lazy, or not software tech-savvy, you might have to over-build your gaming rig to avoid hitting ceilings early. If you're very software tech savvy, you will understand how you can fold all your CEF layers into a single layer using a locally saved HTML page, thus saving many cores from being spin-waited in OBS. Avoiding rescaling in OBS also saves CPU performance per layer. 

 

So I would say, if you are NOT streaming, and not playing multiplayer, then 6-cores is still serviceable for games and productivity software. However since so many apps are being developed as CEF (Eg Electron, Cordova, Nw.js etc) apps, that means what previously in 2016 might have been native application that used small amounts of CPU and no GPU to operate, is now a hodge-podge of HTML and Javascript in the CEF layer, and since CEF's don't use threads, and instead have to communicate IPC over expensive web workers, we're back to the problem of lazy and sloppy software development dictating how many cores you can actually use.

 

If the trend continues, we're going to continue to see poorly performing CEF apps that can be chalked up entirely to trying to use the CEF layer as a GUI.

 

Instead of multithreading, we've being going backwards in every piece of client software that is not a game. Things are becoming less threaded and more dependent on full process forking, thus looking at how many cores or threads a CPU has becomes irrelevant when only the highest clock speed matters. We've come full circle from games not being threaded, but applications being threaded, to games being threaded, and applications not being threaded by becoming single-threaded Javascript.

 

Need I remind everyone again, that loading the Epic Game store, takes literal minutes. Never mind installing or playing something. When you have 6 different game stores idling by using CPU, GPU and RAM resources in hundreds of megabytes to do absolutely nothing, those "extra cores" become valuable to not impact your game or video editing performance.

Link to comment
Share on other sites

Link to post
Share on other sites

@Kisai

CPUMark is hardly a realistic reflection of performance. Especially knowing that even 5800X3D is often going against 13900K and 14900K in gaming with very little lag behind and it's an old Zen 3 chip with 3D cache bolted on top. X3D chips are amazing for gaming. But you can do other things just as well with them. I have the 5800X3D and I also do video encoding, image processing and Ai image processing (which is mainly odne on GPU) as well as image optimization with FileOptimizer which is almost entirely done on CPU and it does the job really well. It's not as fast as 5800X because of lower clocks, but difference is so insignificant it just doesn't matter when it's so much better in games.

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, LAwLz said:

I still think most people on this forum that hate E-cores are ignorant and have read too much marketing. 

 

 

For almost all tasks that people on this forum use their computers for, the E-cores are either a performance boost, or they don't matter. Rarely is ever will you run until a situation where you'd be better off without them. 

 

 

In before the 0.1% on this forum who do run some workload that are negatively impacted by E-cores mention some time they had issues, and the other 99.9% is people go "see? I was right in hating them!"... 

The issue is not in cores themselves. They are pretty capable. Issue is, the scheduler assigning them workloads isn't. Windows is pretty bad at it and that applies to both Intel and AMD chips. While AMD doesn't have E/P-cores, they have normal cores and X3D cores and those CPU's also have constant problems with scheduler. It's just not great. Now, with all out X3D CPU's like 5800X3D and 7800X3D, you know they'll run slightly worse for productivity, but you can be assured they'll NEVER have ANY issues with ANY game. Because any thread will also have entire full fat L3 cache available. On top of that, 5800X3D and 7800X3D have a single CCD which means no CCD-to-CCD shenanigans either like on others like normal 7950X. Though these new generations don't have many inter-CCD problems like initial Ryzens...

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, RejZoR said:

Issue is, the scheduler assigning them workloads isn't.

Source?

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, RejZoR said:

@Kisai

CPUMark is hardly a realistic reflection of performance. Especially knowing that even 5800X3D is often going against 13900K and 14900K in gaming with very little lag behind and it's an old Zen 3 chip with 3D cache bolted on top. X3D chips are amazing for gaming. But you can do other things just as well with them. I have the 5800X3D and I also do video encoding, image processing and Ai image processing (which is mainly odne on GPU) as well as image optimization with FileOptimizer which is almost entirely done on CPU and it does the job really well. It's not as fast as 5800X because of lower clocks, but difference is so insignificant it just doesn't matter when it's so much better in games.

Again, a GAME, the highest performance single-thread performance is the single most important performance metric, particularly for anything that isn't DirectX12u/Vulkan.

 

All other uses, if something can min-max the CPU cores (3D rendering, or Video editing) then equally performing cores is better than not. Asymmetric cache on cores and cores that don't boost evenly will cause every task to run at the slowest core on the system. This is why the scheduler has to be intelligent how it deals with those cores otherwise your 16 core CPU with 3D cache will perform the same as not having one. 

 

Certain loads also do not benefit from parallelized loads at all, such as compression. Compression is linear. Encryption is linear.

 

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Kisai said:

Asymmetric cache on cores and cores that don't boost evenly will cause every task to run at the slowest core on the system. This is why the scheduler has to be intelligent how it deals with those cores otherwise your 16 core CPU with 3D cache will perform the same as not having one. 

 

Im sorry this doesnt track with me, Parallel threads are not defined as dependent on each other unless they are racing. There is no way to say the task will run at the slowest core thread without assuming the fast core is waiting on a result from the slowcore and is creating a massive pipeline bubble. 

Link to comment
Share on other sites

Link to post
Share on other sites

29 minutes ago, Kisai said:

Certain loads also do not benefit from parallelized loads at all, such as compression. Compression is linear. Encryption is linear.

 

Modern compression and encryption algorithms do scale with more cores. All of your assumptions seem to be related to softwares from 10 years ago.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×