Jump to content

Intel Core i9-11900(k) + i7-11700(k/kf) 8c16t Rocket Lake Desktop CPU Benchmarks and Pricing Leaked: (Update #8)

1 minute ago, Bombastinator said:

Should be a lot more than 3%. More like 33%. If it’s not it implies something is up. 

What is the reasoning for you thinking that? Did some different testing show such?

 

In some compute use cases, they heavily hit the bus much more than gaming does, and it makes a bigger difference. I believe folding is such a use case, where a mining style 1x riser will cripple performance. However I'm not aware of games requiring that level of connectivity.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

On 12/26/2020 at 6:32 PM, Vishera said:

Thank you for your reply,it's very informative!.

What do you think about a wafer made of diamonds?

Seems like it's possible.

Possible, yes. Likely? Idk. I don't know how possible it would be to synthesise wafers of the size required for cost-effective CPU production.

 

On 12/26/2020 at 6:32 PM, Vishera said:

I see,i read some interviews of people working in the R&D of this project,

They mentioned that IBM is involved in the project and that it's possible that we will see a wafer fully made of carbon nano-tubes in a few years,

and that it's not a question of if,but a question of when,they also talked about some of the obstacles they encountered and how they solved them.

Interesting. Would love to read them as I can't see how that's viable (or useful) myself. A 2D film of nanotubes provides the benefits we want for a processor, I can't see how having multiple layers of them would help.

 

They also wouldn't make much sense for the power-routing parts of the circuitry. When nanotubes are deposited, you can't guarantee their orientation - it looks like a bird's nest. Connecting them to make this distribution layer would also have to be dynamic - a result more akin to a spider's web than a grid - because of their random orientations. Using a static grid-based power delivery design would, as far as I can see, lead to either zero-gain from using the nanotubes or to short circuits and chaos. But if it looks like a web then ensuring that power goes where you want it to go will be a huge challenge - there will likely be holes in your web - as would ensuring that it doesn't go where it shouldn't.

 

It makes more sense to me to make use of the other properties of the CNTs to enhance the more traditional methods, as they did in the paper:

Spoiler
Quote

Furthermore, RV16X-NANO has a three-dimensional (3D) physical architecture, as the metal interconnect layers are fabricated both above and below the layer of CNFETs; this is in contrast to silicon-based systems in which all metal routing can only be fabricated above the bottom layer of silicon FETs (see Methods). In RV16X-NANO, the metal layers below the CNFETs are primarily used for signal routing, while the metal layers above the CNFETs are primarily used for power distribution (Fig. 1c, d). The fabrication process implements five metal layers and includes more than 100 individual processing steps (see Methods and section ‘MMC’ for details). This 3D layout, with routing above and below the FETs promises improved routing congestion (a major challenge for today’s systems), and is uniquely enabled by CNTs (owing to their low-temperature fabrication; see Methods).

 

 

 

On 12/26/2020 at 6:32 PM, Vishera said:

Also i found out that Silicon Carbide wafers are a thing:

Silicon_Carbide_Wafer_micross.jpg

It makes sense,the superior performance that carbon offers combined with the familiarity of silicon seems like a good in between solution until full carbon wafers are a thing.

Silicon carbide is different. Silicon carbide can withstand higher voltages than regular silicon, and can be more efficient at those higher voltages, so it's incredibly useful in high-voltage applications like electric vehicles, but it also requires higher voltages to work in the first place as it has a far wider band gap. At the low voltages in a CPU it is unlikely to prove worthwhile.

 

It's also difficult to work with (hard to produce high quality SiO2 from SiC), has big problems with defect density and is far harder to grow as economically viable wafers. SiC wafers are only ~6 inches in diameter, as opposed to the 12 inch silicon wafers used today, which massively drives up the cost of production vs regular silicon, hence it is only used in fields where it's downsides can be used as advantages.

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, porina said:

What is the reasoning for you thinking that? Did some different testing show such?

 

In some compute use cases, they heavily hit the bus much more than gaming does, and it makes a bigger difference. I believe folding is such a use case, where a mining style 1x riser will cripple performance. However I'm not aware of games requiring that level of connectivity.

Because a 3080 is generally about a third faster than a 2080ti, and a 2080ti is supposed to be just barely bottlenecked by pcie3.0x8. Stands to reason that pcie3x8 for a 3080 should be a big problem.  Somehow 33% is turned into 3%

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Bombastinator said:

Because a 3080 is generally about a third faster than a 2080ti, and a 2080ti is supposed to be just barely bottlenecked by pcie3.0x8. Stands to reason that pcie3x8 for a 3080 should be a big problem.  Somehow 33% is turned into 3%

The 1080 saw ~1% lower performance using pcie 3.0 8x vs 16x, so by your logic the 2080ti should therefore have been bottlenecked by 30+% when using pcie 3.0 8x. But it wasn't, it saw a similar 2-3% reduction vs 3.0 16x. You're making the assumption that raw card performance and pcie utilisation scale at the same rate, which they clearly do not given historical evidence.

 

Imo it's likely not to do with raw sustained bandwidth, but the ability to send more transfers in any given instant (and with 4.0, those transfers taking half as long to complete). Over long periods of time, no the card likely doesn't saturate the interface, even at 3.0 8x. But I imagine there will be small periods of time where it does - spikes in usage - and that's where I feel the faster interfaces are making their gains: raising the bandwidth ceiling to prevent those spikes from slowing things down. That's why the gains are only single-digit percent - using that much bandwidth just isn't that common of an occurance. (This is all just speculation, I'm willing to be proved wrong.)

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

25 minutes ago, tim0901 said:

The 1080 saw ~1% lower performance using pcie 3.0 8x vs 16x, so by your logic the 2080ti should therefore have been bottlenecked by 30+% when using pcie 3.0 8x. But it wasn't, it saw a similar 2-3% reduction vs 3.0 16x. You're making the assumption that raw card performance and pcie utilisation scale at the same rate, which they clearly do not given historical evidence.

 

Imo it's likely not to do with raw sustained bandwidth, but the ability to send more transfers in any given instant (and with 4.0, those transfers taking half as long to complete). Over long periods of time, no the card likely doesn't saturate the interface, even at 3.0 8x. But I imagine there will be small periods of time where it does - spikes in usage - and that's where I feel the faster interfaces are making their gains: raising the bandwidth ceiling to prevent those spikes from slowing things down. That's why the gains are only single-digit percent - using that much bandwidth just isn't that common of an occurance. (This is all just speculation, I'm willing to be proved wrong.)

It’s odd that they wouldn’t, because the whole point behind the cpu->gpu->monitor thing is for the cpu to build the frame, the gpu to draw the frame, and the monitor to display the frame. For there to be very little change would mean that the pcie connection to the cpu is not a limiting factor, and the limitation is WITHIN the gpu.   I have heard that one of the big problems with egpus is not only the latency of the connection but also it’s bandwidth.  There supposed to not be much of a point in using a gpu more powerful than a 580 for gaming via egpu, because the 4xpcie equivalent connection of thunderbolt gets saturated and there aren’t any gains.  The implication is that someone is wrong.  Either someone’s numbers somewhere are messed up, or my understanding of the way CPUs and GPUs work together is fundamentally flawed.  My money I thing would be on either my lack of understanding or Nvidia lying about something.  Means little though.

Edited by Bombastinator

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Bombastinator said:

Because a 3080 is generally about a third faster than a 2080ti, and a 2080ti is supposed to be just barely bottlenecked by pcie3.0x8. Stands to reason that pcie3x8 for a 3080 should be a big problem.  Somehow 33% is turned into 3%

Just making up some numbers for illustration. With GPU1 say it takes 1ms to transfer the data per frame over the PCIe bus. The GPU itself takes 10ms to draw the frame. Total time: 11ms. GPU2 is is 30% faster at drawing, so it draws the frame in 7ms, plus the same 1ms transfer time, for 8ms overall. So even though GPU2 had 30% more drawing speed, the PCIe transfer time meant it was 27% faster overall. The PCIe transfer became a bigger proportion of the time, which I guess is what people call the "bottleneck".

 

Now, if the PCIe time was much bigger, then as the GPU gets faster, it will have more impact. That part is expected, but we're some way off that.

 

Note my very simplistic example wont be fully accurate. It assume things have to be done sequentially and not at the same time. It would get very complicated fast to try to take account of the variables. For instance, you could apply similar logic to ram speeds. Why running much faster ram gives a smaller benefit than the difference would suggest, because it only affects one part of the whole chain.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

33 minutes ago, porina said:

Just making up some numbers for illustration. With GPU1 say it takes 1ms to transfer the data per frame over the PCIe bus. The GPU itself takes 10ms to draw the frame. Total time: 11ms. GPU2 is is 30% faster at drawing, so it draws the frame in 7ms, plus the same 1ms transfer time, for 8ms overall. So even though GPU2 had 30% more drawing speed, the PCIe transfer time meant it was 27% faster overall. The PCIe transfer became a bigger proportion of the time, which I guess is what people call the "bottleneck".

 

Now, if the PCIe time was much bigger, then as the GPU gets faster, it will have more impact. That part is expected, but we're some way off that.

 

Note my very simplistic example wont be fully accurate. It assume things have to be done sequentially and not at the same time. It would get very complicated fast to try to take account of the variables. For instance, you could apply similar logic to ram speeds. Why running much faster ram gives a smaller benefit than the difference would suggest, because it only affects one part of the whole chain.

The sequentially vs the same time thing goes to video buffer. My understanding of the way 3d games are done is that while the cpu and gpu work at the same time they work at different speeds, and that the way they are synced is with a buffer that acts a lot like the fluid clutch on an old style automatic transmission. The cpu creates the frames very roughly, and throws those rough numbers into the buffer which holds a certain number of frames.  The gpu picks the frames out of the buffer one by one, plays paint-by-numbers obit and kicks it to the monitor. This way the cpu and gpu can work at different speeds. With a gpu bottleneck the buffer is always full and the cpu creates a frame and then waits around for there to be enough space in the buffer to fit another frame in so the buffer is always full.  With a cpu bottleneck the gpu draws faster than the cpu can create, and the buffer is never full.  The pcie link is the conduit through which the frames are passed from the gpu to the cpu, and the cpu ram is where the buffer is. 
 

This is I understand a gross oversimplification, though it is also the limit of my understanding of the concept.

Edited by Bombastinator

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, Bombastinator said:

It’s odd that they wouldn’t, because the whole point behind the cpu->gpu->monitor thing is for the cpu to build the frame, the gpu to draw the frame, and the monitor to display the frame.

Most of the communication that a GPU does isn't with the CPU though, it's with it's own memory. A 3080 has pcie bandwidth of 31.5 GB/s when using 4.0 x16, but has a VRAM bandwidth of 760.3 GB/s - it's 20x faster. Even a GT 1030 (GDDR5) has a faster memory bandwidth at 48 GB/s.

 

Yes the CPU technically "builds the frame" - but all it really does is say "hey, this is what changed in the scene, draw me a frame" - the GPU does the rest independently. All the models and textures are stored in VRAM - what the GPU does is apply the transforms supplied by the CPU to these and use the result to render the frame.

 

Quote

For there to be very little change would mean that the pcie connection to the cpu is not a limiting factor, and the limitation is WITHIN the gpu.

This is very possible and happens regularly. It's why VRAM bandwidth is so important - a bad memory configuration can easily bottleneck a gpu, such as with the DDR4 GT 1030.

 

Quote

I have heard that one of the big problems with egpus is not only the latency of the connection but also it’s bandwidth.  There supposed to not be much of a point in using a gpu more powerful than a 580 for gaming via egpu, because the 4xpcie equivalent connection of thunderbolt gets saturated and there aren’t any gains.  The implication is that someone is wrong.  Either someone’s numbers somewhere are messed up, or my understanding of the way CPUs and GPUs work together is fundamentally flawed.

It's not that there aren't gains, but that they're severely limited. If memory serves me correctly a 2080 performed similar to a 2060 through an eGPU, but a 2070 would perform worse and a 2080ti would perform better. It wasn't that there was no benefit to it, but that it wasn't worth the massive price increase.

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

On 12/26/2020 at 6:16 PM, beerdrunkmonk said:

 

While it is 14nm, it's not Skylake architecture.

 

"Rocket Lake is a codename for Intel’s desktop x86 chip family which is to be released in the first quarter of 2021. It will be based on the new Cypress Cove microarchitecture, a variant of Sunny Cove (used by Intel's Ice Lake mobile processors) backported to the older 14nm process." Source: https://en.wikipedia.org/wiki/Rocket_Lake

Very interesting, I did not know this. This means Rocket Lake is this generation's Broadwell, where we have two microarchitectures on a single socket/chipset.

 

Someone can correct me if I am wrong, but if that Wikipedia article is true, these would be the first consumer-grade, non-extreme edition CPU's with AVX-512 support, right? If it is based on Sunny Cove which is derived from Ice Lake, does this mean we get the more feature rich, full-er flavored AVX 512 that Ice Lake promised? 

Quote

AVX-512 F, CD, VL, DQ, BW, IFMA, VBMI, VBMI2, VPOPCNTDQ, BITALG, VNNI, VPCLMULQDQ, GFNI, VAES

Depending on how the AVX support shakes out, I may have picked a weird time to buy a 5950X, lol.

My (incomplete) memory overclocking guide: 

 

Does memory speed impact gaming performance? Click here to find out!

On 1/2/2017 at 9:32 PM, MageTank said:

Sometimes, we all need a little inspiration.

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, MageTank said:

Someone can correct me if I am wrong, but if that Wikipedia article is true, these would be the first consumer-grade, non-extreme edition CPU's with AVX-512 support, right? If it is based on Sunny Cove which is derived from Ice Lake, does this mean we get the more feature rich, full-er flavored AVX 512 that Ice Lake promised?

On desktop, yes. AVX-512 so far has only been present in mobile (since Ice Lake?), HEDT (since Skylake-X), and server.

 

Depends on which specific parts of AVX-512 you're most interested in. As far as I can tell, if you're after FP64 goodness, this is unlikely to satisfy. The mobile parts were single unit, and thus ~half the IPC of HEDT parts in that area. If it is the other stuff under AVX-512 umbrella, no idea.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, tim0901 said:

Most of the communication that a GPU does isn't with the CPU though, it's with it's own memory. A 3080 has pcie bandwidth of 31.5 GB/s when using 4.0 x16, but has a VRAM bandwidth of 760.3 GB/ - it's 20x faster. Even a GT 1030 (GDDR5) has a faster memory bandwidth at 48 GB/s.

 

Yes the CPU technically "builds the frame" - but all it really does is say "hey, this is what changed in the scene, draw me a frame" - the GPU does the rest independently. All the models and textures are stored in VRAM - what the GPU does is apply the transforms supplied by the CPU to these and use the result to render the frame.

 

This is very possible and happens regularly. It's why VRAM bandwidth is so important - a bad memory configuration can easily bottleneck a gpu, such as with the DDR4 GT 1030.

 

It's not that there aren't gains, but that they're severely limited. If memory serves me correctly a 2080 performed similar to a 2060 through an eGPU, but a 2070 would perform worse and a 2080ti would perform better. It wasn't that there was no benefit to it, but that it wasn't worth the massive price increase.

Re: vram

goes to the definition of what it meant by GPU.  You seem to be referring specifically to the chip where when I think of a gpu I think of the entire object that goes into the slot so a gpu and it’s memory are one unit separated by the pcie bus.

 

re: limiting factors 

but none of that passes through the pcie bus.  That’s all card-side

 

re: card types

this implies that the main limitation of the egpu is NOT the bandwidth, but the latency.  The card can only ask for more frames so fast, and the cpu can only transmit them so fast.  Not because of the number of lanes available but because of the time it takes to go through the rigamarole of converting the data, squirting it through the cable, and unpacking it into something useful again.

Edited by Bombastinator

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Bombastinator said:

Re: vram

goes to the definition of what it meant by GPU.  You seem to be referring specifically to the chip where when I think of a gpu I think of the entire object that goes into the slot so a gpu and it’s memory are one unit separated by the pcie bus.

Potato potato. The pcie connection goes directly from the connector to the GPU chip - they're the same thing when you're saying CPU->GPU->Monitor.

 

7 minutes ago, Bombastinator said:

re: limiting factors 

but none of that passes through the pcie bus.  That’s all card-side

And it doesn't change my point that, as you said:

Quote

For there to be very little change would mean that the pcie connection to the cpu is not a limiting factor, and the limitation is WITHIN the gpu.

The GPU is communicating far more internally than with the rest of the system. It makes perfect sense that bottlenecks would therefore come from within - it's the whole reason Nvidia wanted GDDR6X, and is the whole reason AMD went with HBM for Navi. Bottlenecks within the GPU card, be they the chip itself or the VRAM, have always been an issue.

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Stahlmann said:

Since when does any ryzen CPU "require" liquid cooling?

A decent air cooler like a D15, Dark Rock Pro 4 or Freezer 50 will keep even a 5950X cool with ease.

Every OEM and BYO selling the 3950X has a liquid cooling requirement for it that I've seen.

https://www.tomshardware.com/reviews/alienware-aurora-r10-ryzen-edition

 

AMD recommends liquid cooling, which means that the OEM's are not going to build something that they are going to have to warranty from cheapening out on what is arguably the cheapest parts in the system. The air-only coolers for the Ryzen 3900X/3950X/5900X/5950X are massive. The D15 that fits is 16.5cm x 16.1cm x 15.0 cm and weighs 1.32kg

 

https://ncc.noctua.at/cpus/model/AMD-Ryzen-9 5950X-1044

 

Those large coolers are usually too big to use on anything but full size ATX boards.

 

Anyway, the point I was trying to make here was that if you wanted to avoid having a liquid cooler, that comes at the cost of potentially more thermal throttle/fan-noise. The R9 and i9 parts might score high on benchmarks but they're largely not going to maintain those high numbers if it has to thermal throttle. The AVX instructions on intel parts are known to kick the cpu clocks down to run as an example.

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 12/28/2020 at 8:32 AM, porina said:

Looking back at the techpowerup testing, they claimed a 1% difference on a 3080 between 3.0 and 4.0, at 1080p and 4k. It's irrelevant unless you're a competitive benchmarker. Many other things in the system could make as much difference.

 

https://www.techpowerup.com/review/nvidia-geforce-rtx-3080-pci-express-scaling/27.html

 

Maybe if RTX IO (or the generic equivalent for AMD) gets used in more future games, I might change my mind then, but I'll likely have a newer system by the time that is in any way relevant.

Pretty much every reviewer comes to the conclusion that PCIe 4 is a marketing exercise with no practical benefit but even enthusiasts lap it up like it's an essential feature.  Been with way for PCIe 3 and PCIe 2 when those came out too.  Many of these improvements really only help enterprise users where they can stack 50 people on the same box and thrash the hardware at max utilization.

Workstation:  14700nonk || Asus Z790 ProArt Creator || MSI Gaming Trio 4090 Shunt || Crucial Pro Overclocking 32GB @ 5600 || Corsair AX1600i@240V || whole-house loop.

LANRig/GuestGamingBox: 9900nonK || Gigabyte Z390 Master || ASUS TUF 3090 650W shunt || Corsair SF600 || CPU+GPU watercooled 280 rad pull only || whole-house loop.

Server Router (Untangle): 13600k @ Stock || ASRock Z690 ITX || All 10Gbe || 2x8GB 3200 || PicoPSU 150W 24pin + AX1200i on CPU|| whole-house loop

Server Compute/Storage: 10850K @ 5.1Ghz || Gigabyte Z490 Ultra || EVGA FTW3 3090 1000W || LSI 9280i-24 port || 4TB Samsung 860 Evo, 5x10TB Seagate Enterprise Raid 6, 4x8TB Seagate Archive Backup ||  whole-house loop.

Laptop: HP Elitebook 840 G8 (Intel 1185G7) + 3080Ti Thunderbolt Dock, Razer Blade Stealth 13" 2017 (Intel 8550U)

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, AnonymousGuy said:

Pretty much every reviewer comes to the conclusion that PCIe 4 is a marketing exercise with no practical benefit but even enthusiasts lap it up like it's an essential feature.  Been with way for PCIe 3 and PCIe 2 when those came out too.  Many of these improvements really only help enterprise users where they can stack 50 people on the same box and thrash the hardware at max utilization.

With ReBAR on PCIe 4.x that might not be true anymore. And sure, while it's not a performance breaking addition, when you're running a top of the line system, getting ANY advantage is welcome. I mean, why not have up to 12% more performance thanks to ReBAR feature? Everything is there. Plus wider adoption means it'll become the norm for fast data shuffling between CPU and GPU to a point all boards and all GPU's will support it.

Link to comment
Share on other sites

Link to post
Share on other sites

14 minutes ago, RejZoR said:

With ReBAR on PCIe 4.x that might not be true anymore. And sure, while it's not a performance breaking addition, when you're running a top of the line system, getting ANY advantage is welcome. I mean, why not have up to 12% more performance thanks to ReBAR feature? Everything is there. Plus wider adoption means it'll become the norm for fast data shuffling between CPU and GPU to a point all boards and all GPU's will support it.

Resizable BAR was in the 3.0 standard. It's just only recently that it's become a feature being pushed to the cards. Nvidia has stated that their implementation will work on standard PCIe 3.0 interfaces. Whether this means it's coming to older cards I don't know.

 

I think the bigger one might be RTX IO - if you're speeding up the rate at which textures are loaded to VRAM, then you're most likely going to be increasing the amount of pcie bandwidth you're using at the same time. But it could also be that it actually reduces pcie usage by way of reducing the amount of data being shared, seeing as it is essentially a GPU-accelerated decompression algorithm. I guess it depends entirely on if/how the technology is used - do they push for smaller game sizes, or higher-res textures?

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

RTX IO is just DirectStorage. But with usual NVIDIA's naming BS attached to it.

Link to comment
Share on other sites

Link to post
Share on other sites

9 hours ago, tim0901 said:

Resizable BAR was in the 3.0 standard.

Well it does have to be, all the GPUs are PCIe 3.0 so without magic it's in the 3.0 spec. Nothing like a bit of marketing and platform segmentation to get people confused.

Link to comment
Share on other sites

Link to post
Share on other sites

11 hours ago, RejZoR said:

RTX IO is just DirectStorage. But with usual NVIDIA's naming BS attached to it.

I mean, Intel, Microsoft and AMD are just as guilty. Smart access memory? Freesync? Hyperthreading? All rebrands of other technologies. 

 

And it’s not like Microsoft DirectStorage is the original name for this either - it’s just the Microsoft brand for the technology. The process that the DirectStorage API is facilitating is known in the pci spec as ‘Peer-to-Peer DMA’ (the same stuff that’s behind Nvidia’s GPUDirect Storage and GPUDirect RDMA on their Quadro cards).

 

Honestly, I generally don’t have a problem with all the marketing names - if we left it up to the people who actually invented the ideas then we’d be doomed. Engineers are after all generally terrible at naming things. (‘GPU-to-NVME Peer-to-Peer DMA’ sound catchy to anyone?) As long as they’re up front as to what technology is behind it (as Nvidia are with RTX IO) then I have no issue with it. If they claim to have come up with something new and proprietary when they haven’t (looking at you Smart Access Memory) then I have a problem.

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, tim0901 said:

Engineers are after all generally terrible at naming things

Well that depends, marketing names are almost always less useful or useless to me compared to the proper technical name given to the technology. Not that I don't disagree with the sentiment around your point, just can depend on target audience. I'm going to find a lot more useful technical information when searching PCIe Resizable BAR than I will for Smart Access Memory or Clever Access Memory (🤦‍♂️).

 

Engineers tend to be more literal with naming, which can help but only so long as you actually know the terms being used.

Link to comment
Share on other sites

Link to post
Share on other sites

On 12/31/2020 at 3:59 PM, leadeater said:

Well that depends, marketing names are almost always less useful or useless to me compared to the proper technical name given to the technology. Not that I don't disagree with the sentiment around your point, just can depend on target audience. I'm going to find a lot more useful technical information when searching PCIe Resizable BAR than I will for Smart Access Memory or Clever Access Memory (🤦‍♂️).

 

Engineers tend to be more literal with naming, which can help but only so long as you actually know the terms being used.

In all honesty, before AMD announced it with Big Navi, no one really talked or cared about ReBAR. And now, suddenly, EVERYONE talks and cares about it. Even though the tech has been around since PCIe 3.x...

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, TOMPPIX said:

It's nice to see intel actually giving a f*ck again.

Even though not the first time I've seen it, I still think that new font they're using is really ugly.

 

Anyway, there is a difference between not giving a F, vs not being able to show they are giving a F. The processor they're about to release would likely have started work a year or more ago.

 

Was wondering if the last screenshot might be enough to give an indication of IPC, but we don't know running clocks. It looks like the bench was already over when it was taken, so the core temp reading is presumably near idle. If we assume the indicated 5.2 GHz is the max turbo and that is what was running when the ST result was obtained, that compares with my Coffee Lake showing a 19% IPC increase. Rocket Lake is based off the same microarchitecture as Ice Lake, and they were claiming 18% average IPC increase for that, so the number is in the right ball park. Some variation is expected depending on the actual code. I'm hesitant to try to interpret the multi-thread result as I don't know how that scales. However, with my Coffee Lake at fixed clock regardless of number of threads active, I get ball park same MTratio:cores ratio. If the CPU-Z bench scales well then that would imply either at same clock benefit from HT is similar, or if clock is actually lower then the HT benefit is higher.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

A few smallish updates (Will update and add to OP) ~ 

 

Intel Core i9-11900K Rocket Lake-S Already Pushed To 5.2GHz All-Core Overclock:

 

Quote

An upcoming Core i9-11900K has been spotted racing along at 5.2GHz on all cores (overclocked). In the meantime, someone has taken an apparent engineering sample and cranked all 8 cores to 5.2GHz. Have a look...

 

small_core_i9-11900K_overclock.jpg.8088180278c054d0f030fe64ccd7bc8e.jpg

 

However, we can at least see that they set the multiplier to 52x, resulting in a 5,204MHz (5.2GHz) all-core overclock. Suggesting it was a stable OC (or at least partially stable), they benchmarked the overclocked chip in CPU-Z, obtaining a single-core score of 706.3 and a multi-threaded score of 7,198.8. How do those scores compare? To find out, we headed over to CPU-Z's validator page, which lists scores for a whole bunch of processors. Oddly, it does not contain any Ryzen 5000 series scores, so we sprinkled in some results found on the web for those processors. Here is the single-core breakdown...

  • Leaked Core i9-11900K: 706
  • Ryzen 9 5950X: 677
  • Ryzen 9 5900X: 677
  • Ryzen 7 5800X: 663
  • Ryzen 5 5600X: 638
  • Core i9-10900K: 584
  • Core i9-9900KS: 582
  • Core i7-10700K: 558
  • Core i9-9900K: 545
  • Core i7-10700: 540
  • Ryzen 9 3950X: 524
  • Ryzen 9 3900X: 521

 

The Core i9-11900K running at 5.2GHz takes the top spot in single-threaded performance, and outpaces the current-generation Core i7-10700K by quite a bit. What about the multi-core score? Let's have a look...
 
  • Ryzen 9 5950X: 12,329
  • Ryzen 9 3950X: 10,867
  • Ryzen 9 5900X: 9,768
  • Ryzen 9 3900X: 8,177
  • Leaked Core i9-11900K: 7,198
  • Core i9-10900K: 7,159
  • Ryzen 7 5800X: 6,766
  • Core i9-9900KS: 5,997
  • Core i7-10700K: 5,660
  • Core i9-9900K: 5,512
  • Core i7-10700: 5,435b

Here we see CPUs with more cores and threads outpace the overclocked Core i9-11900K, as we would expect. Against other processors with the same number of cores and threads, however, the overclocked Rocket Lake-S processor performs very well.

 

Source 1: https://hothardware.com/news/intel-core-i9-11900k-rocket-lake-s-52ghz-all-core-overclock

 

Rocket Lake Engineering Samples Benchmarked Against Zen 3:

 

Quote

In a recent post on Chip Hell, a user reportedly grabbed an early B560 motherboard and engineering samples of three of Intel's new Rocket Lake CPUs, including the Core i7-11700, Core i9-11900, and Core i9-11900K. The tested pitched each processor against AMD's Zen 3-powered Ryzen 7 5800X to see just how they compare to AMD's best eight-core chip. Since these are engineering samples, the Intel chips' clocks speeds are significantly lower than we would likely see with retail models. The poster also threw in Intel's previous-gen Core i9-9900K and Core i7-10700K as well to compare gen-on-gen performance gains.

 

Here are the Rocket Lake engineering samples tested:  

  • QV1J, Core i7-11700 ES -- 1.8GHz base frequency, 4.4GHz boost frequency.
  • QVTE, Core i9-11900 ES -- 1.8GHz base frequency, 4.5GHz boost frequency.
  • QV1K, Core i9-11900K ES -- 3.4GHz base frequency, 4.8GHz boost frequency.

Please not that the engineering samples for Rocket Lake are clocked so low, that any performance benchmarks from these samples are specific to these samples alone, and will not represent actual Rocket Lake performance when the retail SKUs hit shelves this year.

 

154238y6rn7tp4ni3i9ihp.jpg.f8141565418af0a989159ffb5e8dc2c7.jpg

 

154238ksq2821l8qlj0828.jpg.c3570dfeece734ff0816d6709c00d090.jpg

 

154238bthant11jk64okuo.jpg.177619b76401ebe2bf5056c38b9fa78b.jpg

 

154238c5shu5c5z8cp1pc5.jpg.e36a6bef819d7de223558c6328b86a3d.jpg

 

154238f486qlu4tql2ug5s.jpg.55f7742722ab08be8e1a860ca4737fab.jpg

 

154238veuqjuhjyg1sjj41.jpg.cccce36b4fd5a3446f534451d3739f9e.jpg

 

Source 2: https://www.tomshardware.com/news/rocket-lake-engineering-samples-benchmarked

Source 3: https://www.chiphell.com/thread-2290061-1-1.html

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×