Nvidia announces better ARM support and new ARM CPU - x86 not the main player anymore

igormp · April 12, 2021

Summary

Nvidia just announced during GTC their own ARM CPU to be used in DCs/HPCs, with tons of throughput in order to not bottleneck their GPUs. They're also getting ARM-based GPU nodes available on AWS, workstations and servers in partnership with Ampere Altra and Marvell. There will also be consumer devices based on MTK CPUs.

Quotes

Quote

My thoughts

Nvidia building their own CPU with tons of NVLink buses was expected due to the current limitations of x86 and POWER (the latter has NVLink, but the current design aren't that flexible for nvidia), however I didn't expect them to launch it before their ARM acquisition was done.

I'm also surprised that they're even bringing ARM-based devices to end-consumers, let's see if they'll try to get Windows running or just go for Linux.

This is a big hit to the x86 hegemony, especially for Intel, since they're losing their grasp on the DC market at an incredible fast pace, being attacked by both AMD and Nvidia now.

Sources

More links for their Grace CPU:

https://nvidianews.nvidia.com/news/nvidia-announces-cpu-for-giant-ai-and-high-performance-computing-workloads

https://www.nvidia.com/en-us/data-center/grace-cpu/

Edited April 12, 2021 by igormp
Added Grace links

Drama Lama · April 12, 2021

just a logical step

StDragon · April 12, 2021

So, basically the end-game is SOC like integration but far more scalable and powerful.

CPU and GPU with proprietary bussing between the two, and their own MB chipset. Don't think it can't happen...

Drama Lama · April 12, 2021

I‘m wondering if Nvidia might try to make their own game console/ pc at some point

Edited April 12, 2021 by Drama Lama

igormp · April 12, 2021

5 minutes ago, StDragon said:

So, basically the end-game is SOC like integration but far more scalable and powerful.

CPU and GPU with proprietary bussing between the two, and their own MB chipset. Don't think it can't happen...

There already are POWER9 CPUs with NVLink built in. For hyperscalers and other solutions that require the most of GPU performance, it's a price that they'll need to pay, since current open buses aren't fast enough, nor there's any CPU capable of delivering such amounts of throughput.

4 minutes ago, Drama Lama said:

I‘m wondering if Nvidia might try to make their own game console/ pc at some point

IMO, they're caring less and less for the end-consumer market. Any console/pc would be through partners, such as the nintendo switch or that MTK laptop pictured above.

Bombastinator · April 12, 2021

12 minutes ago, igormp said:

There already are POWER9 CPUs with NVLink built in. For hyperscalers and other solutions that require the most of GPU performance, it's a price that they'll need to pay, since current open buses aren't fast enough, nor there's any CPU capable of delivering such amounts of throughput.

IMO, they're caring less and less for the end-consumer market. Any console/pc would be through partners, such as the nintendo switch or that MTK laptop pictured above.

*looks for picture of laptop. Doesn’t find it* creating a cpu and creating a commercially successful cpu are different things. The former happens a lot more often than the latter. Just because Nvidia announces something doesn’t always mean it will get somewhere. Sounds like they’re more worried about apple than anything else.

Kisai · April 12, 2021

41 minutes ago, Drama Lama said:

I‘m wondering if Nvidia might try to make their own game console/ pc at some point

They already do. The nVidia Shield and the Nintendo Switch are the same hardware platform.

igormp · April 12, 2021

40 minutes ago, Bombastinator said:

*looks for picture of laptop. Doesn’t find it*

It's inside the quote since it came straight from the presentation.

41 minutes ago, Bombastinator said:

creating a cpu and creating a commercially successful cpu are different things.

They don't want a commercially successful CPU. They want a dummy chip that can transfer data from/to their GPU and send some commands, it's solely meant to drive their GPU business and not to be sold as a general purpose CPU.

42 minutes ago, Bombastinator said:

Sounds like they’re more worried about apple than anything else.

How so? Apple is not a player in the DC/HPC market.

Bombastinator · April 12, 2021

1 hour ago, igormp said:

It's inside the quote since it came straight from the presentation.

They don't want a commercially successful CPU. They want a dummy chip that can transfer data from/to their GPU and send some commands, it's solely meant to drive their GPU business and not to be sold as a general purpose CPU.

How so? Apple is not a player in the DC/HPC market.

Re paragraph 2 they’re a company. They want to make money. Of course they want a commercially successful product be it proprietary or not. That doesn’t mean that success has to be found in the consumer space. If one conflates consumer space success and commercial success then sure. If it isn’t sucessful people will make less software for it which makes it less succesful and if they’re not careful they’ll wind up with a winCE or widows RT thing that causes billions in losses.

your acronyms can mean a bunch of things. I’m going to assume HPC is high performance computing (supercomputers) mostly because if the requirement is that it be not computer oriented and not something Apple deals with it’s the only bit left. NVIDIA has talked a big game around their a100 chip for supercomputing. Apparently you are saying this is part of that thing and will never see any consumer use to begin with. My earlier information about A100 was that it worked fine with things like epic and didn’t need its own custom built processor to function. If this is not the case I suspect that not only is this new thing screwed but so is a100. There are lots of products like epic that don’t see consumer use. the entire industry of middleware is like that. Do I think Apple is or will get into supercomputing? No. But Nvidia sells stuff in several spaces and this thing sounds a whole lot like the m1. Supercomputers are sold singly. There may never be more than a few machines that ever run this chip. That could still be considered commercial success though. An a100 though still isn’t that different from the 102 chip in a 3080. This could be specifically for a100 based supercomputing except they’re apparently also apparently putting it in laptops which are sort of the opposite of big iron so supercomputers can’t be the only place they’re going and Apple DOES make laptops.

kvuj · April 12, 2021

2 hours ago, igormp said:

How so? Apple is not a player in the DC/HPC market.

But...but... ARM = Apple right?

Interesting that you mention POWER limiting Nvidia. IBM decided to drop NVLink on chip with POWER10 which probably helped cement their decision:

"With the advent of PCIe Gen5, both IBM and NVIDIA determined that PCIe is once again sufficient for eliminating performance bottlenecks in host-to-GPU attach. Therefore a proprietary solution such as NVLINK is no longer a strong differentiator for host-to-GPU attach, and the POWER10 processor will not exploit NVLINK for host-to-GPU attach."[1]

I'm also quite happy that they're starting from HPC / servers and scaling down to consumer hardware. This usually gives us enterprise grade features like CCIX, memory encryption (though AMD's got that going), virtualization goodies, on chip accelerators, etc.

I've been keeping an eye on Ampere for my next computer that I will buy in a couple of years, but it's very costly even compared to POWER9. Nvidia has the money to mass produce so who knows, maybe I'll end up buying a Nvidia CPU. However, given how much they hate open source, that probably won't happen.

porina · April 12, 2021

https://www.anandtech.com/show/16610/nvidia-unveils-grace-a-highperformance-arm-server-cpu-for-use-in-ai-systems

Finally, some reasonable bandwidth. I really hope something like that trickles down to consumer, but we only have DDR5 to look forward to. No, I don't expect 2 TB/s between ram and CPU. System ram bandwidth has not grown anywhere near as fast as CPU cores and the imbalance for compute use cases is comical, especially in AMD's higher consumer offerings. For a Skylake core-GHz, I'd estimate 4 GB/s peak rating would be practically unlimited, and Zen 3 or Rocket Lake would need much more than that.

igormp · April 12, 2021

54 minutes ago, Bombastinator said:

Re paragraph 2 they’re a company. They want to make money. Of course they want a commercially successful product be it proprietary or not. That doesn’t mean that success has to be found in the consumer space. If one conflates consumer space success and commercial success then sure. If it isn’t sucessful people will make less software for it which makes it less succesful and if they’re not careful they’ll wind up with a winCE or widows RT thing that causes billions in losses.

Yeah, the point that I made is that the CPU isn't the product, but a piece for their overall product, so that's why its success is somewhat irrelevant, what needs to be successful is the whole server/machine with the focus being on GPUs that aren't bottlenecked by bandwidth.

57 minutes ago, Bombastinator said:

I’m going to assume HPC is high performance computing (supercomputers)

Yup, that's it.

58 minutes ago, Bombastinator said:

My earlier information about A100 was that it worked fine with things like epic and didn’t need its own custom built processor to function.

It does work, but when you have many of those, PCIe becomes a bottleneck and suddenly you can't just add more of those. With Nvidia's new product, you could have more than 8 GPUs in a single box without any bandwidth bottleneck.

59 minutes ago, Bombastinator said:

An a100 though still isn’t that different from the 102 chip in a 3080.

Oh boy, it actually is really different. Even though both are Ampere-based, how they're built differs a lot. For starters, the A100 is built on TSMC's 7nm, unlike the other products that are built on samsung fabs.

1 hour ago, Bombastinator said:

except they’re apparently also apparently putting it in laptops which are sort of the opposite of big iron so supercomputers can’t be the only place they’re going and Apple DOES make laptops.

There's no mobile A100. And those workstation laptops are in a segment that doesn't rival Apple, since they're mean as portable workstations for CAD and scientific computing on the go (think an engineer in an oil platform in the middle of the ocean), not lightweight products with huge battery life.

cowmama7 · April 12, 2021

Quick question: Does this mean anything related to the legal battle going on with Nvidia's acquisition of ARM? If so, what?

leadeater · April 12, 2021

1 hour ago, igormp said:

Oh boy, it actually is really different. Even though both are Ampere-based, how they're built differs a lot. For starters, the A100 is built on TSMC's 7nm, unlike the other products that are built on samsung fabs.

GA100 vs GA102 architecture is entirely different, it's a little odd to call them both Ampere other than they share same/similar CUDA feature support, but I guess the cores themselves are the same but grouped, arranged and addressed entirely differently which is why it's odd to call them the same archecture code name Ampere.

GA100:

GA102

Bombastinator · April 13, 2021

3 hours ago, igormp said:

Yeah, the point that I made is that the CPU isn't the product, but a piece for their overall product, so that's why its success is somewhat irrelevant, what needs to be successful is the whole server/machine with the focus being on GPUs that aren't bottlenecked by bandwidth.

Yup, that's it.

It does work, but when you have many of those, PCIe becomes a bottleneck and suddenly you can't just add more of those. With Nvidia's new product, you could have more than 8 GPUs in a single box without any bandwidth bottleneck.

Oh boy, it actually is really different. Even though both are Ampere-based, how they're built differs a lot. For starters, the A100 is built on TSMC's 7nm, unlike the other products that are built on samsung fabs.

There's no mobile A100. And those workstation laptops are in a segment that doesn't rival Apple, since they're mean as portable workstations for CAD and scientific computing on the go (think an engineer in an oil platform in the middle of the ocean), not lightweight products with huge battery life.

re: 100!=102.

Well yes of course. They’re not identical. An a100 chip is going to be a lot more similar to a 102 than to say Polaris though. Apparently close enough for a laptop to be made.

Re: there’s no mobile a100

exactly. There isn’t. So if there’s a laptop what could it possibly be using except smaller amphere stuff. Also as a side note CAD is a pretty common Mac thing. There was a period where the only decent CAD software was Mac only. As for science, watch any given video from NASA and see what they’re using. It’s certainly not everywhere. I suspect there are pockets though. Whether they compete though isn’t the point:

it’s a laptop which obviously can’t have an a100 in it.

Marketplace competition is to a degree irrelevant. Im talking about similarity. They’re doing a risc SOC They would have started it shortly after Apple started their “more than tablet” risc SOC and they’re apparently putting it into a laptop.

leadeater · April 13, 2021

5 hours ago, Bombastinator said:

Well yes of course. They’re not identical. An a100 chip is going to be a lot more similar to a 102 than to say Polaris though. Apparently close enough for a laptop to be made.

The point is the makeup inside the GA100 die is completely different to GA102 through GA106. With that GA102 through GA106 etc dies the makeup is the same, the difference is the number of SM's or the number of active SMs. GA100 has no RT cores and the number of FP/INT units per SM is different.

GA100 SM != GA102 SM

GA102 SM = GA104 SM

Bombastinator · April 13, 2021

25 minutes ago, leadeater said:

The point is the makeup inside the GA100 die is completely different to GA102 through GA106. With that GA102 through GA106 etc dies the makeup is the same, the difference is the number of SM's or the number of active SMs. GA100 has no RT cores and the number of FP/INT units per SM is different.

GA100 SM != GA102 SM

GA102 SM = GA104 SM

Different, yes. More different than 102 vs. 104 certainly. I’m not sure that qualifies as completely different. Completely different would be something that works in a fundamentally different way. There are GPUs like that after all. I suspect even a tu102 would qualify for that better than an a100 would. This is word definitions again though. Defining the degree of difference defined by “completely”. No RT cores on an a100 does make it more different than I was thinking, as it then wouldn’t be useful for hardware RT. It could probably still do software RT though.

Stahlmann · April 13, 2021

Didn't they say that it's meant specifically for AI/Data crunching? I wouldn't expect anything to come of it for the consumer or gaming market at first.

A bit of a strech to say "x86 not the main player anymore" wouldn't you say?

RedRound2 · April 13, 2021

It's funny how like 2-3 years ago, people were swearing on themselves than ARM PCs are never going to be a thing. CPUs have just become really interesting and it's gotten even more spiced up after M1 along with ARM's big entrance. Pretty excited to see what's coming up in the next 5 years in the CPU space

leadeater · April 13, 2021

5 hours ago, Bombastinator said:

Completely different would be something that works in a fundamentally different way.

Well honestly it really is that different.

GA100 has dedicated INT Cores, FP32 Cores, FP64 Cores and Tensor cores, GA10x has a shared set of FP/INT cores and dedicated FP32 cores, no FP64 cores, and Tensor Cores.

Inside an SM you have half the number of CUDA cores in GA100 as is in GA10x, GA100 has a larger L1 instruction cache per SM. GA100 is exclusively HBM memory controller. GA100 has twice the number of Load/Stores per SM than GA10x. GA100 support processing multiple data types that GA10x does not.

Save for the fact that GA100 and GA10x have support for the same CUDA version they are different, as much different has is Turing is to Pascal, or Turing is to Ampere. CUDA is just an interface to the hardware of the GPU so due to that even with the same version of CUDA you cannot do things on a GA10x that you can on a GA100.

There is literal physical differences, architecture differences and software differences between GA100 and GA10x, its not semantics, they are fundamentally different. If you can't consider say Turing vs Ampere as fundamentally different architectures then we may as well throw out archecture naming altogether as there is no difference and everything is the same. If your bar for fundamental difference is CPU vs GPU then that is just silly in the context of the discussion of GPUs. GA100 is about as different to GA10x as it is to RDNA 2 to be really honest.

igormp · April 13, 2021

12 hours ago, Bombastinator said:

Also as a side note CAD is a pretty common Mac thing. There was a period where the only decent CAD software was Mac only. As for science, watch any given video from NASA and see what they’re using.

You're rambling about something that was the case over 20 years ago. Nowadays, Macs barely have support for CAD software, just take a look at the small percentage of Autodesk's software that still supports Macs.

About NASA, those Macs are used to remote to bigger machines, much like a dumb terminal (a lightweight dumb terminal with great battery life), so the OS itself doesn't make any difference.

5 hours ago, Stahlmann said:

Didn't they say that it's meant specifically for AI/Data crunching? I wouldn't expect anything to come of it for the consumer or gaming market at first.

A bit of a strech to say "x86 not the main player anymore" wouldn't you say?

Their Grace CPU? Yeah, not going into the hands of any consumer for the foreseeable future (and there's no reason to), but they did announce a partnership with MTK to have consumer ARM-based desktops/laptops with their 3000 series GPUs.

RejZoR · April 13, 2021

5 hours ago, RedRound2 said:

It's funny how like 2-3 years ago, people were swearing on themselves than ARM PCs are never going to be a thing. CPUs have just become really interesting and it's gotten even more spiced up after M1 along with ARM's big entrance. Pretty excited to see what's coming up in the next 5 years in the CPU space

If it wasn'gt for Apple, they still wouldn't be. And on Windows side, still won't be. Because Apple has a closed ecosystem with total control over it and they can pull things like this in a single year. Microsoft has been struggling with ARM for what, 10 years? And it's still absolutely sad experience on hardware that's more expensive than x86.

RedRound2 · April 13, 2021

4 hours ago, RejZoR said:

If it wasn'gt for Apple, they still wouldn't be. And on Windows side, still won't be. Because Apple has a closed ecosystem with total control over it and they can pull things like this in a single year. Microsoft has been struggling with ARM for what, 10 years? And it's still absolutely sad experience on hardware that's more expensive than x86.

That's true, only Apple is capable of doing a massive change in the way things are in the industry, but these people were talking about performance and how an ARM CPU could never match up to x86.

Now that Apple has momentum and if Nvidia actually comes out with consumer grade ARM CPUs that are actually good, maybe Microsoft's 9th attempt at WIndows on ARM wont be such a fail as it used to be before. Apple has proved ARM's legitamacy on what was otherwise a very skeptical prediction, so I'm sure more people, including devs beilf in ARM than before.

RejZoR · April 13, 2021

2 hours ago, RedRound2 said:

That's true, only Apple is capable of doing a massive change in the way things are in the industry, but these people were talking about performance and how an ARM CPU could never match up to x86.

Now that Apple has momentum and if Nvidia actually comes out with consumer grade ARM CPUs that are actually good, maybe Microsoft's 9th attempt at WIndows on ARM wont be such a fail as it used to be before. Apple has proved ARM's legitamacy on what was otherwise a very skeptical prediction, so I'm sure more people, including devs beilf in ARM than before.

No one cares if ARM is as capable as x86 when software is just pathetic. Something Apple also has sorted out. Microsoft, not even remotely. NVIDIA can release the best ARM CPU+GPU ever and it'll be trash because software is trash for desktop devices. Except MacOS.

igormp · April 13, 2021

12 minutes ago, RejZoR said:

NVIDIA can release the best ARM CPU+GPU ever and it'll be trash because software is trash for desktop devices.

I wonder if they're not targeting the regular "gamer" market, but the scientific/prosumer one.

Having an ARM CPU on linux with a nvidia GPU is a no issue for most ML/Science tools, in fact it's even better than windows for that specific scenario.

Sign In

Nvidia announces better ARM support and new ARM CPU - x86 not the main player anymore

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites