Jump to content

Nvidia announces better ARM support and new ARM CPU - x86 not the main player anymore

igormp

Summary

Nvidia just announced during GTC their own ARM CPU to be used in DCs/HPCs, with tons of throughput in order to not bottleneck their GPUs. They're also getting ARM-based GPU nodes available on AWS, workstations and servers in partnership with Ampere Altra and Marvell. There will also be consumer devices based on MTK CPUs.

 

Quotes

Quote

image.thumb.png.6e0500ab36ae583ce4f38c003b12d8ef.png

 

My thoughts

Nvidia building their own CPU with tons of NVLink buses was expected due to the current limitations of x86 and POWER (the latter has NVLink, but the current design aren't that flexible for nvidia), however I didn't expect them to launch it before their ARM acquisition was done.

I'm also surprised that they're even bringing ARM-based devices to end-consumers, let's see if they'll try to get Windows running or just go for Linux.

 

This is a big hit to the x86 hegemony, especially for Intel, since they're losing their grasp on the DC market at an incredible fast pace, being attacked by both AMD and Nvidia now.

 

Sources

 

More links for their Grace CPU:

https://nvidianews.nvidia.com/news/nvidia-announces-cpu-for-giant-ai-and-high-performance-computing-workloads

https://www.nvidia.com/en-us/data-center/grace-cpu/

Edited by igormp
Added Grace links

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

just a logical step

Hi

 

Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler

hi

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

So, basically the end-game is SOC like integration but far more scalable and powerful.

 

CPU and GPU with proprietary bussing between the two, and their own MB chipset. Don't think it can't happen... 

Link to comment
Share on other sites

Link to post
Share on other sites

I‘m wondering if Nvidia might try to make their own game console/ pc at some point

Edited by Drama Lama

Hi

 

Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler

hi

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, StDragon said:

So, basically the end-game is SOC like integration but far more scalable and powerful.

 

CPU and GPU with proprietary bussing between the two, and their own MB chipset. Don't think it can't happen... 

There already are POWER9 CPUs with NVLink built in. For hyperscalers and other solutions that require the most of GPU performance, it's a price that they'll need to pay, since current open buses aren't fast enough, nor there's any CPU capable of delivering such amounts of throughput.

 

4 minutes ago, Drama Lama said:

I‘m wondering if Nvidia might try to make their own game console/ pc at some point

IMO, they're caring less and less for the end-consumer market. Any console/pc would be through partners, such as the nintendo switch or that MTK laptop pictured above.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, igormp said:

There already are POWER9 CPUs with NVLink built in. For hyperscalers and other solutions that require the most of GPU performance, it's a price that they'll need to pay, since current open buses aren't fast enough, nor there's any CPU capable of delivering such amounts of throughput.

 

IMO, they're caring less and less for the end-consumer market. Any console/pc would be through partners, such as the nintendo switch or that MTK laptop pictured above.

*looks for picture of laptop.  Doesn’t find it*  creating a cpu and creating a commercially successful cpu are different things.  The former happens a lot more often than the latter.  Just because Nvidia announces something doesn’t always mean it will get somewhere.  Sounds like they’re more worried about apple than anything else.

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

41 minutes ago, Drama Lama said:

I‘m wondering if Nvidia might try to make their own game console/ pc at some point

They already do. The nVidia Shield and the Nintendo Switch are the same hardware platform.

Link to comment
Share on other sites

Link to post
Share on other sites

40 minutes ago, Bombastinator said:

*looks for picture of laptop.  Doesn’t find it*

It's inside the quote since it came straight from the presentation.

 

41 minutes ago, Bombastinator said:

creating a cpu and creating a commercially successful cpu are different things.

They don't want a commercially successful CPU. They want a dummy chip that can transfer data from/to their GPU and send some commands, it's solely meant to drive their GPU business and not to be sold as a general purpose CPU.

 

42 minutes ago, Bombastinator said:

Sounds like they’re more worried about apple than anything else.

How so? Apple is not a player in the DC/HPC market.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, igormp said:

It's inside the quote since it came straight from the presentation.

 

They don't want a commercially successful CPU. They want a dummy chip that can transfer data from/to their GPU and send some commands, it's solely meant to drive their GPU business and not to be sold as a general purpose CPU.

 

How so? Apple is not a player in the DC/HPC market.

Re paragraph 2 they’re a company.  They want to make money.  Of course they want a commercially successful product be it proprietary or not.   That doesn’t mean that success has to be found in the consumer space.  If one conflates consumer space success and commercial success then sure. If it isn’t sucessful people will make less software for it which makes it less succesful and if they’re not careful they’ll wind up with a winCE or widows RT thing that causes billions in losses.

 

your acronyms can mean a bunch of things.  I’m going to assume HPC is high performance computing (supercomputers) mostly because if the requirement is that it be not computer oriented and not something Apple deals with it’s the only bit left.  NVIDIA has talked a big game around their a100 chip for supercomputing. Apparently you are saying this is part of that thing and will never see any consumer use to begin with.  My earlier information about A100 was that it worked fine with things like epic and didn’t need its own custom built processor to function.  If this is not the case I suspect that not only is this new thing screwed but so is a100.  There are lots of products like epic that don’t see consumer use. the entire industry of middleware is like that.   Do I think Apple is or will get into supercomputing? No.  But Nvidia sells stuff in several spaces and this thing sounds a whole lot like the m1.  Supercomputers are sold singly.  There may never be more than a few machines that ever run this chip.  That could still be considered commercial success though.  An a100 though still isn’t that different from the 102 chip in a 3080.  This could be specifically for a100 based supercomputing except they’re apparently also apparently putting it in laptops which are sort of the opposite of big iron so supercomputers can’t be the only place they’re going and Apple DOES make laptops.

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, igormp said:

How so? Apple is not a player in the DC/HPC market.

But...but... ARM = Apple right? :old-wink:

 

Interesting that you mention POWER limiting Nvidia. IBM decided to drop NVLink on chip with POWER10 which probably helped cement their decision:

 

"With the advent of PCIe Gen5, both IBM and NVIDIA determined that PCIe is once again sufficient for eliminating performance bottlenecks in host-to-GPU attach. Therefore a proprietary solution such as NVLINK is no longer a strong differentiator for host-to-GPU attach, and the POWER10 processor will not exploit NVLINK for host-to-GPU attach."[1]

 

I'm also quite happy that they're starting from HPC / servers and scaling down to consumer hardware. This usually gives us enterprise grade features like CCIX, memory encryption (though AMD's got that going), virtualization goodies, on chip accelerators, etc.

 

I've been keeping an eye on Ampere for my next computer that I will buy in a couple of years, but it's very costly even compared to POWER9. Nvidia has the money to mass produce so who knows, maybe I'll end up buying a Nvidia CPU. However, given how much they hate open source, that probably won't happen.

Link to comment
Share on other sites

Link to post
Share on other sites

NVLink.thumb.jpg.af83601d5aff788296d702ff55507ce9.jpg

https://www.anandtech.com/show/16610/nvidia-unveils-grace-a-highperformance-arm-server-cpu-for-use-in-ai-systems

 

Finally, some reasonable bandwidth. I really hope something like that trickles down to consumer, but we only have DDR5 to look forward to. No, I don't expect 2 TB/s between ram and CPU. System ram bandwidth has not grown anywhere near as fast as CPU cores and the imbalance for compute use cases is comical, especially in AMD's higher consumer offerings. For a Skylake core-GHz, I'd estimate 4 GB/s peak rating would be practically unlimited, and Zen 3 or Rocket Lake would need much more than that.

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

54 minutes ago, Bombastinator said:

Re paragraph 2 they’re a company.  They want to make money.  Of course they want a commercially successful product be it proprietary or not.   That doesn’t mean that success has to be found in the consumer space.  If one conflates consumer space success and commercial success then sure. If it isn’t sucessful people will make less software for it which makes it less succesful and if they’re not careful they’ll wind up with a winCE or widows RT thing that causes billions in losses.

Yeah, the point that I made is that the CPU isn't the product, but a piece for their overall product, so that's why its success is somewhat irrelevant, what needs to be successful is the whole server/machine with the focus being on GPUs that aren't bottlenecked by bandwidth.

57 minutes ago, Bombastinator said:

I’m going to assume HPC is high performance computing (supercomputers)

Yup, that's it.

58 minutes ago, Bombastinator said:

My earlier information about A100 was that it worked fine with things like epic and didn’t need its own custom built processor to function. 

It does work, but when you have many of those, PCIe becomes a bottleneck and suddenly you can't just add more of those. With Nvidia's new product, you could have more than 8 GPUs in a single box without any bandwidth bottleneck.

 

59 minutes ago, Bombastinator said:

An a100 though still isn’t that different from the 102 chip in a 3080.

Oh boy, it actually is really different. Even though both are Ampere-based, how they're built differs a lot. For starters, the A100 is built on TSMC's 7nm, unlike the other products that are built on samsung fabs.

 

1 hour ago, Bombastinator said:

except they’re apparently also apparently putting it in laptops which are sort of the opposite of big iron so supercomputers can’t be the only place they’re going and Apple DOES make laptops.

There's no mobile A100. And those workstation laptops are in a segment that doesn't rival Apple, since they're mean as portable workstations for CAD and scientific computing on the go (think an engineer in an oil platform in the middle of the ocean), not lightweight products with huge battery life.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

Quick question: Does this mean anything related to the legal battle going on with Nvidia's acquisition of ARM? If so, what?

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, igormp said:

Oh boy, it actually is really different. Even though both are Ampere-based, how they're built differs a lot. For starters, the A100 is built on TSMC's 7nm, unlike the other products that are built on samsung fabs.

GA100 vs GA102 architecture is entirely different, it's a little odd to call them both Ampere other than they share same/similar CUDA feature support, but I guess the cores themselves are the same but grouped, arranged and addressed entirely differently which is why it's odd to call them the same archecture code name Ampere.

 

GA100:

nvidia-ga100-sm-streaming-multiprocessor

 

GA102

image.png.abbec4e65ff91fb19bbf472c8e8861a0.png

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, igormp said:

Yeah, the point that I made is that the CPU isn't the product, but a piece for their overall product, so that's why its success is somewhat irrelevant, what needs to be successful is the whole server/machine with the focus being on GPUs that aren't bottlenecked by bandwidth.

Yup, that's it.

It does work, but when you have many of those, PCIe becomes a bottleneck and suddenly you can't just add more of those. With Nvidia's new product, you could have more than 8 GPUs in a single box without any bandwidth bottleneck.

 

Oh boy, it actually is really different. Even though both are Ampere-based, how they're built differs a lot. For starters, the A100 is built on TSMC's 7nm, unlike the other products that are built on samsung fabs.

 

There's no mobile A100. And those workstation laptops are in a segment that doesn't rival Apple, since they're mean as portable workstations for CAD and scientific computing on the go (think an engineer in an oil platform in the middle of the ocean), not lightweight products with huge battery life.

re: 100!=102.

Well yes of course.  They’re not identical.  An a100 chip is going to be a lot more similar to a 102 than to say Polaris though.  Apparently close enough for a laptop to be made.


 

Re: there’s no mobile a100 

exactly.  There isn’t.  So if there’s a laptop what could it possibly be using except smaller amphere stuff.  Also as a side note CAD is a pretty common Mac thing.  There was a period where the only decent CAD software was Mac only. As for science, watch any given video from NASA and see what they’re using.   It’s certainly not everywhere.  I suspect there are pockets though.  Whether they compete though isn’t the point:

 it’s a laptop which obviously can’t have an a100 in it. 

Marketplace competition is to a degree irrelevant.  Im talking about similarity.  They’re doing a risc SOC  They would have started it shortly after Apple started their “more than tablet” risc SOC and they’re apparently putting it into a laptop.

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Bombastinator said:

Well yes of course.  They’re not identical.  An a100 chip is going to be a lot more similar to a 102 than to say Polaris though.  Apparently close enough for a laptop to be made.

The point is the makeup inside the GA100 die is completely different to GA102 through GA106. With that GA102 through GA106 etc dies the makeup is the same, the difference is the number of SM's or the number of active SMs. GA100 has no RT cores and the number of FP/INT units per SM is different.

 

GA100 SM != GA102 SM

GA102 SM = GA104 SM

Link to comment
Share on other sites

Link to post
Share on other sites

25 minutes ago, leadeater said:

The point is the makeup inside the GA100 die is completely different to GA102 through GA106. With that GA102 through GA106 etc dies the makeup is the same, the difference is the number of SM's or the number of active SMs. GA100 has no RT cores and the number of FP/INT units per SM is different.

 

GA100 SM != GA102 SM

GA102 SM = GA104 SM

Different, yes. More different than 102 vs. 104 certainly.  I’m not sure that qualifies as completely different.  Completely different would be something that works in a fundamentally different way.  There are GPUs like that after all. I suspect even a tu102 would qualify for that better than an a100 would.  This is word definitions again though. Defining the degree of difference defined by “completely”. No RT cores on an a100 does make it more different than I was thinking, as it then wouldn’t be useful for hardware RT.  It could probably still do software RT though. 

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

Didn't they say that it's meant specifically for AI/Data crunching? I wouldn't expect anything to come of it for the consumer or gaming market at first.

 

A bit of a strech to say "x86 not the main player anymore" wouldn't you say?

If someone did not use reason to reach their conclusion in the first place, you cannot use reason to convince them otherwise.

Link to comment
Share on other sites

Link to post
Share on other sites

It's funny how like 2-3 years ago, people were swearing on themselves than ARM PCs are never going to be a thing. CPUs have just become really interesting and it's gotten even more spiced up after M1 along with ARM's big entrance. Pretty excited to see what's coming up in the next 5 years in the CPU space

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Bombastinator said:

Completely different would be something that works in a fundamentally different way.

Well honestly it really is that different.

 

GA100 has dedicated INT Cores, FP32 Cores, FP64 Cores and Tensor cores, GA10x has a shared set of FP/INT cores and dedicated FP32 cores, no FP64 cores, and Tensor Cores.

 

Inside an SM you have half the number of CUDA cores in GA100 as is in GA10x, GA100 has a larger L1 instruction cache per SM. GA100 is exclusively HBM memory controller. GA100 has twice the number of Load/Stores per SM than GA10x. GA100 support processing multiple data types that GA10x does not.

 

Save for the fact that GA100 and GA10x have support for the same CUDA version they are different, as much different has is Turing is to Pascal, or Turing is to Ampere. CUDA is just an interface to the hardware of the GPU so due to that even with the same version of CUDA you cannot do things on a GA10x that you can on a GA100.

 

There is literal physical differences, architecture differences and software differences between GA100 and GA10x, its not semantics, they are fundamentally different. If you can't consider say Turing vs Ampere as fundamentally different architectures then we may as well throw out archecture naming altogether as there is no difference and everything is the same. If your bar for fundamental difference is CPU vs GPU then that is just silly in the context of the discussion of GPUs. GA100 is about as different to GA10x as it is to RDNA 2 to be really honest.

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, Bombastinator said:

Also as a side note CAD is a pretty common Mac thing.  There was a period where the only decent CAD software was Mac only. As for science, watch any given video from NASA and see what they’re using.  

You're rambling about something that was the case over 20 years ago. Nowadays, Macs barely have support for CAD software, just take a look at the small percentage of Autodesk's software that still supports Macs.

About NASA, those Macs are used to remote to bigger machines, much like a dumb terminal (a lightweight dumb terminal with great battery life), so the OS itself doesn't make any difference.

5 hours ago, Stahlmann said:

Didn't they say that it's meant specifically for AI/Data crunching? I wouldn't expect anything to come of it for the consumer or gaming market at first.

 

A bit of a strech to say "x86 not the main player anymore" wouldn't you say?

Their Grace CPU? Yeah, not going into the hands of any consumer for the foreseeable future (and there's no reason to), but they did announce a partnership with MTK to have consumer ARM-based desktops/laptops with their 3000 series GPUs.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, RedRound2 said:

It's funny how like 2-3 years ago, people were swearing on themselves than ARM PCs are never going to be a thing. CPUs have just become really interesting and it's gotten even more spiced up after M1 along with ARM's big entrance. Pretty excited to see what's coming up in the next 5 years in the CPU space

If it wasn'gt for Apple, they still wouldn't be. And on Windows side, still won't be. Because Apple has a closed ecosystem with total control over it and they can pull things like this in a single year. Microsoft has been struggling with ARM for what, 10 years? And it's still absolutely sad experience on hardware that's more expensive than x86.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, RejZoR said:

If it wasn'gt for Apple, they still wouldn't be. And on Windows side, still won't be. Because Apple has a closed ecosystem with total control over it and they can pull things like this in a single year. Microsoft has been struggling with ARM for what, 10 years? And it's still absolutely sad experience on hardware that's more expensive than x86.

That's true, only Apple is capable of doing a massive change in the way things are in the industry, but these people were talking about performance and how an ARM CPU could never match up to x86.

 

Now that Apple has momentum and if Nvidia actually comes out with consumer grade ARM CPUs that are actually good, maybe Microsoft's 9th attempt at WIndows on ARM wont be such a fail as it used to be before. Apple has proved ARM's legitamacy on what was otherwise a very skeptical prediction, so I'm sure more people, including devs beilf in ARM than before. 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, RedRound2 said:

That's true, only Apple is capable of doing a massive change in the way things are in the industry, but these people were talking about performance and how an ARM CPU could never match up to x86.

 

Now that Apple has momentum and if Nvidia actually comes out with consumer grade ARM CPUs that are actually good, maybe Microsoft's 9th attempt at WIndows on ARM wont be such a fail as it used to be before. Apple has proved ARM's legitamacy on what was otherwise a very skeptical prediction, so I'm sure more people, including devs beilf in ARM than before. 

No one cares if ARM is as capable as x86 when software is just pathetic. Something Apple also has sorted out. Microsoft, not even remotely. NVIDIA can release the best ARM CPU+GPU ever and it'll be trash because software is trash for desktop devices. Except MacOS.

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, RejZoR said:

NVIDIA can release the best ARM CPU+GPU ever and it'll be trash because software is trash for desktop devices.

I wonder if they're not targeting the regular "gamer" market, but the scientific/prosumer one.

Having an ARM CPU on linux with a nvidia GPU is a no issue for most ML/Science tools, in fact it's even better than windows for that specific scenario.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×