Habana AI-processors challenging Nvidia

Furiku · December 7, 2020

Summary

Israeli chipmaker Habana Labs (currently owned by Intel which acquired it year ago for 2 billion) has received order from Amazon AWS which plans on offering their Gaudi AI processors as alternative for Nvdias AI chips for their customers. Could this challenge Nvidias position as the AI compute king?

Quotes

Quote

AWS says that Habana's Gaudi AI processors deliver up to 40% better price performance than current graphics processing chips (in other words Nvidia's). This is a dramatic improvement in the fast growing AI computer resources consumptions market in which every percentage improvement translates into a great deal of money.

My thoughts

Given that AWS controls 50% of the data center market, this is interesting move and possible sign that maybe.. just maybe. Team green might not be the only option when it comes to the world of AI computing. If AWS does something well, it's cost efficiency.

Sources

https://en.globes.co.il/en/article-aws-to-offer-israeli-made-habana-ai-processors-1001351972

igormp · December 7, 2020

40 minutes ago, Furiku said:

Team green might not be the only option when it comes to the world of AI computing.

There are other options available, such as TPUs on GCP, which also boast better cost efficiency. The problem comes to ease of use, since, even though they are meant to work with the famous frameworks (tf, pytorch, etc), there are many underlying problems that are way harder to solve than your regular, well-known GPU.

Kisai · December 7, 2020

2 hours ago, igormp said:

There are other options available, such as TPUs on GCP, which also boast better cost efficiency. The problem comes to ease of use, since, even though they are meant to work with the famous frameworks (tf, pytorch, etc), there are many underlying problems that are way harder to solve than your regular, well-known GPU.

The AI/NN stuff tends to not fit squarely into general purpose computing, so while TF and pytorch seem to be establishing themselves as middleware between the actual software and the hardware, a lot of the software is still designed to run on CUDA/Quadro systems with 24GB of video memory. Try running some of those NN's on an 8GB Geforce part and they will usually just roll over and die after the first sample. I've actually had very little success getting any of the training examples to work, though most of the inference examples will work, maybe once if they need more than 6GB, but won't work if anything else uses the GPU, like a web browser window.

And yeah, GPU's were never going to be the most efficient, an ASIC would. However we're nowhere near a stage where one can be implemented in hardware, and a ^8 increase in performance is still needed before much of these things can be portable and untethered from the cloud.

williamcll · December 7, 2020

Oh yeah, there's always been a lot of small manufacturers who wants to compete with GPU manufacturers in the AI space.

Problem is, many large techs could just design their own NPU cards if they don't want to buy from someone else, which means the demand is not that big in many cases.

Jet_ski · December 7, 2020

Right now Nvidia is far ahead because of their software. CUDA has great integration with popular platforms like Pytorch and Tensorflow. Until all these other hardware companies invest in their software and APIs, you can use their products unless you are Amazon who can develop those tools internally.

igormp · December 7, 2020

1 hour ago, Kisai said:

The AI/NN stuff tends to not fit squarely into general purpose computing, so while TF and pytorch seem to be establishing themselves as middleware between the actual software and the hardware, a lot of the software is still designed to run on CUDA/Quadro systems with 24GB of video memory.

Huh, you do know that TF and pytorch do use CUDA underneath, right? That's why they don't work out of the box with AMD cards. In the end it's all about matrices FMA.

1 hour ago, Kisai said:

Try running some of those NN's on an 8GB Geforce part and they will usually just roll over and die after the first sample.

Most models will work just fine, you just need to properly tune your parameters. The only network that I know for sure that won't run at all with less than 20gb or so is pix2pix, unless you use a batch size so ridiculously low that training would not be worth it.

1 hour ago, Kisai said:

And yeah, GPU's were never going to be the most efficient, an ASIC would. However we're nowhere near a stage where one can be implemented in hardware, and a ^8 increase in performance is still needed before much of these things can be portable and untethered from the cloud.

We already have TPUs and similar stuff that are solely meant to accelerate ML tasks, and those can be classified as ASICs since it's not a general purpose chip. Amazon also has cloud FPGAs available.

Sign In

Habana AI-processors challenging Nvidia

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Topics

Latest From Linus Tech Tips:

I Will NOT Give You $250 for Your Broken Game - WAN Show April 26, 2024

Latest From Tech Quickie:

Why Are Gaming Laptops So Expensive?

Latest From TechLinked:

Good Riddance, TikTok

Latest From GameLinked:

Is Nintendo being FRAMED?

Latest From ShortCircuit:

I tried 20 influencer foods, here are the best… and the worst…

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!

My Activity Streams