Jump to content

Nvidia announces Tesla T4 Turing Tensor core inference acceleration GPU

Master Disaster

Announced earlier the T4 is a Turing based tensor core accelerator designed to be small form factor and low power yet still deliver acelleration to AI and deep learning tasks.

Quote

We’re racing toward the future where every customer interaction, every product, and every service offering will be touched and improved by AI. Realizing that the future requires a computing platform that can accelerate the full diversity of modern AI, enabling businesses to create new customer experiences, reimagine how they meet—and exceed—customer demands, and cost-effectively scale their AI-based products and services.

 

The NVIDIA® Tesla® T4 GPU is the world’s most advanced inference accelerator. Powered by NVIDIA Turing™ Tensor Cores, T4 brings revolutionary multi-precision inference performance to accelerate the diverse applications of modern AI. Packaged in an energy-efficient 75-watt, small PCIe form factor, T4 is optimized for scale-out servers and is purpose-built to deliver state-of-the-art inference in real time.

Screen-Shot-2018-09-13-at-7.27.39-AM.png

The card has impressive numbers behind it too, it's not just a pretty face.

Quote

The specifications inside the Tesla T4 are very impressive given its single-slot PCI-e form factor. The graphics card packs the Turing TU104 GPU with 2560 CUDA cores and 320 Tensor Cores. It delivers 8.1 TFLOPs of FP32 performance, 65 TFLOPs of FP16 mixed-precision, 130 TOPs of INT8 and 260 TOPs of INT4 performance. All of this compute performance is achieved with a TDP of just 75W. It means that you don’t need any external power source as the graphics card will be pulling the juice from the PCIe slot and can be put inside a 1U, 4U or any rack since the small form factor design will allow for large-scale compatibility in many servers.

 

Additionally, the graphics card would be coupled with 16 GB of GDDR6 memory which will deliver a bandwidth of more than 320 GB/s which is just stunning. The NV TensorRT Hyperscale Platform includes a comprehensive set of hardware and software offerings optimized for powerful, highly efficient inference.


The relative specs of the card are as follows.

aviary-image-1536823226415.thumb.jpeg.1300ddfa6322966092cc5759bdba15f1.jpeg

https://wccftech.com/nvidia-tesla-t4-turing-75w-gpu-announced/

 

I can't wait till Linus sticks 5 of em in a build just for the lulz.

Main Rig:-

Ryzen 7 3800X | Asus ROG Strix X570-F Gaming | 16GB Team Group Dark Pro 3600Mhz | Corsair MP600 1TB PCIe Gen 4 | Sapphire 5700 XT Pulse | Corsair H115i Platinum | WD Black 1TB | WD Green 4TB | EVGA SuperNOVA G3 650W | Asus TUF GT501 | Samsung C27HG70 1440p 144hz HDR FreeSync 2 | Ubuntu 20.04.2 LTS |

 

Server:-

Intel NUC running Server 2019 + Synology DSM218+ with 2 x 4TB Toshiba NAS Ready HDDs (RAID0)

Link to comment
Share on other sites

Link to post
Share on other sites

Noice, but memory bandwidth of 320 GB/s "which is just stunning"? R9 290 from 2013 had as much, and can be easily overclocked to 400 GB/s +...

CPU: Intel i7 3970X @ 4.7 GHz  (custom loop)   RAM: Kingston 1866 MHz 32GB DDR3   GPU(s): 2x Gigabyte R9 290OC (custom loop)   Motherboard: Asus P9X79   

Case: Fractal Design R3    Cooling loop:  360 mm + 480 mm + 1080 mm,  tripple 5D Vario pump   Storage: 500 GB + 240 GB + 120 GB SSD,  Seagate 4 TB HDD

PSU: Corsair AX860i   Display(s): Asus PB278Q,  Asus VE247H   Input: QPad 5K,  Logitech G710+    Sound: uDAC3 + Philips Fidelio x2

HWBot: http://hwbot.org/user/tame/

Link to comment
Share on other sites

Link to post
Share on other sites

55 minutes ago, Tam3n said:

Noice, but memory bandwidth of 320 GB/s "which is just stunning"? R9 290 from 2013 had as much, and can be easily overclocked to 400 GB/s +...

Maybe you are thinking about a different product category than what is presented here.

Link to comment
Share on other sites

Link to post
Share on other sites

Just to note, INT4 is not anything amazing, it is just efficient, because it's small. it takes just 4 bits(the 4 in INT4) and it is an integer (INT in INT4). That means that if you want for your models to run that efficiently, you will have to sacrifice much of the accuracy in your machine learning models, as INT4 can only represent values form 0 to 15. And I doubt many use cases can adapt to such a limitation. I find it sad that Nvidia is trying to show this amazing machine learning performance when it's really mostly the same, just the numbers are smaller. They just optimized for small numbers, so those who need good accuracy will suffer.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Tech Enthusiast said:

Maybe you are thinking about a different product category than what is presented here.

The Firepro W9100 was based on the 290x and was announced mid 2016, it has 320GB/s ram speed.

https://www.amd.com/en-us/press-releases/Pages/amd-announces-world-2016apr14.aspx

if you want to annoy me, then join my teamspeak server ts.benja.cc

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Tam3n said:

Noice, but memory bandwidth of 320 GB/s "which is just stunning"? R9 290 from 2013 had as much, and can be easily overclocked to 400 GB/s +...

Which consumed 120W while this whole card only consumes 75W

Link to comment
Share on other sites

Link to post
Share on other sites

Oooof. 2560 Cuda cores for 75w.... Damn :D.

 

This must be an exceptionally well binned part.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, AluminiumTech said:

Oooof. 2560 Cuda cores for 75w.... Damn :D.

 

This must be an exceptionally well binned part.

Well, about the same as the Pascal one. GDDR6 has probably seriously lowered VRAM power consumption, so they could give the core more resources. Moreover, Pascal cards actually undervolt quite nicely, and I don't think Turing will differ since the process is almost the same

But why people keep on spelling TFLOPs with a lowercase s? It's FLOPS, non FLOP

On a mote of dust, suspended in a sunbeam

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Agost said:

Well, about the same as the Pascal one. GDDR6 has probably seriously lowered VRAM power consumption, so they could give the core more resources. Moreover, Pascal cards actually undervolt quite nicely, and I don't think Turing will differ since the process is almost the same

But why people keep on spelling TFLOPs with a lowercase s? It's FLOPS, non FLOP

Is it though? It really should be FLOP/s for consistency. Since OP is merely OPerations, just S FL is FLoating.

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

20 hours ago, Curufinwe_wins said:

Is it though? It really should be FLOP/s for consistency. Since OP is merely OPerations, just S FL is FLoating.

FLoating point Operations Per Second. So FLOPS. FLOP/s would mean FP operations per per second, so no. Unleass you decide to divide it into OPerations, but meh

On a mote of dust, suspended in a sunbeam

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, Agost said:

FLoating point Operations Per Second. So FLOPS. FLOP/s would mean FP operations per per second, so no. Unleass you decide to divide it into OPerations, but meh

Consistency is king. FLOP/s makes more sense. Point is a far more important word than per and point isnt even being kept in the acronym. 

 

 

Also with the notable exception of MIPs, everything else relating to OPs uses OP as the base for acronyms. MOP fusion etc.

 

LINK-> Kurald Galain:  The Night Eternal 

Top 5820k, 980ti SLI Build in the World*

CPU: i7-5820k // GPU: SLI MSI 980ti Gaming 6G // Cooling: Full Custom WC //  Mobo: ASUS X99 Sabertooth // Ram: 32GB Crucial Ballistic Sport // Boot SSD: Samsung 850 EVO 500GB

Mass SSD: Crucial M500 960GB  // PSU: EVGA Supernova 850G2 // Case: Fractal Design Define S Windowed // OS: Windows 10 // Mouse: Razer Naga Chroma // Keyboard: Corsair k70 Cherry MX Reds

Headset: Senn RS185 // Monitor: ASUS PG348Q // Devices: Note 10+ - Surface Book 2 15"

LINK-> Ainulindale: Music of the Ainur 

Prosumer DYI FreeNAS

CPU: Xeon E3-1231v3  // Cooling: Noctua L9x65 //  Mobo: AsRock E3C224D2I // Ram: 16GB Kingston ECC DDR3-1333

HDDs: 4x HGST Deskstar NAS 3TB  // PSU: EVGA 650GQ // Case: Fractal Design Node 304 // OS: FreeNAS

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×