Jump to content
Search In
  • More options...
Find results that contain...
Find results in...


  • Content Count

  • Joined

  • Last visited


About AluminiumTech

  • Title

Contact Methods

  • Discord
  • Steam
  • Origin
  • UPlay
  • Xbox Live
  • Reddit
  • Twitter
  • Heatware

Profile Information

  • Location
  • Gender
  • Interests
    Technology and Gaming
  • Occupation
    Student at Uni


  • CPU
    AMD Ryzen 7 1700X
  • Motherboard
    MSI X370 Gaming Pro Carbon
  • RAM
    32GB DDR4-2666 (2x 16GB) Corsair Vengeance LPX
  • GPU
    Asus ROG Strix Radeon RX Vega 56
  • Case
    NZXT S340 Elite
  • Storage
    500GB Samsung 860 EVO, 6TB Seagate Barracuda 5400rpm (2020)
  • PSU
    Corsair RM850i
  • Display(s)
    2560x1440 27" Gigabyte G27QC 165Hz VA with FreeSync Premium + Asus VX 238H 23" secondary monitor
  • Cooling
    1x Corsair ML 140 Pro, 1x Corsair ML 120 Pro, 1x Corsair H100i v2
  • Keyboard
    Corsair K70 LUX RGB MX Brown
  • Mouse
    Logitech G903 Wireless
  • Sound
    Sennheiser HD 450BT Wireless Headphones
  • Operating System
    Windows 10 Pro 64 Bit
  • Phone
    iPhone 8 Plus 64GB Gold

Recent Profile Visitors

405,201 profile views

Single Status Update

See all updates by AluminiumTech

  1. Thanks to @Syn. for helping me fully realize Nvidia RTX 30 series Cuda Core count debacle.


    Nvidia pulled a Bulldozer everyone!


    Each 30 series Cuda Core is 1/2 of a regular Cuda Core.


    That's why they get to claim super high Cuda core counts.

    1. Show previous comments  3 more
    2. .Apex.


      I think I've figured it out..


      Traditionally the FP32 Cores is what they count when they mention the CUDA Cores, If you look at the A100 it has 6912 FP32 Cores and 3456 FP64 Cores, but since the A100 has 2xFP16 in each FP32 core and 4xFP16 in each FP64 core that means there is a total amount of 27648 FP16 Cores, according to NVIDIA the A100 achieves 78 TFlops of FP16 Performance at 1410Mhz, you can calculate the TFlops by multiplying the Core Count with Clock Speed then multiplied again by 2, which turns out to be 27648*2*1410 = 78 TFlops, now if I am correct that what they're counting are the FP16 Cores in the RTX 3080, then dividing the core count and then dividing it again to get the TFlops performance should get the same results as if we calculate the FP16 Performance by multiplying the cores with the Clock Speed:

      27648 (FP16 Cores) / 8704 (FP16 Cores) = 3.17
      78 (TFlops) / 3.17 = 24.6 TFlops (FP16 Performance at 1410Mhz)


      Calculating again by multiplying:

      8704*2*1410 = 24.54 TFlops (FP16 Performance at 1410Mhz)

      *to note though the 30 TFlops they've mentioned was calculated at 1710Mhz which is the boost clock

      The calculations all match up which means... the RTX 3080 actually has 2176 CUDA Cores in the traditional sense by taking account only the FP32 Cores... to calculate the FP32 Performance at 1710Mhz that would turn out to 2176*2*1710Mhz = 7.44 TFlops, I believe that's why they refused to mention the actual CUDA Cores since to most people it might seem like a downgrade but since they were able to achieve 2X FP32 IPC Improvement that puts the FP32 TFlops performance at 15 TFlops when compared to previous generation.


      in realistic scenarios though the card would most likely boost way higher than 1710Mhz, and the IPC Improvement is likely slightly higher than 2X which would put it more inline with the performance metrics we've seen with the RTX 2080.

    3. AluminiumTech


      It also makes comparisons more difficult if people don’t know the true cuda core count.

    4. .Apex.


      Yeah I can see how a ton of YouTubers might wrongly assume that what NVIDIA has shown is in any way comparable to previous Gen on paper, there is going to be a lot of confusion and I don't know how to make more people aware of this