bandwidth Bandwidth of GPU, VRAM, PCIe confusion.
There's minimal or no relation between pci-e , vram and bandwidth of the screens etc.
All bandwidths are maximums for ideal scenarios.
Data is transferred in packets of fixed sizes, like let's say 512 bytes, 32 KB, 64 KB, 1 MB, etc.
You get the maximum throughput provided you use the maximum size packets all the time and you're streaming a large file without interruption.
If you constantly complete transfers and initiate new transfers of various sizes, there is some latency involved and there maximum bandwidth decreases.
Same with memory chips.
GDDR5, 6 and even HBM aren't simple things.
The gpu chip requests for some data from a specific location from RAM, but it takes some amount of time (let's say 10ms - hugely inflated, bogus numbers just to make it easier for you to understand) from the moment the gpu chip tells each memory chip that it wants to read data from a location in the ram chip, until the start of the data is available on the memory chips pins.
Once the data is there, the memory chips are prepared to send a relatively large chunk of data to the gpu chip with much smaller delays, very fast.
So for example, let's say each memory chip is arranged in rows of 32 KB (let's say a 1 GB memory chip is arranged in 32768 rows x 32 KB per row of data) and you have 8 memory chips ( 8 GB in total on the video card) and the gpu chip wants to read a 200 KB chunk of data from the memory chips.
The gpu chip sends command to first chip to give him data starting from 0 KB, second chip to give data starting from 32 KB, 3rd chip to give data starting from 96 KB and so on ... but being only 200 KB , the sixth chip is programmed to give data starting from 192 KB to 224 KB and chips 7 and 8 are basically not used because the file is only 200 KB in size.
So there's 10ms of waiting until the memory chips come back and say data is ready to be transferred, and then for every Hz/whatever (let's say 0.01ms per transfer) , the memory chip can place 32 bits (4 bytes) on the output pins of the ram chip, so there's 4 bytes x 8 = 32 bytes available to the video card, but the video card can only use 24 bytes of those because memory chips 7 and 8 don't have any meaningful data.
So after 10 ms, it takes 0.01ms to read 24 bytes at a time, so the 200 KB will be transferred in 204800 bytes (200 KB) / 24 bytes per transfer = ~8533 or ~ 853 ms to read those 200 KB of data.
If you had a chunk of data that was at least 8 memory chips x 32 KB per row of memory chip = 256 KB, then the data may be transferred in shorter time, because data from all 8 memory chips may be used, instead of just 6 chips.
So you get that maximum throughput between ram and video card ONLY if you have very big chunks of data and other ideal cases.
HBM2 memory is even special because unlike GDDRx which is 128 bit or 256 bit (or whatever small number) wide, it's 1024 bit wide per chip, so a card like Vegas is 4096 bit wide. If you deal with large textures or other resources, like let's say 2-4 MB, then the wide interface of 4096 bit can be useful, as you transfer 512 bytes in one shot instead of only 32 or 64 bytes or some small amount. It sucks if you deal with small resources, because you still have to deal with that long period of time between a request for data and the moment the data becomes available.
So even though in your picture you see there maximum of 616 GB/s or 1 TB/s, in reality video games and things deal with textures of various sizes, shaders and scripts that are of various sizes, these scripts and shaders and other things reserve various amounts of memory to perform calculations, so not every memory transfer is ideal to get you that maximum throughput.
A game can also deal with 1-2 GB worth of textures held in the ram and applies those textures or parts of those textures over objects (terrain, buildings, signposts. characters. signs. walls, trees, grass) on every frame it outputs to the screen.. then shaders and other things use lightning information and other things to change the look of the frames ... and all this is repeated for each frame ... if the game outputs 60fps, then it's done 60 times a second... so during every 5-10ms, the video card kinda has to start from scratch and bring inside the gpu chip textures and all the information from ram and build the picture you look at
During this, the pci-e slot is only used to send commands like change the view port (where in the scene player looks), maybe change parameters of shaders (increase blur, change brightness, make it rain etc), maybe upload some textures that will show up in the next level or as you turn a corner in the game... it's not about how fast (raw mb/s) data is sent to the video card, it's also about latency, how fast the data arrives and video card acknowledges it and so on.
As for data from video card to monitor ... that basically more or less limitation of how fast bits can be pushed over a bunch of wires to the monitor with enough strength (intensity) so that at the end of a few meters, it's still easy to make distinction between a digital 0 and 1. Right now, we're stuck to around 20-30 gbps for a connection to a monitor, limitations of copper, length of cable, how well signal carries across a bunch of individual copper wires close together etc etc
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now