Question over Intel Xe graphics

BlakeHoward · June 29, 2020

So I've been watching some videos over Xe graphics and am still confused over one thing. Will Xe graphics be exclusively integrated graphics? Or dedicated graphics as well for a pcie slot?

Haro · June 29, 2020

Afaik

Xe will be an i-gpu and then Intel will release a dedicated GPU ( I don't know if it's the same name ) as well

But I could be wrong

WereCatf · June 29, 2020

6 minutes ago, BlakeHoward said:

Will Xe graphics be exclusively integrated graphics?

Xe isn't integrated at all. It's exclusively discrete graphics at the moment. It might also receive an integrated model in the future, but there is no such a thing at the moment.

Jurrunio · June 29, 2020

4 minutes ago, WereCatf said:

Xe isn't integrated at all. It's exclusively discrete graphics at the moment. It might also receive an integrated model in the future, but there is no such a thing at the moment.

isnt the low power DG1 available as both dedicated GPU and iGPU?

WereCatf · June 29, 2020

Just now, Jurrunio said:

isnt the low power DG1 available as both dedicated GPU and iGPU?

Considering that the moniker "DG" comes from the words "Discrete Graphics".....

Jurrunio · June 29, 2020

9 minutes ago, WereCatf said:

Considering that the moniker "DG" comes from the words "Discrete Graphics".....

Nothing's stopping them from integrating the design into their CPU and giving it a different name. Just look at AMD's Vega cardd, make it much smaller and cut away memory controller you get what Ryzen APUs carry.

WereCatf · June 29, 2020

Just now, Jurrunio said:

Nothing's stopping them from integrating the design into their CPU and giving it a different name.

Yes, but they haven't done so yet. All the Xe - graphics are discrete at the moment and I am not aware of anyone at Intel having said any plans to make an integrated version anytime soon.

adm0n · June 29, 2020

For now it has been designed to be a scalable, tile-based architecture. If their plan succeeds, they can create anything from it, ranging from low powered integrated GPUs to high powered gaming/accellerator cards for servers.

But for now, they just exist as dedicated GPUs. At least that is all they have told us about. If they wanted to integrate them with a CPU, they might need to adapt the architecture a bit, but in the end, the will probably want to merge their iGPU division with the XE devision. This only works however, if the XE architecture is truely scalable. The iGPU version of the XE GPU would probably only need a portion of a single tile.

The fact that it is tile based, will probably mean, that it will also not be as good at gaming. Since makeing something tile based, will inherently introduce more latency, which is really bad for games. Unless they can write a really smart scheduler, that limits the need for communication between the dies, the single tile version will be the most interesting.

adm0n · June 29, 2020

5 minutes ago, Jurrunio said:

Nothing's stopping them from integrating the design into their CPU and giving it a different name. Just look at AMD's Vega cardd, make it much smaller and cut away memory controller you get what Ryzen APUs carry.

They also really want a low powered/low space GPU for their server market, that is still functional. I think like 3 years ago, they were really proud of the fact, that some of their xeon cpus had iGPUs that can handle a video stream or two. For now that is not the plan with XE. It is supposed to excel at machine learning computation, so it is probably not even relevant for gaming. If it is just full of 'tensor cores', they will only be good for gaming, when Nvidias GameGAN takes off.

https://arxiv.org/pdf/2005.12126.pdf

igormp · June 30, 2020

16 hours ago, BlakeHoward said:

So I've been watching some videos over Xe graphics and am still confused over one thing. Will Xe graphics be exclusively integrated graphics? Or dedicated graphics as well for a pcie slot?

Intel is trying to build an architecture that's easily scalable. There are rumors about Gen12 iGPUs with Tiger Lake (Xe LP), up to big data center cards (Xe HP), so it's going to tackle both things you mentioned.

16 hours ago, adm0n said:

They also really want a low powered/low space GPU for their server market, that is still functional.

There are rumors about a wide range of products, from integrated graphics up to cards with around the same amout of cores of a 2080Ti.

Quote

I think like 3 years ago, they were really proud of the fact, that some of their xeon cpus had iGPUs that can handle a video stream or two.

That doesn't really has much to do with GPU performance, since it's fixed hardware. Pretty much like nvdec, which is the same on a 1650 Super or on a Titan RTX.

Quote

For now that is not the plan with XE. It is supposed to excel at machine learning computation, so it is probably not even relevant for gaming. If it is just full of 'tensor cores', they will only be good for gaming, when Nvidias GameGAN takes off.

https://arxiv.org/pdf/2005.12126.pdf

Xe is meant to be a general usage GPU, be it for games or for HPC usages. In fact, it's believed that their main target is actually low end gaming.

On another note, I haven't seen any indications of the card having any extra HW such as tensor cores, only some rumors about ray tracing, but still quite a small rumor. Also, GameGAN has nothing to do with actually running games. (I know that may be a joke, but many people won't understand it).

adm0n · June 30, 2020

46 minutes ago, igormp said:

That doesn't really has much to do with GPU performance, since it's fixed hardware. Pretty much like nvdec, which is the same on a 1650 Super or on a Titan RTX.

I know, but fixed function hardware takes up a lot of spcae on the silicon.

46 minutes ago, igormp said:

Xe is meant to be a general usage GPU, be it for games or for HPC usages. In fact, it's believed that their main target is actually low end gaming.

If you follow the tweets from Raja Kodury:

Spoiler

He sais something about holding peta-ops in his hand. That is not possible without something like the tensor cores from nvidia. And this is probably INT8 performance rather than FP32/64 or even TFP32/16.

52 minutes ago, igormp said:

Also, GameGAN has nothing to do with actually running games.

It was a joke, but the writer of the paper actually say, that porting games from one system to another would be a future usecase for this. I highly doubt, that what they have done is scalable enough for this, but who knows.

igormp · June 30, 2020

59 minutes ago, adm0n said:

I know, but fixed function hardware takes up a lot of spcae on the silicon.

It really doesn't, that's the reason they exist. Fixed function hardware takes less spaces and costs less than trying to use general purpose units for the same function.

Quote

He sais something about holding peta-ops in his hand. That is not possible without something like the tensor cores from nvidia. And this is probably INT8 performance rather than FP32/64 or even TFP32/16.

Agreed that this is probably int8 or even int4 performance, but you don't really need tensor cores for that. Int4 calc was available way back in Kepler (4x faster than regular fp32 perf). Tensor cores added a faster way to perform FMA matrix operations with those values, and hence only provide 2 instructions to work with those (as seen here).

adm0n · June 30, 2020

1 minute ago, igormp said:

It really doesn't, that's the reason they exist. Fixed function hardware takes less spaces and costs less than trying to use general purpose units for the same function.

Okai I worded that very wrong. They definitely do not take up a lot of space on the silicon in comparison to doing it with GP-units. But they do still take up space and are only useful for one task.

I tried to find information on how much space exactly the FF-block on a current intel iGPU takes up, but sadly didn't find anything and I don't have that much time right now to look further. So what I though was correct, might have been wrong. But having fixed function hardware basically acts like 'negative' performance for all other tasks, that don't require the hardware (of course in the right usecase they actually free up resources ..), since they take up die space. And the die space they do take up is not that insignificant.

But I realise, that I've never seen anything about their actual dimensions and only read about it from people who also have no idea about that, describing the block as massive (probably because of their feautre set and not their performance)

If I have said something wrong, I am very sorry!

28 minutes ago, igormp said:

Agreed that this is probably int8 or even int4 performance, but you don't really need tensor cores for that. Int4 calc was available way back in Kepler (4x faster than regular fp32 perf)

Again, correct me if I'm wrong. But currently in the turing architecture there are the same number of INT execution units as there are FP32 units. So the INT32 and FP32 perfomance is the same. For a 2080TI, that would be 13.4TOPs (at base clock i assume). The x4 comes from the fact, that you can split a INT32 calculation into 4 INT8 calculations, which would get you to 53.6 TOPs for INT8 operations.

I doubt, that Intel would missmatch the int to float ratio in the GP-units. They migh even go to, what nvidia has done up to turing, and have units that switch between fp or int calculation and can't do both in a single cycle.

The only way they could get into POPs teritory, in my opinion, is by having dedicated INT8/4 units. Which I did refer to as something like tensor cores, not because they are also incorporating matrix multiplication, but simply for the fact, that they are seperate from the GP-units. Call them INT cores if you so desire.

Still correct me if I'm wrong. Also I'm not really sure if this thread is still the correct place to have this discussion .__.

igormp · June 30, 2020

1 hour ago, adm0n said:

Okai I worded that very wrong. They definitely do not take up a lot of space on the silicon in comparison to doing it with GP-units. But they do still take up space and are only useful for one task.

I tried to find information on how much space exactly the FF-block on a current intel iGPU takes up, but sadly didn't find anything and I don't have that much time right now to look further. So what I though was correct, might have been wrong. But having fixed function hardware basically acts like 'negative' performance for all other tasks, that don't require the hardware (of course in the right usecase they actually free up resources ..), since they take up die space. And the die space they do take up is not that insignificant.

But I realise, that I've never seen anything about their actual dimensions and only read about it from people who also have no idea about that, describing the block as massive (probably because of their feautre set and not their performance)

If I have said something wrong, I am very sorry!

Yeah, I get what you meant. A misused area ends up as wasted space when it comes do chip design, since that same space could be used for something more useful or even just don't exist at all (making the chip cheaper due to the smaller area). Finding actual die shots that explain which part is left for the media engines is really hard, I couldn't get by any from nvidia or intel, but did manage to find one from AMD (from wikichip

It's kinda big for that kind of chip, but represents a small area percentage when it comes to big GPUs and whatnot.

Anyway, I guess that we kinda steered away from your main point. Intel is probably still going to include media engines into their discrete GPUs, at least for the consumer market, which they are going to attack. Not so likely into their HPC cards tho for obvious reasons.

Quote

Again, correct me if I'm wrong. But currently in the turing architecture there are the same number of INT execution units as there are FP32 units. So the INT32 and FP32 perfomance is the same. For a 2080TI, that would be 13.4TOPs (at base clock i assume). The x4 comes from the fact, that you can split a INT32 calculation into 4 INT8 calculations, which would get you to 53.6 TOPs for INT8 operations.

Yes, indeed that how it works.

Quote

I doubt, that Intel would missmatch the int to float ratio in the GP-units. They migh even go to, what nvidia has done up to turing, and have units that switch between fp or int calculation and can't do both in a single cycle.

The only way they could get into POPs teritory, in my opinion, is by having dedicated INT8/4 units. Which I did refer to as something like tensor cores, not because they are also incorporating matrix multiplication, but simply for the fact, that they are seperate from the GP-units. Call them INT cores if you so desire.

I wouldn't doubt if they had multi purpose execution units with higher INT4/8 throughput. Keep in mind that intel was pushing really hard for low precision ML (going so far as trying single binary values). So maybe that's where Raja got his peta op number from. It's his twitter afterall, not some official marketing campaign.

If they come up with special hardware, tensor-like, it'd be pretty cool, but kinda useless for now IMO since most of the market is dominated by nvidia with CUDA, migrating to intel's stack wouldn't be an easy task.

Quote

Also I'm not really sure if this thread is still the correct place to have this discussion .__.

Probably isn't, but it's hard to have such kind of discussion in this forum anyway, since it's mostly gaming/consumer focused, and I do enjoy going a bit more technical (thanks for that!).

adm0n · July 1, 2020

15 hours ago, igormp said:

Finding actual die shots that explain which part is left for the media engines is really hard, I couldn't get by any from nvidia or intel, but did manage to find one from AMD (from wikichip

I also just thought that the engine is huge, but this is just an 200mm^2 chip, so I guess that's fine. Thank you for finding and posting it!

15 hours ago, igormp said:

Intel is probably still going to include media engines into their discrete GPUs

Definitely. My main point about this was also just, that intel was proud of being able to include this functionality into thier server chips. They will in no way want to put anything powerful (i.e. big) onto them. So they would need the XE to be able to function at something like 1/8th of a tile. Which only works, if the architecture allows to be cut down like that. Well or they just spend a little bit adapting it, they are intel after all, they have the engineers to take care of it.

15 hours ago, igormp said:

I wouldn't doubt if they had multi purpose execution units with higher INT4/8 throughput

So you are saying, that the units found in the thing the GPU manufacturers usually like to be called 'cores' would actually be able to have like 10x or 20x the int8/4 performance as compared to int32? Or that the'll just include a int8 unit as well?

Well if they have some dedicated blocks, they would still be able to do any int calculation, not just machine learning tasks. So they would still be multi purpose.

15 hours ago, igormp said:

market is dominated by nvidia with CUDA

There is a translation layer, that even NVIDIA promoted. It's called CU2CL and in an paper form 2011 it already worked pretty well.

(nvidias slides from 2011 https://www.nvidia.com/content/PDF/GDC2011/Wu_Feng_SC11.pdf

paper about it http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.221.7023&rep=rep1&type=pdf )

So it wouldn't be that hard to at least port things over. They would still need to invest time into optimizing things, but that is basically always true.

16 hours ago, igormp said:

Probably isn't, but it's hard to have such kind of discussion in this forum anyway, since it's mostly gaming/consumer focused, and I do enjoy going a bit more technical (thanks for that!).

I also really enjoy conversations like this. Thank you very much as well. If you want we can continue this discussing via PM. But it is kinda over now anyway ... If you ever find something interesting and want to discuss it, feel free to message me!

Sign In

Question over Intel Xe graphics

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

I Am Not Buying A Super Computer - WAN Show May 3, 2024

Latest From Tech Quickie:

This Guy BUILT His Own Graphics Card!

Latest From TechLinked:

Microsoft, Give Up Already.

Latest From GameLinked:

Roblox and Walmart... Are One

Latest From ShortCircuit:

Dell Has Destroyed the XPS - Dell XPS 16 (2024)

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun: