Jump to content

Question over Intel Xe graphics

So I've been watching some videos over Xe graphics and am still confused over one thing. Will Xe graphics be exclusively integrated graphics? Or dedicated graphics as well for a pcie slot?

Link to comment
Share on other sites

Link to post
Share on other sites

Afaik 

Xe will be an i-gpu and then Intel will release a dedicated GPU ( I don't know if it's the same name ) as well 

But I could be wrong

PC: Motherboard: ASUS B550M TUF-Plus, CPU: Ryzen 3 3100, CPU Cooler: Arctic Freezer 34, GPU: GIGABYTE WindForce GTX1650S, RAM: HyperX Fury RGB 2x8GB 3200 CL16, Case, CoolerMaster MB311L ARGB, Boot Drive: 250GB MX500, Game Drive: WD Blue 1TB 7200RPM HDD.

 

Peripherals: GK61 (Optical Gateron Red) with Mistel White/Orange keycaps, Logitech G102 (Purple), BitWit Ensemble Grey Deskpad. 

 

Audio: Logitech G432, Moondrop Starfield, Mic: Razer Siren Mini (White).

 

Phone: Pixel 3a (Purple-ish).

 

Build Log: 

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, BlakeHoward said:

Will Xe graphics be exclusively integrated graphics?

Xe isn't integrated at all. It's exclusively discrete graphics at the moment. It might also receive an integrated model in the future, but there is no such a thing at the moment.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, WereCatf said:

Xe isn't integrated at all. It's exclusively discrete graphics at the moment. It might also receive an integrated model in the future, but there is no such a thing at the moment.

isnt the low power DG1 available as both dedicated GPU and iGPU?

CPU: i7-2600K 4751MHz 1.44V (software) --> 1.47V at the back of the socket Motherboard: Asrock Z77 Extreme4 (BCLK: 103.3MHz) CPU Cooler: Noctua NH-D15 RAM: Adata XPG 2x8GB DDR3 (XMP: 2133MHz 10-11-11-30 CR2, custom: 2203MHz 10-11-10-26 CR1 tRFC:230 tREFI:14000) GPU: Asus GTX 1070 Dual (Super Jetstream vbios, +70(2025-2088MHz)/+400(8.8Gbps)) SSD: Samsung 840 Pro 256GB (main boot drive), Transcend SSD370 128GB PSU: Seasonic X-660 80+ Gold Case: Antec P110 Silent, 5 intakes 1 exhaust Monitor: AOC G2460PF 1080p 144Hz (150Hz max w/ DP, 121Hz max w/ HDMI) TN panel Keyboard: Logitech G610 Orion (Cherry MX Blue) with SteelSeries Apex M260 keycaps Mouse: BenQ Zowie FK1

 

Model: HP Omen 17 17-an110ca CPU: i7-8750H (0.125V core & cache, 50mV SA undervolt) GPU: GTX 1060 6GB Mobile (+80/+450, 1650MHz~1750MHz 0.78V~0.85V) RAM: 8+8GB DDR4-2400 18-17-17-39 2T Storage: HP EX920 1TB PCIe x4 M.2 SSD + Crucial MX500 1TB 2.5" SATA SSD, 128GB Toshiba PCIe x2 M.2 SSD (KBG30ZMV128G) gone cooking externally, 1TB Seagate 7200RPM 2.5" HDD (ST1000LM049-2GH172) left outside Monitor: 1080p 126Hz IPS G-sync

 

Desktop benching:

Cinebench R15 Single thread:168 Multi-thread: 833 

SuperPi (v1.5 from Techpowerup, PI value output) 16K: 0.100s 1M: 8.255s 32M: 7m 45.93s

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Jurrunio said:

isnt the low power DG1 available as both dedicated GPU and iGPU?

Considering that the moniker "DG" comes from the words "Discrete Graphics".....

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, WereCatf said:

Considering that the moniker "DG" comes from the words "Discrete Graphics".....

Nothing's stopping them from integrating the design into their CPU and giving it a different name. Just look at AMD's Vega cardd, make it much smaller and cut away memory controller you get what Ryzen APUs carry.

CPU: i7-2600K 4751MHz 1.44V (software) --> 1.47V at the back of the socket Motherboard: Asrock Z77 Extreme4 (BCLK: 103.3MHz) CPU Cooler: Noctua NH-D15 RAM: Adata XPG 2x8GB DDR3 (XMP: 2133MHz 10-11-11-30 CR2, custom: 2203MHz 10-11-10-26 CR1 tRFC:230 tREFI:14000) GPU: Asus GTX 1070 Dual (Super Jetstream vbios, +70(2025-2088MHz)/+400(8.8Gbps)) SSD: Samsung 840 Pro 256GB (main boot drive), Transcend SSD370 128GB PSU: Seasonic X-660 80+ Gold Case: Antec P110 Silent, 5 intakes 1 exhaust Monitor: AOC G2460PF 1080p 144Hz (150Hz max w/ DP, 121Hz max w/ HDMI) TN panel Keyboard: Logitech G610 Orion (Cherry MX Blue) with SteelSeries Apex M260 keycaps Mouse: BenQ Zowie FK1

 

Model: HP Omen 17 17-an110ca CPU: i7-8750H (0.125V core & cache, 50mV SA undervolt) GPU: GTX 1060 6GB Mobile (+80/+450, 1650MHz~1750MHz 0.78V~0.85V) RAM: 8+8GB DDR4-2400 18-17-17-39 2T Storage: HP EX920 1TB PCIe x4 M.2 SSD + Crucial MX500 1TB 2.5" SATA SSD, 128GB Toshiba PCIe x2 M.2 SSD (KBG30ZMV128G) gone cooking externally, 1TB Seagate 7200RPM 2.5" HDD (ST1000LM049-2GH172) left outside Monitor: 1080p 126Hz IPS G-sync

 

Desktop benching:

Cinebench R15 Single thread:168 Multi-thread: 833 

SuperPi (v1.5 from Techpowerup, PI value output) 16K: 0.100s 1M: 8.255s 32M: 7m 45.93s

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Jurrunio said:

Nothing's stopping them from integrating the design into their CPU and giving it a different name.

Yes, but they haven't done so yet. All the Xe - graphics are discrete at the moment and I am not aware of anyone at Intel having said any plans to make an integrated version anytime soon.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

For now it has been designed to be a scalable, tile-based architecture. If their plan succeeds, they can create anything from it, ranging from low powered integrated GPUs to high powered gaming/accellerator cards for servers.

But for now, they just exist as dedicated GPUs. At least that is all they have told us about. If they wanted to integrate them with a CPU, they might need to adapt the architecture a bit, but in the end, the will probably want to merge their iGPU division with the XE devision. This only works however, if the XE architecture is truely scalable. The iGPU version of the XE GPU would probably only need a portion of a single tile.

 

The fact that it is tile based, will probably mean, that it will also not be as good at gaming. Since makeing something tile based, will inherently introduce more latency, which is really bad for games. Unless they can write a really smart scheduler, that limits the need for communication between the dies, the single tile version will be the most interesting.

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Jurrunio said:

Nothing's stopping them from integrating the design into their CPU and giving it a different name. Just look at AMD's Vega cardd, make it much smaller and cut away memory controller you get what Ryzen APUs carry.

They also really want a low powered/low space GPU for their server market, that is still functional. I think like 3 years ago, they were really proud of the fact, that some of their xeon cpus had iGPUs that can handle a video stream or two. For now that is not the plan with XE. It is supposed to excel at machine learning computation, so it is probably not even relevant for gaming. If it is just full of 'tensor cores', they will only be good for gaming, when Nvidias GameGAN takes off.

https://arxiv.org/pdf/2005.12126.pdf

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, BlakeHoward said:

So I've been watching some videos over Xe graphics and am still confused over one thing. Will Xe graphics be exclusively integrated graphics? Or dedicated graphics as well for a pcie slot?

Intel is trying to build an architecture that's easily scalable. There are rumors about Gen12 iGPUs with Tiger Lake (Xe LP), up to big data center cards (Xe HP), so it's going to tackle both things you mentioned.

 

16 hours ago, adm0n said:

They also really want a low powered/low space GPU for their server market, that is still functional.

There are rumors about a wide range of products, from integrated graphics up to cards with around the same amout of cores of a 2080Ti.

Quote

I think like 3 years ago, they were really proud of the fact, that some of their xeon cpus had iGPUs that can handle a video stream or two.

That doesn't really has much to do with GPU performance, since it's fixed hardware. Pretty much like nvdec, which is the same on a 1650 Super or on a Titan RTX.

Quote

For now that is not the plan with XE. It is supposed to excel at machine learning computation, so it is probably not even relevant for gaming. If it is just full of 'tensor cores', they will only be good for gaming, when Nvidias GameGAN takes off.

https://arxiv.org/pdf/2005.12126.pdf

Xe is meant to be a general usage GPU, be it for games or for HPC usages. In fact, it's believed that their main target is actually low end gaming.

 

On another note, I haven't seen any indications of the card having any extra HW such as tensor cores, only some rumors about ray tracing, but still quite a small rumor. Also, GameGAN has nothing to do with actually running games. (I know that may be a joke, but many people won't understand it).

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

46 minutes ago, igormp said:

That doesn't really has much to do with GPU performance, since it's fixed hardware. Pretty much like nvdec, which is the same on a 1650 Super or on a Titan RTX.

I know, but fixed function hardware takes up a lot of spcae on the silicon.

 

46 minutes ago, igormp said:

Xe is meant to be a general usage GPU, be it for games or for HPC usages. In fact, it's believed that their main target is actually low end gaming.

If you follow the tweets from Raja Kodury:

Spoiler

 

He sais something about holding peta-ops in his hand. That is not possible without something like the tensor cores from nvidia. And this is probably INT8 performance rather than FP32/64 or even TFP32/16. 

 

52 minutes ago, igormp said:

Also, GameGAN has nothing to do with actually running games.

It was a joke, but the writer of the paper actually say, that porting games from one system to another would be a future usecase for this. I highly doubt, that what they have done is scalable enough for this, but who knows. 

Link to comment
Share on other sites

Link to post
Share on other sites

59 minutes ago, adm0n said:

I know, but fixed function hardware takes up a lot of spcae on the silicon.

It really doesn't, that's the reason they exist. Fixed function hardware takes less spaces and costs less than trying to use general purpose units for the same function.

Quote

He sais something about holding peta-ops in his hand. That is not possible without something like the tensor cores from nvidia. And this is probably INT8 performance rather than FP32/64 or even TFP32/16. 

Agreed that this is probably int8 or even int4 performance, but you don't really need tensor cores for that. Int4 calc was available way back in Kepler (4x faster than regular fp32 perf). Tensor cores added a faster way to perform FMA matrix operations with those values, and hence only provide 2 instructions to work with those (as seen here).

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, igormp said:

It really doesn't, that's the reason they exist. Fixed function hardware takes less spaces and costs less than trying to use general purpose units for the same function.

Okai I worded that very wrong. They definitely do not take up a lot of space on the silicon in comparison to doing it with GP-units. But they do still take up space and are only useful for one task.

I tried to find information on how much space exactly the FF-block on a current intel iGPU takes up, but sadly didn't find anything and I don't have that much time right now to look further. So what I though was correct, might have been wrong. But having fixed function hardware basically acts like 'negative' performance for all other tasks, that don't require the hardware (of course in the right usecase they actually free up resources ..), since they take up die space. And the die space they do take up is not that insignificant.

 

But I realise, that I've never seen anything about their actual dimensions and only read about it from people who also have no idea about that, describing the block as massive (probably because of their feautre set and not their performance)

If I have said something wrong, I am very sorry!

 

28 minutes ago, igormp said:

Agreed that this is probably int8 or even int4 performance, but you don't really need tensor cores for that. Int4 calc was available way back in Kepler (4x faster than regular fp32 perf)

Again, correct me if I'm wrong. But currently in the turing architecture there are the same number of INT execution units as there are FP32 units. So the INT32 and FP32 perfomance is the same. For a 2080TI, that would be 13.4TOPs (at base clock i assume). The x4 comes from the fact, that you can split a INT32 calculation into 4 INT8 calculations, which would get you to 53.6 TOPs for INT8 operations.

I doubt, that Intel would missmatch the int to float ratio in the GP-units. They migh even go to, what nvidia has done up to turing, and have units that switch between fp or int calculation and can't do both in a single cycle.

 

The only way they could get into POPs teritory, in my opinion, is by having dedicated INT8/4 units. Which I did refer to as something like tensor cores, not because they are also incorporating matrix multiplication, but simply for the fact, that they are seperate from the GP-units. Call them INT cores if you so desire. 

 

Still correct me if I'm wrong. Also I'm not really sure if this thread is still the correct place to have this discussion .__.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, adm0n said:

Okai I worded that very wrong. They definitely do not take up a lot of space on the silicon in comparison to doing it with GP-units. But they do still take up space and are only useful for one task.

I tried to find information on how much space exactly the FF-block on a current intel iGPU takes up, but sadly didn't find anything and I don't have that much time right now to look further. So what I though was correct, might have been wrong. But having fixed function hardware basically acts like 'negative' performance for all other tasks, that don't require the hardware (of course in the right usecase they actually free up resources ..), since they take up die space. And the die space they do take up is not that insignificant.

 

But I realise, that I've never seen anything about their actual dimensions and only read about it from people who also have no idea about that, describing the block as massive (probably because of their feautre set and not their performance)

If I have said something wrong, I am very sorry!

Yeah, I get what you meant. A misused area ends up as wasted space when it comes do chip design, since that same space could be used for something more useful or even just don't exist at all (making the chip cheaper due to the smaller area). Finding actual die shots that explain which part is left for the media engines is really hard, I couldn't get by any from nvidia or intel, but did manage to find one from AMD (from wikichip😞

 

image.png.c79994f3dd9cd1c63867f359f6a00b66.png

 

It's kinda big for that kind of chip, but represents a small area percentage when it comes to big GPUs and whatnot.

 

Anyway, I guess that we kinda steered away from your main point. Intel is probably still going to include media engines into their discrete GPUs, at least for the consumer market, which they are going to attack. Not so likely into their HPC cards tho for obvious reasons.

 

Quote

Again, correct me if I'm wrong. But currently in the turing architecture there are the same number of INT execution units as there are FP32 units. So the INT32 and FP32 perfomance is the same. For a 2080TI, that would be 13.4TOPs (at base clock i assume). The x4 comes from the fact, that you can split a INT32 calculation into 4 INT8 calculations, which would get you to 53.6 TOPs for INT8 operations.

Yes, indeed that how it works.

Quote

I doubt, that Intel would missmatch the int to float ratio in the GP-units. They migh even go to, what nvidia has done up to turing, and have units that switch between fp or int calculation and can't do both in a single cycle.

 

The only way they could get into POPs teritory, in my opinion, is by having dedicated INT8/4 units. Which I did refer to as something like tensor cores, not because they are also incorporating matrix multiplication, but simply for the fact, that they are seperate from the GP-units. Call them INT cores if you so desire. 

I wouldn't doubt if they had multi purpose execution units with higher INT4/8 throughput. Keep in mind that intel was pushing really hard for low precision ML (going so far as trying single binary values). So maybe that's where Raja got his peta op number from. It's his twitter afterall, not some official marketing campaign.

 

If they come up with special hardware, tensor-like, it'd be pretty cool, but kinda useless for now IMO since most of the market is dominated by nvidia with CUDA, migrating to intel's stack wouldn't be an easy task.

 

Quote

 Also I'm not really sure if this thread is still the correct place to have this discussion .__.

Probably isn't, but it's hard to have such kind of discussion in this forum anyway, since it's mostly gaming/consumer focused, and I do enjoy going a bit more technical (thanks for that!). 

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

15 hours ago, igormp said:

Finding actual die shots that explain which part is left for the media engines is really hard, I couldn't get by any from nvidia or intel, but did manage to find one from AMD (from wikichip

I also just thought that the engine is huge, but this is just an 200mm^2 chip, so I guess that's fine. Thank you for finding and posting it!

15 hours ago, igormp said:

Intel is probably still going to include media engines into their discrete GPUs

Definitely. My main point about this was also just, that intel was proud of being able to include this functionality into thier server chips. They will in no way want to put anything powerful (i.e. big) onto them. So they would need the XE to be able to function at something like 1/8th of a tile. Which only works, if the architecture allows to be cut down like that. Well or they just spend a little bit adapting it, they are intel after all, they have the engineers to take care of it.

 

15 hours ago, igormp said:

I wouldn't doubt if they had multi purpose execution units with higher INT4/8 throughput

So you are saying, that the units found in the thing the GPU manufacturers usually like to be called 'cores' would actually be able to have like 10x or 20x the int8/4 performance as compared to int32? Or that the'll just include a int8 unit as well?

Well if they have some dedicated blocks, they would still be able to do any int calculation, not just machine learning tasks. So they would still be multi purpose.

 

15 hours ago, igormp said:

market is dominated by nvidia with CUDA

There is a translation layer, that even NVIDIA promoted. It's called CU2CL and in an paper form 2011 it already worked pretty well.

(nvidias slides from 2011 https://www.nvidia.com/content/PDF/GDC2011/Wu_Feng_SC11.pdf

paper about it http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.221.7023&rep=rep1&type=pdf )

 

So it wouldn't be that hard to at least port things over. They would still need to invest time into optimizing things, but that is basically always true.

 

16 hours ago, igormp said:

Probably isn't, but it's hard to have such kind of discussion in this forum anyway, since it's mostly gaming/consumer focused, and I do enjoy going a bit more technical (thanks for that!). 

I also really enjoy conversations like this. Thank you very much as well. If you want we can continue this discussing via PM. But it is kinda over now anyway ... If you ever find something interesting and want to discuss it, feel free to message me!

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×