Jump to content

How AMD's Latest Kaveri APUs Perform With HSA.... Incredible.

It's not that I question the results (the reality is I don't know enough to do so) but I figured there would be at least one Intel fanboy pipe up or one person who really knows their shit say something. Kinda concerning that we have got to page 2 and everyone's comments are affirming.

yeah, there is a lot of that here. I guess those people are off playing titanfall lol

Motherboard - Gigabyte P67A-UD5 Processor - Intel Core i7-2600K RAM - G.Skill Ripjaws @1600 8GB Graphics Cards  - MSI and EVGA GeForce GTX 580 SLI PSU - Cooler Master Silent Pro 1,000w SSD - OCZ Vertex 3 120GB x2 HDD - WD Caviar Black 1TB Case - Corsair Obsidian 600D Audio - Asus Xonar DG


   Hail Sithis!

Link to comment
Share on other sites

Link to post
Share on other sites

yeah, there is a lot of that here. I guess those people are off playing titanfall lol

aha!  so that's why, everyone with real cpus are of playing titanfall, and everyone else who can only afford the AMD apu's are left to comment on these threads.  :ph34r:

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

It's not that I question the results (the reality is I don't know enough to do so) but I figured there would be at least one Intel fanboy pipe up or one person who really knows their shit say something. Kinda concerning that we have got to page 2 and everyone's comments are affirming. 

These HSA results aren't out of the ordinary, the appropriate code is computed on the appropriate accelerator at the appropriate time, this results in both great performance and efficiency.

This is what AMD's fusion imitative was all about, unifying the CPU & GPU architectures to improve performance.

We should start getting more mainstream applications to support HSA once OpenCL 2.0 is released since it has a direct pathway for unified memory architectures and inherently any OpenCL 2.0 program will show dramatic performance improvements with HSA compliant hardware.

There have also been numerous reports on how Microsoft is working on deep HSA integration into Windows 9, so basic OS functions will be accelerated.

And on low power devices power consumption will dramatically decline.

Link to comment
Share on other sites

Link to post
Share on other sites

Now thats impressive ! 

AMD FX 8320@ Stock - Asus M5A99X Evo R2.0 - Kingston HyperX 8GB 1600Mhz - Corsair Carbide 200R - Powercolor Radeon HD 7950 PCS+OC@970Mhz core 1400Mhz memory - Corsair CS650W - Samsung 840 EVO 250GB 
LG 22EA53VQ 21.5" - CM Storm Xornet - CM Storm Quickfire TK - Creative Inspire T3130 2.1

Link to comment
Share on other sites

Link to post
Share on other sites

These HSA results aren't out of the ordinary, the appropriate code is computed on the appropriate accelerator at the appropriate time, this results in both great performance and efficiency.

This is what AMD's fusion imitative was all about, unifying the CPU & GPU architectures to improve performance.

We should start getting more mainstream applications to support HSA once OpenCL 2.0 is released since it has a direct pathway for unified memory architectures and inherently any OpenCL 2.0 program will show dramatic performance improvements with HSA compliant hardware.

There have also been numerous reports on how Microsoft is working on deep HSA integration into Windows 9, so basic OS functions will be accelerated.

And on low power devices power consumption will dramatically decline.

It's not that I question the results (the reality is I don't know enough to do so) but I figured there would be at least one Intel fanboy pipe up or one person who really knows their shit say something. Kinda concerning that we have got to page 2 and everyone's comments are affirming. 

As @TechFan@ic and I have been saying APUs are the future.

Console optimisations and how they will effect you | The difference between AMD cores and Intel cores | Memory Bus size and how it effects your VRAM usage |
How much vram do you actually need? | APUs and the future of processing | Projects: SO - here

Intel i7 5820l @ with Corsair H110 | 32GB DDR4 RAM @ 1600Mhz | XFX Radeon R9 290 @ 1.2Ghz | Corsair 600Q | Corsair TX650 | Probably too much corsair but meh should have had a Corsair SSD and RAM | 1.3TB HDD Space | Sennheiser HD598 | Beyerdynamic Custom One Pro | Blue Snowball

Link to comment
Share on other sites

Link to post
Share on other sites

As @TechFan@ic and I have been saying APUs are the future.

yes of course, Intel have shown that with nearly every domestic cpu having a gpu onboard since when? 2550K?  except will still don't know exactly what AMD plan to do with the fx line.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

wow, very impressive!

excited to see how this affects gameplay when games start taking advantage of HSA

I'm seeing you're using a APU also so how does it perform among the in games. Does it bottleneck.

Link to comment
Share on other sites

Link to post
Share on other sites

yes of course, Intel have shown that with nearly every domestic cpu having a gpu onboard since when? 2550K?  except will still don't know exactly what AMD plan to do with the fx line.

It was long before the 2550k e.g. Intel Family Chipset?

Console optimisations and how they will effect you | The difference between AMD cores and Intel cores | Memory Bus size and how it effects your VRAM usage |
How much vram do you actually need? | APUs and the future of processing | Projects: SO - here

Intel i7 5820l @ with Corsair H110 | 32GB DDR4 RAM @ 1600Mhz | XFX Radeon R9 290 @ 1.2Ghz | Corsair 600Q | Corsair TX650 | Probably too much corsair but meh should have had a Corsair SSD and RAM | 1.3TB HDD Space | Sennheiser HD598 | Beyerdynamic Custom One Pro | Blue Snowball

Link to comment
Share on other sites

Link to post
Share on other sites

I think it's worth noting that you would see even higher performance numbers with a dedicated CPU and a dedicated GPU.

This is great for low end builds, but don't throw out your 290s or 780s yet...

Link to comment
Share on other sites

Link to post
Share on other sites

someone explain to me in simple english what HSA is?

how does it work?

what are the benefits of it?

thanks much appreciated

Link to comment
Share on other sites

Link to post
Share on other sites

yes of course, Intel have shown that with nearly every domestic cpu having a gpu onboard since when? 2550K?  except will still don't know exactly what AMD plan to do with the fx line.

 

Rather, the 2550K was an anomaly - it was basically a 2500K SKU, on which the onboard graphics didn't pass the binning process so thy just lasered it off.

 

someone explain to me in simple english what HSA is?

how does it work?

what are the benefits of it?

thanks much appreciated

 

Very short TLDR version: Both the GPU and the CPU can access the system memory. Additionally, programs can access the GPU and tell it to do stuff without having to ask the CPU to pass those instructions on. Means that processes that benefit hugely from parallel processing or require high amounts of floating point calculations (which is what a GPU is specifically designed to do) will see absolutely massive performance increases over non-HSA compatible hardware.

Intel i7 5820K (4.5 GHz) | MSI X99A MPower | 32 GB Kingston HyperX Fury 2666MHz | Asus RoG STRIX GTX 1080ti OC | Samsung 951 m.2 nVME 512GB | Crucial MX200 1000GB | Western Digital Caviar Black 2000GB | Noctua NH-D15 | Fractal Define R5 | Seasonic 860 Platinum | Logitech G910 | Sennheiser 599 | Blue Yeti | Logitech G502

 

Nikon D500 | Nikon 300mm f/4 PF  | Nikon 200-500 f/5.6 | Nikon 50mm f/1.8 | Tamron 70-210 f/4 VCII | Sigma 10-20 f/3.5 | Nikon 17-55 f/2.8 | Tamron 90mm F2.8 SP Di VC USD Macro | Neewer 750II

Link to comment
Share on other sites

Link to post
Share on other sites

Can't wait til game engines start implementing HSA, perhaps through Mantle...

Totally, and Many gigs of Sysram and Vram being utilized all together.... giggity

Maximums - Asus Z97-K /w i5 4690 Bclk @106.9Mhz * x39 = 4.17Ghz, 8GB of 2600Mhz DDR3,.. Gigabyte GTX970 G1-Gaming @ 1550Mhz

 

Link to comment
Share on other sites

Link to post
Share on other sites

I think it's worth noting that you would see even higher performance numbers with a dedicated CPU and a dedicated GPU.

This is great for low end builds, but don't throw out your 290s or 780s yet...

That's not true, to realize the performance gains with HSA you need a physical unified memory architecture.

You need a CPU & a GPU on the same die sharing memory otherwise we're back where we started, copying data back and fourth between the GPU & the CPU across PCIe lanes wastes energy & performance.

Virtual unified memory architectures like CUDA 6 & OpenCL don't improve performance, their purpose is to reduce development effort.

Nvidia's goal is likely to move towards a physical unified memory architecture on the Tegra line.

Link to comment
Share on other sites

Link to post
Share on other sites

I'm seeing you're using a APU also so how does it perform among the in games. Does it bottleneck.

gets 60 FPS in minecraft ᕦ༼ຈل͜ຈ༽ᕤ

its a 2.6 GHz FM1 Apu, one of the first they made, so, i dont think i'll be getting mantle support ;)

AMD FX 8350 8-Core CPU (Stock 4GHz) | Gigabyte GA-990FXA-UD3 | (Rev. 4.0) | 8GB 1600MHz HyperX Fury | AMD R9 270 (MSI Gaming OC Edition) | NZXT H440 (White) | Kingston 120GB SSDnow V300 | Seagate Barracuda 1TB (7200 RPM) | Corsair RM750m 750w Fully Modular Power Supply

Dell 23.6-Inch 1080p 60hz IPS Display | Corsair H100i AIO Liquid Cooler | Logitech G710+ Mechanical Gaming Keyboard | Logitech G502 Proteus Core | iSymphony Wireless Stereo Speakers | Windows 8.1 (64-bit)

Link to comment
Share on other sites

Link to post
Share on other sites

That's not true, to realize the performance gains with HSA you need a physical unified memory architecture.

As far as I know, there isn't really anything hindering a dedicated graphics card from accessing the same RAM as the CPU in the same way as HSA works with integrated GPUs (other than software support).

 

You need a CPU & a GPU on the same die sharing memory otherwise we're back where we started, copying data back and fourth between the GPU & the CPU across PCIe lanes wastes energy & performance.

We don't need to copy data back and forth between the CPU and GPU if they added support for dedicated GPUs as well. They have already said that they are planning on making HSA support things other than integrated GPUs. Quote from Ars Technica:

HSA isn't just for CPUs with integrated GPUs. In principle, the other processors that share access to system memory could be anything, such as cryptographic accelerators, or programmable hardware such as FPGAs. They might also be other CPUs, with a combined x86/ARM chip often conjectured. Kaveri will in fact embed a small ARM core for creation of secure execution environments on the CPU. Discrete GPUs could similarly use HSA to access system memory.

So let me ask you this, what kind of lost energy and performance do you think we would see from using HSA with dedicated GPUs? PCIe has more than enough bandwidth for it, and in systems with discrete graphics cards I doubt the tiny amount of extra power that would use would be a big deal.

 

Virtual unified memory architectures like CUDA 6 & OpenCL don't improve performance, their purpose is to reduce development effort.

Nvidia's goal is likely to move towards a physical unified memory architecture on the Tegra line.

No idea why you're bringing up CUDA and OpenCL. They are completely irrelevant for this conversation.

 

 

My point was that some people were bringing up dedicated GPUs as if this will compete with it. It won't. Even if they did, AMD could in theory just enable this for dedicated GPUs and they would lose that benefit. Hopefully they will do that in the future. Also, dedicated GPUs are still much much faster. This is good for budget computers, but it won't magically make an APU as powerful or more powerful than a 290X.

Link to comment
Share on other sites

Link to post
Share on other sites

As far as I know, there isn't really anything hindering a dedicated graphics card from accessing the same RAM as the CPU in the same way as HSA works with integrated GPUs (other than software support).

Actually there is a fundamental reason why HSA can't work on dedicated GPUs, you can certainly make a unified memory architecture where a dedicated GPU shares memory with the CPU but it's utterly pointless because any performance gains from such an architecture will be overshadowed by the performance degradation of having to pass pointers in and out of IOs such as PCIe. Programmers actually knew this for a long time and have driven development of software in such ways to avoid that.

It's irrelevant if the IO has enough bandwidth, the latency dictated by this sort of platform level atomics between physically separated accelerators is extremely high.

You will get rid of one bottleneck (copying data back and fourth) but you will create a new bottleneck that's just as bad, you still need to pass pointers between accelerators to maintain memory coherency and the latency associated with the communication required between physically separated accelerators is very high.

 

We don't need to copy data back and forth between the CPU and GPU if they added support for dedicated GPUs as well. They have already said that they are planning on making HSA support things other than integrated GPUs. Quote from Ars Technica:

http://youtu.be/vxEyK32tc30?t=1h40m22s

HSA is all about optimizing for the SOC and insuring that programs for that environment can be written much more easily and be power efficient and performant.

In doing that we noticed several things first of all there are parts of this architecture that are applicable to discrete GPUs and discrete GPUs will be benefit from.

For instance the discrete GPU being able to operate directly into pageable system memory with the same addresses and so forth ( similar to what the latest OpenCL & CUDA have implemented)

However there are some features where discrete GPUs will be challenged such as platform atomics and full memory coherency...

Discrete GPUs will benefit from the subset of the architecture (HSA architecture) that makes sense for those programs....

We don't expect discrete GPUs to be HSA compliant

 

No idea why you're bringing up CUDA and OpenCL. They are completely irrelevant for this conversation.

CUDA & OpenCL are definitely relevant in this discussion because they offer unified memory access between the CPU & dedicated GPUs but offer no performance gains, because again, these accelerators are physically separated.

 

The big news here – and the headlining feature for CUDA 6 – is that NVIDIA has implemented complete unified memory support within CUDA. The toolkit has possessed unified virtual addressing support since CUDA 4, allowing the disparate x86 and GPU memory pools to be addressed together in a single space. But unified virtual addressing only simplified memory management; it did not get rid of the required explicit memory copying and pinning operations necessary to bring over data to the GPU first before the GPU could work on it.
Now to be clear here, CUDA 6’s unified memory system doesn’t resolve the technical limitations that require memory copies – specifically, the limited bandwidth and latency of PCIe – rather it’s a change in who’s doing the memory management.Data still needs to be copied to the GPU to be operated upon, but whereas CUDA 5 required explicit memory operations (higher level toolkits built on top of CUDA withstanding) CUDA 6 offers the ability to have CUDA do it instead, freeing the programmer from the task.

http://www.anandtech.com/show/7515/nvidia-announces-cuda-6-unified-memory-for-cuda

Link to comment
Share on other sites

Link to post
Share on other sites

Actually there is a fundamental reason why HSA can't work on dedicated GPUs, you can certainly make a unified memory architecture where a dedicated GPU shares memory with the CPU but it's utterly pointless because any performance gains from such an architecture will be overshadowed by the performance degradation of having to pass pointers in and out of IOs such as PCIe. Programmers actually knew this for a long time and have driven development of software in such ways to avoid that.

It's irrelevant if the IO has enough bandwidth, the latency dictated by this sort of platform level atomics between physically separated accelerators is extremely high.

You will get rid of one bottleneck (copying data back and fourth) but you will create a new bottleneck that's just as bad, you still need to pass pointers between accelerators to maintain memory coherency and the latency associated with the communication required between physically separated accelerators is very high.

For certain small tasks you are absolutely correct. The extra latency that the PCIe bus would add would outweigh the much faster calculation on a dedicated GPU. That is not true for all tasks though. The more complex the task is, the less impact the extra latency becomes.

I think you are too quick to dismiss using HSA for discrete graphics. AMD has already talked about extending HSA to cover more than just the embedded GPUs, so they are disagreeing with your implication that it will only worth using for embedded GPUs. Personally I am really excited to see where AMD takes this from here.

Link to comment
Share on other sites

Link to post
Share on other sites

Couldnt care less.We talk when all major games and apps support else is just another AMD bla bla 

And if some apps take advantage but it cant do sh4t for gaming it can just die...the future is fusion..only in AMD minds theres no way we can have an fx8350 and r9-290x on the same die,apu's are pointless for mid-high end market so HSA for low end apu market is waste of time.Dedicated gpu's cannot be replaced for a long time.

That cpu in the apu is trash,and same is the gpu its just good enough for entry level gaming and multimedia pc's.What amd is doing now is like intel,making cpu+gpu(apu) expensive for high end market which is not required,at least intel has the cpu part done right.

Link to comment
Share on other sites

Link to post
Share on other sites

Couldnt care less.We talk when all major games and apps support else is just another AMD bla bla 

And if some apps take advantage but it cant do sh4t for gaming it can just die...the future is fusion..only in AMD minds theres no way we can have an fx8350 and r9-290x on the same die,apu's are pointless for mid-high end market so HSA for low end apu market is waste of time.Dedicated gpu's cannot be replaced for a long time.

That cpu in the apu is trash,and same is the gpu its just good enough for entry level gaming and multimedia pc's.What amd is doing now is like intel,making cpu+gpu(apu) expensive for high end market which is not required,at least intel has the cpu part done right.

 

Sorry, but HSA is not an AMD only idea.

1. No-one is saying they want to replace the dedicated GPU's.

2. Not everyone plays video games, most systems are used in the corporate world where HSA will actually be advantageous (i.e. you are just doing spread sheets anyways, the HSA improvements in Libreoffice are nice, but there is 0 reason for MS to not implement it in Office).

3. There is talk of Windows offloading some tasks to the GPU.

4. The reason for the APU having such a poor performance compared to the intel equivalent is specifically due to AMD innovating and looking ahead, a GPU can do Floating-point calculations on a order of magnitude quicker then a CPU, when applications are updated to support this (or if windows integration is deep enough that applications do not need to be specifically coded to account for this), the performance gap between an APU and it's intel counterpart will diminish (and the APU would actually be the better performer - assuming intel does not have the hardware to support it at that time or just chooses not to).

5. Offloading things like physics and AI pathing to a On-die GPU would be the best thing to do, as CPU cores suck at this sort of thing. This means you do not need the CPU horsepower you previously did.

6. Please educate yourself, you need it.

Link to comment
Share on other sites

Link to post
Share on other sites

For certain small tasks you are absolutely correct. The extra latency that the PCIe bus would add would outweigh the much faster calculation on a dedicated GPU. That is not true for all tasks though. The more complex the task is, the less impact the extra latency becomes.

I think you are too quick to dismiss using HSA for discrete graphics. AMD has already talked about extending HSA to cover more than just the embedded GPUs, so they are disagreeing with your implication that it will only worth using for embedded GPUs. Personally I am really excited to see where AMD takes this from here.

The thing is though, if you have HSA and a dedicated GPU then the sky's the limit.

Console optimisations and how they will effect you | The difference between AMD cores and Intel cores | Memory Bus size and how it effects your VRAM usage |
How much vram do you actually need? | APUs and the future of processing | Projects: SO - here

Intel i7 5820l @ with Corsair H110 | 32GB DDR4 RAM @ 1600Mhz | XFX Radeon R9 290 @ 1.2Ghz | Corsair 600Q | Corsair TX650 | Probably too much corsair but meh should have had a Corsair SSD and RAM | 1.3TB HDD Space | Sennheiser HD598 | Beyerdynamic Custom One Pro | Blue Snowball

Link to comment
Share on other sites

Link to post
Share on other sites

wow, this is actually pretty damn impressive! I might have to get an kaveri apu to mess with now haha. I hope this lets amd to actually start competing with intel again. The cpu market has been too stagnant the past few years.

Case: Phanteks Evolve X with ITX mount  cpu: Ryzen 3900X 4.35ghz all cores Motherboard: MSI X570 Unify gpu: EVGA 1070 SC  psu: Phanteks revolt x 1200W Memory: 64GB Kingston Hyper X oc'd to 3600mhz ssd: Sabrent Rocket 4.0 1TB ITX System CPU: 4670k  Motherboard: some cheap asus h87 Ram: 16gb corsair vengeance 1600mhz

                                                                                                                                                                                                                                                          

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Couldnt care less.We talk when all major games and apps support else is just another AMD bla bla 

And if some apps take advantage but it cant do sh4t for gaming it can just die...the future is fusion..only in AMD minds theres no way we can have an fx8350 and r9-290x on the same die,apu's are pointless for mid-high end market so HSA for low end apu market is waste of time.Dedicated gpu's cannot be replaced for a long time.

That cpu in the apu is trash,and same is the gpu its just good enough for entry level gaming and multimedia pc's.What amd is doing now is like intel,making cpu+gpu(apu) expensive for high end market which is not required,at least intel has the cpu part done right.

 

You might not be aware of this, but not everybody buys a computer to play games on it. The "computer game" market is a pathetically small segment of the total computer hardware marketplace (albeit a very, very profitable one). And this is a technology for that comparatively huge section of the market which does nothing but crunch numbers all day and all night long which could see AMD hardware becoming the industry benchmark - and that is where the AMD processor division's main revenue is, and will remain.

 

Many computer gamers like to think that the market is designed to provide them with the best gaming experience, and that performance in industrial processing is a by-product of that R&D work, but it is actually the other way around.

Intel i7 5820K (4.5 GHz) | MSI X99A MPower | 32 GB Kingston HyperX Fury 2666MHz | Asus RoG STRIX GTX 1080ti OC | Samsung 951 m.2 nVME 512GB | Crucial MX200 1000GB | Western Digital Caviar Black 2000GB | Noctua NH-D15 | Fractal Define R5 | Seasonic 860 Platinum | Logitech G910 | Sennheiser 599 | Blue Yeti | Logitech G502

 

Nikon D500 | Nikon 300mm f/4 PF  | Nikon 200-500 f/5.6 | Nikon 50mm f/1.8 | Tamron 70-210 f/4 VCII | Sigma 10-20 f/3.5 | Nikon 17-55 f/2.8 | Tamron 90mm F2.8 SP Di VC USD Macro | Neewer 750II

Link to comment
Share on other sites

Link to post
Share on other sites

gets 60 FPS in minecraft ᕦ༼ຈل͜ຈ༽ᕤ

its a 2.6 GHz FM1 Apu, one of the first they made, so, i dont think i'll be getting mantle support ;)

I'm not talking about minecraft i'm talking about intense games like bf4,Bf3,titanfall,rust& all the new games coming out.

Link to comment
Share on other sites

Link to post
Share on other sites

It's mind blowing how far AMD has come in the past few years. They're the provider of chipsets for all three consoles, they have been destroying the graphics card market with options that offer exceptional~price to performance and now this. 

Desert Storm PC | Corsair 600T | ASUS Sabertooth 990FX AM3+ | AMD FX-8350 | MSI 7950 TFIII | 16GB Corsair Vengeance 1600 | Seasonic X650W I Samsung 840 series 500GB SSD

Mobile Devices I ASUS Zenbook UX31E I Nexus 7 (2013) I Nexus 5 32GB (red)

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×