Jump to content

Carrizo-L For Christmas, Full Carrizo For The New Year

Opcode

Sauce

 

amd_carrizo_excavator_fusion.jpg

 

Kaveri, Beema and Mullins are on their way out to be replaced by the Excavator based Carrizo family towards the end of the year.  We can hope they will appear in products in time for Christmas as the low power Carrizo-L, rumoured to be around 12-35W TDP, will arrive.  In the new year the more powerful Carrizo, speculated at 45-65W TDP, will be available.  It is unclear how long the delay will be between availability to system builders and the products appearing on the market.  The chips will support DDR3, contain a GPU based on GCN 3.0 and stacked on-package memory which will be accessible by Through Silicon Via to act as a sort of L3 cache for HSA applications.

 

Looks like the next generation of AMD APU's will be quite impressive. I personally would like to see what the delta compression will do for slow DDR3 system memory as it offers up to 40% better memory bandwidth with Tonga.

Link to comment
Share on other sites

Link to post
Share on other sites

It should really offer up to 3,000mhz ddr3 ram... like really 2133 isn't anywhere near enough to stop the bottlenecks.

Computing enthusiast. 
I use to be able to input a cheat code now I've got to input a credit card - Total Biscuit
 

Link to comment
Share on other sites

Link to post
Share on other sites

Oh boy, when APU's get their hands on DDR4. We're all done for. RUUUUUUN!

Someone told Luke and Linus at CES 2017 to "Unban the legend known as Jerakl" and that's about all I've got going for me. (It didn't work)

 

Link to comment
Share on other sites

Link to post
Share on other sites

It should really offer up to 3,000mhz ddr3 ram... like really 2133 isn't anywhere near enough to stop the bottlenecks.

Considering that's the effective memory clock of Intel's eDRAM for Iris Pro, I'd say the problem with DDR3 is latency. You need a huge amount of cache to keep the iGPU fed and have a very snappy TLB to make sure said eDRAM cache stays full. What TVs provides we'll have to wait and see.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Wait, these APUs are getting an updated GCN before full desktop GPUs do?

 

And I know everyone seems to be anticipating an AMD desktop CPU update, so hopefully the Excavator cores from Carrizo can make it to FX CPUs.

Link to comment
Share on other sites

Link to post
Share on other sites

I don't care about the GPU, it's the Excavator cores that really excite me. I wonder what IPC improvements they bring.

CPU: i7 2600 @ 4.2GHz  COOLING: NZXT Kraken X31 RAM: 4x2GB Corsair XMS3 @ 1600MHz MOBO: Gigabyte Z68-UD3-XP GPU: XFX R9 280X Double Dissipation SSD #1: 120GB OCZ Vertex 2  SSD #2: 240GB Corsair Force 3 HDD #1: 1TB Seagate Barracuda 7200RPM PSU: Silverstone Strider Plus 600W CASE: NZXT H230
CPU: Intel Core 2 Quad Q9550 @ 2.83GHz COOLING: Cooler Master Eclipse RAM: 4x1GB Corsair XMS2 @ 800MHz MOBO: XFX nForce 780i 3-Way SLi GPU: 2x ASUS GTX 560 DirectCU in SLi HDD #1: 1TB Seagate Barracuda 7200RPM PSU: TBA CASE: Antec 300
Link to comment
Share on other sites

Link to post
Share on other sites

doesnt matter, zen all da things. Also we will never get another construction based 8-core.

I'm not a fan of their modular cores personally. I would like to see them do away with the concept.

CPU: i7 2600 @ 4.2GHz  COOLING: NZXT Kraken X31 RAM: 4x2GB Corsair XMS3 @ 1600MHz MOBO: Gigabyte Z68-UD3-XP GPU: XFX R9 280X Double Dissipation SSD #1: 120GB OCZ Vertex 2  SSD #2: 240GB Corsair Force 3 HDD #1: 1TB Seagate Barracuda 7200RPM PSU: Silverstone Strider Plus 600W CASE: NZXT H230
CPU: Intel Core 2 Quad Q9550 @ 2.83GHz COOLING: Cooler Master Eclipse RAM: 4x1GB Corsair XMS2 @ 800MHz MOBO: XFX nForce 780i 3-Way SLi GPU: 2x ASUS GTX 560 DirectCU in SLi HDD #1: 1TB Seagate Barracuda 7200RPM PSU: TBA CASE: Antec 300
Link to comment
Share on other sites

Link to post
Share on other sites

Considering that's the effective memory clock of Intel's eDRAM for Iris Pro, I'd say the problem with DDR3 is latency. You need a huge amount of cache to keep the iGPU fed and have a very snappy TLB to make sure said eDRAM cache stays full. What TVs provides we'll have to wait and see.

uuuuhm.... GPUs are not like CPUs. they are bandwith not latency limited (also latency to a certain factor, but that is irrelevant). they dont do constant (every or every other cycle) pulls from ram like the CPU. they will pull their resources from vram into L2 (that will be textures, instructions etc,..) and keep them there for the 1/60th of a second they are rendering the frame (in a perfect scenario where it runs at 60FPS) 

 

therefore, its much more important that there is an immense amount of bandwith available to it, so even if it has to wait 20 VRAM cycles for the signal to come, it will be transfered immediately.

 

i suggest you check out the latencies on GDDR5 and also some benchmarks of the APUs with different DDR3 configs

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

I'm not a fan of their modular cores personally. I would like to see them do away with the concept.

they will, with Zen.

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

uuuuhm.... GPUs are not like CPUs. they are bandwith not latency limited (also latency to a certain factor, but that is irrelevant). they dont do constant (every or every other cycle) pulls from ram like the CPU. they will pull their resources from vram into L2 (that will be textures, instructions etc,..) and keep them there for the 1/60th of a second they are rendering the frame (in a perfect scenario where it runs at 60FPS) 

 

therefore, its much more important that there is an immense amount of bandwith available to it, so even if it has to wait 20 VRAM cycles for the signal to come, it will be transfered immediately.

 

i suggest you check out the latencies on GDDR5 and also some benchmarks of the APUs with different DDR3 configs

You do know bandwidth and latency are intertwined, right? The lower your latency, the more EFFECTIVE (not theoretical) bandwidth you can have.

Latency is important for the 60+fps speeds.

 

Also, CPUs don't pull every other cycle. The TLB pulls out whole blocks at a time. You're thinking very old architectures at this point.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

they will, with Zen.

Yup, finally AMD will catch up on technique via SMT.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Yay.

  ﷲ   Muslim Member  ﷲ

KennyS and ScreaM are my role models in CSGO.

CPU: i3-4130 Motherboard: Gigabyte H81M-S2PH RAM: 8GB Kingston hyperx fury HDD: WD caviar black 1TB GPU: MSI 750TI twin frozr II Case: Aerocool Xpredator X3 PSU: Corsair RM650

Link to comment
Share on other sites

Link to post
Share on other sites

You do know bandwidth and latency are intertwined, right? The lower your latency, the more EFFECTIVE (not theoretical) bandwidth you can have.

Latency is important for the 60+fps speeds.

 

Also, CPUs don't pull every other cycle. The TLB pulls out whole blocks at a time. You're thinking very old architectures at this point.

usually, ram latencies scale slower than bandwith, so you can get the same access times with high bandwith ram, but much more raw bandwith. 

 

and the pulling every cycle was more of an extreme, i do know they utilise cache much better now, but they still pull many more times than a GPU does. and they pull smaller chunks, thats why the latency matters more with cpus than pure bandwith. but with super fast ram, with a normal latency (2666@CL13-14 like) you get exactly the same access time as with 1600@8-9, but with an almost double the raw bandwith available. and since GPUs dont do many pulls, the latency can be looser, and it will still at most waste a few cycles on it per second, due to that latency not being repeated over and over again like it is with a CPU (again, i know its not every cycle, but with a bad latency (eg 1333@10) it adds up in waiting cycles per second quite fast

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

I don't care about the GPU, it's the Excavator cores that really excite me. I wonder what IPC improvements they bring.

This will be the first time their mobile APU will pack 8 CU's. That's a jump up from 384 streams to 512 streams. On top of being GCN 3.0 which offers faster tessellation performance and up to 40% memory bandwidth. So these APU should be very interesting for budget gaming laptops. I am personally waiting to see how well they perform with the delta color compression since they will help alleviate a tiny bit of the memory bandwidth issue we face with DDR3. Laptops that can play heavy games like BF4 for $600 price mark is what these should bring.

Link to comment
Share on other sites

Link to post
Share on other sites

usually, ram latencies scale slower than bandwith, so you can get the same access times with high bandwith ram, but much more raw bandwith. 

 

and the pulling every cycle was more of an extreme, i do know they utilise cache much better now, but they still pull many more times than a GPU does. and they pull smaller chunks, thats why the latency matters more with cpus than pure bandwith. but with super fast ram, with a normal latency (2666@CL13-14 like) you get exactly the same access time as with 1600@8-9, but with an almost double the raw bandwith available. and since GPUs dont do many pulls, the latency can be looser, and it will still at most waste a few cycles on it per second, due to that latency not being repeated over and over again like it is with a CPU (again, i know its not every cycle, but with a bad latency (eg 1333@10) it adds up in waiting cycles per second quite fast

This is all going out the window. AMD and Intel are both designing heterogeneous chips which use system memory, and you can only keep the engines fed if you never run out of fuel, but you need to be able to do rapid-fire small chunks too (thank you caching).

 

And about 2666 RAM, this is why we need to fight for lower CAS latencies. If you increase CL by 2 you lose bandwidth and access speed.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

This will be the first time their mobile APU will pack 8 CU's. That's a jump up from 384 streams to 512 streams. On top of being GCN 3.0 which offers faster tessellation performance and up to 40% memory bandwidth. So these APU should be very interesting for budget gaming laptops. I am personally waiting to see how well they perform with the delta color compression since they will help alleviate a tiny bit of the memory bandwidth issue we face with DDR3. Laptops that can play heavy games like BF4 for $600 price mark is what these should bring.

You seem to think AMD's stacked memory is extremely cheap compared to Intel's eDRAM... The price of mobile APUs will be driven up by that inclusion, and with that, the prices of laptops. I think $600 is a bit overly optimistic.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

You seem to think AMD's stacked memory is extremely cheap compared to Intel's eDRAM... The price of mobile APUs will be driven up by that inclusion, and with that, the prices of laptops. I think $600 is a bit overly optimistic.

Stacked DRAM is cheaper than adding L3 so I could care less about a few extra dollars.

Link to comment
Share on other sites

Link to post
Share on other sites

Wait, I'm confused. What node is this?

Zen is supposedly 16nm FinFET, but even that is unconfirmed.

In the mean time, Intel are shipping 14nm, with 10nm going into large scale production by the time Zen roles around

CPU: AMD Ryzen 7 3700X - CPU Cooler: Deepcool Castle 240EX - Motherboard: MSI B450 GAMING PRO CARBON AC

RAM: 2 x 8GB Corsair Vengeance Pro RBG 3200MHz - GPU: MSI RTX 3080 GAMING X TRIO

 

Link to comment
Share on other sites

Link to post
Share on other sites

Wait, I'm confused. What node is this?

 

 

I believe Carrizo is still on GloFo's 28nm SHP (bulk) node (same as Kaveri). Great process for the GPU portion, not so much for high CPU clock speeds.

Link to comment
Share on other sites

Link to post
Share on other sites

Stacked DRAM is cheaper than adding L3 so I could care less about a few extra dollars.

It's about 70% the cost.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Zen is supposedly 16nm FinFET, but even that is unconfirmed.

In the mean time, Intel are shipping 14nm, with 10nm going into large scale production by the time Zen roles around

14nm

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Wait, these APUs are getting an updated GCN before full desktop GPUs do?

 

And I know everyone seems to be anticipating an AMD desktop CPU update, so hopefully the Excavator cores from Carrizo can make it to FX CPUs.

From the looks of it Excavator will be another APU exclusive architecture just the same as Steamroller.

 

Wait, I'm confused. What node is this?

More than likely 28nm.

Link to comment
Share on other sites

Link to post
Share on other sites

14nm

Oh yeah, they signed the deal with Samsung didn't they.

CPU: AMD Ryzen 7 3700X - CPU Cooler: Deepcool Castle 240EX - Motherboard: MSI B450 GAMING PRO CARBON AC

RAM: 2 x 8GB Corsair Vengeance Pro RBG 3200MHz - GPU: MSI RTX 3080 GAMING X TRIO

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×