The End of CPU Advancement on our Doorstep (Moore's Law and the 7nm Barrier) Discussion

Mira Yurizaki · March 19, 2018

3 minutes ago, AncientNerd said:

The thing is we tried something like this "RISC" processors back in the late 1980s and early 1990s RISC was supposed to take over computing and change everything. Well it turns out that it is really really (and I mean really) hard to optimize code for a reduced instruction set and not slow things down dramatically. Like orders of magnitude dramatically like loss of speed in the divide by 1000 or 10000 times the speed. There is a reason microcode is a specialized field, its D@mn hard to do let alone do right. Trying to do a generalized compiler for microcode is even harder. I was on a couple of teams in the early 1990s that looked into this problem and after much spending and thrashing determined that it was not a good place to go (well it included the companies going under while we were burning $$$ trying to solve this type of problem).

The way I understand things, there were several other attempts at going towards, on paper, a more efficient style of architecture and they all ended in various degrees of success (or failure, depending on your disposition). But with the way things are going, we may have no other choice. Either we keep all this baggage from the 70s (which has resulted in at least one major security flaw that required a hardware fix) or we tell ourselves to nut up and dump it.

Of course, I wouldn't be asking critical systems to make the jump, but something has to give.

Coaxialgamer · March 19, 2018

1 minute ago, M.Yurizaki said:

I think before we start looking into exotic methods of building CPUs, we should look at the current state of them and the software that runs on them.

For example, what if we got rid of x86 entirely and compiled directly to microcodes? Yes I know that sounds like a pipe dream, but there is some tax involved decoding x86 instructions into microcodes. And then what if we got rid of legacy features? That would eliminate a lot of conditional checks at the hardware level and perhaps even vastly simplify the processor architecture (for every conditional you have, you at least double the number of outcomes).

There are quite a few problems with that approach though.

For starters , micro-code isn't meant to have any kind of syntax : they are basically basically just control signals.

You can't feed data into a processor with micro-code, not really . You need some form of instruction.

Backwards-Compatibility aside , micro-code isn't only ISA specific , it's µarch specific . Meaning something compiled for skylake will ONLY work on skylake . you'd need to have a version of software for every micro-arch out there , forward and back . Software might not work simply because the dev forgot to update it for the latest gen.

And besides , the x86 decoder isn't that expensive . Its only one pipeline stage , and most instructions come out of the µop cache anyway. Area is negligible too .

What i'm more concerned about is the performance cost of an abstraction layer designed to convert to said micro-code . X86 compilers for stuff like C and c++ are extremely advanced , but having to translate to microcode from raw code would likely incur a penalty far greater than the decoding of instructions .

A similar argument can be made for legacy instructions . they likely don't take up much area , and likely use the same execution units as other instructions through micro-code anyway.

Coaxialgamer · March 19, 2018

1 minute ago, gabrielcarvfer said:

The dude was talking about x92 and x128 instruction sets, that's why bits matter. ._.

You can just add more instructions to the ISA, keeping the x86/AMD64 instruction sets. You basically change the frontend of the processor to include them, and map them on the current uOps, or change the backend to include hardware support.

Extra registers might be worth depending on the workload, the speculation depth, etc.

I don't remember exactly how many registers the Power9 uArch has, but it's a beast, at least for servers.

well that would need to be in the form of an extension similar to x86-64 , which would allow architectural registers to be added . But then again , the argument for a 96 or 128 bit ISA ( or extension ) is kind of dubious , especially considering potential performance penalties , and the existence of SSE/AVX/AVX2

AncientNerd · March 19, 2018

13 minutes ago, M.Yurizaki said:

The way I understand things, there were several other attempts at going towards, on paper, a more efficient style of architecture and they all ended in various degrees of success (or failure, depending on your disposition). But with the way things are going, we may have no other choice. Either we keep all this baggage from the 70s (which has resulted in at least one major security flaw that required a hardware fix) or we tell ourselves to nut up and dump it.

Of course, I wouldn't be asking critical systems to make the jump, but something has to give.

Yes, the thing is that the "degree of success" was always in an academic environment, I was involved in a couple of the attempts to translate this into real products. What I can tell you is it works great in theory, in practice with really world code bases, it is much harder to make things actually work the way they do in a research paper environment. I contributed to some of the papers, then tried to turn them into real results and the results were...underwhelming at best.

Now compiler construction has come a long way in the last 30 years, but the directions it has been going is not in the directions that you are talking about...kind of exactly the opposite actually. So there may be some advantage to dropping the complexity of the instruction set, there may also be some advantage to having some additional compiler optimization - but frankly the compiler optimization we have now is really d@mn good, I used to be able to write better low-level code than the compiler - I don't even bother to try anymore.

Yes having developers write more efficient code would be good, having compilers product more efficient code would be good, having languages that help promote more efficient coding would be good...but all of that is a diffuse effect. One or Two companies can't make as much impact as say Intel, AMD, Broadcom, Qalcom, Samsung or NVidia getting a 20% increase in performance out of a processor. It would take many companies all being willing to spend the $$$ to actually optimize their coding practices to fix the software side of the problem.

Mira Yurizaki · March 19, 2018

1 minute ago, AncientNerd said:

-snip-

One issue I see is that you're taking existing software, designed and built around the existing architecture of the time, rather than try to build something for the system natively. And every time I see someone create a processor that tries to do things more efficiently on paper (like Itanium or Transmeta Crusoe), they tried to use the existing software not built for said architectures and everyone simply poo-poo'd them for that.

And while one could blame the compiler for those performance issues not being optimized enough, I don't believe that this approach is a complete dead end. NVIDIA's Project Denver was a good effort and showed promise, but nobody seemed interested in it enough for NVIDIA to continue work on it since I haven't heard anything about it since the Nexus 9's release. Then there's the Russian's Elbrus processors, but being it's Russian, there doesn't appear to be a lot of analysis (in English anyway) on it.

Either way, yes new radical processors running existing software is important to check to see how painful the transition is, but to only test existing software and not also software that's natively built and provides the same functions isn't productive.

straight_stewie · March 19, 2018

1 hour ago, M.Yurizaki said:

For example, what if we got rid of x86 entirely and compiled directly to microcodes? Yes I know that sounds like a pipe dream, but there is some tax involved decoding x86 instructions into microcodes. And then what if we got rid of legacy features? That would eliminate a lot of conditional checks at the hardware level and perhaps even vastly simplify the processor architecture (for every conditional you have, you at least double the number of outcomes).

We don't allow hardware programmers to write microcode because it makes the processor very vulnerable. You could easily write an instruction that physically damages the processor (say, turning on all the bus connections, both inputs and outputs, at the same time). Now that example can be easily overcome, but consider that the ability to write any arbitrary instruction also voids privilege levels. Think about how privilege levels are enforced: Black listing opcodes. This works because the OPcode contains some information about what the instruction does, and that instruction will only ever do that one thing. If you allow execution of arbitrary instructions you cannot have this type of enforcement.

Another problem is that, atleast for x86, control words can be very long.

I have been thinking of an ISA agnostic processor lately though. It has the simplest control scheme and control matrix ever. All the control matrix does is shift bytes out into control word registers, and all the instruction does is tell you how long it is, how many operands it has, and then start encoding control words.

AncientNerd · March 19, 2018

25 minutes ago, M.Yurizaki said:

One issue I see is that you're taking existing software, designed and built around the existing architecture of the time, rather than try to build something for the system natively. And every time I see someone create a processor that tries to do things more efficiently on paper (like Itanium or Transmeta Crusoe), they tried to use the existing software not built for said architectures and everyone simply poo-poo'd them for that.

And while one could blame the compiler for those performance issues not being optimized enough, I don't believe that this approach is a complete dead end. NVIDIA's Project Denver was a good effort and showed promise, but nobody seemed interested in it enough for NVIDIA to continue work on it since I haven't heard anything about it since the Nexus 9's release. Then there's the Russian's Elbrus processors, but being it's Russian, there doesn't appear to be a lot of analysis (in English anyway) on it.

Either way, yes new radical processors running existing software is important to check to see how painful the transition is, but to only test existing software and not also software that's natively built and provides the same functions isn't productive.

The problem with looking at natively built "all new software" is that it is completely unrealistic to expect any change in existing software baseline for multiple years to come. Basically a whiz bang new processor that can't run existing software at all is dead in the water from the start. The fact that COBOL, FORTRAN, and ADA compilers still exist for Linux and Windows Server proves that, basically if you can't provide at least compile level compatibility with similar or better performance than their existing platform there is 0% chance that any business will change to the shiny new processor. There is a reason that x86 servers took until the early 2000's to really take off, they existed alongside minicomputers and mainframes through the 80's and 90's only making minor inroads until the very tail end of the '90's because they couldn't provide the performance and stability to run "real businesses" they were fine for those small 500 ish person enterprises but a real fortune 1000 or fortune 100 company not a chance at higher than an engineer level.

Changing from x64 will run into the same push back today unless you can prove that they can run as fast and as stability with at very least compile level compatibility but in most cases they will want run level compatibility - i.e., it better run x64 instruction set somehow.

Mira Yurizaki · March 19, 2018

3 minutes ago, straight_stewie said:

Now that example can be easily overcome, but consider that the ability to write any arbitrary instruction also voids privilege levels. Think about how privilege levels are enforced: Black listing opcodes. This works because the OPcode contains some information about what the instruction does, and that instruction will only ever do that one thing. If you allow execution of arbitrary instructions you cannot have this type of enforcement.

But what do these instructions do that make them more privileged than others?

straight_stewie · March 19, 2018

4 minutes ago, AncientNerd said:

The problem with looking at natively built "all new software" is that it is completely unrealistic to expect any change in existing software baseline for multiple years to come. Basically a whiz bang new processor that can't run existing software at all is dead in the water from the start. The fact that COBOL, FORTRAN, and ADA compilers still exist for Linux and Windows Server proves that, basically if you can't provide at least compile level compatibility with similar or better performance than their existing platform there is 0% chance that any business will change to the shiny new processor. There is a reason that x86 servers took until the early 2000's to really take off, they existed alongside minicomputers and mainframes through the 80's and 90's only making minor inroads until the very tail end of the '90's because they couldn't provide the performance and stability to run "real businesses" they were fine for those small 500 ish person enterprises but a real fortune 1000 or fortune 100 company not a chance at higher than an engineer level.

They will if those old processors are taken out of production and support for them is dropped. Eventually they will fail and they will be forced to buy new products. Obviously there is cost in introducing new designs, but that's what migration strategies are for.

. @M.Yurizaki They modify processor state that is illegal for user space programs to modify.

Consider if we allow user programs to directly address memory, instead of forcing them to use their virtual address space.

For a more advanced example, consider what would happen if we let users have complete access to the interrupt vector table.

AncientNerd · March 19, 2018

3 minutes ago, straight_stewie said:

They will if those old processors are taken out of production and support for them is dropped. Eventually they will fail and they will be forced to buy new products. Obviously there is cost in introducing new designs, but that's what migration strategies are for.

Yea, not...the companies that buy multiple millions of dollars worth of server equipment have a really big influence on how long chips stay in production. I worked for an Fortune 500 for about 3 years and my boss heard one of the chips we used was going end of life, he called his boss, who called his boss...it's now 13 years later and it is finally really going end of life...surprise, the company I used to work for stopped producing the equipment that used that chip.

When you order enough of something the supplier keeps the line open for you, if GE told Intel that they would switch to AMD if they stopped producing the x64 Xenons you can bet whatever you want that Intel would ask very nicely "which Xenon processors do you want us to keep producing? and how many do you want? (FYI GE uses processors in their medical equipment, the last I talked to some of the people I know it was multiple hundred thousand per year of the mid to high end Xenons for their processing, plus memory and disk storage - and these are not counted as "computers" they are "just" smart medical equipment).

AncientNerd · March 19, 2018

11 minutes ago, M.Yurizaki said:

But what do these instructions do that make them more privileged than others?

They modify actual voltages and current inside the processor and on the bus, remember microcode is not really computer instructions it's machine logic at the hardware level.

If you have ever programmed in ladder logic it's at that level, below the level of assembly language each instruction in assembly language is between 3-10 microcode instructions or even more. doing things like send a 1.2v pulse down pin 4 now. and if you send a 1.7v pulse down pin 6 you burn things up - and there are instructions that let you do that.

straight_stewie · March 19, 2018

1 minute ago, AncientNerd said:

Yea, not...the companies that buy multiple millions of dollars worth of server equipment have a really big influence on how long chips stay in production. I worked for an Fortune 500 for about 3 years and my boss heard one of the chips we used was going end of life, he called his boss, who called his boss...it's now 13 years later and it is finally really going end of life...surprise, the company I used to work for stopped producing the equipment that used that chip.

So then we can never advance. x86 and x64 is the current state of the art ISA (that's laughable), and it's in use so much that we can never replace it. Ergo, we cannot advance technology much further than it already has, and looking at new ISAs for general purpose computing is an exercise in futility.

That's a mighty nihilistic view of the state of things, isn't it?

You never gave any thought to the idea that it is highly possible that some completely new architecture could be capable of computing so efficiently that it could run an x86 virtual machine faster than current processors can natively run that instruction set, did you? Again, this is what migration strategies are for.

Beyond that, the licensing agreement between AMD and Intel has to be reupped every so often, and actually places alot of restrictions on what AMD can do (and what can be done to AMD). It's not too hard to conceive of a situation in which Intel regains full control over that market.

AncientNerd · March 19, 2018

1 minute ago, straight_stewie said:

So then we can never advance. x86 and x64 is the current state of the art ISA (that's laughable), and it's in use so much that we can never replace it. Ergo, we cannot advance technology much further than it already has, and looking at new ISAs for general purpose computing is an exercise in futility.

That's a mighty nihilistic view of the state of things, isn't it?

You never gave any thought to the idea that it is highly possible that some completely new architecture could be capable of computing so efficiently that it could run an x86 virtual machine faster than current processors can natively run that instruction set, did you? Again, this is what migration strategies are for.

Beyond that, the licensing agreement between AMD and Intel has to be reupped every so often, and actually places alot of restrictions on what AMD can do (and what can be done to AMD). It's not too hard to conceive of a situation in which Intel regains full control over that market.

No what I am saying is that what ever new architecture happens it needs to have some way to support x64, it could be a subsystem, it could be emulation, it could be a VM. But it has to exist for a period of time. Heck there are still (as of 2015) new versions of OS/360 coming out that support emulation of the hardware that existed when the OS first came out in 1964 and they run on current hardware. So no I am not saying the underlying hardware can't change, I am saying it needs to keep supporting the large customers - and that because they all want to make money it will or they will keep producing the old hardware until it can. Just like IBM keep producing hardware that could run the OS/360 variants until the mid 1990's when they could buy commodity hardware that could run emulation that would support their existing customer base.

bcredeur97 · March 19, 2018

inb4 we have PCIe slots that we slot CPU's into

TechyBen · March 19, 2018

3 hours ago, gabrielcarvfer said:

The dude was talking about x92 and x128 instruction sets, that's why bits matter. ._.

You can just add more instructions to the ISA, keeping the x86/AMD64 instruction sets. You basically change the frontend of the processor to include them, and map them on the current uOps, or change the backend to include hardware support.

Extra registers might be worth depending on the workload, the speculation depth, etc.

I don't remember exactly how many registers the Power9 uArch has, but it's a beast, at least for servers.

AFAIK this still takes up space on the silicone, hence no real speed benefit. It changes the types of calculations you can do quicker/slower, it does not make all calculations quicker/slower. A bit like a lever, it either gives length or strength to the movement.

inb4 we have PCIe slots that we slot CPU's into

A micro controller is a mini computer... everything has them in these days. It's not actually a "computer" any more, it's a collection of computers!

Besides: https://en.wikipedia.org/wiki/Transputer and http://www.commell.com.tw/Product/SBC/HS-771.HTM

WallacEngineering · March 20, 2018

23 hours ago, TechyBen said:

A micro controller is a mini computer... everything has them in these days. It's not actually a "computer" any more, it's a collection of computers!

-SNIP-

Ain't that the truth. Everything electronic is basically a computer anymore. Even the old boomboxes with CD players from the 90's were computers.

Computers today basically run the entire planet.

AncientNerd · March 20, 2018

21 minutes ago, WallacEngineering said:

Ain't that the truth. Everything electronic is basically a computer anymore. Even the old boomboxes with CD players from the 90's were computers.

Computers today basically run the entire planet.

You aren't kidding I spent a good part of the '90s programming controls for conveyors, and now there is more processing power in my dishwasher than I used to run a factory with...

WallacEngineering · March 20, 2018

15 minutes ago, AncientNerd said:

You aren't kidding I spent a good part of the '90s programming controls for conveyors, and now there is more processing power in my dishwasher than I used to run a factory with...

Lol

maleko48 · March 20, 2018

Optical computing will likely be the future of general consumer computing. Surprised nobody mentioned optical computing yet.

Donut417 · March 20, 2018

1 hour ago, maleko48 said:

Optical computing will likely be the future of general consumer computing. Surprised nobody mentioned optical computing yet.

To be honest this is the first time even hearing the term.

LED_Guy · March 21, 2018

On 3/16/2018 at 7:45 AM, AncientNerd said:

This...basically before the last break-through to drop the scale of individual components there was a lot of interest in switching completely to GaAs as a substrate, but as you state there are some advantages and disadvantages to both, however if we do hit the limits of Si then there may be a switch to GaAs in general rather than just in the high frequency parts it is used in now. Although that will raise the costs as it is significantly more expensive as a material to process and has had traditionally much lower yields than Si, and is a is a more expensive material to actually make dies out of...all of which add up to increased cost of chips if there is a general switch to GaAs unless one or more of these change.

If you think the cost of silicon is really an important factor in the cost of a CPU you are mistaken. There are hundreds of process steps. The cost of the substrate doesn't really affect price of the end product. The global yield, well that's another story.

GaAs would just be a first step. InP actually has better electrical properties for really high frequency operation.

Keep in mind that both GaAs and InP are direct bandgap semiconductors - they are capable of efficient light emission. They would allow some amount of optical communication between different portions of the CPU without needing to the use of copper conductors that follow a tortuous path. Even if they are thin, electrical conductors in a CPU take up room. Eliminate some portion and you get a die shrink at the same process node. Your signals can even pass through each other without interference - try that with copper!

Thin electrical conductors also have a relatively high resistance which means you lose a lot of energy sending signals along them (that ends up as heat). Light generation in GaAs materials can be ~80% efficient (electrical power into optical power) and there is no difference in heat generation is the signal travels 1 mm or 1 m.

You won't be able to convert all the signaling/communication to optical, but it would make a significant difference. The data bus on you mobo could also convert to optical signalling which frees up board space - mobo's get smaller at the same time. A single fiber is capable of >40 Gb/s which exceeds the bandwidth for PCIe (4.0) x16. 24 fiber channels gives you the capability for ~1Tb/s!

So GaAs and/or InP gives the potential for higher clock speeds, smaller die, less heat generation and higher data transfer rates. Really not a bad combination.

WallacEngineering · March 21, 2018

2 hours ago, maleko48 said:

Optical computing will likely be the future of general consumer computing. Surprised nobody mentioned optical computing yet.

47 minutes ago, Donut417 said:

To be honest this is the first time even hearing the term.

No idea here either. Optical could possibly mean fiber-optic cableing or eyesight so probably cloud computing? We did discuss this if thats what you mean @maleko48

LED_Guy · March 21, 2018

On 3/18/2018 at 6:05 PM, Sierra Fox said:

i dont get the whole nm aspect of CPUs, so i'l ask; why is the 7nm "barrier" such an issue, when we hit 7nm why don't we just start making CPUs bigger/multiple dies like thread ripper etc? or does it not work like that?

At 7nm quantum effects start to creep in and your charge carriers (electrons and holes) start to act like waves. &nm is REALLY small - about 20 atoms! Look up quantum tunneling. The little buggers won't stay where they are supposed to.

You're resistance for conductors starts to go up dramatically as well. It becomes increasingly easy to burn out your conductor traces.

Your sensitivity to soft errors (bit flip due to background radiation) starts to go up as well. There is always some natural background radiation, in the materials used to make the chip and from cosmic rays. Digital ran into problems with this when they were developing their Alpha processor back in the early 1990's. They characterized the natural radioactive impurites in lead deposits all over the world to identify the mines that produced the "oldest" lead (lead is the end product of radioactive decay for elements with an atomic number >82. As lead deposits age, more of the natural radioactive impurities have a chance to decay into lead.

maleko48 · March 21, 2018

11 minutes ago, WallacEngineering said:

No idea here either. Optical could possibly mean fiber-optic cableing or eyesight so probably cloud computing? We did discuss this if thats what you mean @maleko48

http://forum.notebookreview.com/threads/all-about-new-scientific-concept-and-futuristic-technologies-thread.813357/#post-10674941

Sorry, I'm kind of new here. I usually hang out over at www.notebookreview.com. Fiber optic cabling is optical data transmission, but I mean optical (AKA: photonic) computing. I think not relying on literal electrons flowing, but rather the precise control of photons for processing will yield a cooler-running CPU which is a good place to start.

LED_Guy · March 21, 2018

2 minutes ago, maleko48 said:

http://forum.notebookreview.com/threads/all-about-new-scientific-concept-and-futuristic-technologies-thread.813357/#post-10674941

Sorry, I'm kind of new here. I usually hang out over at www.notebookreview.com. Fiber optic cabling is optical data transmission, but I mean optical (AKA: photonic) computing. I think not relying on literal electrons flowing, but rather the precise control of photons for processing will yield a cooler-running CPU which is a good place to start.

Optical computing requires materials with some very special properties. Instead of a transistor that operates based on the flow of electrons and applied voltages you have the optical equivalent of a transistor. That takes a few different light sources to reproduce some of the effects that you get in electrical transistors from applied voltages.

You're not running wires (conductive traces) any more - you have to precisely align your optical emitters, whatever you are using to act like a transistor and your detectors. It was a great concept, but lost its appeal, 1) because it turns out to be really hard to do and 2) because silicon technology advanced amazingly fast.

Who else remembers 16 MHz 16bit processors? The first true 32 bit machines were considered amazing, but the clock speeds where still in the 20-30 MHz range.

Sign In

The End of CPU Advancement on our Doorstep (Moore's Law and the 7nm Barrier) Discussion

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites