Jump to content

vm'N

Member
  • Posts

    837
  • Joined

  • Last visited

Awards

This user doesn't have any awards

  1. Nonsense? You might not want to think to far into an anology, as that will always bring up differences. It was simply a matter of showing a point that some problems can be solved with different thinking. Dont go to literal. There are no confliction in my sources. They all have the same statement. The pipeline is 14-19 stages! You are the only one so far saying otherwise. You might want to take a look in Intels x86 manual again.. Your way of thinking is nearing the max. We have been over this ones before. The implementation have a theoretical limit. x86 are already translating CISC (more accurately x86) instruction to optimize for the underlying micro-architecture. So what is so new about it? That is the "traditional" RISC pipeline (except for the decode part). CISC and almost all RISC architectures have gone beyond that. You have to crave more into the executable to increase performance, else you will be limited with the same limitation as you are today. Great! Deeper pipeline is not necessarily. I dont see why you suddenly have an obsession with it. "just and just". There is a whole lot of optimization that can be done for the backend. Dont confuse it with "just fetching and decoding". We have yet seen any big implement some sort of TLS (thread-level speculation, transactional memory (except intels TSX), and so on..). Might want to reread what I'm saying. I never said anything about a longer pipeline, and I think we have a different understanding on the functionality of the frontend. Never said anything about ideal solutions neither. Just saying that it is possible.
  2. Doubt AMD could take up all the HBM2 with the dedicated GPU and professional market. Pascal will already have both GDDR5/HBM2 versions (low-/high-end).
  3. You can call it what you what, but it doesn't change it. We humans cant fly neither, but with the "illusion" of a plane, we can. I'm calling it a solution, to your problem. ILP are not near any theoretical limits, whatever you mean with that. Yes, the frontend is good enough for the backend which are in place today. If you think Intels frontend is good enough at this point for the future, you are wrong. A bigger frontend could also help to increase single-threaded workload. Sources: http://www.anandtech.com/show/6355/intels-haswell-architecture/6 https://en.wikipedia.org/wiki/Haswell_%28microarchitecture%29#Features_carried_over_from_Ivy_Bridge http://www.lighterra.com/papers/modernmicroprocessors/ That is only to mention a few of them. I doubt Intels x86 manual on haswell disagrees with me. This is not really something that surprising.
  4. That is not an illusion. In relative to the core clock, it is 0,5 cycle.Clearly these theoretical bounds, can be stretched with some different thinking. You would think you could do some funny things with data hazard, when some instructions only take 0,5 clock cycle. A bigger frontend. This is vere we are heading. Having to read further into the executable, and predict more more accurately. But they needed to be fed as they were 4. I never suggested you needed more? I never said it was the BIGGEST problem. I also doubt we could pinpoint a single thing as the biggest problem. There were a whole lot wrong with it. Netburst had 20-31 pipeline stages. Sandy and forward is 14-19 stages. Haswell pipeline is a solid 14-19 stages. What kind of implementation is Intel using that can have an fixed accuracy? I know of non. Intel most have sacrificed a baby or something, as that is clearly magic.
  5. Actually, you can have instruction executed in less than 1 core cycle. This was a big point of netburst, having the ALUs running 2x the core clock. So those 1 cycle latency would have ½ core clock latency. Now, one of the big problems with netburst was the inability to keep the ALU feed, and that certain instruction would have far bigger latency in comparison and could slow down, the instruction stream. Also in regards to the 98% claim, that does not mean 98% at every workload. The more complex the workload the bigger the chance of a misprediction. Intel most certainly have the most advanced branch predictor, but it is not perfect. Intel does not have the deepest (longest?) pipeline. Longer pipelines usually runs with higher frequency, like bulldozer and netburst. That intel have the widest OoOE engine is another matter.
  6. I think we are starting to go into semantics. My first reply to you, I already addressed it, in an somewhat proper way. My point is, that you cannot pinpoint the instruction set, based alone if it is RISC or CISC. So that is what is meant when I say, RISC/CISC does not determine the instruction set. What makes you think that? Have the executable run on a different abstraction level, and have some sort of intepreter to translate it to something the underlying hardware can understand. By far not easy, of course, but theoretically (in the same fashion), that is how x86 runs today. You might have to run some sort of prefix or something to differentiate between the different ISA.
  7. I think you are missing the point. CISC nor RISC doesn't determine the set of instructions the architecture can processes. It tells something about the instruction set, but as I said, it is not an actual instruction set as you would think of x86 or ARM. You can think of CISC and RISC as a branch/category of instruction sets that follow a certain paradigm. Not an actual instruction set. @patrickjp93 did cover it through. We cant assume that such an essential step in the pipeline wont have any withdrawings in regards to low power (special regards to ultra low power) where everything matters in terms of power consumption. This is an advantage, however, other things can change the outcome. Obvious it have some effect, but to what degree, can quickly change depended on everything else.
  8. I cant quite find the transistor count for the haswell IGP or PVR 6450. But haswell GT2 is 87mm2 PVR GX6450 is 19.1mm2 So if you have them at hand, you are more than welcome to share them. On my search for the transistor count, I did stumble upon this: http://www.anandtech.com/show/8716/apple-a8xs-gpu-gxa6850-even-better-than-i-thought
  9. I'm not so sure about your "intels 22nm is more dense than TSMC 20nm". Not even Intel is making that claim!
  10. No, they are not. CISC, is its original term, means that you will have complex instructions which can be decoded to internal instructions (very similar to RISC instructions). CISC != x86 and RISC != ARM. MIPS also goes under RISC, as many others instruction sets. RISC have a theoretical advantage at low power. Is that better?
  11. CISC and RISC are not instruction sets as you would think x86. It is design choice. ARM have the benefit of overall much smaller, power-effecient, and less bottleneck-prone decoders. This is the only advantage, else they could be identical. If we discuss RISC and CISC we dont include anything else. Else the comparison it off, and there are to many things to change the outcome , so you will end up with a high percentage-error. The development could happen everywhere. They could be using a linux server running on ARM hardware. But that is a good point, and I'm not quite sure much it will/can effect things.
  12. Okay crown jewel, by putting it all at stake? Again, if you think a take-down it the only way, you're simply a delusional. Sure, there are many companies that have something that interest Intel. Doesn't mean the will take over the whole thing. Nvidia was looking at x86 in the mid 2000, but again, companies put up many projects, and some are doomed to fail. If they truly wanted it, they properbly could have been more aggressive in their crosslicense-agreement. Again, how much is it really worth for the company? Not enough, clearly. What is the difference between CISC and RISC? A stage in the pipeline? CISC are RISC internally, and RISC are becoming more CISC-a-like. (from the early stages of CISC) ARM does have the benefit in ultra-low-power. Scale it up, and the advantage are smaller. ARM 64bit is much newer, and doesn't have the x86 "bloat". A more minimalist design. The only things Intel have proved is that a lot of paperwork and shovel ton of money, still wont make your product any more competitive. The whole issue with Intels TDP actually been SDP, and Intels low-end processors are known to overdraw their TDP (SDP) by large quantities. How big of an advantage is their compiler optimization really in those marketsegment? Logic? Not detected Pure, cold, evidence-based logic? Yet, you have presented zero evidence.
  13. You could shrink down the images to say, a thumbnail size, then do the comparison, using perceptual hashing. More info here: http://dsp.stackexchange.com/questions/5995/what-algorithm-does-google-use-for-its-search-by-image-site
×