for is way faster than goto in C?

Gat Pelsinger · February 4

So this

for (register int i = 0; i < UINT_MAX; i++);

is way faster than

register int i = 0;
a:
	if (i < UINT_MAX) {
    	i++;
    	goto a;
    }
    else return;

Technically I thought this is what a for loop (or even a while) is essentially in the background, but the for loop is quite a bit more faster than the goto if I benchmark. Then how is the for loop made?

Sauron · February 4

We went over this in your other thread - you don't know what this compiles to and, further, this is going to be scheduled by the operating system, meaning it will not always complete in the same time.

4 minutes ago, Gat Pelsinger said:

Technically I thought this is what a for loop (or even a while) is essentially in the background

I'm pretty sure the compiler will just delete your for loop if it doesn't do anything, as is the case here.

The explicit goto instruction will probably not be deleted, because the variable could be used elsewhere in the code and therefore it must be incremented; it's possible the compiler will just substitute the useless goto with a simple addition, but I'm not certain.

PDifolco · February 4

My guess is that while for just increment the variable each step until condition met, goto increment once then the code go to another address, then increment, then go again etc, which is slower

mariushm · February 4

ffs you ask a lot of basic and silly questions, in a forum that's not programming oriented..

Would be easy to just compile your program, both versions, and then load the binaries into ghidra or some other disassembler and see how they're different.

There's tons of optimizations a compiler could do, that you're not aware of. In the first version the compiler could even take out the whole thing and you'd have an application that just returns.

Kilrah · February 4

You really need to start looking at the assembly output of your compiler and learn to understand it if you want to go into such detail.

Eigenvektor · February 4

Comparing or optimizing performance at that level effectively makes zero sense. Especially if you're not specifying the compiler, compiler version and compiler flags you used.

For any real world code that does something useful inside the loop, there's most likely going to be zero performance difference if you enable compiler optimizations. In any case that is not a performance difference in C, but rather a performance difference in the machine code each variant compiles to.

I've used "gcc -S" (version 13.2.1 20230801) to get the compiled machine code, without any optimization. I'm only copying the actual loop here, since I figure the rest is irrelevant to the discussion.

The for-loop turns into this:

.L3:
	addl	$1, %ebx
.L2:
	cmpl	$-1, %ebx
	jne	.L3

-> Increment number, compare numbers, jump back to L3 if not equal.

The GOTO turns into this instead:

.L2:
	cmpl	$-1, %ebx
	je	.L3
	addl	$1, %ebx
	jmp	.L2
.L3:

-> Compare numbers, jump to L3 if equal, else increment number, jump back to L2.

In other words, one additional instruction that needs to be evaluated per iteration for goto.

PDifolco · February 4

24 minutes ago, Eigenvektor said:
Comparing or optimizing performance at that level effectively makes zero sense. Especially if you're not specifying the compiler, compiler version and compiler flags you used.

For any real world code that does something useful inside the loop, there's most likely going to be zero performance difference if you enable compiler optimizations. In any case that is not a performance difference in C, but rather a performance difference in the machine code each variant compiles to.

I've used "gcc -S" (version 13.2.1 20230801) to get the compiled machine code, without any optimization. I'm only copying the actual loop here, since I figure the rest is irrelevant to the discussion.

The for-loop turns into this:
.L3:
	addl	$1, %ebx
.L2:
	cmpl	$-1, %ebx
	jne	.L3
-> Increment number, compare numbers, jump back to L3 if not equal.

The GOTO turns into this instead:
.L2:
	cmpl	$-1, %ebx
	je	.L3
	addl	$1, %ebx
	jmp	.L2
.L3:
-> Compare numbers, jump to L3 if equal, increment number, jump back to L2.

In other words, one additional instruction that needs to be evaluated per iteration for goto.

This confirms my reasoning

Was considering the code as example snippets, obviously taken alone it doesn't have any effect

Eigenvektor · February 4

35 minutes ago, PDifolco said:

This confirms my reasoning

Was considering the code as example snippets, obviously taken alone it doesn't have any effect

It does, yes I just thought it might make sense to confirm.

On the one hand, it is interesting to see what the code compiles into. On the other hand, it makes zero sense to try and draw any definitive conclusion—for is way faster than goto—about performance from a single example.

As soon as you add actual content into the loop, the performance cost of a single conditional jump might turn into meaningless noise. Or the compiler output might look completely different flipping the result on its head. Of course, for any real world program you'd also enable compiler optimization, which might eliminate the performance difference altogether.

FlyingPotato_is_taken · February 4

In this particular example the for loop would probably be eliminated while the goto would be turned into machine code and be executed. In a nutshell you probably measured a empty program vs. the goto code.

Gat Pelsinger · February 4

@Eigenvektor

My dumb brain is not able to process from where exactly the new instruction is coming from, so all I want to ask is if there is a way to optimize the goto loop to make it behave the same as the for loop?

Eigenvektor · February 4

59 minutes ago, Gat Pelsinger said:

My dumb brain is not able to process from where exactly the new instruction is coming from, so all I want to ask is if there is a way to optimize the goto loop to make it behave the same as the for loop?

The instructions come from how the compiler interprets your code. You got multiple options:

- Easy mode: Enable compiler optimizations

- Hard mode: Write assembler

- I got too much time on my hand: Fiddle around with your code until the compiler output matches what you want

Of course, while the loops don't actually do anything, they'll simply be removed by the optimizer. But for "useful" code that is both the easiest and at the same time fastest option.

Without optimizations enabled, the assembler code is more or less a literal translation of your C code.

a:                    .L2
  if (i < UINT_MAX)     cmpl $-1, %ebx
                        je .L3
  {                 
      i++;              addl $1, %ebx
      goto a;           jmp  .L2
  }
  else
  {
      return;         .L3
  }

The "je .L3" is effectively the "else"

With -O3 enabled, GCC spits out a warning and compiles to an endless loop, because your variable is an int, but you're comparing against an unsigned int value, which it'll never be able reach. It'll overflow and therefore never stop looping. Changing the variable to an unsigned int fixes the warning and produces the expected outcome: Nothing. The loop is optimized away.

Gat Pelsinger · February 4

@Eigenvektor

Obviously this a completely non-practical experiment. My stupidly genius brain got this idea to compare a for loop with a manual goto loop, and turns out my goto loop is slower than the for loop, and I am look for why that is. I did rewrite the code and benchmarked it, but it is still slower. I re-wrote with following the steps you told that the for loop uses -

4 hours ago, Eigenvektor said:

-> Increment number, compare numbers, jump back to L3 if not equal.

The code -

register unsigned int i = 0;
    a:
        i++;
        if (i < UINT_MAX)
            goto a;

Isn't it doing exactly the same thing as the for loop?

Okay, if we look at the assembly output,

for loop-

.L5:
	addl	$1, %ebx
.L4:
	cmpl	$-1, %ebx
	jne	.L5

goto -

.L6:
	addl	$1, %ebx
	cmpl	$-1, %ebx
	je	.L7
	jmp	.L6
.L7:

Now I don't quite know how to read assembly, but it looks like the for loop code has 1 jump statement, whereas the goto code has 2. The for loop code jumps to .L5 if the comparing result is false in one go, whereas the goto code after comparing, checks to jump to .L7 if the comparing result is true, and then has to jump to .L6. So the logical difference I can see is that the for loop code uses a standalone if statement to keep the loop running and when the if statement results false, there is no special jump code. The code execution just goes on. But in my goto code, it uses an if/else like statement. It checks to jump to .L7 if comparing result was true, and then jump to .L6. Only if we could convert the "je" instruction to a "jne" instruction (or just flip the result of the comparison?). Is there still a way to optimize my for loop?

Eigenvektor · February 4

28 minutes ago, Gat Pelsinger said:

Isn't it doing exactly the same thing as the for loop?

Logically? Yes. But the compiler doesn't know that.

When you use a for-loop, the compiler knows you're writing a loop, so it can make certain assumptions/optimizations right away. If you use goto, all it knows is that you're jumping somewhere under certain conditions. It can't know that this is intended to be a loop. If you do enable optimization, it'll perform additional analysis and likely arrive at the same result. Without optimization it just performs a literal translation of C to assembler.

Sign In

for is way faster than goto in C?

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

I Bought a Recording Jammer. It’s Legal.

Latest From Tech Quickie:

Why Are Gaming Laptops So Expensive?

Latest From TechLinked:

Not MORE Youtube Ads…

Latest From GameLinked:

Is Nintendo being FRAMED?

Latest From ShortCircuit:

I tried 20 influencer foods, here are the best… and the worst…

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!