Weird anomalies in instruction run time

Art Vandelay · April 8, 2015

so I was benchmarking some different instructions in assembly to see if certain things actually took longer than others.

I noticed that this code ran in about 976 ms:

    MOV ECX, -1    timing_loop:        DEC ECX        JNZ timing_loop

and this loop ran in about 978ms:

  MOV ECX, -1  timing_loop:        ADD EDX, 1        DEC ECX        JNZ timing_loop

and this loop ran in about 1947ms:

  MOV ECX, -1  timing_loop:        ADD EDX, 1        ADD EDX, 1        DEC ECX        JNZ timing_loop

So why exactly is this happening? Is this due to a branch delay slot or something?

The same thing happens with 4 NOPS instead of 2 additions, which would seem to indicate that this has something to do with the memory that instructions use. It doesn't happen with multiply, however, so I'm not really sure.

I also noticed that this:

timing_loop:        MOV EAX, 456464646        MUL EAX        DEC ECX        JNZ timing_loop

runs twice as fast as this:

timing_loop:        MUL EAX        DEC ECX        JNZ timing_loop

Is that because of superscalar execution/pre-execution or something?

Art Vandelay · April 8, 2015

I'm not quite sure, but I'm guessing that the problem is that you need to finish some of the pipeline stages to be able to forward the result and execute the second ADD instruction

But why is a single add basically taking 0 time though?

if I add another add instruction it adds about an extra second.

NOPs and additions can take different amounts of cycles to finish, and that's probably why you're getting 4 nops = 2 adds.

Well, four NOPS took ~1.9 seconds, seven NOPS took ~1.9 seconds and 8 NOPS took ~2.9 seconds.

One NOP is one byte, one addition instruction is 2 bytes.

I'm suspecting this additional time taken is due to x86 fetching instructions in 4 byte chunks, because of this.

Art Vandelay · April 8, 2015

Adds are pretty fast, but you can't use the same register at the same time, you need to think on the pipelining effects of those operations.

Oh you're right. The processor probably tries to do the arithmatic operations in parallel and can't because of that.

That doesn't really explain the NOPs though.

Sign In

Weird anomalies in instruction run time

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

I shouldn’t have kept the $1,000,000 computer

Latest From Tech Quickie:

This Guy BUILT His Own Graphics Card!

Latest From TechLinked:

Our Planes Are Made By Psychopaths

Latest From GameLinked:

Roblox and Walmart... Are One

Latest From ShortCircuit:

Dell Has Destroyed the XPS - Dell XPS 16 (2024)

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!

My Activity Streams