Jump to content

Assembly code between compilers is so different!

Go to solution Solved by dcgreen2k,

The actual code that gets generated appears to be mostly the same, aside from some differences in how they name things and manage the stack. The lines with INCLUDELIB in the MSVC version are likely directives for the linker. LIBCMT is just the system's C library and OLDNAMES seems to be a compatibility layer for Windows: https://devblogs.microsoft.com/oldnewthing/20200730-00/?p=104021

So I was learning the MSVC compiler (btw, VS is trash. I didn't know standalone CLI existed) and some Windows API. I thought of an interesting idea to compare the assembly code outputted by both of these compilers, and the result was pretty weird.

 

The C code -

#include <stdio.h>

int main(void) {
    puts("Hello World!");
    getchar();
    return 0;
}

 


Assembly outputted by GCC - 

 

	.file	"main.c"
	.text
	.def	__main;	.scl	2;	.type	32;	.endef
	.section .rdata,"dr"
.LC0:
	.ascii "Hello World!\0"
	.text
	.globl	main
	.def	main;	.scl	2;	.type	32;	.endef
	.seh_proc	main
main:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$32, %rsp
	.seh_stackalloc	32
	.seh_endprologue
	call	__main
	leaq	.LC0(%rip), %rax
	movq	%rax, %rcx
	call	puts
	call	getchar
	movl	$0, %eax
	addq	$32, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.ident	"GCC: (x86_64-win32-seh-rev0, Built by MinGW-Builds project) 13.2.0"
	.def	puts;	.scl	2;	.type	32;	.endef
	.def	getchar;	.scl	2;	.type	32;	.endef

 

Assembly outputted by MSVC (using pure x64 version)- 

 

; Listing generated by Microsoft (R) Optimizing Compiler Version 19.39.33521.0 

include listing.inc

INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES

PUBLIC	main
EXTRN	getchar:PROC
EXTRN	puts:PROC
pdata	SEGMENT
$pdata$main DD	imagerel $LN3
	DD	imagerel $LN3+28
	DD	imagerel $unwind$main
pdata	ENDS
_DATA	SEGMENT
$SG9507	DB	'Hello World!', 00H
_DATA	ENDS
xdata	SEGMENT
$unwind$main DD	010401H
	DD	04204H
xdata	ENDS
; Function compile flags: /Odtp
_TEXT	SEGMENT
main	PROC
; File C:\Users\------\main.c ; well thats annoying to have
; Line 3
$LN3:
	sub	rsp, 40					; 00000028H
; Line 4
	lea	rcx, OFFSET FLAT:$SG9507
	call	puts
; Line 5
	call	getchar
; Line 6
	xor	eax, eax
; Line 7
	add	rsp, 40					; 00000028H
	ret	0
main	ENDP
_TEXT	ENDS
END

 

Now, I know to how to read some assembly. But this is something I've never seen. It seems like it is using some libraries instead of doing all by itself? The GCC one looks fine. It seems clear that the GCC one manages the stack, registers and function calls out clear. But the MSVC one seems to be using some libraries? Weirdly enough, the MSVC executable is 118 KB whereas the GCC one is only 45 KB. What is happening?

 

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

That GCC output is using AT&T syntax while msvc is using Intel's syntax. You can change it in gcc, not sure about msvc since I don't use it.

 

The rest of the stuff seem to be mostly due to linker differences.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, igormp said:

You can change it in gcc

 

By using the "masm=intel" option? That doesn't seem to alter the code that much.

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Gat Pelsinger said:

 

By using the "masm=intel" option? That doesn't seem to alter the code that much.

Yes, that'd be the option. Not sure if it works on windows tho.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

The actual code that gets generated appears to be mostly the same, aside from some differences in how they name things and manage the stack. The lines with INCLUDELIB in the MSVC version are likely directives for the linker. LIBCMT is just the system's C library and OLDNAMES seems to be a compatibility layer for Windows: https://devblogs.microsoft.com/oldnewthing/20200730-00/?p=104021

Computer engineering grad student, cybersecurity researcher, and hobbyist embedded systems developer

 

Daily Driver:

CPU: Ryzen 7 4800H | GPU: RTX 2060 | RAM: 16GB DDR4 3200MHz C16

 

Gaming PC:

CPU: Ryzen 5 5600X | GPU: EVGA RTX 2080Ti | RAM: 32GB DDR4 3200MHz C16

Link to comment
Share on other sites

Link to post
Share on other sites

I have to mention that MinGW also builds with the windows c runtime so it uses many of windows interfaces under the hood. it is not exactly as portable as compare to Cygwin which ships its runtime library. 

 

you know which ones to pick if you want to write cross-platform posix apps. 

Sudo make me a sandwich 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×