Jump to content

Assembly Language

Guest

Preface:
Don't tell me how to do my homework. I am hoping for clarification on how the language works.

 

TL;DR

I'm taking an accelerated computer architecture class & we're learning computer architecture & assembly.

From what I've gathered we're focusing on the x86 standard (Von Neumann focused) though the textbook also has sections on ARM.

 

Starting Question:

Does anyone have a good guide on Assembly? If not, would you like to share your experience or any thoughts on assembly (use or otherwise)?

From my observation & after watching 8-bit guy talk about Assembly for his Planet X game series, you have lots of power by reading/writing directly to memory addresses & that's 90% of what the language does.

 

 

To get a discussion rolling, (for anyone interested in learning/teaching assembly)

I have to create a basic Hello World program. The assignment gives me the following commands:

Spoiler

section .text             ; This must be here to tell the compiler
                          ; where Program’s instructions start
global _start             ; Must be declared for using gcc
_start:                   ; Tell linker entry point
mov edx, len              ; Message length
mov ecx, msg              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel
section .data             ; This must be here to tell the compiler
                          ; where Program’s data starts
msg db 'Hello, world!',0xa  ; Our dear string
len equ $ - msg             ; Length of our dear string

We are using this online compiler.

It seems that I have to declare a variable for each output?

If I try placing a second msg it says that you cannot redefine a variable like that.

I'm intended to place at least 4 lines. However with the above assembly, we use 4/6 of the general registers.

Spoiler

Because of this, I'm guessing I have to push variable msg to the main memory (RAM), overwrite the registers for each line then finally pop each line back to the registry to print to the screen?

If not, must I jump lines?

 

My solution was to just copy/paste the memory allocation over several times. Perhaps that is the correct method?

Spoiler

section .text             ; This must be here to tell the compiler
                          ; where Program’s instructions start
                          
global _start             ; Must be declared for using gcc

_start:                   ; Tell linker entry point
mov edx, len              ; Message length
mov ecx, msa              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

mov edx, len              ; Message length
mov ecx, msb              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

mov edx, len              ; Message length
mov ecx, msc              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

mov edx, len              ; Message length
mov ecx, msd              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

mov edx, len              ; Message length
mov ecx, mse              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel


section .data             ; This must be here to tell the compiler
                          ; where Program’s data starts
msa db 'Hello, world!',0xa  ; Our dear string
msb db 'Row row fight da powa',0xa
msc db 'I stole that book',0xa
msd db "What're you going to do about it?",0xa
mse db "5 lines just to spite you. :P",0xa

len equ $ - msa             ; Length of our dear string

at Mortis, plz add assembly syntax highlighting

If I could improve my Assembly coding conventions, could anyone recommend anything?

I'm used to writing like 12 lines of C# to do like 50 things through automation, loops & whatnot.

 

I'm going to continually research assembly guides. Unfortunately the e-book provided for the class is very general with its concepts & does not elaborate much. Noting that it has a diagram of a Pentium 4 & says "now computers are starting to have more than 1 core" sets a bit of a date on the book. I'm sure much of the information is still relevant enough-it's not like we stopped using Ram & motherboards.

Link to comment
Share on other sites

Link to post
Share on other sites

It's been almost two decades since I've done anything in assembly myself, but I could perhaps give you some pointers:

Instead of this:

mov edx, len              ; Message length
mov ecx, msa              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

mov edx, len              ; Message length
mov ecx, msb              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

mov edx, len              ; Message length
mov ecx, msc              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

etc.

You could make it a bit clearer like this:

mov edx, len              ; Message length
mov ecx, msa              ; Message to write
call _print

mov edx, len              ; Message length
mov ecx, msb              ; Message to write
call _print

mov edx, len              ; Message length
mov ecx, msc              ; Message to write
call _print

etc.

_print:
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel
ret

 

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, WereCatf said:

snip

Woah!

Thanks! I didn't know Assembly had functions. I thought it only had the jump/branch feature that sent the program to a specific line.


Any specific reasons as to why you don't use assembly anymore? Aside from C, Python, etc. being much easier to develop.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, fpo said:

Any specific reasons as to why you don't use assembly anymore? Aside from C, Python, etc. being much easier to develop.

Modern C-compilers generally do a better job of producing optimized code than you can do manually, even if you were an assembly-god, and C does allow similarly low-level access to memory, interrupts and all. Also, it takes so fricking much longer to write the code to do the same thing in assembly as one can do in C that it just doesn't make sense to use assembly for anything other than extremely rare special cases and/or extremely limited platforms, like e.g. some microcontrollers with only hundreds of bytes of RAM. (Though, even on those I tend to just use trimmed-down C/C++)

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, fpo said:

Thanks! I didn't know Assembly had functions. I thought it only had the jump/branch feature that sent the program to a specific line.

You can also do loops, just like you would in higher-level code:


section .text             ; This must be here to tell the compiler
                          ; where Program’s instructions start
                          
global _start             ; Must be declared for using gcc

_start:                   ; Tell linker entry point
mov edx, len              ; Message length
mov ecx, msg              ; Message to write
mov ebx, 1                ; File descriptor (stdout)
mov eax, 4                ; System call number (sys_write)
int 0x80                  ; Call the kernel
mov eax, 1                ; System call number (sys_exit)
int 0x80                  ; Call the kernel

;These three lines here decrement numLoops by one, compare it to zero
;and if not zero, jumps back to the beginning
dec [numLoops]
cmp [numLoops], 0
jne _start

section .data             ; This must be here to tell the compiler
                          ; where Program’s data starts
numLoops db 5
msg db 'Hello, world!',0xa  ; Our dear string

len equ $ - msg             ; Length of our dear string

 

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, fpo said:

Starting Question:

Does anyone have a good guide on Assembly? If not, would you like to share your experience or any thoughts on assembly (use or otherwise)?

While the most I've done with assembly was in school, from writing routines to even making an assembler, I did use it in my job where I had to modify the initialization routines of an ARM processor before it jumped to the C program. I don't think assembly is hard, as a lot of instructions are simple in that they do one thing and only one thing. But that's also its downside, as it tends to make the code verbose. It can make things difficult to understand when in a higher level language an operation can be represented in one line where the same operation in assembly may need a handful or more.

 

Quote

From my observation & after watching 8-bit guy talk about Assembly for his Planet X game series, you have lots of power by reading/writing directly to memory addresses & that's 90% of what the language does.

That's pretty much what it is. Either you're moving data around or you're performing an operation. A computer is basically that: a compute machine. At the end of the day all you're really doing is math. It's just that values mean different things to different parts of the machine. Or to put in another way, assembly language is telling these things what to do.

 

image.png.2b7e0c8591e25f8888e1976498e6044a.png

 

(taken from https://en.wikichip.org/wiki/File:zen_block_diagram.svg)

 

1 hour ago, fpo said:

Woah!

Thanks! I didn't know Assembly had functions. I thought it only had the jump/branch feature that sent the program to a specific line.

In x86, the CALL instruction is shorthand for pushing information of where the next instruction after the call is to a stack and making an unconditional jump to the subroutine. Some instructions in x86 are shorthand for a series of other instructions to reduce the code size.

 

The equivalent, more or less, in ARM is using the bl instruction, which jumps to a label but also stores the address of the next instruction in the link register so the subroutine that gets jumped to knows where to return.

 

Quote

Any specific reasons as to why you don't use assembly anymore? Aside from C, Python, etc. being much easier to develop.

The biggest reason is it's architecture specific. Unless I'm going to deeply learn about the architecture and how it works, I don't really see a need to use it. Areas where understanding the CPU and doing assembly would be useful is in system programs and compilers. But even system programs likely have little need for assembly outside of finely tuning code (if you know what you're doing) or you have no other choice.

 

And as I mentioned before, assembly can get verbose, which gets kind of old to read or write after a while. This can especially be the case with RISC architectures too, where the philosophy was that it was better to do more instructions that are simple than it is to do a single instruction that's complex.

Link to comment
Share on other sites

Link to post
Share on other sites

On 1/5/2019 at 12:31 PM, fpo said:

Thanks! I didn't know Assembly had functions. I thought it only had the jump/branch feature that sent the program to a specific line. 

They do. Sort of. What's going on here is what's called a subroutine.

So we have this layout:

; do something
call _label

;do other things

_label:
; do something
ret

But before we can talk about calls, we have to talk about labels.

Labels are an assembler directive. In a basic sense, what they do is tell the assembler that you have some group of something which needs to be placed somewhere in memory, and then the location of it needs to be remembered. Every time you reference something by that label, the assembler inserts the address of where it's placed the first thing in that label. Some assemblers (like MPASM, for PIC microcontollers) even allow you to reference labels with an index of sorts, useful for managing arrays of data.

So what actually happens when you write something like:

call _label

Is that the assembler outputs the appropriate call instruction, and replaces "_label" with the address of wherever it's placed the things associated with it.

The call instruction is used specifically to call subroutines. Basically, what happens in a call instruction is that the return address is pushed onto the stack, and then the processor jumps to the address of the subroutine. The return address is the address of the next instruction to run after the subroutine returns.

So, you've called your subroutine, and it's done all of it's work, but now we need to leave the subroutine and get back to where we were. What happens in a "ret" instruction is that the return address is popped off of the stack, and then we jump to it.

NOTES: I keep saying "basically" because what the assembler actually does will depend on your specific target, the assembler itself, and your assembler settings. Additionally, x86 has 8 different call instructions, and what happens when you use each individual one is slightly different. But at this point, all we care about is the simple idea of what using labels, calling a subroutine, and returning from a subroutine does. This is enough knowledge to start using them in your programs and, assemblers will select the appropriate opcodes for you, freeing you to only care about one or two instructions of every type. For example: There are 14 ADD instructions, but you only need to care about one, ADD, and then the assembler will infer which one you meant based on your operands.

 

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

On 1/5/2019 at 2:21 PM, Mira Yurizaki said:

I don't think assembly is hard, as a lot of instructions are simple in that they do one thing and only one thing. But that's also its downside, as it tends to make the code verbose. It can make things difficult to understand when in a higher level language an operation can be represented in one line where the same operation in assembly may need a handful or more. 

To me, that's kind of the allure of assembly. All of the constant referencing, and learning, and tracking what's where and when, while requiring you to know not only the ISA, but also the programmers model and compiler and linker directives is just a challenge that can't be refused. And, in the end, after years and years and years of study and practice, you can emerge triumphant as an expert on that specific architecture, knowing that you know as much about that specific model as anyone out there.

EDIT:: Sorry about the double post. I meant to include this in my response above.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

@WereCatf @Mira Yurizaki @straight_stewie

This is all very insightful information.


I don't really know what else to say other than thank you all for the excellent explanations. Your replies definitely open some doors to higher understanding.

Link to comment
Share on other sites

Link to post
Share on other sites

My school teaches assembly in MIPS which is good to know if you wanna code games for Gameboy advance.

Sudo make me a sandwich 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×