ASM Control
Assembly language is considered Turing complete because it can simulate a Turing machine. A Turing machine is an abstract machine that is capable of performing computations of arbitrary complexity. This means that any computer program can be executed by a Turing machine if given enough time and memory.
The key features of Assembly language that make it Turing complete are:
It has variables and memory: Assembly has registers and memory locations that can store data. This allows it to maintain state between instructions.
It has conditional branching: Assembly has conditional jump instructions that allow it to change the flow of execution based on conditions. This allows it to make decisions and loop.
It has unbounded memory: Though assembly has a limited number of registers, it can access an effectively unlimited amount of memory using memory addresses. This allows it to store as much data as needed for a given computation.
It has basic arithmetic and logic operations: Assembly has instructions to perform basic operations like addition, subtraction, AND, OR, etc. This provides the computational primitives needed for any algorithm.
So in summary, since Assembly language has variables, conditional branching, unbounded memory and basic operations - all of which a Turing machine has - it is considered a Turing complete language. It can theoretically execute any algorithm given enough resources.
Control Flow
Control flow refers to the order in which instructions are executed in a program. There are three basic types of control flow in Assembly:
Sequential: Instructions are executed one after the other in the order they are written. This is the default control flow.
Jumps: Using jump instructions, the execution can jump to a different part of the code, skipping instructions in between.
Loops: Using loop instructions, a block of code can be executed multiple times.
Assembly language is a procedural language, not a functional or object-oriented language. This means:
It focuses on procedures or functions, not objects or data. Functions manipulate data, they don't own data.
It lacks high-level concepts like objects, classes, inheritance, polymorphism, etc. that are found in object-oriented languages.
Data is manipulated through procedures that change the program state. There is no immutability like in functional programming.
Control flow statements like if-else, loops, etc. are needed to structure the procedures and manipulate data. Without control statements, Assembly code would be linear and hard to follow.
Since Assembly lacks high-level abstraction like objects, functions are the basic unit of modularity. Functions take input, process it and produce output. They are called through function calls.
Sequential
Sequential execution is the default control flow in Assembly, meaning instructions are executed one after the other in the order they are written. Here are some points about sequential execution in Assembly:
It is the simplest form of control flow. The CPU executes each instruction in the program sequentially, one after the other.
The program counter (PC) keeps track of which instruction is currently being executed. It is incremented after executing each instruction.
Sequential execution works well for simple programs with a linear flow of control.
For complex programs, sequential execution alone is not sufficient. Jump and loop instructions are needed to change the sequential flow.
Even when using jumps and loops, the instructions within a basic block (a sequence of instructions without any jumps) are executed sequentially.
Sequential execution provides a deterministic and predictable flow of control. For a given input, the program will always execute the same sequence of instructions and produce the same output.
Sequential execution aligns well with the procedural nature of Assembly language. Functions are composed of sequential instructions that manipulate data.
Sequential execution is the simplest to implement in the CPU hardware. The program counter can be implemented using a simple adder circuit.
So in summary, while sequential execution alone is not sufficient for complex programs, it forms the basis of control flow in Assembly. Instructions within basic blocks are always executed sequentially. Jumps and loops are used to change this default sequential execution as needed.
Hope this explanation of sequential execution in Assembly helps! Let me know if you have any other questions.
Conditionals
Conditional branches in Assembly language allow the execution flow to proceed down different paths based on certain conditions. They are implemented using conditional jump instructions.
The common conditional jump instructions are:
JZ
- Jump if Zero flag is set. Used to check if a value is zero.JNZ
- Jump if Zero flag is NOT set.JE
- Jump if Equal. Used after a comparison instruction like CMP.JNE
- Jump if Not Equal.JA/JNBE
- Jump if Above/Jump if Not Below or Equal. Used after comparing two values.JAE
- Jump if Above or Equal.JB/JNAE
- Jump if Below/Jump if Not Above or Equal.JBE
- Jump if Below or Equal.
They are used like this:
CMP ax, 10 ; Compare ax to 10
JE equal ; If equal (Z flag set), jump to 'equal' label
JNZ notEqual ; If not equal (Z flag clear), jump to 'notEqual' label
equal:
; Code for if equal case
JMP done
notEqual:
; Code for if not equal case
done:
; Code after conditional branch
Here we have a conditional branch based on the result of a comparison. If ax is equal to 10, it will jump to the equal
label, otherwise it will go to the notEqual
label.
Conditional branches allow us to:
Implement if/else logic
Create loops that exit based on conditions
Select different code paths
So in summary, conditional branches use conditional jump instructions to change the execution flow based on certain conditions, evaluated using flags like the Zero flag.
They allow the implementation of conditional logic, exiting loops based on conditions, and selecting different code paths.
Labels
Labels in Assembly language:
Labels are names assigned to locations in the Assembly code.
They are defined using a colon (:) character followed by the label name.
Examples:
start:
loop:
end:
Here start
, loop
and end
are labels.
Labels provide "named locations" that jump instructions can target.
Jump instructions like
JMP
andJE
can jump to a label.
Example:
loop:
mov ax, 1
JE end ; If equal, jump to 'end' label
JMP loop
end:
; Code here executes after loop
Here we jump to the end
label if a condition is met.
The CPU doesn't actually understand labels. The assembler replaces labels
with their corresponding memory addresses before generating the executable code.Labels are used to:
Implement loops
Implement conditionals
Mark the start and end of functions
Provide targets for jumps in general
A label's name:
Can be up to 128 characters long
Cannot start with a number
Is case-sensitive
Labels are resolved by the assembler in the order they are defined. So a label can be jumped to before it is defined.
So in summary, labels provide "named locations" in the Assembly code that:
Jump instructions can target
Mark the start and end of code blocks
Allow implementing loops, conditionals, functions, etc.
Loops
Loops allow executing the same code multiple times. They are implemented using jump instructions that jump back to a label.
The basic components of a loop in Assembly are:
- The loop label: Marks the start of the loop.
Example: loop:
- The loop body: The code that needs to be executed repeatedly.
Example:
loop:
mov ax, 1
add ax, 1
; Loop body
- The loop condition: Checks if the loop needs to continue. Evaluates a condition using flags like the Zero flag.
Example:
cmp ax, 10 ; Compare loop counter ax to 10
- The loop jump: Jumps back to the loop label if the condition is met. Uses a conditional jump instruction like JNZ.
Example:
jnz loop ; If ax is not equal to 10, jump back to 'loop'
Putting it all together:
loop:
mov ax, 1
add ax, 1
cmp ax, 10
jnz loop
; Code after loop
Here the loop will execute 10 times, incrementing ax from 1 to 10.
Loops are useful to:
Repeat a block of code a fixed number of times
Repeat while a condition is true
Registry
Registers are used extensively in Assembly language loops. The common registers used are:
AX
- Used as the loop counter. It is incremented or decremented on each iteration.CX
orCXH:CXL
- Also used as the loop counter. Since it is a 16-bit register, it can count up to 65,535 iterations.SI
andDI
- Used as index registers to iterate through arrays.
For example, a simple loop that iterates 10 times can be:
mov ax, 1 ; Initialize loop counter
loop:
; Loop body
inc ax ; Increment loop counter
cmp ax, 10 ; Compare to 10
jnz loop ; If not equal, loop again
Here we use the AX
register as the loop counter. We initialize it to 1, increment it by 1 on each iteration using INC
, and compare it to 10 to exit the loop.
The CX
/CXH:CXL
registers are often preferred as loop counters since they are 16-bit, allowing for more iterations.
The code would be:
mov cx, 10 ; Initialize CX to 10
repeat:
; Loop body
loop repeat ; Loop CX times
The LOOP
instruction will automatically decrement CX
for us.
The SI
and DI
registers are often used as array indexes when looping through arrays. For example:
mov si, 0 ; Initialize array index
mov di, LENGTH ; Length of array
loop1:
; Access array[SI]
inc si ; Increment index
cmp si, di ; Compare to length
jne loop1 ; Loop until end of array
Here SI
acts as the array index, incrementing on each iteration until the end of the array.
Example
The main purpose is to copy elements from array1
to array2
using a loop. Comments explain the purpose of each section and instruction, making the code self-documenting. Assembly gives you very low-level control over the machine, demonstrating basic control flow constructs.
; Assembly program demonstrating data declaration,
; arrays, loops, if-else and labels
section .data ; data section
array1 db 10,20,30,40,50 ; array declaration
array2 db 5
size equ 5 ; array size
section .text ; code section
global _start ; required for linker
_start: ; program entry point
mov ecx, 0 ; initialize loop counter
loop1:
cmp ecx, size ; compare with array size
je exit ; if equal, exit loop
mov al, array1[ecx] ; load array element into al
mov array2[ecx], al ; store in array2
inc ecx ; increment loop counter
jmp loop1 ; jump to loop
exit:
mov eax,1 ; exit syscall
mov ebx,0
int 0x80
This program demonstrates basic assembly language concepts like:
Data section to declare arrays
array1
andarray2
size
directive to define array size as 5loop1:
label for the loopcmp
andje
instructions for conditional jumpArray indexing using
array1[ecx]
Increment
ecx
loop counterUnconditional jump using
jmp
Exit using
int 0x80
syscall
.data section
The .data section in assembly language is used to define initialized data - variables, arrays, constants, etc. It has the following purposes:
It defines the data segment of the program which contains initialized variables.
The variables defined in the .data section have fixed addresses and are allocated space when the program is loaded.
The .data section comes before the .text section which contains the actual instructions.
The .data section is denoted by starting with a dot (.). The dot indicates that it is a directive, not a label. Directives give instructions to the assembler or linker, rather than generating machine code.
In the example code:
section .data
array1 db 10,20,30,40,50
array2 db 5
size equ 5
The .data section contains:
The
array1
andarray2
arrays, defined using thedb
directive which reserves space for 1 byte data.The
size
constant, defined using theequ
directive to set it equal to 5.
So in summary, the .data section:
Defines the data segment of the program
Contains initialized variables like arrays, constants
Variables have fixed addresses when the program loads
Is denoted using the .data directive, starting with a dot
Comes before the .text section containing instructions
The dot indicates that .data is an assembler directive, not a label. It gives the assembler instructions to place the following data in the data segment.
The .text section in assembly language is used to define the executable machine code instructions - the actual program. It has the following purposes:
It defines the text (or code) segment of the program which contains the executable instructions.
The instructions defined in the .text section will be executed when the program runs.
The .text section comes after the .data section which contains initialized variables.
.text section
The .text section is also denoted by starting with a dot (.). Like .data, the dot indicates that it is an assembler directive.
In the example code:
section .text
global _start
_start:
mov ecx, 0
cmp ecx, size
je exit
...
The .text section contains:
The
global _start
directive, making the _start label externally visible.The actual instructions that will be executed:
mov
,cmp
,je
etc.
So in summary, the .text section:
Defines the text (code) segment containing executable instructions
Contains the instructions that will actually be run
Is denoted using the .text directive, starting with a dot
Comes after the .data section containing initialized variables
The dot indicates that .text is an assembler directive, not a label. It tells the assembler to place the following instructions in the text (code) segment.
Best practice compared to C
Here are some best practices and tricks regarding control flow in assembly language compared to C:
Use labels instead of brackets - In assembly, you define labels to mark the start of blocks of code, instead of using curly brackets like in C. Labels start with a colon.
Explicit jumps - In assembly, you have to explicitly use jump instructions (like jmp, je, jne) to change the control flow. There are no implicit falls-through like in C.
Loops - For loops, you have to manually increment the loop counter. Assembly does not have for/while loops like C. You use conditional jumps and labels for loops.
Functions - There are no functions in assembly, only labels. You have to manually push registers to the stack before a call and pop them after the call.
Conditionals - You have to explicitly use conditional jump instructions (je, jne, jg, jl) to change the control flow based on conditions. There are no if/else statements.
Less abstraction - Assembly provides much less abstraction compared to C. You have to manually manage the stack, registers, memory, loops, conditionals, etc.
Focus on optimization - Since assembly is so close to the hardware, you can optimize the code much better by rearranging instructions, using specific registers, tail call optimization, etc.
Use as few instructions as possible - Since each instruction takes time to execute, the fewer instructions you have, the faster your code will run (within reason for readability).
Comment extensively - Since assembly lacks abstraction, you have to comment extensively to explain the purpose of labels, instructions and sections of code.
So in summary, assembly provides much less control flow abstraction compared to C, requiring you to manage loops, conditions, jumps, stacks, registers, etc. manually. But in return, it gives you much better optimization opportunities and performance.
Disclaim: I have not read or follow any Assembly tutorial before this research. Now I use AI and ask the best questions to explain Assembly to myself. This is not a course of Assembly it is just a study paper.
Subscribe to my newsletter
Read articles from Elucian Moise directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Elucian Moise
Elucian Moise
Software engineer instructor, software developer and community leader. Computer enthusiast and experienced programmer. Born in Romania, living in US.