Assembling … done

Assembler codeWell, I finished the assembler project for school. It was fun writing it and breaking the assembler into separate parts, for each specific task. I have also learned a lot about the way assembler works.

It is a basic assembler for 8086 processor with 2-byte words, and 1 byte long addressable unit.

It isn’t full with instructions, but it covers the basics. There are four segments: dat, txt, bss and stack. dat and txt, as data and code segments, are represented in the object file. There is the directive DW which is used to reserve a word, END for marking the end of the input. Seven instructions are available:

  • mov dst, src
  • add dst, src
  • push src
  • pop dst
  • int src
  • jz src
  • jmp src

Eight registers are accessible: AX, BX, CX, DX, SI, DI, BP, SP. There are four addressing modes: immediate, direct, register direct and register indirect with displacement. The coded instruction is one word long, if both operands are addressed with register direct addressing, or two words long, if one of the operands is addressed in some other mode.

So far, the assembler is case-insensitive, but I think I’ll reprogram it, so it is case-sensitive for user-defined symbols. That would be nice :).

Since I don’t think I should publicly post the project code for now (there are still other people writing it), I’m gonna post a simple assembler program and the output. The program is not doing anything meaningful. It is just presenting the assembler possibilities.

Input:

PUBLIC p1,p2,p3,p4
EXTERN e1,e2,e3,e4
SEGMENT txt
p1: mov cx, 13
p2: add ax, 0x12
p4: jmp [bp]*
labela: jmp p1
add cx, p3
jz e3
add [si]*, ax
dw *
dw p1
add ax, [si]*
jz labela
jmp e4
p3: int e1
int p4
jmp labela
mov cx, bx
SEGMENT dat
dw ?
dw *
SEGMENT bss
dw ?
dw ?
SEGMENT stack
END

And here’s the object file:

LINK
4 8 14
# Segments
.txt	0	58	RWP
.dat	58	4	RWP
.bss	62	4	RW
.stack	66	32	RW
# Symbols
e3	0	0	U
e4	0	0	U
e1	0	0	U
e2	0	0	U
p4	8	1	D
p3	44	1	D
p2	4	1	D
p1	0	1	D
# Relocations
000A	1	1	A1
000E	1	1	A1
0012	1	1	A1
0016	1	3	RS1
001A	1	1	A1
001C	1	1	A1
001E	1	1	A1
0022	1	1	A1
0026	1	1	R1
002A	1	4	AS1
002E	1	1	AS1
0032	1	1	A1
0036	1	1	A1
003C	2	2	A1
# Data
19 00 00 0d 28 00 00 12 70 78 00 08 70 00 00 00 29 00 00 2c 60 00 00 00 2e 40 00 18
00 1c 00 00 28 70 00 20 60 00 00 0c 70 00 00 00 50 00 00 00 50 00 00 08 70 00 00 0c 19 44
00 00 00 00 00 3C

And here is the same program, with some errors…

labela1: SEGMENT txt
labela2: PUBLIC p1,p2,p3,p4
EXTERN e1,e2,e3,e4
p1: mov 0x13, 13
p2: add ax, 0x12
p2: jmp [ax]*
labela: jmp p1
add cx, p3
dw *
dw p1
add [ax]*, [bx]*
e1: int p2
SEGMENT dat
mov cx, bx
dw ?
SEGMENT dat
dw *
SEGMENT bss
dw *
dw ?
SEGMENT stack
dw *
SEGMENT nekiTamoSegment
END

…and the errors list that is presented to the user…

ERROR line 1: SEGMENT directive can not be labeled
ERROR line 2: PUBLIC directive can not be labeled
ERROR line 2: PUBLIC not called at the beginning
ERROR line 3: EXTERN not called at the beginning
ERROR line 4: Destination can not be addressed with immediate addresssing.
ERROR line 4: Uncompatible address modes - dst: immediate, src: immediate
ERROR line 6: Symbol p2 has multiple definition
WARNING line 9: DW found in TXT segment, make sure you wrapped it with "JMP yourLabelX" and "yourLabelX: ...", so it doesn't get executed.
WARNING line 10: DW found in TXT segment, make sure you wrapped it with "JMP yourLabelX" and "yourLabelX: ...", so it doesn't get executed.
ERROR line 11: Uncompatible address modes - dst: register indirect with displacement, src: register indirect with displacement
ERROR line 12: Extern symbol e1 can not be defined
ERROR line 14: MOV must be called in TXT segment
ERROR line 16: Segment dat exists
ERROR line 19: Data can not be initialized in BSS segment
ERROR line 22: DW must be called in TXT, BSS or DAT segment
ERROR line 23: Bad segment - nekiTamoSegment
Errors were encountered. Assembling is not completed.

Well, I suppose that is it for now. Maybe we’re gonna have to make a linker for the second homework. That would be nice.

Tags: , , , , ,

Leave a Reply