About jAsm


This is the documentation for the 6502 and Z80 assembler jAsm. It was written in 2015 by me, Jonas Hultén, because I used DAsm before and over the years it started to feel outdated. I missed namespaces and I was annoyed by some bugs, like its inability to use local variables as macro arguments. It felt a bit slow on my slow laptop computer as well, although that wasn't a strong argument. Also, I hadn't written a language before and it felt like an interesting challenge.

jAsm was written with two main goals. It should be fast enough to assemble a large program in under a second and it should support everything DAsm can do and more. To reach the required speed it tries to use memory as linearly as possible and it proved to be faster than DAsm in the end.

The assembler was made for Commodore 64 programming and some features were specifically made to help programming for that computer. However, it shouldn't stop anyone from making programs for other computers with it.

jAsm looks a lot like C. It wasn't meant to do that but over the course of development it moved closer and closer because it was easier to solve parsing problems that way.

It took 7 months to complete this first version of the assembler. It is still a bit rough around the edges but has some power under the hood. Let's start!

About This Document

This documentation covers the language and syntax provided by the assembler but not any details about specific supported processors. It was written when only 6502 was supported so the document is heavily geared towards that processor.

Table of Contents

Processor Support

6502

jAsm supports all regular instructions of the 6502. Instructions are written in lower case.

lda #0
sta $d020

Z80

jAsm supports all regular instructions of the Z80. Instructions are written in lower case.

ld a, 0
ld (hl), a

Due to the large amount of source code with upper case instruction keywords, a python script is provided to convert upper case keywords in all .asm files in a directory. Run that like this.

python3 jasm-z80/convert_z80_keyword_case.py <my_source_directory>

Starter Guide


We'll start by creating a small program in a text file.

section code, "main", $8000
{
    inc $d020
    rts
}

Save this to a file named main.jasm. Use utf-8 format, because this is what jAsm expects. 7-bit ASCII is also ok since that is compatible with the utf-8 format. Now we'll assemble it into a binary. Open a command line window and change the current directory to where the main.jasm file is. Type this on the command line.

jasm-6502 main.jasm main.prg

Now you have a program that changes the border color on a Commodore 64. Load it into an emulator or onto a real machine.

LOAD"TEST.PRG",8,1

Now start it.

SYS32768

The border color changes.

Basic Start

If you want to start it with a BASIC line, you need to add the necessary data to produce a SYS line at the BASIC start. This example shows how to do that in jAsm.

section code, "main", $0801
{
    define word = .next_basic_line // next BASIC line
    define word = 2016 // line number
    define byte = $9e // SYS token
    define byte[] = { string(.start) }
    define byte = 0 // end of line
.next_basic_line:
    define word = 0 // zero next BASIC line to mark end of program

.start:
    inc $d020
    rts
}

Stuff written after // are comments and will be completely ignored by the assembler.

.next_basic_line and .start are labels that represent the addresses in memory where they are placed. The dot before the name means it is local to the space between the closest surrounding curly braces. define places variable data into the program. A word is two bytes long. The SYS token is written in hexadecimal form, which is what the dollar sign indicates.

string(.start) means "call the built in function string with the argument .start". The function will return a string representation of .start.

Basic Macro

This BASIC line thing will be used a lot in programs since almost all programs loaded from disk will need it. Let's break out this code into a handy macro that we can reuse. The macro will need two arguments, one is the line number and one is the address to start the program from.

macro basic_sys_line(.line_number, .sys_address)
{
    define word = .next_basic_line // next BASIC line
    define word = .line_number
    define byte = $9e // SYS token
    define byte[] = { string(.sys_address) }
    define byte = 0 // end of line
.next_basic_line:
    define word = 0 // zero next BASIC line to mark end of program
}

section code, "main", $0801
{
    basic_sys_line(2016, .start)

.start:
    inc $d020
    rts
}

The start of the main section invokes the macro and this inserts the code in the macro at the place of invocation.

Using Files

The main section of our example looks a lot cleaner now. We can now move the macro to its own file. We can build a small library of handy macros to help us be productive and avoid solving the same problem several times.

Move the macro code into a file called macros.jasm and place it where main.jasm lies. We can now include the macros in main.jasm.

include "macros.jasm"

section code, "main", $0801
{
    basic_sys_line(2016, .start)

.start:
    inc $d020
    rts
}

Defining Constants

The border color changing address isn't exactly self explanatory. The BASIC start address is also a naked constant that isn't exactly self explained. Let's make this a bit better.

include "macros.jasm"

const BASIC_START = $0801
const BORDER_COLOR = $d020

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    inc BORDER_COLOR
    rts
}

I use uppercase characters for fixed address constants (basically any naked constant) to make it easy to identify them. BASIC_START and BORDER_COLOR can now be used instead of the naked constants. Let's move the constants out into their own file as well. Call this c64.jasm since they describe constants specific to Commodore 64. We'll include this as well in the program.

include "macros.jasm"
include "c64.jasm"

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    inc BORDER_COLOR
    rts
}

Conditional Assembly

Now, what if we wanted to port this to VIC20? We would only need to create a vic20.jasm file with different BASIC_START and BORDER_COLOR addresses and then include that instead of the c64.jasm file. We can also support both at the same time. Let's put this in the vic20.jasm file.

const BASIC_START = $1001
const BORDER_COLOR = $900f // this address controls both background and border colors

Now, what we need is a way to include either the c64.jasm or vic20.jasm file based on an option somewhere. Let's add the selection first.

include "macros.jasm"
if (C64_BUILD) {
    include "c64.jasm"
} else {
    include "vic20.jasm"
}

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    inc BORDER_COLOR
    rts
}

Command Line Constants

The if statement wants a boolean expression within the parentheses and if true the first block of code is used, otherwise the second block is used. We can feed constants from the command line to solve this. The command line option is -d and it needs to be followed by an assignment. In this case we want to assign C64_BUILD to true or false.

jasm-6502 -d C64_BUILD=true main.jasm main.prg
jasm-6502 -d C64_BUILD=false main.jasm main.prg

Defining Data

Let's try a hello world example. We'll drop the VIC20 support to make the code shorter. We will define the string "hello world!" and print it, character by character. We have already seen how to define a string in memory in the BASIC line. Printing is done with a jump to $ffd2, which prints a single character. Let's add the following naked constant to the c64.jasm file.

const CHROUT = $ffd2

Now we'll add the loop to print the text.

include "macros.jasm"
include "c64.jasm"

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #0
.loop:
    lda hello_world_text,x
    jsr CHROUT
    inx
    cpx #sizeof(hello_world_text)
    bne .loop
    rts

    define byte[] hello_world_text = { "HELLO WORLD!" }
}

The define now has a name before the equal sign. This becomes a special kind of label. It can be used as a normal label but it also contains information about the defined data. sizeof is a function that returns the size in bytes of such a labeled object or array.

Coding For Readability

This works but is hard to read. It isn't obvious where the loop starts and ends unless we read the instructions. Let's improve it using indentation.

include "macros.jasm"
include "c64.jasm"

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #0
.loop:
        lda hello_world_text,x
        jsr CHROUT
        inx
        cpx #sizeof(hello_world_text)
    bne .loop
    rts

    define byte[] hello_world_text = { "HELLO WORLD!" }
}

Automatic Labels

This is better but can be improved further. jAsm supports an automatic @loop label at the beginning of a scope defined by curly braces.

include "macros.jasm"
include "c64.jasm"

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #0
    {
        lda hello_world_text,x
        jsr CHROUT
        inx
        cpx #sizeof(hello_world_text)
        bne @loop
    }
    rts

    define byte[] hello_world_text = { "HELLO WORLD!" }
}

It's now much easier to read the loop and we got rid of the explicitly defined label .loop.

Subroutines

If we want to print more text we need to move the loop into a subroutine which can be called with a jsr instruction and some parameters in registers.

include "macros.jasm"
include "c64.jasm"

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #>hello_world_text // high byte from address
    lda #<hello_world_text // low byte from address
    ldy #sizeof(hello_world_text)
    jsr print_text
    rts

    define byte[] hello_world_text = { "HELLO WORLD!" }


    // -> xa: address to text
    // -> y: size of text
    subroutine print_text
    {
        // self modifying code
        sta .addr
        stx .addr + 1
        sty .size

        ldx #0
        {
            const .addr = * + 1
            lda $ffff,x // just a dummy address, it will be overwritten
            jsr CHROUT
            inx
            const .size = * + 1
            cpx #0 // just a dummy value, it will be overwritten
            bne @loop
        }
        rts
    }
}

* in the subroutine represents the current program counter. * + 1 points one byte into the next instruction, which is where the instruction argument is. All is well, except that it doesn't assemble!

main.jasm(23,7) : Error 3004 : Reference to undefined symbol .addr
main.jasm(24,7) : Error 3004 : Reference to undefined symbol .addr
main.jasm(24,13) : Error 3000 : Operator + is not defined for left hand side unknown type.
main.jasm(25,7) : Error 3004 : Reference to undefined symbol .size

Declaring Symbols

There is something wrong with .addr and .size. The reason is that local constants are not accessible outside the scope they are defined in. Local constants are always accessible inside the scope they are defined in, even in inner scopes. The scope is defined by the closest enclosing curly braces. So .addr and .size is accessible inside the loop but not outside.

To solve this we can declare the symbol names in the subroutine scope but define the constants inside the loop. This is the working subroutine.

// -> xa: address to text
// -> y: size of text
subroutine print_text
{
    // declaring constants
    declare .addr
    declare .size

    // self modifying code
    sta .addr
    stx .addr + 1
    sty .size

    ldx #0
    {
        const .addr = * + 1
        lda $ffff,x // just a dummy address, it will be overwritten
        jsr CHROUT
        inx
        const .size = * + 1
        cpx #0 // just a dummy value, it will be overwritten
        bne @loop
    }
    rts
}

There is a more intuitive way to declare the .addr and .size addresses. Instruction data labels can point directly to the instruction argument by placing a label definition between the instruction and the argument.

// -> xa: address to text
// -> y: size of text
subroutine print_text
{
    // declaring constants
    declare .addr
    declare .size

    // self modifying code
    sta .addr
    stx .addr + 1
    sty .size

    ldx #0
    {
        lda .addr: $ffff,x // just a dummy address, it will be overwritten
        jsr CHROUT
        inx
        cpx .size: #0 // just a dummy value, it will be overwritten
        bne @loop
    }
    rts
}

This subroutine can be reused so let's move it to its own file. Name a new file screen_io.jasm and paste the subroutine into it. Now we'll modify the main file to include this new file. Note that we now must include the file inside the section because otherwise generated code or data would lie outside any section and that isn't allowed. Only code sections can contain code or data. The other include files only contain constant definitions and macros and they don't directly produce any code or data themselves. That's why they can be outside a section.

include "macros.jasm"
include "c64.jasm"

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #>hello_world_text
    lda #<hello_world_text
    ldy #sizeof(hello_world_text)
    jsr print_text
    rts

    define byte[] hello_world_text = { "HELLO WORLD!" }

    include "screen_io.jasm"
}

Bss Sections

Self modifying code is handy and can improve efficiency but it doesn't work if the code is in a cartridge ROM, because it can't be modified. Let's try modifying the code to use the zero page instead. To do this we need to reserve some space for variables in the zero page area. This is done with a bss section. BSS stands for "Block Started by Symbol" and means a static memory block that is part of the program, but without its content stored in the executable file. The bss section doesn't generate any code or data, it just reserves uninitialized space. I reserved the last 5 bytes in the zero page area from $fb to, but not including, $100.

include "macros.jasm"
include "c64.jasm"

section bss, "zero page", $fb, $100
{
    reserve word addr
}

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #>hello_world_text
    lda #<hello_world_text
    ldy #sizeof(hello_world_text)
    jsr print_text
    rts

    define byte[] hello_world_text = { "HELLO WORLD!" }

    include "screen_io.jasm"
}

The reserve statement can reserve one type or an array of them, just like the define statement. The difference is that a reserve statement can't put actual values into anything. Also, you must specify array sizes with a number between the brackets.

The addr constant has no leading dot. This means that it is a global constant. It is accessible from anywhere in the program. Making it global is necessary since it doesn't exist in the same scope as the code that uses it.

Note that the bss section header has an extra value added after the start address. This is the end of the section. If the section grows beyond this value, an error is generated. This is an effective way to keep the variables under control.

Now we need to modify the print subroutine to not modify itself and instead use the allocated pointer.

// -> xa: address to text
// -> y: size of text
subroutine print_text
{
    sta addr
    stx addr + 1

    tya
    tax // size left in x
    ldy #0 // pointer offset
    {
        lda (addr),y
        jsr CHROUT
        iny
        dex
        bne @loop
    }
    rts
}

It would also be nice to avoid having to specify the length of the string when printing it. The code became a bit kludgy when swapping registers. We can solve this by removing the need for the size argument. If we zero terminate the string we can get rid of it (or swap argument registers).

include "macros.jasm"
include "c64.jasm"

section bss, "zero page", $fb, $100
{
    reserve word addr
}

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #>hello_world_text
    lda #<hello_world_text
    jsr print_text
    rts

    define byte[] hello_world_text = { "HELLO WORLD!", 0 }

    include "screen_io.jasm"
}

Now the zero is added. Let's update the subroutine.

// -> xa: address to text
subroutine print_text
{
    sta addr
    stx addr + 1

    ldy #0 // pointer offset
    {
        lda (addr),y
        beq @continue
        jsr CHROUT
        iny
        bne @loop
    }
    rts
}

Now that looks better. The @continue is another automatic label that is defined by the closest surrounding closing curly braces.

Section Parts

One thing that isn't really great is that it isn't obvious what addr is used for. It would be nice if it was connected to the print subroutine somehow. We can make that connection by creating partial sections in the screen_io.jasm file that adds to the sections in the main file. We do that by moving the reserve into the screen_io.jasm file. We also move the include outside the main section, because we can't define a partial section within a section.

This is what main.jasm looks like after the change.

include "macros.jasm"
include "c64.jasm"

section bss, "zero page", $fb, $100
{
}

section code, "main", BASIC_START
{
    basic_sys_line(2016, .start)

.start:
    ldx #>hello_world_text
    lda #<hello_world_text
    jsr print_text
    rts

    define byte[] hello_world_text = { "HELLO WORLD!", 0 }
}

include "screen_io.jasm"

The screen_io.jasm file now needs to define two section parts, one for the zero page reservation and one for the code.

section part, "zero page"
{
    // temporary text address when printing
    reserve word addr
}

section part, "main"
{
    // -> xa: address to text
    subroutine print_text
    {
        sta addr
        stx addr + 1

        ldy #0 // pointer offset
        {
            lda (addr),y
            beq @continue
            jsr CHROUT
            iny
            bne @loop
        }
        rts
    }
}

We now have a kind of module with the print subroutine and its zero page variable. It can sit beside other potential modules in a larger program without overlapping. We don't have to specify a single address and it will be optimally packed together.

Namespaces

What if some other module also wants to have a temporary address called addr? That could be a problem. One solution for this is to put the print related names in a namespace.

We'll enclose the contents of the screen_io.jasm file in a screen namespace.

And the same for the screen_io.jasm file.

namespace screen
{
    section part, "zero page"
    {
        // temporary text address when printing
        reserve word addr
    }

    section part, "main"
    {
        // -> xa: address to text
        subroutine print_text
        {
            sta addr
            stx addr + 1

            ldy #0 // pointer offset
            {
                lda (addr),y
                beq @continue
                jsr CHROUT
                iny
                bne @loop
            }
            rts
        }
    }
}

The reference to the print subroutine must now specify the namespace in one way or another. One way would be to explicitly type it in front of the print name like this:

jsr screen::print_text

If print_text is used a lot in one place it is also possible to specify that a namespace should be used in a scope. As long as other names don't start to collide, this is just as good.

include "macros.jasm"
include "c64.jasm"

section bss, "zero page", $fb, $100
{
}

section code, "main", BASIC_START
{
    using namespace screen

    basic_sys_line(2016, .start)

.start:
    ldx #>hello_world_text
    lda #<hello_world_text
    jsr print_text
    rts

    define byte[] hello_world_text = { "HELLO WORLD!", 0 }
}

include "screen_io.jasm"

Modules

A namespace expose everything to the outside world. Sometimes that's what you want but it could also be nice to control the module's interface. This can be done using /modules/ instead of namespaces. In a module, all global variables are local to the module unless they are marked for export.

In our example, addr doesn't need to be exposed outside, but the print subroutine must be.

module screen
{
    section part, "zero page"
    {
        // temporary text address when printing
        reserve word addr
    }

    section part, "main"
    {
        // -> xa: address to text
        export subroutine print_text
        {
            sta addr
            stx addr + 1

            ldy #0 // pointer offset
            {
                lda (addr),y
                beq @continue
                jsr CHROUT
                iny
                bne @loop
            }
            rts
        }
    }
}

Accessing the print_text subroutine in the module is done exactly the same way it was accessed in the namespace so the rest of the program can be left unchanged.

Debugging in VICE

jAsm can assist debugging in the VICE emulator by exporting the names of addresses for use in the emulator. Add -dv and a filename to the command line arguments to export this information.

jasm-6502 -dv main.vs main.jasm main.prg

Now, a symbol file will be created called main.vs. Let's start the emulator (install it first if you don't have it) and use the file.

x64sc -moncommands main.vs -autostart main.prg

Hello world should be printed on the screen.

Hello World Example

Start the monitor (alt-h in Linux) and type d 080d.

(C:$e5d1) d 080d
.C:080d   .sys_address:
.C:080d  A2 08       LDX #$08
.C:080f  A9 15       LDA #$15
.C:0811  20 22 08    JSR .print_text
.C:0814  60          RTS
.C:0815   .hello_world_text:
.C:0815  48          PHA
.C:0816  45 4C       EOR $4C
.C:0818  4C 4F 20    JMP $204F
.C:081b  57 4F       SRE $4F,X
.C:081d  52          JAM
.C:081e  4C 44 21    JMP $2144
.C:0821  00          BRK
.C:0822   .print_text:
.C:0822  85 FB       STA $FB
.C:0824  86 FC       STX $FC
.C:0826  A0 00       LDY #$00
.C:0828  B1 FB       LDA (.addr),Y
.C:082a  F0 06       BEQ $0832
.C:082c  20 D2 FF    JSR $FFD2
.C:082f  C8          INY
.C:0830  D0 F6       BNE $0828
.C:0832  60          RTS
.C:0833  00          BRK
.C:0834  00          BRK
.C:0835  00          BRK
(C:$0836)

You'll get a disassembled listing of the program and some of the labels are visible in the listing! The zero page addresses didn't get a name. That's a limitation in VICE so we can't help that. The CHROUT address didn't get a name either. How come? Well, the constant is only a number and not all numbers should be exported to VICE because those would act as addresses and it would get very confusing. There is a work-around for this. You can explicitly set a value to be an address like this.

const address BASIC_START = $0801
const address BORDER_COLOR = $d020
const address CHROUT = $ffd2

Change the c64.jasm file to this, assemble and restart the emulator.

.C:0822   .print_text:
.C:0822  85 FB       STA $FB
.C:0824  86 FC       STX $FC
.C:0826  A0 00       LDY #$00
.C:0828  B1 FB       LDA (.addr),Y
.C:082a  F0 06       BEQ $0832
.C:082c  20 D2 FF    JSR .CHROUT
.C:082f  C8          INY
.C:0830  D0 F6       BNE $0828
.C:0832  60          RTS

Problem solved!

VICE Breakpoints

To aid debugging you can set breakpoints in your program. This makes it easy to stop the program in a specific subroutine and single step through it. You do this by creating a label with a name that begins with breakpoint. Let's try this. Add a label somewhere in the print_text subroutine, like this.

// -> xa: address to text
subroutine print_text
{
.breakpoint:
    sta addr
    stx addr + 1

    ldy #0 // pointer offset
    {
        lda (addr),y
        beq @continue
        jsr CHROUT
        iny
        bne @loop
    }
    rts
}

The emulator stops almost immediately.

BREAK: 1  C:$0822  (Stop on exec)
#1 (Stop on  exec 0822)  141 016
.C:0822  85 FB       STA $FB        - A:15 X:08 Y:00 SP:f4 ..-.....    3114547
(C:$0822)

You can step through the instructions with the z command in the monitor.

There are two more types of breakpoints. A label beginning with read_breakpoint will stop execution when that memory address is accessed to read. A label beginning with write_breakpoint will stop the execution when that memory address is accessed to write.

Now you know the basics of jAsm and should be able to start experimenting yourself. The language has more to offer and the complete syntax is described in the reference section. Good luck!

Compiling jAsm


Fetching Source Code

You need to fetch the source code from BitBucket to get started. If you have a command line git client you can clone the repository like this.

git clone https://bitbucket.org/bjonte/jasm.git

jAsm compiles using CMake and Clang or using Code::Blocks or Visual Studio.

Compiling Using CMake

You need CMake, Clang and Python3 to build jAsm. Under Ubuntu you can fetch the dependencies like this.

sudo apt-get install cmake clang git python3

Run the following commands from the root of the project.

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
sudo make install

Compiling Using IDE

jAsm compiles using Clang under Linux in Code::Blocks. Open the jasm.workspace file, select the release configuration for a desired processor and build.

On Windows, you need Visual Studio 2015 to build. Open jasm.sln, select the release configuration for the desired processor and build.

Starting jAsm


jAsm is a command line tool. It will print its arguments if started without any. Basically it needs an input file and an output file.

jasm-6502 input.jasm output.bin

If you are assembling for Z80, use that version of the assembler instead.

jasm-z80 input.jasm output.bin

There are some flags to tweak how the assembler behaves.

Predefined Constants

You can instruct the assembler to create some initial constants that can be accessed in the source code with the -d flag.

jasm-6502 -d INFINITE_LIVES=true -d STARTING_LIVES=3 -d DEFAULT_NAME=bobo input.jasm output.bin

You can feed it with integers, booleans and strings, like in the example above.

Symbol Dumps

The constants and variables can be written in two formats. Either in a native format for jAsm or a format suitable for the VICE emulator. The VICE symbol output can be fed into VICE to get symbol names in the disassembler when debugging your program there.

Dump symbols like this.

jasm-6502 -ds symbols.txt input.jasm output.bin

Dump VICE symbols like this.

jasm-6502 -dv symbols.vs input.jasm output.bin

Binary Header

By default, jAsm outputs only the binary data without any header. To generate a program file for Commodore 64 that can be loaded from BASIC, a two byte header must be added containing the load address in little endian format. You can add this header using -hla (header little endian address).

jasm-6502 -hla input.jasm output.prg

Include Paths

You can add include paths using the -i flag. jAsm will look in these for included files.

jasm-6502 -i some/dir -i other/dir input.jasm output.bin

Max Errors

With the -me flag, you can specify the number of errors that will be printed before jAsm stops assembling.

jasm-6502 -me 4 input.jasm output.bin

Output Files and Sections

The default output mode will merge all code sections into one big binary and pad the inbetween space with zeros. With the flag -om, this can be changed to store one file per section instead.

jasm-6502 -om input.jasm output.bin

Verboseness

jAsm supports several levels of output during assembly. This is controlled by the -v0, -v1, -v2 and -v3 flags.

jasm-6502 -v2 input.jasm output.bin
FlagMeaning
-v0Show errors
-v1Show errors and warnings
-v2Show errors, warnings and general information
-v3Show errors, warnings, general information and debugging information

Return Codes

jAsm returns with return code 0 for success and non-zero if an error occurred.

Language Reference


This section documents the entire syntax. Have a look at the starter guide first to get a grasp of the basics before digging into this.

Input Format

jAsm uses Unicode utf-8 encoded text files only. If you provide something that can't be interpreted as utf-8, an error will be returned.

Comments

jAsm supports C style single line comments. They span the rest of the line.

lda #0 // this is a comment
rts

Multiline comments are also supported with the same syntax as C. They are started with /* and ended with */.

lda #0 /* this
is a
comment */

A notable difference from C is that jAsm supports nested comments.

/* this is
  /* also */
a comment */

Assembler Instruction Syntax

This documentation doesn't cover the actual instructions, their meaning and so on. You will have to find that elsewhere.

The assembler instructions are entered using lowercase letters. All standard opcodes and addressing modes are supported. Examples:

lda #0
tay
sta $d020
ldx #5
lda ($fb),y
sta $9000,x
jmp $8000

Some assemblers use brackets as expression parentheses to avoid colliding with the indirect addressing modes but jAsm uses normal parentheses for both.

lda ($fb + 1) + 1,y // not indirect addressing
lda ($fb + 1 + 1),y // indirect addressing
jmp ($1fff + 1)*2 // not indirect addressing
jmp (($1fff + 1)*2) // indirect addressing

Instructions end with a new line or a semicolon. You can stack together instructions on one line like this.

inx; inx; inx;

Constants

Named constants can be defined in the source code to replace naked constants. This is encouraged as much as possible since it makes the source code much more readable. Two types exist. 1) Labels set to the current program counter and 2) constant declarations. Labels are names followed by a colon.

setup:

The label will contain the address to the next byte in memory. A label can also be placed between an instruction and its argument to be assigned the address to the instruction argument.

lda value: #0
inc value // increment what is loaded by the previous instruction next time it is executed

On Z80, there are at most two arguments so in some cases two labels can be defined to instruction arguments.

ld hl, index
ld bc, data
ld index:(ix+0), data:0
inc (hl) // increase index in instruction above

Constant declarations look like this.

const NUM_LIVES = 5 // constant declaration

With this defined we can use them to produce more readable assembler code.

lda #NUM_LIVES
jsr setup
rts

Sometimes it is convenient to define constants that are exported as addresses. This is done using the address keyword.

const address SCREEN = $0400

Address constants can be exported to the VICE emulator via a command line option. Labels are automatically marked as addresses since they point into the source code address space but constants do not by default.

Constants and labels can be defined and used in any order, unlike C where everything needs to be declared before use.

Variables

A constant cannot change its value throughout the source.

const A = 4
A = 3 // Error 3017 : Cannot reassign constant.

To do that you need a variable.

var A = 4
A = 3 // it works!

Since the value is dependent on the parsing order (top to bottom), variables must be declared before they are used.

lda #A // Error 3004 : Reference to undefined symbol A
var A = 3

Local vs Global

There are local and global symbols (constants, labels and variables). Local symbols always start with a period character.

const A = 3 // global variable
const .A = 4 // local variable

Symbols cannot be used outside their scope. Global symbols have the entire source code as their scope. Local symbols can only be used inside the closest outer scope, which is delimited by curly braces.

{
    const .A = 1 // a local .a
    {
        lda #.A // this will be 2

        const .A = 2 // another local .a

        lda #.A // this will be 2
    }
    lda #.A // this will be 1
}
lda #.A // Error 3004 : Reference to undefined symbol .A

Program Counter

The program counter is always represented by an asterisk (*). This can be used to refer to things relative the current address.

lda #0
inc *-1 // increment the value 0 in the previous instruction

Automatic Scope Variables

Every scope has automatic local variables generated to simplify loop constructs in the code without the need to make up names for the loop labels. At the start of the scope @loop is created and at the end @continue is created. This makes it possible to loop and exit the loop without explicitly creating labels.

{
    lda text,x
    beq @continue // break out if zero is found
    jsr CHROUT
    inx
    bne @loop // loop back to the scope start
}
rts

Dynamically Created Symbol Names

Constant and variable names can be constructed by an expression if the dynamic keyword is used when defining the symbols.

const dynamic "begin" + "end" = 100
lda #beginend // this will be 100

This can be used together with the symbol function to create and access dynamically generated symbols.

repeat 3
{
    const .addr = symbol("data" + string(@i))
    lda #<.addr
    ldx #>.addr
    jsr print_text
}
rts

define byte[] data0 = { "one", 0 }
define byte[] data1 = { "next", 0 }
define byte[] data2 = { "last", 0 }

Namespaces

Global symbols have a tendency to collide name-wise with each other in large programs, especially if there are several people working on the same code. jAsm supports namespaces to reduce these problems. For example, setup is a common name and different systems may provide their own setup. If the symbols are placed in their own namespaces they can coexist.

namespace random
{

setup:
    // initialize randomizer
    rts

}

namespace raster
{

setup:
    // initialize raster system
    rts

}

Outside the namespaces you need to specify which setup you are referring to. This is done like this.

jsr random::setup
jsr raster::setup

Global symbols are fetched relative to the current namespaces. Sometimes you need to use absolute namespace references to resolve ambiguity. This starts with ::.

const A = 1
namespace a
{
    const A = 2
}
namespace b
{
    const A = 3

    lda #::A // this is 1
}

Namespaces can be nested to form deeper namespaces.

namespace a
{
    namespace b
    {

    }
}

You can declare that you will be using a namespace in a scope to avoid having to specify it every time you reference it.

namespace system_a
{
    const A = 0
}

namespace system_b
{
    using namespace ::system_a
    const B = A // this is ok because we are now using namespace system_a as well
}

Note that this doesn't resolve ambiguity so, in some cases, you may still need to specify the absolute namespace for symbols.

Modules

Namespaces encapsulates global symbols in its own space but expose all of them outside the namespace. Modules exists to solve the problem of exposing selected symbols.

All global symbols are by default private to the module. The keyword export is used to make symbols accessible from outside the module.

module system
{
    const A = 0
    export const B = 1
}

const C = system::A // Error 3004: Reference to undefined symbol A
const D = system::B // this is ok because B has been exported

Place the export keyword before a statement that creates a symbol to export it.

export subroutine func
{
    export lbl:

    export define byte number = 3

    export macro exit()
    {
        rts
    }
}

A module can also declare that it needs values from the user of the module. This is done with the import keyword at the beginning of the module definition.

module system
    import color
{
    subroutine init
    {
        lda #color
        sta $d020
        rts
    }
}

The value system::color must be assigned somewhere once.

const BLACK = 0
system::color = BLACK

Import several variables like this.

module system
    import color1, color2
{
}

Declaring Symbols

Local symbols cannot be accessed outside their scope and sometimes they need to be defined in an inner scope. This is common when using self modifying code. In the following example, it isn't possible to access .char outside the loop.

const address SCREEN = $0400

lda #0
sta .char // Error 3004 : Reference to undefined symbol .char
ldx #NUM_ELEMENTS - 1
{
    lda .char: #0 // address to the value to be loaded into the accumulator
    sta SCREEN,x
    inc .char
    dex
    bpl @loop // loops back to enclosing scope beginning
}

This problem can be solved by declaring the symbol .char in the outermost scope where it needs to be accessed.

const address SCREEN = $0400

declare .char // declaring .char to be used in this scope
lda #0
sta .char // works!
ldx #NUM_ELEMENTS - 1
{
    lda .char: #0 // define the value of the declared symbol
    sta SCREEN,x
    inc .char
    dex
    bpl @loop
}

This technique can be used with any type of local symbol to move its scope.

Expressions

Expressions in jAsm are similar to expressions in C. They can contain assignments and assignments return the value assigned. This has the side effect that you can do multiple assignments.

var a = 0
var b = 1
a = b = 2

Normal parentheses are always used in expressions, not brackets.

Operators

jAsm supports a number of operators, similar to C but not exactly the same. The operators are, in the order of precedence:

Operator TypeExample
() call operatormac(a,b)
[] array indexing operatora[3]
. property operatora.length
++ postfix incrementa++
-- postfix decrementa--
++ prefix increment++a
-- prefix decrement--a
! boolean not!a
~ bitwise not~a
+ unary addition+3
- unary subtraction-3
< unary low bytea < b
> unary high bytea > b
* multiplicationa * b
/ divisiona / b
+ additiona + b
- subtractiona - b
<< left shifta << 1
>> right shifta >> 1
< less than comparisona < b
> greater than comparison> b
<= less or equal comparisona <= b
>= greater or equal comparisona >= b
== equal comparisona == b
!= not equal comparisona != b
& bitwise anda & b
^ bitwise exclusive ora ^ b
| bitwise ora | b
&& boolean anda && b
|| boolean ora || b
= assignmenta = 1
+= add and assigna += 1
-= subtract and assigna -= 1
*= multiply and assigna *= 2
/= divide and assigna /= 2
&&=boolean and, and assigna &&= b
||=boolean or and assigna ||= b
&= bitwise and, and assigna &= b
|= bitwise or, and assigna |= b
^= bitwise exclusive or, and assigna ^= b
<<=left shift, and assigna <<= 1
>>=right shift, and assigna >>= 1

Statements

Statements are blocks of code that control the generation of instructions or change the assembler state. Statements can optionally be separated by a semicolon, just like in C. Newline characters only matter in instructions where it is impossible for the assembler to know if some instructions (like rol) have an address following or not. In all other cases newlines are completely ignored. The following is valid in jAsm.

const
a
=
1
;

const b = 2 const c = 3

In some cases you may be confused by the greedy parser which tries to include as much as possible in the current statement. Look at this.

var a = 0
var b = 1 + a
++a // Error 3044 : Expression must have side effect.

The parser tries to include as much as possible in the variable declaration for b. The ++ operator is applied to the a in the second line which leaves a single a in the third line. The assembler tries to be helpful by pointing out that the result is meaningless. This case needs to be resolved by a semicolon to separate the statements.

var a = 0
var b = 1 + a;
++a // ok!

Data Types

jAsm has a couple of built in data types.

Boolean Type

Booleans can only be either true or false. Comparison operators return boolean values. They are well suited for conditional assembly.

const USE_DEBUG_OUTPUT = true

Integer Type

Integer numbers are 32 bit signed numbers in the range [-2147483648, 2147483647].

const NUMBER = 123

Floating Point Type

Floating point numbers are 64 bit signed numbers with decimal points. They can represent large numbers and precision increases closer to zero. The largest number is roughly 10308. The smallest numbers is roughly 10-308.

const NUMBER1 = 123.0
const NUMBER2 = 0.0
const NUMBER3 = -1e-50

String Type

Strings are quoted text. The characters are stored as wide characters (32 bits in Linux and 16 bits in Windows), leaving you a large selection of characters to choose from. This is why utf-8 is used as the file format for source code.

const STRING = "Hello"

There are a number of special characters that can be encoded in strings using a special backslash syntax. Whenever a backslash is encountered in a string the next character is checked to see if a special character should be used.

Code Result
\\ The backslash character itself
\t The horizontal tab character (9)
\n The newline character (10)
\r The carriage return character (13)
\0 The null character (0)
\' The single quote character
\" The double quote character

The 6502 version of jAsm supports conversion of the ASCII characters that can be mapped to PETASCII and screen codes for easy use on Commodore computers. Strings can be prefixed with p, P, s or S to specify the conversion. Without conversion the string is in Unicode format.

const STRING1 = p"Hello" // a string using the lowercase PETASCII character set
const STRING2 = P"HELLO" // a string using the uppercase PETASCII character set
const STRING3 = s"Hello" // a string using the lowercase PETASCII screen codes
const STRING4 = S"HELLO" // a string using the uppercase PETASCII screen codes

There is no character type in jAsm. Each character in a string is considered an integer Unicode character. Specifying a character is done like this.

const CHAR = 'A'

A character can be converted just like strings using the conversion prefix on the 6502 version of jAsm.

const CHAR1 = p'A' // a character using the lowercase PETASCII character set
const CHAR2 = P'A' // a character using the uppercase PETASCII character set
const CHAR3 = s'A' // a character using the lowercase PETASCII screen codes
const CHAR4 = S'A' // a character using the uppercase PETASCII screen codes

The backslash has the same special meaning when specifying a single character as it has in strings. This can be used to specify the single quote character itself for example.

const QUOTE_CHAR = '\''

The following operators and methods are supported by the string type.

Function Argument types Description Examples
+ string Returns left and right side strings concatenated. const a = "Commodore" + "64" // "Commodore64"
[index] integer Returns the character at (zero based) index. const a = "Commodore64"
lda #a[1] // 'o'
length Returns the length of the string. const a = "Commodore64"
lda #a.length // 11
substring(start, length) integer Returns the part of the string starting at (zero based) start and spanning length characters. The range can be partly outside the string and the result will be the union of the string and the range. const a = "Commodore64"
const b = a.substring(3, 4) // "modo"

List Type

The list type can hold a collection of values with different types. A list is created using the list function. This constructs a list containing the arguments.

const PRIMES = list(1, 2, 3, 5, 7, 11)
Function Argument types Description Examples
+ list Concatenates two lists. const a = list(1, 2)
const b = list(3, 4)
a + b // [1, 2, 3, 4]
+= list Concatenates two lists. var a = list(1, 2)
a += list(3, 4)
a // [1, 2, 3, 4]
[index] integer Returns the item at (zero based) index in the list. const a = list(5, 6, 7)
a[1] // 6
push(x) any Adds x to the end of the list and returns the list. var a = list(1, 2, 3)
a.push(4) // [1, 2, 3, 4]
pop() Removes the last element in the list and returns the list. var a = list(1, 2, 3)
a.pop() // [1, 2]
insert(position, value) integer, any Inserts value at zero based index position and returns the list. var a = list(1, 2, 3)
a.insert(1, 99) // [1, 99, 2, 3]
erase(position)
erase(position, length)
integer, integer Erase the part of the list defined by position and the optional length argument. Specifying only the position will erase one element. The list is returned. The range can be partly outside the list. var a = list(1, 2, 3, 4)
a.erase(1, 2) // [1, 4]
a.erase(0) // [4]
keep(position)
keep(position, length)
integer, integer Erase everything except the part of the list defined by position and the optional length argument. Specifying only the position will keep one element. The list is returned. The range can be partly outside the list. var a = list(1, 2, 3, 4)
a.keep(1, 2) // [2, 3]
a.keep(0) // [2]
clear() Clears the string and returns it. var a = list(1, 2, 3)
a.clear() // []
empty Returns true if the list is empty, otherwise false. list(2, 4, 8).empty // false
list().empty // true
length Returns the number of elements in the list. const a = list(2, 4, 8)
lda #a.length // 3

Passing Values

Values are always passed by value, never by reference. Everytime you assign a value to some other variable, a copy is made. This makes it possible to assign constants to variables and variables to constants without problems. Values passed as macro arguments will be copied before executing the macro body as well.

const aa = list(1, 2, 3)
var bb = aa
bb.pop()
print("{} {}\n", aa, bb) // [1, 2, 3] [1, 2]

There is no ghosting or pointers that can mess up the data unexpectedly.

Type Conversions

There are a number of functions in the root namespace dedicated to converting between the built-in types.

Function Accepted input types Description Examples
int(value) numeric Strips decimal part from a value and converts it to an integer. int(5) // 5
int(5.8) // 5
float(value) numeric Converts value into a floating point value. float(5) // 5.0
float(5.5) // 5.5
string(value) numeric Converts value into a readable string. string(5) // "5"
string(5.5) // "5.500000"
hexstring(value) int Converts value into a readable hexadecimal string. hexstring(100) // "64"

Memory Storage Types

There are specific byte, word and long types for memory storage. They are used when reserving or defining data to include in the assembler program.

define byte = 5

The memory storage data types store negative values as signed values and positive values as unsigned.

Storage Conversions

There are a number of functions in the root namespace dedicated to converting between the memory storage type ranges.

Function Accepted input types Description Examples
byte(value) numeric Returns value truncated to integer and with the number of bits reduced to 8. byte(257) // 1
byte(-1) // 255
word(value) numeric Returns value without decimal part and with the number of bits reduced to 16. word(128.3) // 128
long(value) numeric Returns value without decimal part and with the number of bits reduced to 32. long(1000000) // 1000000

Math Functions

jAsm provides a number of mathematical functions in the root namespace.

Function Accepted input types Description Examples
abs(x) numeric Returns the absolute part of x. abs(-10) // 10
acos(x) numeric Returns arc cosine of x in radians. acos(-1) // 3.1415926536
asin(x) numeric Returns arc sine of x in radians. asin(-1) // -1.5707963268
atan(x) numeric Returns arc tangent of x in radians. atan(1) // 0.7853981634
atan2(y, x) numeric, numeric Returns arc tangent of y/x in radians. atan2(1, 1) // 0.7853981634
ceil(x) numeric Returns x after rounding it up to the closest integer. ceil(0.1) // 1.0
ceil(-0.1) // 0.0
clamp(t, a, b) numeric, numeric, numeric Returns t clamped to the range [a..b]. clamp(0.1, 1.0, 2.0) // 1.0
clamp(5, 0, 10) // 5
cos(x) numeric Returns cosine of x radians. cos(PI) // -1.0
cosh(x) numeric Returns hyperbolic cosine of x radians. cosh(PI) // 11.591953344
degrees(x) numeric Returns radian angle x in degrees. degrees(PI) // 180.0
exp(x) numeric Returns ex</sup>. exp(1) // 2.7182818285
floor(x) numeric Returns x after rounding it down to the closest integer. floor(0.9) // 0.0
floor(-0.9) // -1.0
lerp(t, a, b) numeric, numeric, numeric Linearly interpolate a value between [a..b] using t [0..1] where 0 returns a and 1 returns b. t can also be outside the [0..1] range. lerp(0.5, 0.0, 10.0) // 5.0
lerp(-1, 0, 10) // -10
log(x) numeric Returns the natural logarithm of x. log(10) // 2.302585093
log10(x) numeric Returns the base-10 logarithm of x. log10(100) // 2.0
max(a, ...) numeric Returns the largest of the arguments. max(2, 4) // 4
max(2, 4.0) // 4.0
min(a, ...) numeric Returns the smallest of the arguments. min(2, 4) // 2
min(2, 4.0) // 2
pow(a, b) numeric Returns ab</sup>. pow(2, 4) // 16.0
radians(x) numeric Returns angle x in radians. radians(90) // 1.570796327
round(x) numeric Returns x after rounding it to the closest integer. round(0.9) // 1.0
round(0.1) // 0.0
sin(x) numeric Returns sine of x radians. sin(PI) // 0.0
sinh(x) numeric Returns hyperbolic sine of x radians. sinh(PI) // 11.548739357
sqrt(x) numeric Returns the square root of x. sqrt(16) // 4.0
tan(x) numeric Returns the tangent of an angle of x radians. tan(1.0) // 0.5493061444
tanh(x) numeric Returns the hyperbolic tangent of and angle of x radians. tanh(1) // 0.761594156

Math Constants

jAsm has a couple of predefined constants in the root namespace.

Constant Value
E 2.718281828459045
PI 3.141592653589793

Print and Formatting

There are two functions dedicated to formatting and printing, format and print. Both use the same arguments but format returns the result and print outputs it. Let's take format as the example.

const WIDTH = 40
format("width: {}", WIDTH) // returns "width: 40"

The first argument is the format string that describes the output format. Each pair of curly brackets in the format string inserts the next argument to the function, as a string, where the brackets are.

const WIDTH = 40
const HEIGHT = 25
format("width: {}, height: {}", WIDTH, HEIGHT) // returns "width: 40, height: 25"

It is possible to control the alignment of the injected text using a format specifier inside the curly brackets.

const WIDTH = 40
format("width: {L4}", WIDTH) // returns "width: 40  "
format("width: {R4}", WIDTH) // returns "width:   40"

When formatting integers you can control the minimum number of digits used.

const WIDTH = 40
format("width: {D4}", WIDTH) // returns "width: 0040"

Integers can also be formatted as hexadecimal numbers.

const WIDTH = 40
format("width: {X4}", WIDTH) // returns "width: 0028"

Floating point numbers will by default be displayed as a short representation of either fixed-point or scientific notation.

format("{}", 0.001) // returns "0.001"
format("{}", 0.0000001) // returns "1e-07"

It is possible to force fixed-point to be used with a specific number of decimal digits.

format("{F4}", 0.0000001) // returns "0.0000"
format("{F4}", 1.23) // returns "1.2300"
format("{F4}", 10) // returns "10.0000"

Alignment and number formatting specifiers can be combined.

format("{R8F4}", 1.23) // returns "  1.2300"

To print an opening curly bracket, prefix it with a backslash.

format("\{{}}", 1.23) // returns "{1.23}"
Function Argument types Description Examples
format string, ... Returns a string with the additional arguments injected into the format string argument. format("Commodore{}", 64) // "Commodore64"
print string, ... Prints a string with the additional arguments injected into the format string argument. print("Commodore{}", 64) // Commodore64

Symbol Functions

Function Accepted input types Description Examples
symbol(s) string Returns the value of the symbol that is stored as s. const .a = 5; symbol(".a") // 5

Asserts

jAsm supports static asserts to help improve the robustness of your programs. Use those to verify limitations in your program. The following example shows a common use case.

subroutine object_offset
{
    lda object_index
    static_assert(OBJECT_SIZE == 8, "This code only supports object sizes of 8")
    asl
    asl
    asl
    tax
    rts
}

The first argument is a boolean expression. If this evaluates to false, the assembler will generate an error and print the string in the second argument.

Sections

Code Sections

To output anything, a jAsm source file needs to contain a code section. Here is a simple example program that changes the border color on a Commodore 64 and returns.

section code, "main", $8000
{
    inc $d020
    rts
}

A section has a unique name, a start address and an optional end address. The name is used to name the output files when using the command line option to write one file per section. The filenames will consist of the output name specified on the command line, concatenated with an underscore and the section name. This way, each filename will be unique.

An end address can be specified after the start address. This will enforce that the code within the section actually fits within it. If it overflows, the assembler exits with an error.

section code, "main", $8000, $9000
{
    inc $d020
    rts
}

Sections in Sections

Sections can be placed within sections. This is useful in two cases, 1) store relocated code and 2) output the size and placement of code.

In the following example the inner section is stored within the outer section at $8000 but is assembled like it was located at address $9000. So moving the code from $8000 to $9000 makes it run perfectly. This will only create one single tight code section, even if jAsm is configured to output one file per section. This only affects the outermost sections.

section code, "main", $8000
{
    // move the code to the proper location
    ldx #end - start - 1
    {
        lda start,x
        sta target,x
        dex
        bpl @loop
    }
    jmp target

start:
    section code, "reloc", $9000
    {
    target:
        inc $d020
        jmp target
    }
end:
}

If jAsm is started with the -v2 flag, the output will print the sections like this.

$8000 - $8014 ($0014) code: main
  $9000 - $9006 ($0006) code: reloc

The following example measures the size of a piece of code.

const address CHROUT = $ffd2

section code, "main", $8000
{
    ldx #0
    {
        lda str,x
        jsr CHROUT
        inx
        cpx #sizeof(str)
        bne @loop
    }
    rts

    // measure the size of string data
    section code, "string", *
    {
        define byte[] str = {
            P"LONG STRING DATA STORED HERE... ",
            P"NO ONE KNOWS WHERE IT ENDS..."
        }
    }
}

The asterisk represents the current program counter value and this relocates the section to the address it is already at, thus it only affects the assembler information output. If jAsm is started with the -v2 flag, the output will print the sections like this.

$8000 - $804a ($004a) code: main
  $800e - $804a ($003c) code: string

BSS Sections

Bss sections is used to reserve memory for variables in your assembler program. This section type doesn't output anything, it just keeps track of a program counter to measure the size of reserved space. It isn't possible to place instructions or other data generating statements in a bss section. Reservation of space is done with the reserve statement.

section bss, "variables", $9000
{
    reserve byte num_lives
    reserve byte num_boosts
}

Section Parts

It is possible to add to an existing section later in the source code using section parts.

section code, "main", $8000
{
    nop
}

section part, "main"
{
    rts // some more code
}

A section part refers to the name of a previously defined section to add its contents to it. This can be used to create single file modules with code and variable reservations for specific systems. Empty sections can be created in a main file for zero page variables, code and variables and includes a number of modules. The modules adds to these sections to form a complete program.

Section Part Mapping

It is possible to name a module's sections using generic names like "code", "variables" and "zero page" and still have the power to map these to more specific section names in a main program.

Let's say that the main program defines two locations for variable storage.

section bss, "low variables", $1000, $1100
{
}

section bss, "high variables", $2000, $2200
{
}

A generic module can reserve variable storage like this.

section part, "variables"
{
    reserve byte lives
}

The main program can then include the generic module inside a section remap like this to get the module's variables stored in the low variable section.

section mapping
    "variables" = "low variables"
{
    include "module.jasm"
}

Building ROM Images

Sections can be used to build large cartridge images with banks. Do that by creating an outer section for all the banks and one inner section per bank.

section code, "main", 0 // start address will not be used
{
    section code, "image_1", $e000, $10000
    {
        // code here
    }
    section code, "image_2", $e000, $10000
    {
        // code here
    }
    // more sections
}

Building Overlayed Code Sectors

Sections can also be used when building a game that streams code from disk at runtime. Each streaming code sector gets its own section and the command line option -om is used to output one file per section. If the same code files are used in several streaming code sectors, you use namespaces to keep them apart.

const address PROGRAM_START = $1000

section code, "main", PROGRAM_START
{
    // code here
}
section bss, "streaming_buffer", *
{
    // Reserve space for the streaming buffer. The size corresponds to the largest of the sectors.
    reserve byte[max(sector_1::end - sector_1::start, sector_2::end - sector_2::start)] buffer
}

section code, "sector_1", buffer
{
    namespace sector_1
    {
    start:
        // code here
    end:
    }
}

section code, "sector_2", buffer
{
    namespace sector_2
    {
    start:
        // code here
    end:
    }
}

Conditional Assembly

Code blocks can be selected or rejected with the if statement.

if (USE_FEATURE)
{
    jsr feature_update
}

The parentheses must contain a boolean expression to evaluate whether the code will be used or not. Two different code blocks can be selected in a mutually exclusive fashion, using the if-else statement.

if (USE_FEATURE)
{
    jsr feature_update
}
else
{
    jsr featureless_update
}

You can choose to use or reject large blocks of code, even entire sections if needed.

Sometimes you need to select between more than two options. This is what the if-elif-else statement does.

if (USE_FEATURE_1)
{
    jsr feature1_update
}
elif (USE_FEATURE_2)
{
    jsr feature2_update
}
else
{
    jsr feature3_update
}

Include Source

Large programs may need to be separated into several files. You can include other source files in a source file using the include statement.

include "some_dir/some_file.jasm"

This will act as if all the text in some_file.jasm was pasted over the include statement. Files will be searched for in the current directory first, and then all additional include directories specified by command line options.

Include Data

Data like pictures, sprites and character sets can be included in a code section, to be accessible from code. Use the incbin statement for that.

incbin "some_dir/some_file.bin"

The assembler will look in the current directory first and then all additional include directories specified by command line options.

You can add an optional byte offset into the file where to start reading.

incbin "some_dir/some_file.bin", 2

This will skip the first two bytes of the file. It is also possible to set a max size to read.

incbin "some_dir/some_file.bin", 2, 4

This will read at most 4 bytes from offset 2 in the specified file.

Defining Data

You can define data to be included in a code section using the define statement.

define byte max_lives = 3

This adds a single byte with the value 3 and creates a label max_lives pointing to it. All storage types can be used in the define statement. You can also create arrays of data.

define word[] pointers = {
    ptr1, ptr2, ptr3
}

You always need to provide curly braces when defining arrays. This is also true if you are defining strings.

define byte[] str = { "HELLO" }

It is possible to specify the size of the array, to verify that the number of elements match.

define word[NUM_POINTERS] pointers = {
    ptr1, ptr2, ptr3
}

If NUM_POINTERS doesn't match the number of pointers defined, an error will be returned.

pointers acts like a label but you can also index into the array using the array operator [].

lda pointers[1]
sta low_byte
lda pointers[1] + 1
sta high_byte

In the code example above, the array index start at zero so the second pointer was fetched and stored.

Another way to index is to use the offsetof function. It will return the offset in bytes from the beginning of the array.

lda pointers + offsetof(pointers[1])
sta low_byte
lda pointers + offsetof(pointers[1]) + 1
sta high_byte

This isn't more convenient in this case but there are cases when determining the offset is useful.

Another handy function operating on defined data is sizeof. It returns the size in bytes of the consumed space.

ldx #sizeof(pointers) // 6
Function Accepted input types Description Examples
offsetof(x) offset type Returns the offset in bytes from the beginning of defined or reserved data to x. define byte[] ints = { 1, 2, 3 }
lda #offsetof(ints[2])
sizeof(x) offset type Returns the size in bytes of x. define byte[] ints = { 1, 2, 3 }
lda #sizeof(ints)

It is also possible to define data without specifying a name.

define byte = 3
define byte[] = { 1, 2, 3 }

The define statement can also be used to fill a larger memory block with values without specifying each value if they follow a pattern. This will generate 100 bytes of zeroes.

define byte[100] = { 0, ... }

This can also fill using a more complex pattern like this:

define byte[100] = { "HELLO WORLD!", ... }

Reserving Space

In bss sections you can allocate space for variables in your program. You use the reserve statement to do that.

reserve byte lives

This will reserve one byte for lives and create a label to the memory address.

You can reserve an array of a storage data type as well.

reserve long[16] coordinates

It is also possible to reserve space without specifying a name. In this case you will need to provide a semicolon to signal that there will be no name following the type.

reserve byte;
reserve byte[3];

The sizeof and offsetof functions can also be used on reserved memory labels, just like for defined data.

Subroutines

To create a subroutine you really only need to place a label somewhere and jump to it. jAsm allows you to express it a bit more explicitly using the subroutine keyword.

// -> a: the value to multiply
// <- a: the result
// <> x: preserved
// <> y: preserved
subroutine multiply_by_8
{
    asl
    asl
    asl
    rts
}

The label multiply_by_8 is automatically created.

Enumerations

jAsm supports enumerated constants. It is a simplified way of assigning a series of numbers without specifying each number. It makes it easier to insert a value in the middle without re-enumerating all following numbers.

enum pause_menu
{
    continue,
    options,
    exit
}

In this example, continue will contain 0, options 1 and exit 2. You access the values using the pause_menu enum like this.

ldx #pause_menu.continue
jsr draw_menu_option

The first enum value is by default 0, but any of the values can be explicitly specified like this.

enum pause_menu
{
    continue = 1, // 1
    options, // 2
    exit = 10 // 10
}

Enum values can be specified relative to other values as well.

enum device
{
    joy1,
    joy2,
    paddle1 = device.joy1,
    paddle2 = device.joy1,
    paddle3 = device.joy2,
    paddle4 = device.joy2
}

Loops

Sometimes it is necessary to write a lot of repetitive code. jAsm supports loops for this purpose.

For Loop

The for loop is a general form of loop that can be used in a large variety of situations. It is very similar to the C for-loop with a tighter set of options.

for(var .i = 0; i < 5; ++i)
{
    nop
}

This creates five nop instructions. The for loop starts with an optional variable declaration, followed by a required ending condition expression and ends with an optional variable modification expression. The loop takes another pass as long as the ending condition expression evaluates to true.

Range-based For Loop

A more specific form of for loop exists which conveniently iterates over lists.

const .bits = list(0, 1, 2, 4, 5, 7)

for(var .b in .bits)
{
    define byte = 1 << .b
}

This creates six mask bytes according to the bits in the list. Inside the loop the special variable @i is set to the zero-based index to the value in the list.

const .names = list("PICTURE", "GAME", "LEVEL1", "LEVEL2")

for(var .name in .names)
{
    define byte[] = { string(@i), " ", .name, 0 }
}

This will generate data for filenames where each name begins with a number and a space character before the descriptive name.

When the loop starts, a copy of the list will be made. The iteration is done over the copy to avoid problems where the list is accidently modified inside the loop.

Repeat Loop

The repeat loop is a simplified version of the for loop. It can only repeat itself a fixed number of times and doesn't use a complex exit condition expression. It generates an automatic local label @i as a zero based loop iteration counter.

repeat 5
{
    define byte = @i
}

This defines numbers 0, 1, 2, 3 and 4 in memory.

Macros

Macros are a way to generate adaptable and reusable code blocks. A macro is a function type object which generates its contents where it is invoked. This is much like an inline function or a template function in C++.

macro memset(.addr, .size)
{
    ldx #.size - 1
    {
        sta .addr,x
        dex
        bpl @loop
    }
}

This is a simple macro to generate a loop to clear a block of memory. The arguments are put into the local constants .addr and .size when the macro is invoked.

lda #0
memset(data, sizeof(data))

// ...

reserve byte[55] data

A powerful feature is that the macro can change its behavior based on its arguments. What if the size to clear is 2? A loop wouldn't be very efficient in that case. The macro can be changed to solve this more efficiently.

macro memset(.addr, .size)
{
    if (.size < 4)
    {
        repeat .size
        {
            sta .addr + @i
        }
    }
    elif (.size < 129)
    {
        ldx #.size - 1
        {
            sta .addr,x
            dex
            bpl @loop
        }
    }
    else
    {
        static_assert(false, "memset doesn't support larger sizes... yet.")
    }
}

The loop is unrolled for sizes less than 4, otherwise a loop will be constructed and if the size is too big, the assert triggers. This can be extended to support all sizes optimally and then you will never again need to write a memory clear loop!

Macros can also be locally defined.

{
    macro .write2(.addr)
    {
        sta .addr
        sta .addr + 1
    }
    lda #0
    .write2(ptr1)
    .write2(ptr2)
    .write2(ptr3)
}

Macros are first class objects and therefore they can be stored as constants or variables. They can also be sent as arguments to functions or macros. This enables code injection in macros, using other macros as arguments.

macro print_char()
{
    jsr CHROUT
}

macro print_text(.text, .printer)
{
    ldx #0
    {
        lda .text,x
        beq @continue
        .printer()
        inx
        bne @loop
    }
}

print_text(text1, print_char)
print_text(text2, print_char)
rts

define byte[] text1 = { "wow", 0 }
define byte[] text2 = { "cool", 0 }

Now this print_text macro is generic enough to be reused even with other types of output devices.

Macros can return values if desired. That makes it possible for macros to also act as pure functions if no instructions are generated within them. Values are returned with the return statement that takes an optional expression to return as argument. This macro calculates the screen address based on a screen base address and screen coordinates.

macro screen_pos(.start, .x, .y)
{
    return .start + .x + 40*.y
}

Macros can be called recursively. This example calculates a Fibonacci Sequence using a recursive macro.

macro fibonacci(.value)
{
    if (.value == 0) {
        return 0
    }
    if (.value == 1) {
        return 1
    }
    return fibonacci(.value - 1) + fibonacci(.value - 2)
}

repeat 10
{
    define byte = fibonacci(@i)
}

A macro can be ended early without returning any value like this.

macro send_large(.size)
{
    if (.size < 4) {
        return;
    }
    jsr send_it
}

Note that the return statement must end with a semicolon if no value is to be returned, otherwise an expression is expected.

Alignment

Sometimes code or data needs to be aligned to avoid extra cycles spent on traversing memory block boundaries. This is done with the align statement.

align 256

This will align the program counter so that it ends up where the address modulo 256 is 0. In code sections, the alignment pads with zeros by default. If you need to pad with something else in code sections you can supply an additional fill byte argument.

align 256, 55

This will fill up the gap to next page boundary with the number 55.