Next Chapter | Previous Chapter | Contents | Index
NASM contains a powerful macro processor, which supports conditional
assembly, multi-level file inclusion, two forms of macro (single-line and
multi-line), and a `context stack' mechanism for extra macro power.
Preprocessor directives all begin with a sign.
The preprocessor collapses all lines which end with a backslash (\) character into a single line. Thus:
%define THIS_VERY_LONG_MACRO_NAME_IS_DEFINED_TO \
THIS_VALUE
will work like a single-line macro without the backslash-newline sequence.
%define Single-line macros are 2 ; pi do 1.e+4000 ; IEEE 754r quad precision
The 8-bit "quarter-precision" floating-point format is sign:exponent:mantissa = 1:4:3 with an exponent bias of 7. This appears to be the most frequently used 8-bit floating-point format, although it is not covered by any formal standard. This is sometimes called a "minifloat."
The special operators are used to produce floating-point numbers in
other contexts. They produce the binary representation of a specific
floating-point number as an integer, and can use anywhere integer constants
are used in an expression. and
produce the 64-bit mantissa and
16-bit exponent of an 80-bit floating-point number, and
and
produce the lower and upper 64-bit
halves of a 128-bit floating-point number, respectively.
For example:
mov rax,__float64__(3.141592653589793238462)
... would assign the binary representation of pi as a 64-bit floating
point number into . This is exactly equivalent
to:
mov rax,0x400921fb54442d18
NASM cannot do compile-time arithmetic on floating-point constants. This is because NASM is designed to be portable - although it always generates code to run on x86 processors, the assembler itself can run on any system with an ANSI C compiler. Therefore, the assembler cannot guarantee the presence of a floating-point unit capable of handling the Intel number formats, and so for NASM to be able to do floating arithmetic it would have to include its own complete set of floating-point routines, which would significantly increase the size of the assembler for very little benefit.
The special tokens ,
(or )
and can be used to generate infinities,
quiet NaNs, and signalling NaNs, respectively. These are normally used as
macros:
%define Inf __Infinity__
%define NaN __QNaN__
dq +1.5, -Inf, NaN ; Double-precision constants
x87-style packed BCD constants can be used in the same contexts as
80-bit floating-point numbers. They are suffixed with
or prefixed with ,
and can include up to 18 decimal digits.
As with other numeric constants, underscores can be used to separate digits.
For example:
dt 12_345_678_901_245_678p
dt -12_345_678_901_245_678p
dt +0p33
dt 33p
Expressions in NASM are similar in syntax to those in C. Expressions are evaluated as 64-bit integers which are then adjusted to the appropriate size.
NASM supports two special tokens in expressions, allowing calculations
to involve the current assembly position: the
and tokens.
evaluates to the assembly position at the beginning of the line containing
the expression; so you can code an infinite loop using
. evaluates to
the beginning of the current section; so you can tell how far into the
section you are by using .
The arithmetic operators provided by NASM are listed here, in increasing order of precedence.
| : Bitwise OR OperatorThe operator gives a bitwise OR, exactly as
performed by the machine instruction. Bitwise
OR is the lowest-priority arithmetic operator supported by NASM.
^ : Bitwise XOR Operator provides the bitwise XOR operation.
& : Bitwise AND Operator provides the bitwise AND operation.
<< and >> : Bit Shift Operators gives a bit-shift to the left, just
as it does in C. So evaluates to 5
times 8, or 40. gives a bit-shift to the
right; in NASM, such a shift is always unsigned, so that the bits
shifted in from the left-hand end are filled with zero rather than a
sign-extension of the previous highest bit.
+ and - : Addition and Subtraction OperatorsThe and
operators do perfectly ordinary addition and subtraction.
* , / , // , % and %% : Multiplication and Division is the multiplication operator.
and are both
division operators: is unsigned division and
is signed division. Similarly,
and provide
unsigned and signed modulo operators respectively.
NASM, like ANSI C, provides no guarantees about the sensible operation of the signed modulo operator.
Since the character is used extensively by
the macro preprocessor, you should ensure that both the signed and unsigned
modulo operators are followed by white space wherever they appear.
+ , - , ~ , ! and SEG The highest-priority operators in NASM's expression grammar are those
which only apply to one argument. negates its
operand, does nothing (it's provided for
symmetry with ),
computes the one's complement of its operand,
is the logical negation operator, and
provides the segment address of its operand (explained in more detail in
section 3.6).
SEG and WRT When writing large 16-bit programs, which must be split into multiple
segments, it is often necessary to be able to refer to the segment part of
the address of a symbol. NASM supports the
operator to perform this function.
The operator returns the
preferred segment base of a symbol, defined as the segment base
relative to which the offset of the symbol makes sense. So the code
mov ax,seg symbol
mov es,ax
mov bx,symbol
will load with a valid pointer to the
symbol .
Things can be more complex than this: since 16-bit segments and groups
may overlap, you might occasionally want to refer to some symbol using a
different segment base from the preferred one. NASM lets you do this, by
the use of the (With Reference To) keyword.
So you can do things like
mov ax,weird_seg ; weird_seg is a segment base
mov es,ax
mov bx,symbol wrt weird_seg
to load with a different, but
functionally equivalent, pointer to the symbol
.
NASM supports far (inter-segment) calls and jumps by means of the syntax
, where
and both
represent immediate values. So to call a far procedure, you could code
either of
call (seg procedure):procedure
call weird_seg:(procedure wrt weird_seg)
(The parentheses are included for clarity, to show the intended parsing of the above instructions. They are not necessary in practice.)
NASM supports the syntax as
a synonym for the first of the above usages.
works identically to in these examples.
To declare a far pointer to a data item in a data segment, you must code
dw symbol, seg symbol
NASM supports no convenient synonym for this, though you can always invent one using the macro processor.
STRICT : Inhibiting OptimizationWhen assembling with the optimizer set to level 2 or higher (see
section 2.1.22), NASM will use
size specifiers (,
, ,
, ,
or ), but
will give them the smallest possible size. The keyword
can be used to inhibit optimization and
force a particular operand to be emitted in the specified size. For
example, with the optimizer on, and in
mode,
push dword 33
is encoded in three bytes , whereas
push strict dword 33
is encoded in six bytes, with a full dword immediate operand
.
With the optimizer off, the same code (six bytes) is generated whether
the keyword was used or not.
Although NASM has an optional multi-pass optimizer, there are some expressions which must be resolvable on the first pass. These are called Critical Expressions.
The first pass is used to determine the size of all the assembled code and data, so that the second pass, when generating all the code, knows all the symbol addresses the code refers to. So one thing NASM can't handle is code whose size depends on the value of a symbol declared after the code in question. For example,
times (label-$) db 0
label: db 'Where am I?'
The argument to in this case could
equally legally evaluate to anything at all; NASM will reject this example
because it cannot tell the size of the line
when it first sees it. It will just as firmly reject the slightly
paradoxical code
times (label-$+1) db 0
label: db 'NOW where am I?'
in which any value for the
argument is by definition wrong!
NASM rejects these examples by means of a concept called a critical
expression, which is defined to be an expression whose value is
required to be computable in the first pass, and which must therefore
depend only on symbols defined before it. The argument to the
prefix is a critical expression.
NASM gives special treatment to symbols beginning with a period. A label beginning with a single period is treated as a local label, which means that it is associated with the previous non-local label. So, for example:
label1 ; some code
.loop
; some more code
jne .loop
ret
label2 ; some code
.loop
; some more code
jne .loop
ret
In the above code fragment, each
instruction jumps to the line immediately before it, because the two
definitions of are kept separate by virtue
of each being associated with the previous non-local label.
This form of local label handling is borrowed from the old Amiga
assembler DevPac; however, NASM goes one step further, in allowing access
to local labels from other parts of the code. This is achieved by means of
defining a local label in terms of the previous non-local label:
the first definition of above is really
defining a symbol called , and the
second defines a symbol called . So,
if you really needed to, you could write
label3 ; some more code
; and some more
jmp label1.loop
Sometimes it is useful - in a macro, for instance - to be able to define
a label which can be referenced from anywhere but which doesn't interfere
with the normal local-label mechanism. Such a label can't be non-local
because it would interfere with subsequent definitions of, and references
to, local labels; and it can't be local because the macro that defined it
wouldn't know the label's full name. NASM therefore introduces a third type
of label, which is probably only useful in macro definitions: if a label
begins with the special prefix , then it does
nothing to the local label mechanism. So you could code
label1: ; a non-local label
.local: ; this is really label1.local
..@foo: ; this is a special symbol
label2: ; another non-local label
.local: ; this is really label2.local
jmp ..@foo ; this will jump three lines up
NASM has the capacity to define other special symbols beginning with a
double period: for example, is used to
specify the entry point in the output format
(see section 7.4.6).