@ -1,57 +1,99 @@
**********************
# Cranelift IR Reference
Cranelift IR Reference
**********************
.. default-domain:: clif
## Forward
.. highlight:: clif
.. todo:: Update the IR reference
This document is likely to be outdated and missing some important
information. It is recommended to look at the list of instructions as
documented in [the `InstBuilder` documentation].
This document is likely to be outdated and missing some important
[the `InstBuilder` documentation]: https://docs.rs/cranelift-codegen/latest/cranelift_codegen/ir/trait.InstBuilder.html
information. It is recommended to look at the list of instructions as
documented in the `InstBuilder` documentation:
https://docs.rs/cranelift-codegen/latest/cranelift_codegen/ir/trait.InstBuilder.html
The Cranelift intermediate representation (:term:`IR`) has two primary forms:
## Intro
The Cranelift intermediate representation ([IR]) has two primary forms:
an *in-memory data structure* that the code generator library is using, and a
an *in-memory data structure* that the code generator library is using, and a
*text format* which is used for test cases and debug output.
*text format* which is used for test cases and debug output.
Files containing Cranelift textual IR have the ``.clif`` filename extension.
Files containing Cranelift textual IR have the `.clif` filename extension.
This reference uses the text format to describe IR semantics but glosses over
This reference uses the text format to describe IR semantics but glosses over
the finer details of the lexical and syntactic structure of the format.
the finer details of the lexical and syntactic structure of the format.
## Overall structure
Overall structure
Cranelift compiles functions independently. A `.clif` IR file may contain
=================
Cranelift compiles functions independently. A ``.clif`` IR file may contain
multiple functions, and the programmatic API can create multiple function
multiple functions, and the programmatic API can create multiple function
handles at the same time, but the functions don't share any data or reference
handles at the same time, but the functions don't share any data or reference
each other directly.
each other directly.
This is a simple C function that computes the average of an array of floats:
This is a simple C function that computes the average of an array of floats:
.. literalinclude:: example.c
```c
:language: c
float
average(const float *array, size_t count)
{
double sum = 0;
for (size_t i = 0; i < count ; i + + )
sum += array[i];
return sum / count;
}
```
Here is the same function compiled into Cranelift IR:
Here is the same function compiled into Cranelift IR:
.. literalinclude:: example.clif
```
:language: clif
test verifier
:lines: 2-
function %average(i32, i32) -> f32 system_v {
ss0 = explicit_slot 8 ; Stack slot for `sum` .
block1(v0: i32, v1: i32):
v2 = f64const 0x0.0
stack_store v2, ss0
brz v1, block5 ; Handle count == 0.
jump block2
block2:
v3 = iconst.i32 0
jump block3(v3)
block3(v4: i32):
v5 = imul_imm v4, 4
v6 = iadd v0, v5
v7 = load.f32 v6 ; array[i]
v8 = fpromote.f64 v7
v9 = stack_load.f64 ss0
v10 = fadd v8, v9
stack_store v10, ss0
v11 = iadd_imm v4, 1
v12 = icmp ult v11, v1
brnz v12, block3(v11) ; Loop backedge.
jump block4
block4:
v13 = stack_load.f64 ss0
v14 = fcvt_from_uint.f64 v1
v15 = fdiv v13, v14
v16 = fdemote.f32 v15
return v16
block5:
v100 = f32const +NaN
return v100
}
```
The first line of a function definition provides the function *name* and
The first line of a function definition provides the function *name* and
the :term:`function signature` which declares the parameter and return types.
the [function signature] which declares the parameter and return types.
Then follows the :term:`function preamble` which declares a number of entities
Then follows the [function preamble] which declares a number of entities
that can be referenced inside the function. In the example above, the preamble
that can be referenced inside the function. In the example above, the preamble
declares a single explicit stack slot, ``ss0``.
declares a single explicit stack slot, `ss0` .
After the preamble follows the :term:`function body` which consists of
After the preamble follows the [function body] which consists of
:term:`extended basic block`\s (EBBs), the first of which is the
[extended basic block] s (EBBs), the first of which is the
:term:`entry block`. Every EBB ends with a :term:`terminator instruction`, so
[entry block]. Every EBB ends with a [terminator instruction] , so
execution can never fall through to the next EBB without an explicit branch.
execution can never fall through to the next EBB without an explicit branch.
A ``.clif`` file consists of a sequence of independent function definitions:
A `.clif` file consists of a sequence of independent function definitions:
.. productionlist::
.. productionlist::
function_list : { function }
function_list : { function }
@ -59,14 +101,13 @@ A ``.clif`` file consists of a sequence of independent function definitions:
preamble : { preamble_decl }
preamble : { preamble_decl }
function_body : { extended_basic_block }
function_body : { extended_basic_block }
Static single assignment form
### Static single assignment form
-----------------------------
The instructions in the function body use and produce *values* in SSA form. This
The instructions in the function body use and produce *values* in SSA form. This
means that every value is defined exactly once, and every use of a value must be
means that every value is defined exactly once, and every use of a value must be
dominated by the definition.
dominated by the definition.
Cranelift does not have phi instructions but uses :term:`EBB parameter`\ s
Cranelift does not have phi instructions but uses [EBB parameter] s
instead. An EBB can be defined with a list of typed parameters. Whenever control
instead. An EBB can be defined with a list of typed parameters. Whenever control
is transferred to the EBB, argument values for the parameters must be provided.
is transferred to the EBB, argument values for the parameters must be provided.
When entering a function, the incoming function parameters are passed as
When entering a function, the incoming function parameters are passed as
@ -75,32 +116,28 @@ arguments to the entry EBB's parameters.
Instructions define zero, one, or more result values. All SSA values are either
Instructions define zero, one, or more result values. All SSA values are either
EBB parameters or instruction results.
EBB parameters or instruction results.
In the example above, the loop induction variable ``i`` is represented as three
In the example above, the loop induction variable `i` is represented as three
SSA values: In the entry block, ``v4`` is the initial value. In the loop block
SSA values: In the entry block, `v4` is the initial value. In the loop block
``ebb2``, the EBB parameter ``v5`` represents the value of the induction
`ebb2` , the EBB parameter `v5` represents the value of the induction
variable during each iteration. Finally, ``v12`` is computed as the induction
variable during each iteration. Finally, `v12` is computed as the induction
variable value for the next iteration.
variable value for the next iteration.
The `cranelift_frontend` crate contains utilities for translating from programs
The `cranelift_frontend` crate contains utilities for translating from programs
containing multiple assignments to the same variables into SSA form for
containing multiple assignments to the same variables into SSA form for
Cranelift :term:`IR` .
Cranelift [IR] .
Such variables can also be presented to Cranelift as :term:`stack slot`\ s.
Such variables can also be presented to Cranelift as [stack slot] s.
Stack slots are accessed with the `stack_store` and `stack_load` instructions,
Stack slots are accessed with the `stack_store` and `stack_load` instructions,
and can have their address taken with `stack_addr` , which supports C-like
and can have their address taken with `stack_addr` , which supports C-like
programming languages where local variables can have their address taken.
programming languages where local variables can have their address taken.
.. _value-types:
## Value types
Value types
===========
All SSA values have a type which determines the size and shape (for SIMD
All SSA values have a type which determines the size and shape (for SIMD
vectors) of the value. Many instructions are polymorphic -- they can operate on
vectors) of the value. Many instructions are polymorphic -- they can operate on
different types.
different types.
Boolean types
### Boolean types
-------------
Boolean values are either true or false.
Boolean values are either true or false.
@ -119,8 +156,7 @@ zero bits or all one bits.
- b32
- b32
- b64
- b64
Integer types
### Integer types
-------------
Integer values have a fixed size and can be interpreted as either signed or
Integer values have a fixed size and can be interpreted as either signed or
unsigned. Some instructions will interpret an operand as a signed or unsigned
unsigned. Some instructions will interpret an operand as a signed or unsigned
@ -133,8 +169,7 @@ The support for i8 and i16 arithmetic is incomplete and use could lead to bugs.
- i32
- i32
- i64
- i64
Floating point types
### Floating point types
--------------------
The floating point types have the IEEE 754 semantics that are supported by most
The floating point types have the IEEE 754 semantics that are supported by most
hardware, except that non-default rounding modes, unmasked exceptions, and
hardware, except that non-default rounding modes, unmasked exceptions, and
@ -162,8 +197,7 @@ instructions are encoded as follows:
- f32
- f32
- f64
- f64
CPU flags types
### CPU flags types
---------------
Some target ISAs use CPU flags to represent the result of a comparison. These
Some target ISAs use CPU flags to represent the result of a comparison. These
CPU flags are represented as two value types depending on the type of values
CPU flags are represented as two value types depending on the type of values
@ -181,8 +215,7 @@ instructions either. The verifier enforces these rules.
- iflags
- iflags
- fflags
- fflags
SIMD vector types
### SIMD vector types
-----------------
A SIMD vector type represents a vector of values from one of the scalar types
A SIMD vector type represents a vector of values from one of the scalar types
(boolean, integer, and floating point). Each scalar value in a SIMD type is
(boolean, integer, and floating point). Each scalar value in a SIMD type is
@ -221,8 +254,7 @@ b1x%N
Like the `b1` type, a boolean vector cannot be stored in memory.
Like the `b1` type, a boolean vector cannot be stored in memory.
Pseudo-types and type classes
### Pseudo-types and type classes
-----------------------------
These are not concrete types, but convenient names used to refer to real types
These are not concrete types, but convenient names used to refer to real types
in this reference.
in this reference.
@ -254,8 +286,7 @@ Mem
Testable
Testable
Either `b1` or `iN` .
Either `b1` or `iN` .
Immediate operand types
### Immediate operand types
-----------------------
These types are not part of the normal SSA type system. They are used to
These types are not part of the normal SSA type system. They are used to
indicate the different kinds of immediate operands on an instruction.
indicate the different kinds of immediate operands on an instruction.
@ -295,43 +326,42 @@ floatcc
A floating point condition code. See the `fcmp` instruction for details.
A floating point condition code. See the `fcmp` instruction for details.
The two IEEE floating point immediate types `ieee32` and `ieee64`
The two IEEE floating point immediate types `ieee32` and `ieee64`
are displayed as hexadecimal floating point literals in the textual :term:`IR`
are displayed as hexadecimal floating point literals in the textual [IR]
format. Decimal floating point literals are not allowed because some computer
format. Decimal floating point literals are not allowed because some computer
systems can round differently when converting to binary. The hexadecimal
systems can round differently when converting to binary. The hexadecimal
floating point format is mostly the same as the one used by C99, but extended
floating point format is mostly the same as the one used by C99, but extended
to represent all NaN bit patterns:
to represent all NaN bit patterns:
Normal numbers
Normal numbers
Compatible with C99: ``-0x1.Tpe`` where ``T`` are the trailing
Compatible with C99: `-0x1.Tpe` where `T` are the trailing
significand bits encoded as hexadecimal, and ``e`` is the unbiased exponent
significand bits encoded as hexadecimal, and `e` is the unbiased exponent
as a decimal number. `ieee32` has 23 trailing significand bits. They
as a decimal number. `ieee32` has 23 trailing significand bits. They
are padded with an extra LSB to produce 6 hexadecimal digits. This is not
are padded with an extra LSB to produce 6 hexadecimal digits. This is not
necessary for `ieee64` which has 52 trailing significand bits
necessary for `ieee64` which has 52 trailing significand bits
forming 13 hexadecimal digits with no padding.
forming 13 hexadecimal digits with no padding.
Zeros
Zeros
Positive and negative zero are displayed as ``0.0`` and ``-0.0`` respectively.
Positive and negative zero are displayed as `0.0` and `-0.0` respectively.
Subnormal numbers
Subnormal numbers
Compatible with C99: ``-0x0.Tpemin`` where ``T`` are the trailing
Compatible with C99: `-0x0.Tpemin` where `T` are the trailing
significand bits encoded as hexadecimal, and ``emin`` is the minimum exponent
significand bits encoded as hexadecimal, and `emin` is the minimum exponent
as a decimal number.
as a decimal number.
Infinities
Infinities
Either ``-Inf`` or ``Inf`` .
Either `-Inf` or `Inf` .
Quiet NaNs
Quiet NaNs
Quiet NaNs have the MSB of the trailing significand set. If the remaining
Quiet NaNs have the MSB of the trailing significand set. If the remaining
bits of the trailing significand are all zero, the value is displayed as
bits of the trailing significand are all zero, the value is displayed as
``-NaN`` or ``NaN``. Otherwise, ``-NaN:0xT`` where ``T`` are the trailing
`-NaN` or `NaN` . Otherwise, `-NaN:0xT` where `T` are the trailing
significand bits encoded as hexadecimal.
significand bits encoded as hexadecimal.
Signaling NaNs
Signaling NaNs
Displayed as ``-sNaN:0xT`` .
Displayed as `-sNaN:0xT` .
Control flow
## Control flow
============
Branches transfer control to a new EBB and provide values for the target EBB's
Branches transfer control to a new EBB and provide values for the target EBB's
arguments, if it has any. Conditional branches only take the branch if their
arguments, if it has any. Conditional branches only take the branch if their
@ -339,7 +369,7 @@ condition is satisfied, otherwise execution continues at the following
instruction in the EBB.
instruction in the EBB.
JT = jump_table [EBB0, EBB1, ..., EBBn]
JT = jump_table [EBB0, EBB1, ..., EBBn]
Declare a jump table in the :term:`function preamble` .
Declare a jump table in the [function preamble] .
This declares a jump table for use by the `br_table` indirect branch
This declares a jump table for use by the `br_table` indirect branch
instruction. Entries in the table are EBB names.
instruction. Entries in the table are EBB names.
@ -347,9 +377,9 @@ JT = jump_table [EBB0, EBB1, ..., EBBn]
The EBBs listed must belong to the current function, and they can't have
The EBBs listed must belong to the current function, and they can't have
any arguments.
any arguments.
:arg EBB0: Target EBB when ``x = 0`` .
:arg EBB0: Target EBB when `x = 0` .
:arg EBB1: Target EBB when ``x = 1`` .
:arg EBB1: Target EBB when `x = 1` .
:arg EBBn: Target EBB when ``x = n`` .
:arg EBBn: Target EBB when `x = n` .
:result: A jump table identifier. (Not an SSA value).
:result: A jump table identifier. (Not an SSA value).
Traps stop the program because something went wrong. The exact behavior depends
Traps stop the program because something went wrong. The exact behavior depends
@ -359,10 +389,9 @@ traps for certain input value. For example, `udiv` traps when the divisor
is zero.
is zero.
Function calls
## Function calls
==============
A function call needs a target function and a :term:`function signature` . The
A function call needs a target function and a [function signature] . The
target function may be determined dynamically at runtime, but the signature must
target function may be determined dynamically at runtime, but the signature must
be known when the function call is compiled. The function signature describes
be known when the function call is compiled. The function signature describes
how to call the function, including parameters, return values, and the calling
how to call the function, including parameters, return values, and the calling
@ -383,28 +412,24 @@ depend on both the instruction set /// architecture and possibly the operating
system, a function's calling convention is only fully determined by a
system, a function's calling convention is only fully determined by a
`(TargetIsa, CallConv)` tuple.
`(TargetIsa, CallConv)` tuple.
=========== ===========================================
| Name | Description |
Name Description
| ----------| ---------- |
=========== ===========================================
| sret | pointer to a return value in memory |
sret pointer to a return value in memory
| link | return address |
link return address
| fp | the initial value of the frame pointer |
fp the initial value of the frame pointer
| csr | callee-saved register |
csr callee-saved register
| vmctx | VM context pointer, which may contain pointers to heaps etc. |
vmctx VM context pointer, which may contain pointers to heaps etc.
| sigid | signature id, for checking caller/callee signature compatibility |
sigid signature id, for checking caller/callee signature compatibility
| stack_limit | limit value for the size of the stack |
stack_limit limit value for the size of the stack
=========== ===========================================
| Name | Description |
| --------- | ----------- |
========== ===========================================
| fast | not-ABI-stable convention for best performance |
Name Description
| cold | not-ABI-stable convention for infrequently executed code |
========== ===========================================
| system_v | System V-style convention used on many platforms |
fast not-ABI-stable convention for best performance
| fastcall | Windows "fastcall" convention, also used for x64 and ARM |
cold not-ABI-stable convention for infrequently executed code
| baldrdash_system_v | SpiderMonkey WebAssembly convention on platforms natively using SystemV. |
system_v System V-style convention used on many platforms
| baldrdash_windows | SpiderMonkey WebAssembly convention on platforms natively using Windows. |
fastcall Windows "fastcall" convention, also used for x64 and ARM
baldrdash_system_v SpiderMonkey WebAssembly convention on platforms natively using SystemV.
baldrdash_windows SpiderMonkey WebAssembly convention on platforms natively using Windows.
========== ===========================================
The "not-ABI-stable" conventions do not follow an external specification and
The "not-ABI-stable" conventions do not follow an external specification and
may change between versions of Cranelift.
may change between versions of Cranelift.
@ -415,8 +440,7 @@ Parameters and return values have flags whose meaning is mostly target
dependent. These flags support interfacing with code produced by other
dependent. These flags support interfacing with code produced by other
compilers.
compilers.
Functions that are called directly must be declared in the :term:`function
Functions that are called directly must be declared in the [function preamble]:
preamble`:
FN = [colocated] NAME signature
FN = [colocated] NAME signature
Declare a function so it can be called directly.
Declare a function so it can be called directly.
@ -431,61 +455,66 @@ FN = [colocated] NAME signature
This simple example illustrates direct function calls and signatures:
This simple example illustrates direct function calls and signatures:
.. literalinclude:: callex.clif
```
:language: clif
test verifier
:lines: 3-
Indirect function calls use a signature declared in the preamble.
function %gcd(i32 uext, i32 uext) -> i32 uext system_v {
fn0 = %divmod(i32 uext, i32 uext) -> i32 uext, i32 uext
.. _memory:
block1(v0: i32, v1: i32):
brz v1, block3
jump block2
block2:
v2, v3 = call fn0(v0, v1)
return v2
block3:
return v0
}
```
Indirect function calls use a signature declared in the preamble.
Memory
## Memory
======
Cranelift provides fully general `load` and `store` instructions for accessing
Cranelift provides fully general `load` and `store` instructions for accessing
memory, as well as :ref:`extending loads and truncating stores
memory, as well as [extending loads and truncating stores ](#extending-loads-and-truncating-stores ).
< extload-truncstore > `.
If the memory at the given address is not :term:`addressable`, the behavior of
If the memory at the given address is not [addressable] , the behavior of
these instructions is undefined. If it is addressable but not
these instructions is undefined. If it is addressable but not
:term:`accessible`, they :term:`trap`.
[accessible], they [trap] .
There are also more restricted operations for accessing specific types of memory
There are also more restricted operations for accessing specific types of memory
objects.
objects.
Additionally, instructions are provided for handling multi-register addressing.
Additionally, instructions are provided for handling multi-register addressing.
Memory operation flags
### Memory operation flags
----------------------
Loads and stores can have flags that loosen their semantics in order to enable
Loads and stores can have flags that loosen their semantics in order to enable
optimizations.
optimizations.
======== ===========================================
| Flag | Description |
Flag Description
| -------- | ----------- |
======== ===========================================
| notrap | Memory is assumed to be [accessible]. |
notrap Memory is assumed to be :term:`accessible`.
| aligned | Trapping allowed for misaligned accesses. |
aligned Trapping allowed for misaligned accesses.
| readonly | The data at the specified address will not modified between when this function is called and exited. |
readonly The data at the specified address will not
modified between when this function is
called and exited.
======== ===========================================
When the ``accessible`` flag is set, the behavior is undefined if the memory
When the `accessible` flag is set, the behavior is undefined if the memory
is not :term:`accessible` .
is not [accessible].
Loads and stores are *misaligned* if the resultant address is not a multiple of
Loads and stores are *misaligned* if the resultant address is not a multiple of
the expected alignment. By default, misaligned loads and stores are allowed,
the expected alignment. By default, misaligned loads and stores are allowed,
but when the ``aligned`` flag is set, a misaligned memory access is allowed to
but when the `aligned` flag is set, a misaligned memory access is allowed to
:term:`trap` .
[trap] .
Explicit Stack Slots
### Explicit Stack Slots
--------------------
One set of restricted memory operations access the current function's stack
One set of restricted memory operations access the current function's stack
frame. The stack frame is divided into fixed-size stack slots that are
frame. The stack frame is divided into fixed-size stack slots that are
allocated in the :term:`function preamble` . Stack slots are not typed, they
allocated in the [function preamble] . Stack slots are not typed, they
simply represent a contiguous sequence of :term:`accessible` bytes in the stack
simply represent a contiguous sequence of [accessible] bytes in the stack
frame.
frame.
SS = explicit_slot Bytes, Flags...
SS = explicit_slot Bytes, Flags...
@ -504,10 +533,10 @@ the alignment of these stack memory accesses can be inferred from the offsets
and stack slot alignments.
and stack slot alignments.
It's also possible to obtain the address of a stack slot, which can be used
It's also possible to obtain the address of a stack slot, which can be used
in :ref:`unrestricted loads and stores < memory > ` .
in [unrestricted loads and stores ](#memory ) .
The `stack_addr` instruction can be used to macro-expand the stack access
The `stack_addr` instruction can be used to macro-expand the stack access
instructions before instruction selection::
instructions before instruction selection:
v0 = stack_load.f64 ss3, 16
v0 = stack_load.f64 ss3, 16
; Expands to:
; Expands to:
@ -517,8 +546,7 @@ instructions before instruction selection::
When Cranelift code is running in a sandbox, it can also be necessary to include
When Cranelift code is running in a sandbox, it can also be necessary to include
stack overflow checks in the prologue.
stack overflow checks in the prologue.
Global values
### Global values
-------------
A *global value* is an object whose value is not known at compile time. The
A *global value* is an object whose value is not known at compile time. The
value is computed at runtime by `global_value` , possibly using
value is computed at runtime by `global_value` , possibly using
@ -579,8 +607,7 @@ GV = [colocated] symbol Name
:arg Name: External name.
:arg Name: External name.
:result GV: Global value.
:result GV: Global value.
Heaps
### Heaps
-----
Code compiled from WebAssembly or asm.js runs in a sandbox where it can't access
Code compiled from WebAssembly or asm.js runs in a sandbox where it can't access
all process memory. Instead, it is given a small set of memory areas to work
all process memory. Instead, it is given a small set of memory areas to work
@ -588,7 +615,7 @@ in, and all accesses are bounds checked. Cranelift models this through the
concept of *heaps* .
concept of *heaps* .
A heap is declared in the function preamble and can be accessed with the
A heap is declared in the function preamble and can be accessed with the
`heap_addr` instruction that :term:`traps` on out-of-bounds accesses or
`heap_addr` instruction that [traps] on out-of-bounds accesses or
returns a pointer that is guaranteed to trap. Heap addresses can be smaller than
returns a pointer that is guaranteed to trap. Heap addresses can be smaller than
the native pointer size, for example unsigned `i32` offsets on a 64-bit
the native pointer size, for example unsigned `i32` offsets on a 64-bit
architecture.
architecture.
@ -606,27 +633,26 @@ architecture.
A heap appears as three consecutive ranges of address space:
A heap appears as three consecutive ranges of address space:
1. The *mapped pages* are the :term:`accessible` memory range in the heap. A
1. The *mapped pages* are the [accessible] memory range in the heap. A
heap may have a minimum guaranteed size which means that some mapped pages
heap may have a minimum guaranteed size which means that some mapped pages
are always present.
are always present.
2. The *unmapped pages* is a possibly empty range of address space that may be
2. The *unmapped pages* is a possibly empty range of address space that may be
mapped in the future when the heap is grown. They are :term:`addressable` but
mapped in the future when the heap is grown. They are [addressable] but
not :term:`accessible` .
not [accessible] .
3. The *offset-guard pages* is a range of address space that is guaranteed to
3. The *offset-guard pages* is a range of address space that is guaranteed to
always cause a trap when accessed. It is used to optimize bounds checking for
always cause a trap when accessed. It is used to optimize bounds checking for
heap accesses with a shared base pointer. They are :term:`addressable` but
heap accesses with a shared base pointer. They are [addressable] but
not :term:`accessible` .
not [accessible] .
The *heap bound* is the total size of the mapped and unmapped pages. This is
The *heap bound* is the total size of the mapped and unmapped pages. This is
the bound that `heap_addr` checks against. Memory accesses inside the
the bound that `heap_addr` checks against. Memory accesses inside the
heap bounds can trap if they hit an unmapped page (which is not
heap bounds can trap if they hit an unmapped page (which is not
:term:`accessible` ).
[accessible] ).
Two styles of heaps are supported, *static* and *dynamic* . They behave
Two styles of heaps are supported, *static* and *dynamic* . They behave
differently when resized.
differently when resized.
Static heaps
#### Static heaps
~~~~~~~~~~~~
A *static heap* starts out with all the address space it will ever need, so it
A *static heap* starts out with all the address space it will ever need, so it
never moves to a different address. At the base address is a number of mapped
never moves to a different address. At the base address is a number of mapped
@ -646,8 +672,7 @@ H = static Base, min MinBytes, bound BoundBytes, offset_guard OffsetGuardBytes
pages.
pages.
:arg OffsetGuardBytes: Size of the offset-guard pages in bytes.
:arg OffsetGuardBytes: Size of the offset-guard pages in bytes.
Dynamic heaps
#### Dynamic heaps
~~~~~~~~~~~~~
A *dynamic heap* can be relocated to a different base address when it is
A *dynamic heap* can be relocated to a different base address when it is
resized, and its bound can move dynamically. The offset-guard pages move when
resized, and its bound can move dynamically. The offset-guard pages move when
@ -662,53 +687,91 @@ H = dynamic Base, min MinBytes, bound BoundGV, offset_guard OffsetGuardBytes
:arg BoundGV: Global value containing the current heap bound in bytes.
:arg BoundGV: Global value containing the current heap bound in bytes.
:arg OffsetGuardBytes: Size of the offset-guard pages in bytes.
:arg OffsetGuardBytes: Size of the offset-guard pages in bytes.
Heap examples
#### Heap examples
~~~~~~~~~~~~~
The SpiderMonkey VM prefers to use fixed heaps with a 4 GB bound and 2 GB of
The SpiderMonkey VM prefers to use fixed heaps with a 4 GB bound and 2 GB of
offset-guard pages when running WebAssembly code on 64-bit CPUs. The combination
offset-guard pages when running WebAssembly code on 64-bit CPUs. The combination
of a 4 GB fixed bound and 1-byte bounds checks means that no code needs to be
of a 4 GB fixed bound and 1-byte bounds checks means that no code needs to be
generated for bounds checks at all:
generated for bounds checks at all:
.. literalinclude:: heapex-sm64.clif
```
:language: clif
test verifier
:lines: 2-
function %add_members(i32, i64 vmctx) -> f32 baldrdash_system_v {
gv0 = vmctx
gv1 = load.i64 notrap aligned gv0+64
heap0 = static gv1, min 0x1000, bound 0x1_0000_0000, offset_guard 0x8000_0000
block0(v0: i32, v5: i64):
v1 = heap_addr.i64 heap0, v0, 1
v2 = load.f32 v1+16
v3 = load.f32 v1+20
v4 = fadd v2, v3
return v4
}
```
A static heap can also be used for 32-bit code when the WebAssembly module
A static heap can also be used for 32-bit code when the WebAssembly module
declares a small upper bound on its memory. A 1 MB static bound with a single 4
declares a small upper bound on its memory. A 1 MB static bound with a single 4
KB offset-guard page still has opportunities for sharing bounds checking code:
KB offset-guard page still has opportunities for sharing bounds checking code:
.. literalinclude:: heapex-sm32.clif
```
:language: clif
test verifier
:lines: 2-
function %add_members(i32, i32 vmctx) -> f32 baldrdash_system_v {
gv0 = vmctx
gv1 = load.i32 notrap aligned gv0+64
heap0 = static gv1, min 0x1000, bound 0x10_0000, offset_guard 0x1000
block0(v0: i32, v5: i32):
v1 = heap_addr.i32 heap0, v0, 1
v2 = load.f32 v1+16
v3 = load.f32 v1+20
v4 = fadd v2, v3
return v4
}
```
If the upper bound on the heap size is too large, a dynamic heap is required
If the upper bound on the heap size is too large, a dynamic heap is required
instead.
instead.
Finally, a runtime environment that simply allocates a heap with
Finally, a runtime environment that simply allocates a heap with
:c:func:`malloc()` may not have any offset-guard pages at all. In that case,
`malloc()` may not have any offset-guard pages at all. In that case,
full bounds checking is required for each access:
full bounds checking is required for each access:
.. literalinclude:: heapex-dyn.clif
```
:language: clif
test verifier
:lines: 2-
function %add_members(i32, i64 vmctx) -> f32 baldrdash_system_v {
gv0 = vmctx
gv1 = load.i64 notrap aligned gv0+64
gv2 = load.i32 notrap aligned gv0+72
heap0 = dynamic gv1, min 0x1000, bound gv2, offset_guard 0
Tables
block0(v0: i32, v6: i64):
------
v1 = heap_addr.i64 heap0, v0, 20
v2 = load.f32 v1+16
v3 = heap_addr.i64 heap0, v0, 24
v4 = load.f32 v3+20
v5 = fadd v2, v4
return v5
}
```
### Tables
Code compiled from WebAssembly often needs access to objects outside of its
Code compiled from WebAssembly often needs access to objects outside of its
linear memory. WebAssembly uses *tables* to allow programs to refer to opaque
linear memory. WebAssembly uses *tables* to allow programs to refer to opaque
values through integer indices.
values through integer indices.
A table is declared in the function preamble and can be accessed with the
A table is declared in the function preamble and can be accessed with the
`table_addr` instruction that :term:`traps` on out-of-bounds accesses.
`table_addr` instruction that [traps] on out-of-bounds accesses.
Table addresses can be smaller than the native pointer size, for example
Table addresses can be smaller than the native pointer size, for example
unsigned `i32` offsets on a 64-bit architecture.
unsigned `i32` offsets on a 64-bit architecture.
A table appears as a consecutive range of address space, conceptually
A table appears as a consecutive range of address space, conceptually
divided into elements of fixed sizes, which are identified by their index.
divided into elements of fixed sizes, which are identified by their index.
The memory is :term:`accessible` .
The memory is [accessible] .
The *table bound* is the number of elements currently in the table. This is
The *table bound* is the number of elements currently in the table. This is
the bound that `table_addr` checks against.
the bound that `table_addr` checks against.
@ -725,15 +788,13 @@ T = dynamic Base, min MinElements, bound BoundGV, element_size ElementSize
:arg BoundGV: Global value containing the current heap bound in elements.
:arg BoundGV: Global value containing the current heap bound in elements.
:arg ElementSize: Size of each element.
:arg ElementSize: Size of each element.
Constant materialization
### Constant materialization
------------------------
A few instructions have variants that take immediate operands, but in general
A few instructions have variants that take immediate operands, but in general
an instruction is required to load a constant into an SSA value: `iconst` ,
an instruction is required to load a constant into an SSA value: `iconst` ,
`f32const` , `f64const` and `bconst` serve this purpose.
`f32const` , `f64const` and `bconst` serve this purpose.
Bitwise operations
### Bitwise operations
------------------
The bitwise operations and operate on any value type: Integers, floating point
The bitwise operations and operate on any value type: Integers, floating point
numbers, and booleans. When operating on integer or floating point types, the
numbers, and booleans. When operating on integer or floating point types, the
@ -750,20 +811,17 @@ to the number of bits in a *lane*, not the full size of the vector type.
The bit-counting instructions are scalar only.
The bit-counting instructions are scalar only.
Floating point operations
### Floating point operations
-------------------------
These operations generally follow IEEE 754-2008 semantics.
These operations generally follow IEEE 754-2008 semantics.
Sign bit manipulations
#### Sign bit manipulations
~~~~~~~~~~~~~~~~~~~~~~
The sign manipulating instructions work as bitwise operations, so they don't
The sign manipulating instructions work as bitwise operations, so they don't
have special behavior for signaling NaN operands. The exponent and trailing
have special behavior for signaling NaN operands. The exponent and trailing
significand bits are always preserved.
significand bits are always preserved.
Minimum and maximum
#### Minimum and maximum
~~~~~~~~~~~~~~~~~~~
These instructions return the larger or smaller of their operands. Note that
These instructions return the larger or smaller of their operands. Note that
unlike the IEEE 754-2008 `minNum` and `maxNum` operations, these instructions
unlike the IEEE 754-2008 `minNum` and `maxNum` operations, these instructions
@ -771,19 +829,12 @@ return NaN when either input is NaN.
When comparing zeroes, these instructions behave as if :math:`-0.0 < 0.0 ` .
When comparing zeroes, these instructions behave as if :math:`-0.0 < 0.0 ` .
Rounding
#### Rounding
~~~~~~~~
These instructions round their argument to a nearby integral value, still
These instructions round their argument to a nearby integral value, still
represented as a floating point number.
represented as a floating point number.
Conversion operations
### Extending loads and truncating stores
---------------------
.. _extload-truncstore:
Extending loads and truncating stores
-------------------------------------
Most ISAs provide instructions that load an integer value smaller than a register
Most ISAs provide instructions that load an integer value smaller than a register
and extends it to the width of the register. Similarly, store instructions that
and extends it to the width of the register. Similarly, store instructions that
@ -794,45 +845,38 @@ provides extending loads and truncation stores for 8, 16, and 32-bit memory
accesses.
accesses.
These instructions succeed, trap, or have undefined behavior, under the same
These instructions succeed, trap, or have undefined behavior, under the same
conditions as :ref:`normal loads and stores < memory > ` .
conditions as [normal loads and stores ](#memory ) .
ISA-specific instructions
## ISA-specific instructions
=========================
Target ISAs can define supplemental instructions that do not make sense to
Target ISAs can define supplemental instructions that do not make sense to
support generally.
support generally.
x86
### x86
-----
Instructions that can only be used by the x86 target ISA.
Instructions that can only be used by the x86 target ISA.
Codegen implementation instructions
## Codegen implementation instructions
===================================
Frontends don't need to emit the instructions in this section themselves;
Frontends don't need to emit the instructions in this section themselves;
Cranelift will generate them automatically as needed.
Cranelift will generate them automatically as needed.
Legalization operations
### Legalization operations
-----------------------
These instructions are used as helpers when legalizing types and operations for
These instructions are used as helpers when legalizing types and operations for
the target ISA.
the target ISA.
Special register operations
### Special register operations
---------------------------
The prologue and epilogue of a function needs to manipulate special registers like the stack
The prologue and epilogue of a function needs to manipulate special registers like the stack
pointer and the frame pointer. These instructions should not be used in regular code.
pointer and the frame pointer. These instructions should not be used in regular code.
CPU flag operations
### CPU flag operations
-------------------
These operations are for working with the "flags" registers of some CPU
These operations are for working with the "flags" registers of some CPU
architectures.
architectures.
Live range splitting
### Live range splitting
--------------------
Cranelift's register allocator assigns each SSA value to a register or a spill
Cranelift's register allocator assigns each SSA value to a register or a spill
slot on the stack for its entire live range. Since the live range of an SSA
slot on the stack for its entire live range. Since the live range of an SSA
@ -851,16 +895,14 @@ Register values can be temporarily diverted to other registers by the
`regmove` instruction, and to and from stack slots by `regspill`
`regmove` instruction, and to and from stack slots by `regspill`
and `regfill` .
and `regfill` .
Instruction groups
## Instruction groups
==================
All of the shared instructions are part of the `base` instruction
All of the shared instructions are part of the `base` instruction
group.
group.
Target ISAs may define further instructions in their own instruction groups.
Target ISAs may define further instructions in their own instruction groups.
Implementation limits
## Implementation limits
=====================
Cranelift's intermediate representation imposes some limits on the size of
Cranelift's intermediate representation imposes some limits on the size of
functions and the number of entities allowed. If these limits are exceeded, the
functions and the number of entities allowed. If these limits are exceeded, the
@ -904,19 +946,16 @@ Size of function call arguments on the stack
This is probably not possible to achieve given the limit on the number of
This is probably not possible to achieve given the limit on the number of
arguments, except by requiring extremely large offsets for stack arguments.
arguments, except by requiring extremely large offsets for stack arguments.
Glossary
## Glossary
========
.. glossary::
addressable
addressable
Memory in which loads and stores have defined behavior. They either
Memory in which loads and stores have defined behavior. They either
succeed or :term:`trap` , depending on whether the memory is
succeed or [trap] , depending on whether the memory is
:term:`accessible` .
[accessible] .
accessible
accessible
:term:`Addressable` memory in which loads and stores always succeed
[Addressable] memory in which loads and stores always succeed
without :term:`trapping` , except where specified otherwise (eg. with the
without [trapping] , except where specified otherwise (eg. with the
`aligned` flag). Heaps, globals, tables, and the stack may contain
`aligned` flag). Heaps, globals, tables, and the stack may contain
accessible, merely addressable, and outright unaddressable regions.
accessible, merely addressable, and outright unaddressable regions.
There may also be additional regions of addressable and/or accessible
There may also be additional regions of addressable and/or accessible
@ -928,7 +967,7 @@ Glossary
the last instruction.
the last instruction.
entry block
entry block
The :term:`EBB` that is executed first in a function. Currently, a
The [EBB] that is executed first in a function. Currently, a
Cranelift function must have exactly one entry block which must be the
Cranelift function must have exactly one entry block which must be the
first block in the function. The types of the entry block arguments must
first block in the function. The types of the entry block arguments must
match the types of arguments in the function signature.
match the types of arguments in the function signature.
@ -936,12 +975,12 @@ Glossary
extended basic block
extended basic block
EBB
EBB
A maximal sequence of instructions that can only be entered from the
A maximal sequence of instructions that can only be entered from the
top, and that contains no :term:`terminator instruction`\ s except for
top, and that contains no [terminator instruction] s except for
the last one. An EBB can contain conditional branches that can fall
the last one. An EBB can contain conditional branches that can fall
through to the following instructions in the block, but only the first
through to the following instructions in the block, but only the first
instruction in the EBB can be a branch target.
instruction in the EBB can be a branch target.
The last instruction in an EBB must be a :term:`terminator instruction` ,
The last instruction in an EBB must be a [terminator instruction] ,
so execution cannot flow through to the next EBB in the function. (But
so execution cannot flow through to the next EBB in the function. (But
there may be a branch to the next EBB.)
there may be a branch to the next EBB.)
@ -971,7 +1010,7 @@ Glossary
- Type and flags of each return value.
- Type and flags of each return value.
Not all function attributes are part of the signature. For example, a
Not all function attributes are part of the signature. For example, a
function that never returns could be marked as ``noreturn`` , but that
function that never returns could be marked as `noreturn` , but that
is not necessary to know when calling it, so it is just an attribute,
is not necessary to know when calling it, so it is just an attribute,
and not part of the signature.
and not part of the signature.
@ -996,17 +1035,17 @@ Glossary
stack slot
stack slot
A fixed size memory allocation in the current function's activation
A fixed size memory allocation in the current function's activation
frame. These include :term:`explicit stack slot`\ s and
frame. These include [explicit stack slot] s and
:term:`spill stack slot`\ s.
[spill stack slot] s.
explicit stack slot
explicit stack slot
A fixed size memory allocation in the current function's activation
A fixed size memory allocation in the current function's activation
frame. These differ from :term:`spill stack slot`\ s in that they can
frame. These differ from [spill stack slot] s in that they can
be created by frontends and they may have their addresses taken.
be created by frontends and they may have their addresses taken.
spill stack slot
spill stack slot
A fixed size memory allocation in the current function's activation
A fixed size memory allocation in the current function's activation
frame. These differ from :term:`explicit stack slot`\ s in that they are
frame. These differ from [explicit stack slot] s in that they are
only created during register allocation, and they may not have their
only created during register allocation, and they may not have their
address taken.
address taken.