This makes sure that the LLVM target features match the one generated by
Clang:
- This fixes a bug introduced when setting the target CPU for all
targets: Cortex-M4 would now start using floating point operations
while they were disabled in C.
- This will make it possible in the future to inline C functions in Go
and vice versa. This will need some more work though.
There is a code size impact. Cortex-M4 targets are increased slightly in
binary size while Cortex-M0 targets tend to be reduced a little bit.
Other than that, there is little impact.
Instead of doing everything in the interrupt lowering pass, generate
some more code in gen-device to declare interrupt handler functions and
do some work in the compiler so that interrupt lowering becomes a lot
simpler.
This has several benefits:
- Overall code is smaller, in particular the interrupt lowering pass.
- The code should be a bit less "magical" and instead a bit easier to
read. In particular, instead of having a magic
runtime.callInterruptHandler (that is fully written by the interrupt
lowering pass), the runtime calls a generated function like
device/sifive.InterruptHandler where this switch already exists in
code.
- Debug information is improved. This can be helpful during actual
debugging but is also useful for other uses of DWARF debug
information.
For an example on debug information improvement, this is what a
backtrace might look like before this commit:
Breakpoint 1, 0x00000b46 in UART0_IRQHandler ()
(gdb) bt
#0 0x00000b46 in UART0_IRQHandler ()
#1 <signal handler called>
[..etc]
Notice that the debugger doesn't see the source code location where it
has stopped.
After this commit, breaking at the same line might look like this:
Breakpoint 1, (*machine.UART).handleInterrupt (arg1=..., uart=<optimized out>) at /home/ayke/src/github.com/tinygo-org/tinygo/src/machine/machine_nrf.go:200
200 uart.Receive(byte(nrf.UART0.RXD.Get()))
(gdb) bt
#0 (*machine.UART).handleInterrupt (arg1=..., uart=<optimized out>) at /home/ayke/src/github.com/tinygo-org/tinygo/src/machine/machine_nrf.go:200
#1 UART0_IRQHandler () at /home/ayke/src/github.com/tinygo-org/tinygo/src/device/nrf/nrf51.go:176
#2 <signal handler called>
[..etc]
By now, the debugger sees an actual source location for UART0_IRQHandler
(in the generated file) and an inlined function.
The target triples have to match mostly to be able to link LLVM modules.
Linking LLVM modules is already possible (the triples already match),
but testing becomes much easier when they match exactly.
For macOS, I picked "macosx10.12.0". That's an old and unsupported
version, but I had to pick _something_. Clang by default uses
"macos10.4.0", which is much older.
This matches the behavior of Clang, which uses optsize for -Os and adds
minsize for -Oz.
The code size change is all over the map, but using a hacked together
size comparison tool I've found that there is a slight reduction in
binary size overall (-1.6% with the tinygo smoke tests and -0.8% for the
drivers smoke test).
This commit has a few related changes:
* It sets the optsize attribute immediately in the compiler instead of
adding it to each function afterwards in a loop. This seems to me
like the more appropriate way to do it.
* It centralizes setting the optsize attribute in the transform
package, to make later changes easier.
* It sets the optsize in a few more places: to runtime.initAll and to
WebAssembly i64 wrappers.
This commit does not affect the binary size of any of the smoke tests,
so should be risk-free.
This commit adds object layout information to new heap allocations. It
is not yet used anywhere: the next commit will make use of it.
Object layout information will eventually be used for a (mostly) precise
garbage collector. This is what the data is made for. However, it is
also useful in the interp package which can work better if it knows the
memory layout and thus the approximate LLVM type of heap-allocated
objects.
This layout parameter is currently always nil and ignored, but will
eventually contain a pointer to a memory layout.
This commit also adds module verification to the transform tests, as I
found out that it didn't (and therefore didn't initially catch all
bugs).
This commit simplifies the IR a little bit: instead of calling
pseudo-functions runtime.interfaceImplements and
runtime.interfaceMethod, real declared functions are being called that
are then defined in the interface lowering pass. This should simplify
the interaction between various transformation passes. It also reduces
the number of lines of code, which is generally a good thing.
This adds support for a construct like this:
type foo func(fn foo)
Unfortunately, LLVM cannot create function pointers that look like this.
LLVM only supports named types for structs (not for pointers) and thus
can't add a pointer to a function type of the same type to a parameter
of that function type.
The fix is simple: cast all function pointers to a void function, in
LLVM IR:
void ()*
Raw function pointers are cast to this type before storing, and cast
back to the regular function type before calling. This means that
function parameters will never refer to its own type because raw
function types are fixed at that one type.
Somehow, this does have an effect on binary size in some cases. The
effect is small and goes both ways. On top of that, there is work
underway in LLVM which would make all pointer types opaque (without a
pointee type). This would make this whole commit useless and therefore
should fix any size increases that might happen.
https://llvm.org/docs/OpaquePointers.html
The division and remainder operations were lowered directly to LLVM IR.
This is wrong however because the Go specification defines exactly what
happens on a divide by zero or signed integer overflow and LLVM IR
itself treats those cases as undefined behavior. Therefore, this commit
implements divide by zero and signed integer overflow according to the
Go specification.
This does have an impact on the generated code, but it is surprisingly
small. I've used the drivers repo to test the code before and after, and
to my surprise most driver smoke tests are not changed at all. Those
that are, have only a small increase in code size. At the same time,
this change makes TinyGo more compliant to the Go specification.
This attribute is also set by Clang when it compiles C source files
(unless -fexceptions is set). The advantage is that no unwind tables are
emitted on Linux (and perhaps other systems). It also avoids
__aeabi_unwind_cpp_pr0 on ARM when using the musl libc.
... instead of setting a special -target= value. This is more robust and
makes sure that the test actually tests different arcitectures as they
would be compiled by TinyGo. As an example, the bug of the bugfix in the
previous commit ("arm: use armv7 instead of thumbv7") would have been
caught if this change was applied earlier.
I've decided to put GOOS/GOARCH in compileopts.Options, as it makes
sense to me to treat them the same way as command line parameters.
For example, the following did not work before but does work with this
change:
// int add(int a, int b) {
// return a + b;
// }
import "C"
func main() {
println("add:", C.add(3, 5))
}
Even better, the functions in the header are compiled together with the
rest of the Go code and so they can be optimized together! Currently,
inlining is not yet allowed but const-propagation across functions
works. This should be improved in the future.
This commit changes a target triple like "armv6m-none-eabi" to
"armv6m-unknown-unknow-eabi". The reason is that while the former is
correctly parsed in Clang (due to normalization), it wasn't parsed
correctly in LLVM meaning that the environment wasn't set to EABI.
This change normalizes all target triples and uses the EABI environment
(-eabi in the triple) for Cortex-M targets.
This change also drops the `--target=` flag in the target JSON files,
the flag is now added implicitly in `(*compileopts.Config).CFlags()`.
This removes some duplication in target JSON files.
Unfortunately, this change also increases code size for Cortex-M
targets. It looks like LLVM now emits calls like __aeabi_memmove instead
of memmove, which pull in slightly more code (they basically just call
the regular C functions) and the calls themself don't seem to be as
efficient as they could be. Perhaps this is a LLVM bug that will be
fixed in the future, as this is a very common occurrence.
This is a loose collection of small fixes flagged by staticcheck:
- dead code
- regexp expressions not using backticks (`foobar` / "foobar")
- redundant types of slice and map initializers
- misc other fixes
Not all of these seem very useful to me, but in particular dead code is
nice to fix. I've fixed them all just so that if there are problems,
they aren't hidden in the noise of less useful issues.
For example, in this code:
type kv struct {
v float32
}
func foo(a *kv) {
type kv struct {
v byte
}
}
Both 'kv' types would be given the same LLVM type, even though they are
different types! This is fixed by only creating a LLVM type once per Go
type (types.Type).
As an added bonus, this change gives a performance improvement of about
0.4%. Not that much, but certainly not nothing for such a small change.
This commit improves make([]T, len) to be closer to upstream Go. The
difference is unlikely to have much real-world effect, but previously
certain make([]T, len) expressions would not result in a slice out of
bounds error in TinyGo while they would have done such a thing in Go
proper. In practice, available RAM is likely to be a bigger limiting
factor.
This commit adds support for the following packages:
- crypto/md5
- crypto/sha1
- crypto/sha256
- crypto/sha512
They would normally need assembly implementations, but with these
aliases they already work everywhere.
This makes them more flexible, especially with Go 1.17 making the
situation more complicated (see
1d20a362d0).
It also makes it possible to do the same for many other functions, such
as assembly implementations of cryptographich functions which are
similarly dependent on the architecture.
This patch adds a new pragma for functions and globals to set the
section name. This can be useful to place a function or global in a
special device specific section, for example:
* Functions may be placed in RAM to make them run faster, or in flash
(if RAM is the default) to not let them take up RAM.
* DMA memory may only be placed in a special memory area.
* Some RAM may be faster than other RAM, and some globals may be
performance critical thus placing them in this special RAM area can
help.
* Some (large) global variables may need to be placed in external RAM,
which can be done by placing them in a special section.
To use it, you have to place a function or global in a special section,
for example:
//go:section .externalram
var externalRAMBuffer [1024]byte
This can then be placed in a special section of the linker script, for
example something like this:
.bss.extram (NOLOAD) : {
*(.externalram)
} > ERAM
These pragmas weren't really tested anywhere, except that some code
might break if they are not properly applied.
These tests make it easy to see they work correctly and also provide a
logical place to add new pragma tests.
I've also made a slight change to how functions and globals are created:
with the change they're also created in the IR even if they're not
referenced. This makes testing easier.
This commit includes two changes:
* It makes unexported interface methods package-private, so that it's
not possible to type-assert on an unexported method in a different
package.
* It makes the globals used to identify interface methods defined
globals, so that they can (eventually) be left in the program for an
eventual non-LTO build mode.
Do not store the context parameter (which is used for closures and
function pointers) in the goroutine start parameter bundle for direct
functions that don't need a context parameter. This avoids storing the
(undef) context parameter and thus makes the IR to start a new goroutine
simpler in most cases.
This reduces code size in the channel.go and goroutines.go tests.
Surprisingly, all test cases (when compiled with -target=microbit) have
a changed binary, I haven't investigated why but I suppose the codegen
is slightly different for the runtime.run function (which starts the
main goroutine).
Closure variables are allocated in a parent function and are thus never
nil. Don't do a nil check before reading or modifying the value.
This commit results in a slight reduction in code size in some test
cases: calls.go, channel.go, goroutines.go, json.go, sort.go -
presumably wherever closures are used.
Not sure why you would ever do this, but it appears to be allowed by the
Go specification and previously TinyGo would crash with an unhelpful
error message when you would do this. I don't see any practical use of
it.
The implementation simply runs the builtin directly.
This commit adds a test for both WebAssembly and Cortex-M targets (which
use a different way of goroutine lowering) to show how they lower
goroutines. It makes it easier to show how the output changes in future
commits.
Move the code from the compiler.go file to the goroutine.go file, which
is a more appropriate place. This keeps all the goroutine related code
in one file, to make it easier to find.
This results in smaller and likely more efficient code. It does require
some architecture specific code for each architecture, but I've kept the
amount of code as small as possible.
The next commit will change the implementation of func values on Linux
as a result of switching to a task-based scheduler. To keep the
compiler/testdata/func.go test working as expected, switch to
WebAssembly tests.
With this is possible to enable e.g., SIMD in WASM using -llvm-features
+simd128. Multiple features can be specified separated by comma,
e.g., -llvm-features +simd128,+tail-call
With help from @deadprogram and @aykevl.
In many cases, position information is not stored in Go SSA instructions
because they don't exit directly in the source code. This includes
implicit type conversions, implicit returns at the end of a function,
the creation of a (hidden) slice when calling a variadic function, and
many other cases. I'm not sure where this information is supposed to
come from, but this patch takes the value (usually) from the value the
instruction refers to. This seems to work well for these implicit
conversions.
I've also added a few extra tests to the heap-to-stack transform pass,
of which one requires this improved position information.
This allows better escape analysis even without being able to see the
entire program. This makes the stack allocation test case more complete
but probably won't have much of an effect outside of that (as the
compiler is able to infer these attributes in the whole-program
functionattrs pass).
There is no good reason for func values to refer to interface type
codes. The only thing they need is a stable identifier for function
signatures, which is easily created as a new kind of globals. Decoupling
makes it easier to change interface related code.
This is basically just a golden test for the "switch" style of func
lowering. The next commit will make changes to this lowering, which will
be visible in the test output.
Some errors were generated but never returned or never checked in the
test function. That's a problem. Therefore this commit fixes this
oversight (by me).
This commit optimizes string literals and globals by setting the
appropriate alignment and using a nil pointer in zero-length strings.
- Setting the alignment for string values has a surprisingly large
effect, up to around 2% in binary size. I suspect that LLVM will
pick some default alignment for larger byte arrays if no alignment
has been specified and forcing an alignment of 1 will pack all
strings closer together.
- Using nil for zero-length strings also has a positive effect, but
I'm not sure why. Perhaps it makes some optimizations more trivial.
- Always setting the alignment on globals improves code size slightly,
probably for the same reasons setting the alignment of string
literals improves code size. The effect is much smaller, however.
This commit might have an effect on performance, but if it does this
should be tested separately and such a large win in binary size should
definitely not be ignored for small embedded systems.
Sometimes, LLVM may rename named structs when merging modules.
Therefore, we can't rely on typecodeID structs to retain their struct
names.
This commit changes the interface lowering pass to not rely on these
names. The interp package does however still rely on this name, but I
hope to fix that in the future.