This gives a small improvement now, and is needed to be able to use the
Heap2Stack transform that's available in the Attributor pass. This
Heap2Stack transform could replace our custom OptimizeAllocs pass.
Most of the changes are just IR that changed, the actual change is
relatively small.
To give an example of why this is useful, here is the code size before
this change:
$ tinygo build -o test -size=short ./testdata/stdlib.go
code data bss | flash ram
95620 1812 968 | 97432 2780
$ tinygo build -o test -size=short ./testdata/stdlib.go
code data bss | flash ram
95380 1812 968 | 97192 2780
That's a 0.25% reduction. Not a whole lot, but nice for such a small
patch.
This has two benefits:
1. It attributes these bytes to the internal/task package (in
-size=full), instead of (unknown).
2. It makes it possible to print the stack sizes variable in GDB.
This is what it might look like in GDB:
(gdb) p 'internal/task.stackSizes'
$13 = {344, 120, 80, 2048, 360, 112, 80, 120, 2048, 2048}
This is a big commit that changes the way runtime type information is stored in
the binary. Instead of compressing it and storing it in a number of sidetables,
it is stored similar to how the Go compiler toolchain stores it (but still more
compactly).
This has a number of advantages:
* It is much easier to add new features to reflect support. They can simply
be added to these structs without requiring massive changes (especially in
the reflect lowering pass).
* It removes the reflect lowering pass, which was a large amount of hard to
understand and debug code.
* The reflect lowering pass also required merging all LLVM IR into one
module, which is terrible for performance especially when compiling large
amounts of code. See issue 2870 for details.
* It is (probably!) easier to reason about for the compiler.
The downside is that it increases code size a bit, especially when reflect is
involved. I hope to fix some of that in later patches.
ThinLTO results in a small code size reduction, which is nice
(especially on these very small chips). It also brings us one step
closer to using ThinLTO everywhere.
This flag controls whether to convert external i64 parameters for use in
a browser-like environment.
This flag was needed in the past because back then we only supported
wasm on browsers but no WASI. Now, I can't think of a reason why anybody
would want to change the default. For `-target=wasm` (used for
browser-like environments), the wasm_exec.js file expects this
i64-via-stack ABI. For WASI, there is no limitation on i64 values and
`-wasm-abi=generic` is the default.
There was a bug in the wasm ABI lowering pass (found using
AddressSanitizer on LLVM 15) that resulted in a rather subtle memory
corruption. This commit fixes this issues.
A number of llvm.Const* functions (in particular extractvalue and
insertvalue) were removed in LLVM 15, so we have to use a builder
instead. This builder will create the same constant values, it simply
uses a different API.
Scanning of allocas was entirely broken on WebAssembly. The code
intended to do this was never run. There were also no tests.
Looking into this further, I found that it is actually not really
necessary to do that: the C stack can be scanned conservatively and in
fact this was already done for goroutine stacks (because they live on
the heap and are always referenced). It wasn't done for the system stack
however.
With these fixes, I believe code should be both faster *and* more
correct.
I found this in my work to get opaque pointers supported in LLVM 15,
because the code that was never reached now finally got run and was
actually quite buggy.
Go 1.19 started reformatting code in a way that makes it more obvious
how it will be rendered on pkg.go.dev. It gets it almost right, but not
entirely. Therefore, I had to modify some of the comments so that they
are formatted correctly.
Previously, the MakeGCStackSlots pass would attempt to pop the stack chain before a tail call.
This resulted in use-after-free bugs when the tail call allocated memory and used a value allocated by its caller.
Instead of trying to move the stack chain pop, remove the tail flag from the call.
Precise globals require a whole program optimization pass that is hard
to support when building packages separately. This patch removes support
for these globals by converting the last use (Linux) to use
linker-defined symbols instead.
For details, see: https://github.com/tinygo-org/tinygo/issues/2870
This shrinks transform.Optimize() a little bit, working towards the goal
of https://github.com/tinygo-org/tinygo/issues/2870. I ran the smoke
tests and there is no practical downside: one test got smaller (??) and
one had a different .hex hash, but other than that there was no
difference.
This should also make TinyGo a liiitle bit faster but it's probably not
even measurable.
The transform package is the more appropriate location for package-level
optimizations, to match `transform.Optimize` for whole-program
optimizations.
This is just a refactor, to make later changes easier to read.
In https://github.com/tinygo-org/tinygo/issues/2777, a poison value
ended up in `runtime.alloc`. This shouldn't happen, especially not for
well written code. So I'm not sure why it happens. But here is a fix
anyway.
There used to be a difference between `byte` and `uint8` in interface
methods. These are aliases, so they should be treated the same.
This patch introduces a custom serialization format for types,
circumventing the `Type.String()` method that is slightly wrong for our
purposes.
This also fixes an issue with the `any` keyword in Go 1.18, which
suffers from the same problem (but this time actually leads to a crash).
ThinLTO optimizes across LLVM modules at link time. This means that
optimizations (such as inlining and const-propagation) are possible
between C and Go. This makes this change especially useful for CGo, but
not just for CGo. By doing some optimizations at link time, the linker
can discard some unused functions and this leads to a size reduction on
average. It does increase code size in some cases, but that's true for
most optimizations.
I've excluded a number of targets for now (wasm, avr, xtensa, windows,
macos). They can probably be supported with some more work, but that
should be done in separate PRs.
Overall, this change results in an average 3.24% size reduction over all
the tinygo.org/x/drivers smoke tests.
TODO: this commit runs part of the pass pipeline twice. We should set
the PrepareForThinLTO flag in the PassManagerBuilder for even further
reduced code size (0.7%) and improved compilation speed.
This removes the parentHandle argument from the internal calling convention.
It was formerly used to implment coroutines.
Now that coroutines have been removed, it is no longer necessary.
This adds support for building with `-tags=llvm13` and switches to LLVM
13 for tinygo binaries that are statically linked against LLVM.
Some notes on this commit:
* Added `-mfloat-abi=soft` to all Cortex-M targets because otherwise
nrfx would complain that floating point was enabled on Cortex-M0.
That's not the case, but with `-mfloat-abi=soft` the `__SOFTFP__`
macro is defined which silences this warning.
See: https://reviews.llvm.org/D100372
* Changed from `--sysroot=<root>` to `-nostdlib -isystem <root>` for
musl because with Clang 13, even with `--sysroot` some system
libraries are used which we don't want.
* Changed all `-Xclang -internal-isystem -Xclang` to simply
`-isystem`, for consistency with the above change. It appears to
have the same effect.
* Moved WebAssembly function declarations to the top of the file in
task_asyncify_wasm.S because (apparently) the assembler has become
more strict.
When I wrote the code originally, I didn't know about SetAlignment so I
hacked a way around it by allocating [...]uintptr types. However, this
allocates a few too many bytes in some cases.
This commit changes this to only allocate the space that we actually
need.
The code size effect is mixed, but generally positive. The combined
average is reduced by 0.27% with more programs being reduced in size
than are increasing in size.
This change implements a new "scheduler" for WebAssembly using binaryen's asyncify transform.
This is more reliable than the current "coroutines" transform, and works with non-Go code in the call stack.
runtime (js/wasm): handle scheduler nesting
If WASM calls into JS which calls back into WASM, it is possible for the scheduler to nest.
The event from the callback must be handled immediately, so the task cannot simply be deferred to the outer scheduler.
This creates a minimal scheduler loop which is used to handle such nesting.