This commit improves make([]T, len) to be closer to upstream Go. The
difference is unlikely to have much real-world effect, but previously
certain make([]T, len) expressions would not result in a slice out of
bounds error in TinyGo while they would have done such a thing in Go
proper. In practice, available RAM is likely to be a bigger limiting
factor.
This commit adds support for the following packages:
- crypto/md5
- crypto/sha1
- crypto/sha256
- crypto/sha512
They would normally need assembly implementations, but with these
aliases they already work everywhere.
This makes them more flexible, especially with Go 1.17 making the
situation more complicated (see
1d20a362d0).
It also makes it possible to do the same for many other functions, such
as assembly implementations of cryptographich functions which are
similarly dependent on the architecture.
This patch adds a new pragma for functions and globals to set the
section name. This can be useful to place a function or global in a
special device specific section, for example:
* Functions may be placed in RAM to make them run faster, or in flash
(if RAM is the default) to not let them take up RAM.
* DMA memory may only be placed in a special memory area.
* Some RAM may be faster than other RAM, and some globals may be
performance critical thus placing them in this special RAM area can
help.
* Some (large) global variables may need to be placed in external RAM,
which can be done by placing them in a special section.
To use it, you have to place a function or global in a special section,
for example:
//go:section .externalram
var externalRAMBuffer [1024]byte
This can then be placed in a special section of the linker script, for
example something like this:
.bss.extram (NOLOAD) : {
*(.externalram)
} > ERAM
These pragmas weren't really tested anywhere, except that some code
might break if they are not properly applied.
These tests make it easy to see they work correctly and also provide a
logical place to add new pragma tests.
I've also made a slight change to how functions and globals are created:
with the change they're also created in the IR even if they're not
referenced. This makes testing easier.
This commit includes two changes:
* It makes unexported interface methods package-private, so that it's
not possible to type-assert on an unexported method in a different
package.
* It makes the globals used to identify interface methods defined
globals, so that they can (eventually) be left in the program for an
eventual non-LTO build mode.
Do not store the context parameter (which is used for closures and
function pointers) in the goroutine start parameter bundle for direct
functions that don't need a context parameter. This avoids storing the
(undef) context parameter and thus makes the IR to start a new goroutine
simpler in most cases.
This reduces code size in the channel.go and goroutines.go tests.
Surprisingly, all test cases (when compiled with -target=microbit) have
a changed binary, I haven't investigated why but I suppose the codegen
is slightly different for the runtime.run function (which starts the
main goroutine).
Closure variables are allocated in a parent function and are thus never
nil. Don't do a nil check before reading or modifying the value.
This commit results in a slight reduction in code size in some test
cases: calls.go, channel.go, goroutines.go, json.go, sort.go -
presumably wherever closures are used.
Not sure why you would ever do this, but it appears to be allowed by the
Go specification and previously TinyGo would crash with an unhelpful
error message when you would do this. I don't see any practical use of
it.
The implementation simply runs the builtin directly.
This commit adds a test for both WebAssembly and Cortex-M targets (which
use a different way of goroutine lowering) to show how they lower
goroutines. It makes it easier to show how the output changes in future
commits.
Move the code from the compiler.go file to the goroutine.go file, which
is a more appropriate place. This keeps all the goroutine related code
in one file, to make it easier to find.
This results in smaller and likely more efficient code. It does require
some architecture specific code for each architecture, but I've kept the
amount of code as small as possible.
The next commit will change the implementation of func values on Linux
as a result of switching to a task-based scheduler. To keep the
compiler/testdata/func.go test working as expected, switch to
WebAssembly tests.
With this is possible to enable e.g., SIMD in WASM using -llvm-features
+simd128. Multiple features can be specified separated by comma,
e.g., -llvm-features +simd128,+tail-call
With help from @deadprogram and @aykevl.
In many cases, position information is not stored in Go SSA instructions
because they don't exit directly in the source code. This includes
implicit type conversions, implicit returns at the end of a function,
the creation of a (hidden) slice when calling a variadic function, and
many other cases. I'm not sure where this information is supposed to
come from, but this patch takes the value (usually) from the value the
instruction refers to. This seems to work well for these implicit
conversions.
I've also added a few extra tests to the heap-to-stack transform pass,
of which one requires this improved position information.
This allows better escape analysis even without being able to see the
entire program. This makes the stack allocation test case more complete
but probably won't have much of an effect outside of that (as the
compiler is able to infer these attributes in the whole-program
functionattrs pass).
There is no good reason for func values to refer to interface type
codes. The only thing they need is a stable identifier for function
signatures, which is easily created as a new kind of globals. Decoupling
makes it easier to change interface related code.
This is basically just a golden test for the "switch" style of func
lowering. The next commit will make changes to this lowering, which will
be visible in the test output.
Some errors were generated but never returned or never checked in the
test function. That's a problem. Therefore this commit fixes this
oversight (by me).
This commit optimizes string literals and globals by setting the
appropriate alignment and using a nil pointer in zero-length strings.
- Setting the alignment for string values has a surprisingly large
effect, up to around 2% in binary size. I suspect that LLVM will
pick some default alignment for larger byte arrays if no alignment
has been specified and forcing an alignment of 1 will pack all
strings closer together.
- Using nil for zero-length strings also has a positive effect, but
I'm not sure why. Perhaps it makes some optimizations more trivial.
- Always setting the alignment on globals improves code size slightly,
probably for the same reasons setting the alignment of string
literals improves code size. The effect is much smaller, however.
This commit might have an effect on performance, but if it does this
should be tested separately and such a large win in binary size should
definitely not be ignored for small embedded systems.
Sometimes, LLVM may rename named structs when merging modules.
Therefore, we can't rely on typecodeID structs to retain their struct
names.
This commit changes the interface lowering pass to not rely on these
names. The interp package does however still rely on this name, but I
hope to fix that in the future.
This commit adds a new transform that converts reflect Implements()
calls to runtime.interfaceImplements. At the moment, the Implements()
method is not yet implemented (how ironic) but if the value passed to
Implements is known at compile time the method call can be optimized to
runtime.interfaceImplements to make it a regular interface assert.
This commit is the last change necessary to add basic support for the
encoding/json package. The json package is certainly not yet fully
supported, but some trivial objects can be converted to JSON.
This is important as golden test output and to verify that the output is
correct. Later improvements and bug fixes are clearly visible in the IR,
and unintentional changes will also be immediately spotted.
This patch fixes a use of the global context. I've seen a few instances
of crashes in the llvm.ConstInt function when called from
makeStructTypeFields, which I believe are caused by this bug.
Previously there was code to avoid impossible type asserts but it wasn't
great and in fact was too aggressive when combined with reflection.
This commit improves this by checking all types that exist in the
program that may appear in an interface (even struct fields and the
like) but without creating runtime.typecodeID objects with the type
assert. This has two advantages:
* As mentioned, it optimizes impossible type asserts away.
* It allows methods on types that were only asserted on (in
runtime.typeAssert) but never used in an interface to be optimized
away using GlobalDCE. This may have a cascading effect so that other
parts of the code can be further optimized.
This sometimes massively improves code size and mostly negates the code
size regression of the previous commit.
This distinction was useful before when reflect wasn't properly
supported. Back then it made sense to only include method sets that were
actually used in an interface. But now that it is possible to get to
other values (for example, by extracting fields from structs) and it is
possible to turn them back into interfaces, it is necessary to preserve
all method sets that can possibly be used in the program in a type
assert, interface assert or interface method call.
In the future, this logic will need to be revisited again when
reflect.New or reflect.Zero gets implemented.
Code size increases a bit in some cases, but usually in a very limited
way (except for one outlier in the drivers smoke tests). The next commit
will improve the situation significantly.
Previously we used the --export-all linker flag to export most
functions. However, this is not needed and possibly increases binary
size. Instead, we should be exporting the specific functions to be
exported.
An allocated object is never nil, so there is no need for a nil check.
This probably does not result in any better optimization (the nil check
is easily optimized away by LLVM because the result of runtime.alloc is
marked nonnull) but it makes the slice tests a bit cleaner.
It's difficult to create clean test cases while remaining compatible
with multiple LLVM versions. Most test outputs are much more readable
after an instcombine pass but instcombine rules change between LLVM
versions, leading to different (but semantically equivalent) test
outputs.
This reduces the test coverage a little bit (because old LLVM versions
aren't tested as well), but it als makes it easier to add more complex
tests.
In the future it might be a good idea to make the compiler output a bit
less messy so these workarounds are not needed.
This commit switches from the previous behavior of compiling the whole
program at once, to compiling every package in parallel and linking the
LLVM bitcode files together for further whole-program optimization.
This is a small performance win, but it has several advantages in the
future:
- There are many more things that can be done per package in parallel,
avoiding the bottleneck at the end of the compiler phase. This
should speed up the compiler futher.
- This change is a necessary step towards a non-LTO build mode for
fast incremental builds that only rebuild the changed package, when
compiler speed is more important than binary size.
- This change refactors the compiler in such a way that it will be
easier to inspect the IR for one package only. Inspecting this IR
will be very helpful for compiler developers.
The SimpleDCE pass was previously used to only compile the parts of the
program that were in use. However, lately the only real purpose has been
to speed up the compiler a bit by only compiling the necessary
functions.
This pass however is a problem for compiling (and caching) packages in
parallel. Therefore, this commit removes it as a preparatory step
towards that goal.
This is a leftover from a long time ago, when everything was still in
the global context. The fact that this uses the global context is most
certainly a bug.
I have seen occasional crashes in the build-packages-indepedently branch
(and PRs based on it) which I suspect are caused by this bug. I think
this is a long-dormant bug that only surfaced when doing the compilation
steps in parallel.
This doesn't yet add support for actually making use of variadic
functions, but at least allows (unintended) variadic functions like the
following to work:
void foo();
Because of a bug in the ARM backend of LLVM, the cmpxchg instruction is
lowered using ldrexd/strexd instructions which don't exist on Cortex-M
cores. This leads to an "undefined instruction" exception at runtime.
Therefore, this patch works around this by lowering directly to a call
to the __sync_val_compare_and_swap_8 function, which is what the backend
should be doing.
For details, see: https://reviews.llvm.org/D95891
To test this patch, you can run the code on a Cortex-M3 or higher
microcontroller, for example:
tinygo flash -target=pca10040 ./testdata/atomic.go
Before this patch, this would trigger an error. With this patch, the
behavior is correct. The error (without this patch) could look like
this:
fatal error: undefined instruction with sp=0x200007cc pc=nil
Moving settings to a separate config struct has two benefits:
- It decouples the compiler a bit from other packages, most
importantly the compileopts package. Decoupling is generally a good
thing.
- Perhaps more importantly, it precisely specifies which settings are
used while compiling and affect the resulting LLVM module. This will
be necessary for caching the LLVM module.
While it would have been possible to cache without this refactor, it
would have been very easy to miss a setting and thus let the
compiler work with invalid/stale data.