This commit parallelizes almost everything that can currently be
parallelized. With that, it also introduces a framework for easily
parallelizing other parts of the compiler.
Code for baremetal targets already compiles slightly faster because it
can parallelize the compilation of supporting assembly files. However,
the speedup is especially noticeable when libraries (compiler-rt,
picolibc) also need to be compiled: they will be compiled in parallel
next to the Go files using all available cores. On my dual core laptop
(4 cores if you count hyperthreading) this cuts compilation time roughly
in half when compiling something for a Cortex-M board after running
`tinygo clean`.
This commit finally introduces unit tests for the compiler, to check
whether input Go code is converted to the expected output IR.
To make this necessary, a few refactors were needed. Hopefully these
refactors (to compile a program package by package instead of all at
once) will eventually become standard, so that packages can all be
compiled separate from each other and be cached between compiles.
This commit switches to LLVM 11 for builds with LLVM linked statically
(e.g. `make`). It does not yet switch the default for builds dynamically
linked to LLVM, that should be done in a later change.
This commit also changes to use the default host toolchain (probably
GCC) instead of Clang as the default compiler in CI. There were some
issues with Clang 3.8 in CI and hopefully this will fix it.
Additionally it updates the way LLVM is built on Windows, with
-DLLVM_ENABLE_PIC=OFF (which should have been used all along). This
change makes it possible to revert a hack to build libclang manually and
instead uses the libclang static library like on all other operating
systems, simplifying the Makefile.
On Fedora 33+, there is a buggy package that installs to
`/usr/lib64/clang/{version}/lib`, even on 32-bit systems. The original
code sees the `/usr/lib64/clang/{version}` directory, checks for an
`include` subdirectory, and then gives up because it doesn't exist.
To be more robust, check both `/usr/lib64/clang/{version}/include` and
`/usr/lib/clang/{version}/include`, and only allow versions that match
the LLVM major version used to build tinygo.
Unfortunately, the .rodata section can't be stored in flash. Instead, an
explicit .progmem section should be used, which is supported in LLVM as
address space 1 but not exposed to normal programs.
Eventually a pass should be written that converts trivial const globals
of which all loads are visible to be in addrspace 1, to get the benefits
of storing those globals directly in ROM.
This can be useful to test improvements in LLVM master and to make it
possible to support LLVM 11 for the most part already before the next
release. That also allows catching LLVM bugs early to fix them upstream.
Note that tests do not yet pass for this LLVM version, but the TinyGo
compiler can be built with the binaries from apt.llvm.org (at the time
of making this commit).
On 64-bit Fedora, `lib64` is where the clang headers are, not `lib`. For
multiarch systems, both will exist, but it's likely you want 64-bit, so
check that first.
Test binaries must be run in the source directory of the package to be
tested. This wasn't done, leading to a few "file not found" errors.
This commit implements this. Unfortunately, it does not allow more
packages to be tested as both affected packages (debug/macho and
debug/plan9obj) will still fail with this patch even though the "file
not found" errors are gone.
Right now this requires setting the -port parameter, but other than that
it totally works (if esptool.py is installed). It works by converting
the ELF file to the custom ESP32 image format and flashing that using
esptool.py.
Interrupts store 32 bytes on the current stack, which may be a goroutine
stack. After that the interrupt switches to the main stack pointer so
nothing more is pushed to the current stack. However, these 32 bytes
were not included in the stack size calculation.
This commit adds those 32 bytes. The code is rather verbose, but that is
intentional to make sure it is readable. This is tricky code that's hard
to get right, so I'd rather keep it well documented.
This is a big change that will determine the stack size for many
goroutines automatically. Functions that aren't recursive and don't call
function pointers can in many cases have an automatically determined
worst case stack size. This is useful, as the stack size is usually much
lower than the previous hardcoded default of 1024 bytes: somewhere
around 200-500 bytes is common.
A side effect of this change is that the default stack sizes (including
the stack size for other architectures such as AVR) can now be changed
in the config JSON file, making it tunable per application.
Currently, turning optimizations off causes compile failures.
We rely on the optimizer removing some dead symbols.
Avoid providing an option that does not work right now.
In the future once everything has been fixed we can re-enable this.
For now, this is just an extra flag that can be used to print stack
frame information, but this is intended to provide a way to determine
stack sizes for goroutines at compile time in many cases.
Stack sizes are often somewhere around 350 bytes so are in fact not all
that big usually. Once this can be determined at compile time in many
cases, it is possible to use this information when available and as a
result increase the fallback stack size if the size cannot be determined
at compile time. This should reduce stack overflows while at the same
time reducing RAM consumption in many cases.
Interesting output for testdata/channel.go:
function stack usage (in bytes)
Reset_Handler 332
.Lcommand-line-arguments.fastreceiver 220
.Lcommand-line-arguments.fastsender 192
.Lcommand-line-arguments.iterator 192
.Lcommand-line-arguments.main$1 184
.Lcommand-line-arguments.main$2 200
.Lcommand-line-arguments.main$3 200
.Lcommand-line-arguments.main$4 328
.Lcommand-line-arguments.receive 176
.Lcommand-line-arguments.selectDeadlock 72
.Lcommand-line-arguments.selectNoOp 72
.Lcommand-line-arguments.send 184
.Lcommand-line-arguments.sendComplex 192
.Lcommand-line-arguments.sender 192
.Lruntime.run$1 548
This shows that the stack size (if these numbers are correct) can in
fact be determined automatically in many cases, especially for small
goroutines. One of the great things about Go is lightweight goroutines,
and reducing stack sizes is very important to make goroutines
lightweight on microcontrollers.
This is necessary to avoid a circular dependency in the loader (which
soon will need to read the Go version) and because it seems like a
better place anyway.
Eventually we might want to start using the FPU, but the easy option
right now is to simply disable it everywhere. Previously, it depended on
whether Clang was built as part of TinyGo or it was an external binary.
By setting the floating point mode explicitly, such inconsistencies are
avoided.
This commit creates a new cortex-m4 target which can be the central
place for setting FPU-related settings across all Cortex-M4 chips.
Previously we used --sysroot to set the sysroot explicitly.
Unfortunately, this flag is not used directly by Clang to set the
include path (<sysroot>/include) but is instead interpreted by the
toolchain code. This means that even when the toolchain is explicitly
set (using the --sysroot parameter), it may still decide to use a
different include path such as <sysroot>/usr/include (such as on
baremetal aarch64).
This commit uses the Clang-internal -internal-isystem flag which sets
the include directory directly (as a system include path). This should
be more robust.
The reason the --sysroot parameter has so far worked is that all
existing targets happened to add <sysroot>/include as an include path.
The relevant Clang code is here:
https://github.com/llvm/llvm-project/blob/release/9.x/clang/lib/Driver/Driver.cpp#L4693-L4739
So far, RISC-V is handled by RISCVToolchain, Cortex-M targets by
BareMetal (which seems to be specific to ARM unlike what the name says)
and aarch64 fell back to Generic_ELF.
See comment in the commit for details. It works around a bug that's been
reported here: https://bugs.llvm.org/show_bug.cgi?id=45336
This is a separate commit so it can easily be reverted if/when this
patch is backported to the LLVM 10 stable branch.
This commit also adds a bit of version independence, in particular for
external commands. It also adds the LLVM version to the `tinygo version`
command, which might help while debugging.
The frame pointer was already omitted in the object files that TinyGo
emits, but wasn't yet omitted in the C files it compiles. Omitting the
frame pointer is good for code size (and perhaps performance).
The frame pointer was originally used for printing stack traces in a
debugger. However, advances in DWARF debug info have made it largely
unnecessary (debug info contains enough information now to recover the
frame pointer even without an explicit frame pointer register). In fact,
GDB has been able to produce backtraces in TinyGo compiled code for a
while now while it didn't include a frame pointer.
The main change is in building the libraries, where -fshort-enums was
passed on RISC-V while other C files weren't compiled with this setting.
Note: the test already passed before this change, but it seems like a
good idea to explicitly test for enum size consistency.
There is also not a particular reason not to pass -fshort-enums on
RISC-V. Perhaps it's better to do it there too (on baremetal targets
that don't have to worry about binary compatibility).
This is necessary because LLVM defines many options in global variables
that are modified when invoking Clang. In particular, LLVM 10 seems to
have a bug in which it always sets the -pgo-warn-misexpect flag. Setting
it multiple times (over various cc1 invocations) results in an error:
clang (LLVM option parsing): for the --pgo-warn-misexpect option: may only occur zero or one times!
This is fixed by running the Clang invocation in a new `tinygo`
invocation.
Because we've had issues with lld in the past, also run lld in a
separate process so similar issues won't happen with lld in the future.
This commit merges NewCompiler and Compile into one simplifying the
external interface. More importantly, it does away with the entire
Compiler object so the public API becomes a lot smaller.
The refactor is not complete: eventually, the compiler should just
compile a single package without trying to load it first (that should be
done by the builder package).
This is necessary for better CGo support on bare metal. Existing
libraries expect to be able to include parts of libc and expect to be
able to link to those symbols.
Because with this all targets have a working libc, it is now possible to
add tests to check that a libc in fact works basically.
Not all parts of picolibc are included, such as the math or stdio parts.
These should be added later, when needed.
This commit also avoids the need for the custom memcpy/memset/memcmp
symbols that are sometimes emitted by LLVM. The C library will take care
of that.
This refactor makes adding a new library (such as a libc) much easier in
the future as it avoids a lot of duplicate code. Additionally, CI should
become a little bit faster (~15s) as build-builtins now uses the build
cache.
Add a target for the Adafruit Circuit Playground Bluefruit, which is
based on the nRF52840. Adds the necessary code for the machine
package and the json and linker script files in the targets directory.
The machine package code is based on board_circuitplay_express.go,
with modifications made by consulting the wiring diagram on the
adafruit website here:
https://learn.adafruit.com/adafruit-circuit-playground-bluefruit/downloads
Also adds support to the uf2 conversion packacge to set the familyID
field. The Circuit Playground Bluefruit firmware rejects uf2 files
without the family id set to 0xADA52840 (and without the flag specifying
that the family id is present).
We have long since moved towards a different location for these headers
in the git checkout, so update where getClangHeaderPath looks for these
headers.
Also add an extra check to make sure a path has been detected.
We don't need the separate submodule: compiler-rt is already included in
the llvm-project repository.
This should hopefully make CI slightly faster too.