This makes sure that the LLVM target features match the one generated by
Clang:
- This fixes a bug introduced when setting the target CPU for all
targets: Cortex-M4 would now start using floating point operations
while they were disabled in C.
- This will make it possible in the future to inline C functions in Go
and vice versa. This will need some more work though.
There is a code size impact. Cortex-M4 targets are increased slightly in
binary size while Cortex-M0 targets tend to be reduced a little bit.
Other than that, there is little impact.
With this fix, `cflags` in the target JSON files is correctly ordered.
Previously, the cflags of a parent JSON file would come after the ones
in the child JSON file, which makes it hard to override properties in
the child JSON file.
Specifically, this fixes the case where targets/riscv32.json sets
`-march=rv32imac` and targets/esp32c3.json wants to override this using
`-march=rv32imc` but can't do this because its `-march` comes before the
riscv32.json one.
This is fake debug info. It doesn't point to a source location because
there is no source location. However, it helps to correctly attribute
code size usage to particular packages.
I've also updated builder/sizes.go with some debugging helpers.
Instead of doing everything in the interrupt lowering pass, generate
some more code in gen-device to declare interrupt handler functions and
do some work in the compiler so that interrupt lowering becomes a lot
simpler.
This has several benefits:
- Overall code is smaller, in particular the interrupt lowering pass.
- The code should be a bit less "magical" and instead a bit easier to
read. In particular, instead of having a magic
runtime.callInterruptHandler (that is fully written by the interrupt
lowering pass), the runtime calls a generated function like
device/sifive.InterruptHandler where this switch already exists in
code.
- Debug information is improved. This can be helpful during actual
debugging but is also useful for other uses of DWARF debug
information.
For an example on debug information improvement, this is what a
backtrace might look like before this commit:
Breakpoint 1, 0x00000b46 in UART0_IRQHandler ()
(gdb) bt
#0 0x00000b46 in UART0_IRQHandler ()
#1 <signal handler called>
[..etc]
Notice that the debugger doesn't see the source code location where it
has stopped.
After this commit, breaking at the same line might look like this:
Breakpoint 1, (*machine.UART).handleInterrupt (arg1=..., uart=<optimized out>) at /home/ayke/src/github.com/tinygo-org/tinygo/src/machine/machine_nrf.go:200
200 uart.Receive(byte(nrf.UART0.RXD.Get()))
(gdb) bt
#0 (*machine.UART).handleInterrupt (arg1=..., uart=<optimized out>) at /home/ayke/src/github.com/tinygo-org/tinygo/src/machine/machine_nrf.go:200
#1 UART0_IRQHandler () at /home/ayke/src/github.com/tinygo-org/tinygo/src/device/nrf/nrf51.go:176
#2 <signal handler called>
[..etc]
By now, the debugger sees an actual source location for UART0_IRQHandler
(in the generated file) and an inlined function.
The target triples have to match mostly to be able to link LLVM modules.
Linking LLVM modules is already possible (the triples already match),
but testing becomes much easier when they match exactly.
For macOS, I picked "macosx10.12.0". That's an old and unsupported
version, but I had to pick _something_. Clang by default uses
"macos10.4.0", which is much older.
This generally means that code size is reduced, especially when the os
package is not imported.
Specifically:
- On Linux (which currently statically links musl), it avoids calling
malloc, which avoids including the musl C heap for small programs
saving around 1.6kB.
- On WASI, it avoids initializing the args slice when the os package
is not used. This reduces binary size by around 1kB.
Previously, libclang was run on each fragment (import "C") separately.
However, in regular Go it's possible for later fragments to refer to
types in earlier fragments so they must have been parsed as one.
This commit changes the behavior to run only one C parser invocation for
each Go file.
WriteString just does the simple and and converts the passed string
to a byte-slice. This can be made zero-copy later with unsafe, if needed.
WriteAt returns ErrNotImplemented, to match Seek() and ReadAt().
Fixes#2157
This commit adds support for musl-libc and uses it by default on Linux.
The main benefit of it is that binaries are always statically linked
instead of depending on the host libc, even when using CGo.
Advantages:
- The resulting binaries are always statically linked.
- No need for any tools on the host OS, like a compiler, linker, or
libc in a release build of TinyGo.
- This also simplifies cross compilation as no cross compiler is
needed (it's all built into the TinyGo release build).
Disadvantages:
- Binary size increases by 5-6 kilobytes if -no-debug is used. Binary
size increases by a much larger margin when debugging symbols are
included (the default behavior) because musl is built with debugging
symbols enabled.
- Musl does things a bit differently than glibc, and some CGo code
might rely on the glibc behavior.
- The first build takes a bit longer because musl needs to be built.
As an additional bonus, time is now obtained from the system in a way
that fixes the Y2038 problem because musl has been a bit more agressive
in switching to 64-bit time_t.
MacOS X 10.14 has a soft limit of 256 open files by default, at least on
CircleCI. So don't keep object files open while writing the ar file to
reduce the number of open files at once.
Context: the musl libc has more than 256 object files in the .a file.
This resulted in the error "too many open files" on MacOS X 10.14 when
running in CircleCI.
This is really just a preparatory commit for musl support. The idea is
to store not just the archive file (.a) but also an include directory.
This is optional for picolibc but required for musl, so the main purpose
of this commit is the refactor needed for this change.
This brings a bit more consistency to libc configuration. It seems
better to me to set the header flags all in the same place, instead of
some in Go code and some in JSON target files (depending on the target).
GitHub Actions is faster and much better integrated into GitHub than
Azure Pipelines, and is in general easier to use. Therefore, switch to
GitHub Actions for our Windows builds and tests.
This is for consistency with Clang, which always adds a CPU flag even if
it's not specified in CFLAGS.
This commit also adds some tests to make sure the Clang target-cpu
matches the CPU property in the JSON files.
This does have an effect on the generated binaries. The effect is very
small though: on average just 0.2% increase in binary size, apparently
because Cortex-M3 and Cortex-M4 are compiled a bit differently. However,
when rebased on top of https://github.com/tinygo-org/tinygo/pull/2218
(minsize), the difference drops to -0.1% (a slight decrease on average).
These functions are defined in compiler-rt in assembly and therefore
don't have stack size information. However, they're often called so
these missing functions often inhibit stack size calculation.
Example, before:
$ tinygo build -o test.elf -target=cortex-m-qemu -print-stacks ./testdata/float.go
function stack usage (in bytes)
Reset_Handler unknown, __aeabi_memclr does not have stack frame information
runtime.run$1 unknown, __aeabi_dcmpgt does not have stack frame information
After:
$ tinygo build -o test.elf -target=cortex-m-qemu -print-stacks ./testdata/float.go
function stack usage (in bytes)
Reset_Handler 260
runtime.run$1 224
This commit improves accuracy of the -size=full flag in a big way.
Instead of relying on symbol names to figure out by which package
symbols belong, it will instead mostly use DWARF debug information
(specifically, debug line tables and debug information for global
variables) relying on symbols only for some specific things. This is
much more accurate: it also accounts for inlined functions.
For example, here is how it looked previously when compiling a personal
project:
code rodata data bss | flash ram | package
1902 333 0 0 | 2235 0 | (bootstrap)
46 256 0 0 | 302 0 | github
0 454 0 0 | 454 0 | handleHardFault$string
154 24 4 4 | 182 8 | internal/task
2498 83 5 2054 | 2586 2059 | machine
0 16 24 130 | 40 154 | machine$alloc
1664 32 12 8 | 1708 20 | main
0 0 0 200 | 0 200 | main$alloc
2476 79 0 36 | 2555 36 | runtime
576 0 0 0 | 576 0 | tinygo
9316 1277 45 2432 | 10638 2477 | (sum)
11208 - 48 6548 | 11256 6596 | (all)
And here is how it looks now:
code rodata data bss | flash ram | package
------------------------------- | --------------- | -------
1509 0 12 23 | 1521 35 | (unknown)
660 0 0 0 | 660 0 | C compiler-rt
58 0 0 0 | 58 0 | C picolibc
0 0 0 4096 | 0 4096 | C stack
174 0 0 0 | 174 0 | device/arm
6 0 0 0 | 6 0 | device/sam
598 256 0 0 | 854 0 | github.com/aykevl/ledsgo
320 24 0 4 | 344 4 | internal/task
1414 99 24 2181 | 1537 2205 | machine
726 352 12 208 | 1090 220 | main
3002 542 0 36 | 3544 36 | runtime
848 0 0 0 | 848 0 | runtime/volatile
70 0 0 0 | 70 0 | time
550 0 0 0 | 550 0 | tinygo.org/x/drivers/ws2812
------------------------------- | --------------- | -------
9935 1273 48 6548 | 11256 6596 | total
There are some notable differences:
* Odd packages like main$alloc and handleHardFault$string are gone,
instead their code is put in the correct package.
* C libraries and the stack are now included in the list, they were
previously part of the (bootstrap) pseudo-package.
* Unknown bytes are slightly reduced. It should be possible to reduce
it significantly more in the future: most of it is now caused by
interface invoke wrappers.
* Inlined functions are now correctly attributed. For example, the
runtime/volatile package is normally entirely inlined.
* There is no difference between (sum) and (all) anymore. A better
code size algorithm now counts the code/data sizes correctly.
* And last (but not least) there is a stylistic change: the table now
looks more like a table. Especially the summary should be clearer
now.
Future goals:
* Improve debug information so that the (unknown) pseudo-package is
reduced in size or even eliminated altogether.
* Add support for other file formats, most importantly WebAssembly.
* Perhaps provide a way to expand this report per file, or in a
machine-readable format like JSON or CSV.
This matches the behavior of Clang, which uses optsize for -Os and adds
minsize for -Oz.
The code size change is all over the map, but using a hacked together
size comparison tool I've found that there is a slight reduction in
binary size overall (-1.6% with the tinygo smoke tests and -0.8% for the
drivers smoke test).
This commit has a few related changes:
* It sets the optsize attribute immediately in the compiler instead of
adding it to each function afterwards in a loop. This seems to me
like the more appropriate way to do it.
* It centralizes setting the optsize attribute in the transform
package, to make later changes easier.
* It sets the optsize in a few more places: to runtime.initAll and to
WebAssembly i64 wrappers.
This commit does not affect the binary size of any of the smoke tests,
so should be risk-free.
This commit will use the memory layout information for heap allocations
added in the previous commit to determine LLVM types, instead of
guessing their types based on the content. This fixes a bug in which
recursive data structures (such as doubly linked lists) would result in
a compiler stack overflow due to infinite recursion.
Not all heap allocations have a memory layout yet, but this can be
incrementally fixed in the future. So far, this commit should fix
(almost?) all cases of this stack overflow issue.
Instead of doing lots of complicated calculations to get the shortest
GEP, I'll just cast it to i8*, do the GEP, and optionally cast to the
requested type.
This currently produces ugly constant expressions, but once LLVM
switches to opaque pointer types all of this shouldn't matter anymore.
This is uncommon, but it does happen if the source pointer is a bitcast
of a global. For example, if a struct is cast to an i8*, it's possible
to index beyond what would appear to be the size of the pointer (i8*).
This commit adds object layout information to new heap allocations. It
is not yet used anywhere: the next commit will make use of it.
Object layout information will eventually be used for a (mostly) precise
garbage collector. This is what the data is made for. However, it is
also useful in the interp package which can work better if it knows the
memory layout and thus the approximate LLVM type of heap-allocated
objects.
This layout parameter is currently always nil and ignored, but will
eventually contain a pointer to a memory layout.
This commit also adds module verification to the transform tests, as I
found out that it didn't (and therefore didn't initially catch all
bugs).
This is necessary to display error messages on Windows. For example,
this command invocation is not correct (esp32 doesn't define
machine.LED, you need esp32-coreboard-v2 for example):
tinygo run -target=esp32 examples/blinky1
It results in the following hard-to-read error message:
# examples/blinky1
..\..\..\..\..\AppData\Local\tinygo\goroot-go1.16-24cb853b66a5367bf6d65bc08b2cb665c75bd9971f0be8f8b73f69d1a33e04a1-syscall\src\examples\blinky1\blinky1.go:11:17: LED not declared by package machine
With this commit, this error message becomes much easier to read:
# examples/blinky1
C:\Users\Ayke\go\src\github.com\tinygo-org\tinygo\src\examples\blinky1\blinky1.go:11:17: LED not declared by package machine
This commit simplifies the IR a little bit: instead of calling
pseudo-functions runtime.interfaceImplements and
runtime.interfaceMethod, real declared functions are being called that
are then defined in the interface lowering pass. This should simplify
the interaction between various transformation passes. It also reduces
the number of lines of code, which is generally a good thing.
This adds support for a construct like this:
type foo func(fn foo)
Unfortunately, LLVM cannot create function pointers that look like this.
LLVM only supports named types for structs (not for pointers) and thus
can't add a pointer to a function type of the same type to a parameter
of that function type.
The fix is simple: cast all function pointers to a void function, in
LLVM IR:
void ()*
Raw function pointers are cast to this type before storing, and cast
back to the regular function type before calling. This means that
function parameters will never refer to its own type because raw
function types are fixed at that one type.
Somehow, this does have an effect on binary size in some cases. The
effect is small and goes both ways. On top of that, there is work
underway in LLVM which would make all pointer types opaque (without a
pointee type). This would make this whole commit useless and therefore
should fix any size increases that might happen.
https://llvm.org/docs/OpaquePointers.html
The division and remainder operations were lowered directly to LLVM IR.
This is wrong however because the Go specification defines exactly what
happens on a divide by zero or signed integer overflow and LLVM IR
itself treats those cases as undefined behavior. Therefore, this commit
implements divide by zero and signed integer overflow according to the
Go specification.
This does have an impact on the generated code, but it is surprisingly
small. I've used the drivers repo to test the code before and after, and
to my surprise most driver smoke tests are not changed at all. Those
that are, have only a small increase in code size. At the same time,
this change makes TinyGo more compliant to the Go specification.
This adds support for stdio in picolibc and fixes wasm_exec.js so that
it can also support C puts. With this, C stdout works on all supported
platforms.
There is no need to put these in the board files as the I2S is the same
on all Microchip SAM D21 chips. This simplifies the code and avoids some
special *_baremetal.go files.
This change does not change the resulting binaries.
This has practically no effect on the resulting binaries, the only
difference I could find was for the flash/console/spi driver example.
I'm not sure how to test that one, but I think it's very unlikely that
code will have changed in any meaningful way (apart from reordering some
globals).
This commit changes the I2C declarations so that the objects are
instantiated in each chip file (e.g. machine_atsamd21e18.go) and used to
define I2C0 (and similar) in the board file (e.g. board_qtpy.go). This
should make it easier to define new board files, and reduces the need
for separate *_baremetal.go files.
I have tested this the following way:
- With the LIS3DH driver example on the Circuit Playground Express and
the PyBadge.
- With the LSM6DS3 driver example on the Arduino Nano 33 IoT.
They both still work fine.