The Cortex-M target isn't much changed, but much of the logic for the
AVR stack switcher that was previously in assembly has now been moved to
Go to make it more maintainable and in fact smaller in code size. Three
functions (tinygo_getCurrentStackPointer, tinygo_switchToTask,
tinygo_switchToScheduler) have been changed to one: tinygo_swapTask.
This reduction in assembly code should make the code more maintainable
and should make it easier to port stack switching to other
architectures.
I've also moved the assembly files to src/internal/task, which seems
like a more appropriate location to me.
Instead of putting tinygo_scanCurrentStack in scheduler_*.S files, put
them in dedicated files. The function tinygo_scanCurrentStack has
nothing to do with scheduling and so doesn't belong there. Additionally,
while scheduling code is made specific for the Cortex-M, the
tinygo_scanCurrentStack is generic to all ARM targets so this move
removes some duplication there.
Specifically:
* tinygo_scanCurrentStack is moved out of scheduler_cortexm.S as it
isn't really part of the scheduler. It is now gc_arm.S.
* Same for the AVR target.
* Same for the RISCV target.
* scheduler_gba.S is removed, using gc_arm.S instead as it only
contains tinygo_scanCurrentStack.
This is a big change that will determine the stack size for many
goroutines automatically. Functions that aren't recursive and don't call
function pointers can in many cases have an automatically determined
worst case stack size. This is useful, as the stack size is usually much
lower than the previous hardcoded default of 1024 bytes: somewhere
around 200-500 bytes is common.
A side effect of this change is that the default stack sizes (including
the stack size for other architectures such as AVR) can now be changed
in the config JSON file, making it tunable per application.
For now, this is just an extra flag that can be used to print stack
frame information, but this is intended to provide a way to determine
stack sizes for goroutines at compile time in many cases.
Stack sizes are often somewhere around 350 bytes so are in fact not all
that big usually. Once this can be determined at compile time in many
cases, it is possible to use this information when available and as a
result increase the fallback stack size if the size cannot be determined
at compile time. This should reduce stack overflows while at the same
time reducing RAM consumption in many cases.
Interesting output for testdata/channel.go:
function stack usage (in bytes)
Reset_Handler 332
.Lcommand-line-arguments.fastreceiver 220
.Lcommand-line-arguments.fastsender 192
.Lcommand-line-arguments.iterator 192
.Lcommand-line-arguments.main$1 184
.Lcommand-line-arguments.main$2 200
.Lcommand-line-arguments.main$3 200
.Lcommand-line-arguments.main$4 328
.Lcommand-line-arguments.receive 176
.Lcommand-line-arguments.selectDeadlock 72
.Lcommand-line-arguments.selectNoOp 72
.Lcommand-line-arguments.send 184
.Lcommand-line-arguments.sendComplex 192
.Lcommand-line-arguments.sender 192
.Lruntime.run$1 548
This shows that the stack size (if these numbers are correct) can in
fact be determined automatically in many cases, especially for small
goroutines. One of the great things about Go is lightweight goroutines,
and reducing stack sizes is very important to make goroutines
lightweight on microcontrollers.
The frame pointer was already omitted in the object files that TinyGo
emits, but wasn't yet omitted in the C files it compiles. Omitting the
frame pointer is good for code size (and perhaps performance).
The frame pointer was originally used for printing stack traces in a
debugger. However, advances in DWARF debug info have made it largely
unnecessary (debug info contains enough information now to recover the
frame pointer even without an explicit frame pointer register). In fact,
GDB has been able to produce backtraces in TinyGo compiled code for a
while now while it didn't include a frame pointer.
This is necessary for better CGo support on bare metal. Existing
libraries expect to be able to include parts of libc and expect to be
able to link to those symbols.
Because with this all targets have a working libc, it is now possible to
add tests to check that a libc in fact works basically.
Not all parts of picolibc are included, such as the math or stdio parts.
These should be added later, when needed.
This commit also avoids the need for the custom memcpy/memset/memcmp
symbols that are sometimes emitted by LLVM. The C library will take care
of that.
This scheduler is intended to live along the (stackless) coroutine based
scheduler which is needed for WebAssembly and unsupported platforms. The
stack based scheduler is somewhat simpler in implementation as it does
not require full program transform passes and supports things like
function pointers and interface methods out of the box with no changes.
Code size is reduced in most cases, even in the case where no scheduler
scheduler is used at all. I'm not exactly sure why but these changes
likely allowed some further optimizations somewhere. Even RAM is
slightly reduced, perhaps some global was elminated in the process as
well.
dumb -> leaking:
make it more clear what this "GC" does: leak everything.
marksweep -> conservative:
"marksweep" is too generic, use "conservative" to differentiate
between future garbage collectors: precise marksweep / mark-compact /
refcounting.
This is very useful for debugging. It differentiates between a stack
overflow and other errors (because it's easy to see when a stack
overflow occurs) and prints the old stack pointer and program counter if
available.
LLD version 8 has added support for armv6m:
https://reviews.llvm.org/D55555
This means we can use LLD instead of arm-none-eabi-ld, eliminating our
dependency on GNU binutils.
There are small differences in code size, but never more than a few
bytes.
So far, we've pretended to be js/wasm in baremetal targets to make the
stdlib happy. Unfortunately, this has various problems because
syscall/js (a dependency of many stdlib packages) thinks it can do JS
calls, and emulating them gets quite hard with all changes to the
syscall/js packages in Go 1.12.
This commit does a few things:
* It lets baremetal targets pretend to be linux/arm instead of
js/wasm.
* It lets the loader only select particular packages from the src
overlay, instead of inserting them just before GOROOT. This makes it
possible to pick which packages to overlay for a given target.
* It adds a baremetal-only syscall package that stubs out almost all
syscalls.
This commit does two things:
* It adds support for the GOOS and GOARCH environment variables. They
fall back to runtime.GO* only when not available.
* It adds support for 3 new architectures: 386, arm, and arm64. For
now, this is Linux-only.
This avoids a ton of duplication and makes it easier to change a generic
target (for example, the "cortex-m" target) for all boards that use it.
Also, by making it possible to inherit properties from a parent target
specification, it is easier to support out-of-tree boards that don't
have to be updated so often. A target specification for a
special-purpose board can simply inherit the specification of a
supported chip and override the properites it needs to override (like
the programming interface).