These functions are defined in compiler-rt in assembly and therefore
don't have stack size information. However, they're often called so
these missing functions often inhibit stack size calculation.
Example, before:
$ tinygo build -o test.elf -target=cortex-m-qemu -print-stacks ./testdata/float.go
function stack usage (in bytes)
Reset_Handler unknown, __aeabi_memclr does not have stack frame information
runtime.run$1 unknown, __aeabi_dcmpgt does not have stack frame information
After:
$ tinygo build -o test.elf -target=cortex-m-qemu -print-stacks ./testdata/float.go
function stack usage (in bytes)
Reset_Handler 260
runtime.run$1 224
This matches the behavior of Clang, which uses optsize for -Os and adds
minsize for -Oz.
The code size change is all over the map, but using a hacked together
size comparison tool I've found that there is a slight reduction in
binary size overall (-1.6% with the tinygo smoke tests and -0.8% for the
drivers smoke test).
This is a big change that will determine the stack size for many
goroutines automatically. Functions that aren't recursive and don't call
function pointers can in many cases have an automatically determined
worst case stack size. This is useful, as the stack size is usually much
lower than the previous hardcoded default of 1024 bytes: somewhere
around 200-500 bytes is common.
A side effect of this change is that the default stack sizes (including
the stack size for other architectures such as AVR) can now be changed
in the config JSON file, making it tunable per application.
Call Frame Information is stored in the .debug_frame section and is used
by debuggers for unwinding. For assembly, this information is not known.
Debuggers will normally use heuristics to figure out the parent function
in the absence of call frame information.
This usually works fine, but is not enough for determining stack sizes.
Instead, I hardcoded the stack size information in
stacksize/stacksize.go, which is somewhat fragile. This change uses CFI
assembly directives to store this information instead of hardcoding it.
This change also fixes the following error message that would appear in
GDB:
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
More information on CFI:
* https://sourceware.org/binutils/docs/as/CFI-directives.html
* https://www.imperialviolet.org/2017/01/18/cfi.html
For now, this is just an extra flag that can be used to print stack
frame information, but this is intended to provide a way to determine
stack sizes for goroutines at compile time in many cases.
Stack sizes are often somewhere around 350 bytes so are in fact not all
that big usually. Once this can be determined at compile time in many
cases, it is possible to use this information when available and as a
result increase the fallback stack size if the size cannot be determined
at compile time. This should reduce stack overflows while at the same
time reducing RAM consumption in many cases.
Interesting output for testdata/channel.go:
function stack usage (in bytes)
Reset_Handler 332
.Lcommand-line-arguments.fastreceiver 220
.Lcommand-line-arguments.fastsender 192
.Lcommand-line-arguments.iterator 192
.Lcommand-line-arguments.main$1 184
.Lcommand-line-arguments.main$2 200
.Lcommand-line-arguments.main$3 200
.Lcommand-line-arguments.main$4 328
.Lcommand-line-arguments.receive 176
.Lcommand-line-arguments.selectDeadlock 72
.Lcommand-line-arguments.selectNoOp 72
.Lcommand-line-arguments.send 184
.Lcommand-line-arguments.sendComplex 192
.Lcommand-line-arguments.sender 192
.Lruntime.run$1 548
This shows that the stack size (if these numbers are correct) can in
fact be determined automatically in many cases, especially for small
goroutines. One of the great things about Go is lightweight goroutines,
and reducing stack sizes is very important to make goroutines
lightweight on microcontrollers.