Because of a bug in the ARM backend of LLVM, the cmpxchg instruction is
lowered using ldrexd/strexd instructions which don't exist on Cortex-M
cores. This leads to an "undefined instruction" exception at runtime.
Therefore, this patch works around this by lowering directly to a call
to the __sync_val_compare_and_swap_8 function, which is what the backend
should be doing.
For details, see: https://reviews.llvm.org/D95891
To test this patch, you can run the code on a Cortex-M3 or higher
microcontroller, for example:
tinygo flash -target=pca10040 ./testdata/atomic.go
Before this patch, this would trigger an error. With this patch, the
behavior is correct. The error (without this patch) could look like
this:
fatal error: undefined instruction with sp=0x200007cc pc=nil
Moving settings to a separate config struct has two benefits:
- It decouples the compiler a bit from other packages, most
importantly the compileopts package. Decoupling is generally a good
thing.
- Perhaps more importantly, it precisely specifies which settings are
used while compiling and affect the resulting LLVM module. This will
be necessary for caching the LLVM module.
While it would have been possible to cache without this refactor, it
would have been very easy to miss a setting and thus let the
compiler work with invalid/stale data.
This fixes a longstanding TODO comment and similar to
https://github.com/tinygo-org/tinygo/pull/1593 it removes some code out
of the compiler.CompileProgram function that doesn't need to be there.
This is a small refactor to move code away from compiler.CompilePackage,
with the goal that compiler.CompilePackage will eventually be removed
entirely in favor of compiler.CompilePackage.
Since https://github.com/tinygo-org/tinygo/pull/1571 (in particular, the first
commit that sets the main package path), the main package is always named
"main". This makes the callMain() workaround in the runtime unnecessary and
allows directly calling the main.main function with a //go:linkname pragma.
This package was long making the design of the compiler more complicated
than it needs to be. Previously this package implemented several
optimization passes, but those passes have since moved to work directly
with LLVM IR instead of Go SSA. The only remaining pass is the SimpleDCE
pass.
This commit removes the *ir.Function type that permeated the whole
compiler and instead switches to use *ssa.Function directly. The
SimpleDCE pass is kept but is far less tightly coupled to the rest of
the compiler so that it can easily be removed once the switch to
building and caching packages individually happens.
This change extends defer support to all supported builitin functions.
Not all of them make sense (such as len, cap, real, imag, etc) but this
change for example adds support for `defer(delete(m, key))` which is
used in the Go 1.15 encoding/json package.
This commit parallelizes almost everything that can currently be
parallelized. With that, it also introduces a framework for easily
parallelizing other parts of the compiler.
Code for baremetal targets already compiles slightly faster because it
can parallelize the compilation of supporting assembly files. However,
the speedup is especially noticeable when libraries (compiler-rt,
picolibc) also need to be compiled: they will be compiled in parallel
next to the Go files using all available cores. On my dual core laptop
(4 cores if you count hyperthreading) this cuts compilation time roughly
in half when compiling something for a Cortex-M board after running
`tinygo clean`.
This works around some UB in LLVM, where an out-of-bounds conversion would produce a poison value.
The selected behavior is saturating, except that NaN is mapped to the minimum value.
This commit finally introduces unit tests for the compiler, to check
whether input Go code is converted to the expected output IR.
To make this necessary, a few refactors were needed. Hopefully these
refactors (to compile a program package by package instead of all at
once) will eventually become standard, so that packages can all be
compiled separate from each other and be cached between compiles.
This should make exported names a bit more consistent.
I believe there was a bug report for this issue, but I can't easily find
it. In any case, I think it's an important improvement to match the
behavior of the Go toolchain.
On WebAssembly it is possible to grow the heap with the memory.grow
instruction. This commit implements this feature and with that also
removes the -heap-size flag that was reportedly broken (I haven't
verified that). This should make it easier to use TinyGo for
WebAssembly, where there was no good reason to use a fixed heap size.
This commit has no effect on baremetal targets with optimizations
enabled.
This commit swaps the layout of the heap. Previously, the metadata was
at the start and the data blocks (the actual heap memory) followed
after. This commit swaps those, so that the heap area starts with the
data blocks followed by the heap metadata.
This arrangement is not very relevant for baremetal targets that always
have all RAM allocated, but it is an important improvement for other
targets such as WebAssembly where growing the heap is possible but
starting with a small heap is a good idea. Because the metadata lives at
the end, and because the metadata does not contain pointers, it can
easily be moved. The data itself cannot be moved as the conservative GC
does not know all the pointer locations, plus moving the data could be
very expensive.
This commit changes the number of wait states for the stm32f103 chip to
2 instead of 4. This gets it back in line with the datasheet, but it
also has the side effect of breaking I2C. Therefore, another (seemingly
unrelated) change is needed: the i2cTimeout constant must be increased
to a higher value to adjust to the lower flash wait states - presumably
because the lower number of wait states allows the chip to run code
faster.
Before this change, the compiler could panic with the following message:
panic: 20 not an Int
That of course doesn't make much sense. But it apparently is expected
behavior, see https://github.com/golang/go/issues/43165 for details.
This commit fixes this issue by converting the constant to an integer if
needed.
During a run of interp, some memory (for example, memory allocated
through runtime.alloc) may not have a known LLVM type. This memory is
alllocated by creating an i8 array.
This does not necessarily work, as i8 has no alignment requirements
while the allocated object may have allocation requirements. Therefore,
the resulting global may have an alignment that is too loose.
This works on some microcontrollers but notably does not work on a
Cortex-M0 or Cortex-M0+, as all load/store operations must be aligned.
This commit fixes this by setting the alignment of untyped memory to the
maximum alignment. The determination of "maximum alignment" is not
great but should get the job done on most architectures.
For a full explanation, see interp/README.md. In short, this rewrite is
a redesign of the partial evaluator which improves it over the previous
partial evaluator. The main functional difference is that when
interpreting a function, the interpretation can be rolled back when an
unsupported instruction is encountered (for example, an actual unknown
instruction or a branch on a value that's only known at runtime). This
also means that it is no longer necessary to scan functions to see
whether they can be interpreted: instead, this package now just tries to
interpret it and reverts when it can't go further.
This new design has several benefits:
* Most errors coming from the interp package are avoided, as it can
simply skip the code it can't handle. This has long been an issue.
* The memory model has been improved, which means some packages now
pass all tests that previously didn't pass them.
* Because of a better design, it is in fact a bit faster than the
previous version.
This means the following packages now pass tests with `tinygo test`:
* hash/adler32: previously it would hang in an infinite loop
* math/cmplx: previously it resulted in errors
This also means that the math/big package can be imported. It would
previously fail with a "interp: branch on a non-constant" error.
Because the parentHandle parameter wasn't always set to the right value,
the coroutine lowering pass would sometimes panic with "trying to make
exported function async" even though there was no exported function
involved. Therefore, it should unconditionally be set to avoid this.
The parent function doesn't always have the parentHandle function
parameter set because it can only be set after defining a function, not
when it is only declared.
Previously, EmitPointerPack would generate an out-of-bounds read from an
alloca. This commit fixes that by creating an alloca of the appropriate
size instead of using the size of the to-be-packed data (which might be
smaller than a pointer).
I discovered this error while working on a rewrite of the interp
package, which checks for out-of-bounds reads and writes. There I
discovered this issue when the image package was compiled.
This newer peripheral supports DMA (through EasyDMA) and should
generally be faster. Importantly for some operations: interrupts (within
255 byte buffers) will not interfere with the SPI transfer.
The nrf52 series is all very similar and copying the code only makes it
harder to maintain the code or to add more chips in the nrf52 series
(for example, the nrf52833 as used in the micro:bit v2).
This commit also has a small improvement regarding pins: it now includes
chip-level pin names (P0.00, P0.01, etc) to the machine package.