This distinction was useful before when reflect wasn't properly
supported. Back then it made sense to only include method sets that were
actually used in an interface. But now that it is possible to get to
other values (for example, by extracting fields from structs) and it is
possible to turn them back into interfaces, it is necessary to preserve
all method sets that can possibly be used in the program in a type
assert, interface assert or interface method call.
In the future, this logic will need to be revisited again when
reflect.New or reflect.Zero gets implemented.
Code size increases a bit in some cases, but usually in a very limited
way (except for one outlier in the drivers smoke tests). The next commit
will improve the situation significantly.
This commit switches from the previous behavior of compiling the whole
program at once, to compiling every package in parallel and linking the
LLVM bitcode files together for further whole-program optimization.
This is a small performance win, but it has several advantages in the
future:
- There are many more things that can be done per package in parallel,
avoiding the bottleneck at the end of the compiler phase. This
should speed up the compiler futher.
- This change is a necessary step towards a non-LTO build mode for
fast incremental builds that only rebuild the changed package, when
compiler speed is more important than binary size.
- This change refactors the compiler in such a way that it will be
easier to inspect the IR for one package only. Inspecting this IR
will be very helpful for compiler developers.
Moving settings to a separate config struct has two benefits:
- It decouples the compiler a bit from other packages, most
importantly the compileopts package. Decoupling is generally a good
thing.
- Perhaps more importantly, it precisely specifies which settings are
used while compiling and affect the resulting LLVM module. This will
be necessary for caching the LLVM module.
While it would have been possible to cache without this refactor, it
would have been very easy to miss a setting and thus let the
compiler work with invalid/stale data.
This package was long making the design of the compiler more complicated
than it needs to be. Previously this package implemented several
optimization passes, but those passes have since moved to work directly
with LLVM IR instead of Go SSA. The only remaining pass is the SimpleDCE
pass.
This commit removes the *ir.Function type that permeated the whole
compiler and instead switches to use *ssa.Function directly. The
SimpleDCE pass is kept but is far less tightly coupled to the rest of
the compiler so that it can easily be removed once the switch to
building and caching packages individually happens.
Because the parentHandle parameter wasn't always set to the right value,
the coroutine lowering pass would sometimes panic with "trying to make
exported function async" even though there was no exported function
involved. Therefore, it should unconditionally be set to avoid this.
The parent function doesn't always have the parentHandle function
parameter set because it can only be set after defining a function, not
when it is only declared.
This makes viewing the IR easier because parameters have readable names.
This also makes it easier to write compiler tests (still a work in
progress), that work in LLVM 9 and LLVM 10, as LLVM 10 started printing
value names for unnamed parameters.
Previously, the typecode was passed via a direct reference, which results in invalid IR when the defer is not reached in all return paths.
It also results in incorrect behavior if the defer is in a loop, causing all defers to use the typecode of the last iteration.
This is used for example by the errors package, which contains:
if x, ok := err.(interface{ As(interface{}) bool }); ok && x.As(target) {
return true
}
The interface here is not a named type.
This gives a hint to the compiler that such parameters are either NULL
or point to a valid object that can be dereferenced. This is not
directly very useful, but is very useful when combined with
https://reviews.llvm.org/D60047 to remove the runtime.isnil hack without
regressing escape analysis.
Now that most of the utility compiler methods are ported over to the
builder or compilerContext, it is possible to avoid having to do the
wrapper creation in two steps. A new builder is created just to create
the wrapper.
This is a small reduction in line count (and a significant reduction in
complexity!), even though more documentation was added.
This is a fairly big commit, but it actually changes very little.
getValue should really be a property of the builder (or frame), where
the previously created instructions are kept.
This is the first commit in a series to refactor the compiler. The
intention is to make sure every function to be compiled eventually has
its own IR builder. This will make it much easier to do other
refactorings in the future:
* Most code won't depend (directly) on the central Compiler object,
perhaps making it possible to eliminate it in the future. Right now
it's embedded in the `builder` struct but individual fields from the
`Compiler` can easily be moved into the `builder` object.
* Some functions are not directly exposed in Go SSA, they are wrapper
functions for something. At the moment they are included in the list
of functions to be compiled with the reachability analysis
(SimpleDCE) in the ir package, but eventually this reachability
analys will be removed. At that point, it would be very convenient
to be able to simply build a function with a new IR builder.
The `compilerContext` struct makes sure that it is not possible for
`builder` methods to accidentally use global state such as the global IR
builder. It is a transitional mechanism and may be removed when
finished.
Instead of putting the magic in the AST, generate regular accessor
methods. This avoids a number of special cases in the compiler, and
avoids missing any of them.
The resulting union accesses are somewhat clunkier to use, but the
compiler implementation has far less coupling between the CGo
implementation and the IR generator.
Move most of the logic of determining which compiler configuration to
use (such as GOOS/GOARCH, build tags, whether to include debug symbols,
panic strategy, etc.) into the compileopts package. This makes it a
single source of truth for anything related to compiler configuration.
It has a few advantages:
* The compile configuration is independent of the compiler package.
This makes it possible to move optimization passes out of the
compiler, as they don't rely on compiler.Config anymore.
* There is only one place to look if an incorrect compile option is
used.
* The compileopts provides some resistance against unintentionally
picking the wrong option, such as with c.selectGC() vs c.GC() in the
compiler.
* It is now a lot easier to change compile options, as most options
are getters now.
Linked lists are usually implemented as follows:
type linkedList struct {
next *linkedList
data int // whatever
}
This caused a stack overflow while writing out the reflect run-time type
information. This has now been fixed by splitting the allocation of a
named type number from setting the underlying type in the sidetable.
There are a lot more fields that are important when comparing structs
with each other. Take them into account when building the unique ID per
struct type.
Example code that differs between the compilers:
https://play.golang.org/p/nDX4tSHOf_T
This commit fixes the following issue:
https://github.com/tinygo-org/tinygo/issues/309
Also, it prepares for some other reflect-related changes that should
make it easier to add support for named types (etc.) in the future.
No error is produced, so no error needs to be returned. It was missed in
https://github.com/tinygo-org/tinygo/pull/294.
Also, it fixes this smelly code:
if err != nil {
return <something>, nil
}
There could never be an error, so the code was already dead.
This commit adds getValue which gets a const, global, or result of a
local SSA expression and replaces (almost) all uses of parseExpr with
getValue. The only remaining use is in parseInstr, which makes sure an
instruction is only evaluated once.
This commit replaces "unknown type" errors in getLLVMType with panics.
The main reason this is done is that it simplifies the code *a lot*.
Many `if err != nil` lines were there just because of type information.
Additionally, simply panicking is probably a better approach as the only
way this error can be produced is either with big new language features
or a serious compiler bug. Panicking is probably a better way to handle
this error anyway.
The LLVM library we use does not (yet) provide a llvm.Zero (like it
provides a llvm.Undef) so we have implemented our own. However, in
theory it might return an error in some cases.
No real-world errors have been seen in a while and errors would likely
indicate a serious compiler bug anyway (not an external error), so make
it panic instead of returning an error.
This has several advantages, among them:
- Many passes (heap-to-stack, dead arg elimination, inlining) do not
work with function pointer calls. Making them normal function calls
improves their effectiveness.
- Goroutine lowering to LLVM coroutines does not currently support
function pointers. By eliminating function pointers, coroutine
lowering gets support for them for free.
This is especially useful for WebAssembly.
Because of the second point, this work is currently only enabled for the
WebAssembly target.
Unions are somewhat hard to implement in Go because they are not a
native type. But it is actually possible with some compiler magic.
This commit inserts a special "C union" field at the start of a struct
to indicate that it is a union. As such a field cannot be written
directly in Go, this is a useful to distinguish structs and unions.
In LLVM 8, the AVR backend has moved all function pointers to address
space 1 by default. Much of the code still assumes function pointers
live in address space 0, leading to assertion failures.
This commit fixes this problem by autodetecting function pointers and
avoiding them in interface pseudo-calls.
The interp package does a much better job at interpretation, and is
implemented as a pass on the IR which makes it much easier to compose.
Also, the implementation works much better as it is based on LLVM IR
instead of Go SSA.
Match data layout of complex numbers to that of Clang, for better
interoperability. This makes alignment of complex numbes the same as the
individual elements (real and imaginary), as is required by the C spec
and implemented in Clang, but unlike the gc compler. The Go language
specification is silent on this matter.
> Each complex type has the same object representation and alignment
> requirements as an array of two elements of the corresponding real
> type (float for float complex, double for double complex, long double
> for long double complex). The first element of the array holds the
> real part, and the second element of the array holds the imaginary
> component.
Source: https://en.cppreference.com/w/c/language/arithmetic_types
This commit makes sure all Go types can be encoded in the interface type
code, so that Type.Kind() always returns a proper type kind for any
non-nil interface.
Before this commit, goroutine support was spread through the compiler.
This commit changes this support, so that the compiler itself only
generates simple intrinsics and leaves the real support to a compiler
pass that runs as one of the TinyGo-specific optimization passes.
The biggest change, that was done together with the rewrite, was support
for goroutines in WebAssembly for JavaScript. The challenge in
JavaScript is that in general no blocking operations are allowed, which
means that programs that call time.Sleep() but do not start goroutines
also have to be scheduled by the scheduler.
This reduces complexity in the compiler without affecting binary sizes
too much.
Cortex-M0: no changes
Linux x64: no changes
WebAssembly: some testcases (calls, coroutines, map) are slightly bigger
This commit changes many things:
* Most interface-related operations are moved into an optimization
pass for more modularity. IR construction creates pseudo-calls which
are lowered in this pass.
* Type codes are assigned in this interface lowering pass, after DCE.
* Type codes are sorted by usage: types more often used in type
asserts are assigned lower numbers to ease jump table construction
during machine code generation.
* Interface assertions are optimized: they are replaced by constant
false, comparison against a constant, or a typeswitch with only
concrete types in the general case.
* Interface calls are replaced with unreachable, direct calls, or a
concrete type switch with direct calls depending on the number of
implementing types. This hopefully makes some interface patterns
zero-cost.
These changes lead to a ~0.5K reduction in code size on Cortex-M for
testdata/interface.go. It appears that a major cause for this is the
replacement of function pointers with direct calls, which are far more
susceptible to optimization. Also, not having a fixed global array of
function pointers greatly helps dead code elimination.
This change also makes future optimizations easier, like optimizations
on interface value comparisons.