This is implemented as follows:
* The parent coroutine allocates space for the return value in its
frame and stores a pointer to this frame in the parent coroutine
handle.
* The child coroutine obtains the alloca from its parent using the
parent coroutine handle. It then stores the result value there.
* The parent value reads the data from the alloca on resumption.
A bitcast was inserted when the receiver of the call wasn't a *i8. This
is a pretty common case, and did not play well with goroutines.
Avoid this bitcast by changing each call to a direct call, after
unpacking the receiver type from the *i8 parameter. This might also fix
some undefined behavior in the resulting program, as it is technically
not allowed to call a function with a different signature (even if the
signature is compatible).
Before this commit, goroutine support was spread through the compiler.
This commit changes this support, so that the compiler itself only
generates simple intrinsics and leaves the real support to a compiler
pass that runs as one of the TinyGo-specific optimization passes.
The biggest change, that was done together with the rewrite, was support
for goroutines in WebAssembly for JavaScript. The challenge in
JavaScript is that in general no blocking operations are allowed, which
means that programs that call time.Sleep() but do not start goroutines
also have to be scheduled by the scheduler.