-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack switching: Infrastructure and runtime support #10388
base: main
Are you sure you want to change the base?
Stack switching: Infrastructure and runtime support #10388
Conversation
Subscribe to Label Actioncc @fitzgen
This issue or pull request has been labeled: "fuzzing", "wasmtime:api", "wasmtime:c-api", "wasmtime:config", "wasmtime:ref-types"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Label Messager: wasmtime:configIt looks like you are changing Wasmtime's configuration options. Make sure to
To modify this label's message, edit the To add new label messages or remove existing label messages, edit the |
Haven't dug into the details of the PR yet, but I have some initial questions after reading the OP.
Is there a particular reason this does not use the
Is this because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(review is still in progress, but I need to context switch for a little bit and might as well share what I have so far)
WASM_API_EXTERN void | ||
wasmtime_linker_allow_unknown_exports(wasmtime_linker_t *linker, | ||
bool allow_unknown_exports); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a corresponding function definition in crates/c-api
, I think it is missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a stray declaration from some of our downstream experiments. It has been needed to capture the C shadow stack pointer, that clang generates. It must be kept in sync with the stack switching done by the engine. To be clear, this is just due to the fact that clang doesn't know about stack switching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, exposing these things the C API would be good in general, so if you want to do it here or in a new PR, you're more than welcome to!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sure! I am leaning towards adding this in a follow-up PR focused on the C API. What do you think @frank-emrich ?
/** | ||
* \brief Configures whether the WebAssembly function references | ||
* proposal is enabled. | ||
* | ||
* This setting is `false` by default. | ||
*/ | ||
WASMTIME_CONFIG_PROP(void, wasm_function_references, bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I don't see a function definition for this one either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason as above. Should just be deleted for now, I think.
/// Size of stacks created with cont.new instructions | ||
pub stack_switching_stack_size: Option<usize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least for now, I'm not sure it is worth having separate config knobs for stacks created via cont.new
versus API-level fibers. This effectively creates two different kinds of stacks in the system, and I'd prefer to try and keep things simple for as long as we can by having a single kind of stack that we config and allocate and handle in the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting question, which Frank and I have discussed in the past. If my memory serves me right, then our layout is slightly different. However perhaps we can unify the layouts (or we can perhaps simply adopt the fibers' layout). I am not too sure about the implications, I think @frank-emrich has the key knowledge to best answer this question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The automatic pooling allocator integration would also get you fast allocation of new stacks from a pool, which would look nice on your benchmarks and what have you ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed! I totally see the appeal!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a technical perspective, there is no need for us to have our own config setting for the stack size.
I guess the reason we didn't just re-use the existing stack_size
option is that the latter always felt like a soft limit to me: AFAIK you can still occupy stack space beyond that limit, for example with host functions.
But I have no strong feelings about re-using the existing stack_size
config option, or merging our stack_switching_stack_size
with async_stack_size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK you can still occupy stack space beyond that limit, for example with host functions.
We are generally a little loosey-goosey with our nomenclature here, but to make things precise for a moment:
- "Host functions" = functions defined in native code by the Wasmtime embedder (e.g. via
wasmtime::Func::new
) and imported by Wasm - "libcalls" = various bits of Wasm functionality (e.g.
memory.grow
) that are not implemented inline in JIT code, but instead in native functions inside of Wasmtime's runtime.
libcalls are part of Wasmtime's trusted compute base, host functions are not (modulo some hand waving around their usage of unsafe
).
Given all that: we will execute libcalls on async fibers, but will never execute host functions on async fibers. When Wasm calls an imported host function, we first switch from the async fiber to the native stack, then run the host function, and finally switch back to the Wasm's async fiber stack to pass the function results back to Wasm JIT code.
A libcall should never access stack memory beyond the async_stack_size
limit (and we rely on guard pages to help enforce this). The Wasm JIT code is given max_wasm_stack
bytes for its own use, it must always be true that async_stack_size > max_wasm_stack
, and therefore that leaves async_stack_size - max_wasm_stack
bytes of stack space for libcalls in the worst case.
This is all a little bit of an aside and I'm mostly just explaining all this to make sure we are on the same page here. I don't think any of this needs to change for stack-switching, and I would also expect that we would call libcalls, but never host functions, on stack-switching stacks, same way we do things on our async fibers.
// TODO(#10248) GC integration for stack switching | ||
return Err(wasmtime_environ::WasmError::Unsupported( | ||
"Stack switching feature not compatbile with GC, yet".to_string(), | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not something that needs to be handled in this PR, but there is no reason we can't enable GC and also maintain the existing memory management for continuations, where they have the same lifetime as the Store
. We don't have to implement full GC of continuations before we allow enabling both GC and stack switching at the same time.
pub const TRAP_DELETE_ME_DEBUG_ASSERTION: TrapCode = | ||
TrapCode::unwrap_user(Trap::DeleteMeDebugAssertion as u8 + TRAP_OFFSET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, we already have TRAP_INTERNAL_ASSERT
that we use for debug-only assertions inside JIT code.
// FIXME(frank-emrich) The next three builtins are used by the debug printing mechanism. | ||
// They are not supposed to be part of the final upstreamed code. | ||
// | ||
// Prints a 'static str, represented as a | ||
// pointer and a length. | ||
delete_me_print_str(vmctx: vmctx, s: pointer, len : u64); | ||
// Prints integer | ||
delete_me_print_int(vmctx: vmctx, arg : u64); | ||
// Prints pointer, formatted as hex. | ||
delete_me_print_pointer(vmctx: vmctx, arg : pointer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were these going to be removed in the next PR, or should they be removed now? Maybe throw an item into the meta task list if not now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take is that they shouldn't make it into the main
branch -- probably a stray include?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! These only exist for some hacky debug printing code that should not end up in main
. Initially, when we thought that there would only be a single, big PR, the plan was to keep them during reviewing, and then remove them at the last moment before merging. Now that we will land this in smaller pieces, we could either remove them now, or once all PRs from the series have landed.
The reason why we don't want to land this is explained here: On the codegen side, we just emit the addresses of Rust string literals as literals into generated code, with no relocation information. So this crashes and burns if you ever try to re-use the same generated code from a different invocation of wamtime
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why we don't want to land this is explained here: On the codegen side, we just emit the addresses of Rust string literals as literals into generated code, with no relocation information. So this crashes and burns if you ever try to re-use the same generated code from a different invocation of
wamtime
.
Ah okay! Yes, definitely should not land these bits in that case.
/// FIXME(frank-emrich) Will remove in the final upstreamed version | ||
#[allow(dead_code, reason = "Only accessed in debug builds")] | ||
pub const ENABLE_DEBUG_PRINTING: bool = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto to the debug printing stuff in this file.
/// FIXME(frank-emrich) Will remove in the final upstreamed version | ||
#[macro_export] | ||
macro_rules! debug_println { | ||
($( $args:expr ),+ ) => { | ||
#[cfg(debug_assertions)] | ||
if ENABLE_DEBUG_PRINTING { | ||
#[cfg(feature = "std")] | ||
println!($($args),*); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi, I tend to do log::trace!(...)
for debug prints (both temporary ones that I will remove, and for ones that are useful enough to keep in tree).
when running the CLI, you can see them by setting the WASMTIME_LOG=trace
env var, or only a specific crate/module via WASMTIME_LOG=wasmtime::runtime::vm=trace
, or multiple crates/modules at different levels via WASMTIME_LOG=debug,regalloc2=info,cranelift_codegen::machinst=trace
.
When running tests instead of the CLI, WASMTIME_LOG
becomes RUST_LOG
. Most tests initialize the env-logger, but if you aren't seeing the logs, then you can put let _ = env_logger::try_init();
at the start of the test (or ideally in a shared test helper function that a bunch of related tests already call at the beginning of their execution).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(again, review still in progress, leaving intermediate notes)
/// Runtime configuration options for stack switching that can be set | ||
/// via the command line. | ||
/// | ||
/// Part of wasmtime::config::Config type (which is not in scope in this crate). | ||
#[derive(Debug, Clone)] | ||
pub struct StackSwitchingConfig { | ||
/// The (fixed) size of a continuation stack. | ||
pub stack_size: usize, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These sorts of config knobs that need to exist at the wasmtime_environ
level usually go inside wasmtime_environ::Tunables
. Unless there is a reason not to do that here, we should follow the existing convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I discussed with @alexcrichton (https://github.com/bytecodealliance/wasmtime/pull/10177/files#r1965733715), I'd be OK with no bespoke stack switching configuration for now.
/// Encodes the life cycle of a `VMContRef`. | ||
#[derive(Debug, Clone, Copy, PartialEq)] | ||
#[repr(u32)] | ||
pub enum VMStackState { | ||
/// The `VMContRef` has been created, but neither `resume` or `switch` has ever been | ||
/// called on it. During this stage, we may add arguments using `cont.bind`. | ||
Fresh = wasmtime_environ::stack_switching::STACK_STATE_FRESH_DISCRIMINANT, | ||
/// The continuation is running, meaning that it is the one currently | ||
/// executing code. | ||
Running = wasmtime_environ::stack_switching::STACK_STATE_RUNNING_DISCRIMINANT, | ||
/// The continuation is suspended because it executed a resume instruction | ||
/// that has not finished yet. In other words, it became the parent of | ||
/// another continuation (which may itself be `Running`, a `Parent`, or | ||
/// `Suspended`). | ||
Parent = wasmtime_environ::stack_switching::STACK_STATE_PARENT_DISCRIMINANT, | ||
/// The continuation was suspended by a `suspend` or `switch` instruction. | ||
Suspended = wasmtime_environ::stack_switching::STACK_STATE_SUSPENDED_DISCRIMINANT, | ||
/// The function originally passed to `cont.new` has returned normally. | ||
/// Note that there is no guarantee that a VMContRef will ever | ||
/// reach this status, as it may stay suspended until being dropped. | ||
Returned = wasmtime_environ::stack_switching::STACK_STATE_RETURNED_DISCRIMINANT, | ||
} | ||
|
||
/// Universal control effect. This structure encodes return signal, resume | ||
/// signal, suspension signal, and the handler to suspend to in a single variant | ||
/// type. This instance is used at runtime. There is a codegen counterpart in | ||
/// `cranelift/src/stack-switching/control_effect.rs`. | ||
#[derive(Clone, Copy, Debug, Eq, PartialEq)] | ||
#[repr(u32)] | ||
#[allow(dead_code)] | ||
pub enum ControlEffect { | ||
/// Used to signal that a continuation has returned and control switches | ||
/// back to the parent. | ||
Return = wasmtime_environ::stack_switching::CONTROL_EFFECT_RETURN_DISCRIMINANT, | ||
/// Used to signal to a continuation that it is being resumed. | ||
Resume = wasmtime_environ::stack_switching::CONTROL_EFFECT_RESUME_DISCRIMINANT, | ||
/// Used to signal that a continuation has invoked a `suspend` instruction. | ||
Suspend { | ||
/// The index of the handler to be used in the parent continuation to | ||
/// switch back to. | ||
handler_index: u32, | ||
} = wasmtime_environ::stack_switching::CONTROL_EFFECT_SUSPEND_DISCRIMINANT, | ||
/// Used to signal that a continuation has invoked a `suspend` instruction. | ||
Switch = wasmtime_environ::stack_switching::CONTROL_EFFECT_SWITCH_DISCRIMINANT, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because (AFAICT) these don't contain any usize
s or pointers or anything else that changes with the target's word size, you could move them into the wasmtime_environ
crate as
#[repr(u32)]
pub enum VMControlEffect {
Return = 0,
// ...
}
and avoid the wordy and kind of annoying split between the definitions of the enum
s and their discriminants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, these were in wasmtime_environ
originally, but then @alexcrichton suggested moving them into the runtime. I don't mind it either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, lets keep it as-is for now then. I don't want to make you bounce back and forth.
/// TODO | ||
pub fn unwrap_ref_type(&self) -> WasmRefType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to fill this in.
/// Returns the (module interned) index to the underlying function type. | ||
pub fn unwrap_interned_type_index(self) -> ModuleInternedTypeIndex { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: both the Engine
and Module
variants are interned, so perhaps we can rename this method to unwrap_module_type_index
?
/// This is the size of the largest value type (i.e. a V128). | ||
#[inline] | ||
fn maximum_value_size(&self) -> u8 { | ||
self.size_of_vmglobal_definition() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unused. Does it get used in follow up PRs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it gets used during the code translation. It should arguably be added along with it in the next patch.
stack-switching = [ | ||
"std", | ||
"runtime" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
-
Why does stack-switching need
std
? -
We should be able to enable stack-switching for a compiler-without-runtime build to be able to compile stack-switching Wasm to be run elsewhere, and similarly should be able to enable stack-switching in a runtime-without-compiler build to run elsewhere-compiled stack-switching Wasm modules. As written, enabling the stack-switching feature will always enable the runtime, making the compiler-without-runtime version impossible. That is, unless compilation of stack-switching isn't gated on any cargo feature, but that is kind of funky given that this cargo feature exists and I expect users would think "I want to compile Wasm programs that use stack-switching, therefore I will enable this feature". This is a long way of basically saying, I think this should not unconditionally enable the runtime and, once compilation is added in the next PR, we ultimately end up with something like this:
stack-switching = [ "wasmtime-cranelift?/stack-switching", ]
Neither of these things necessarily needs to be addressed now, before this PR merges, but it may end up being easier doing it now before more stuff gets entangled, making it harder to pull the cfg
s apart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll also want to add some CI checks for these features being enabled/disabled in various combinations like this:
https://github.com/bytecodealliance/wasmtime/blob/main/.github/workflows/main.yml#L363-L365
// TODO(#10248) Required to support stack switching in the embedder API. | ||
unimplemented!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an item to the meta task list for the embedder API, if you haven't already?
// The `enter_wasm` call below will reset the store's `stack_chain` to | ||
// a new `InitialStack`, pointing to the stack-allocated | ||
// `initial_stack_csi`. | ||
let mut initial_stack_csi = VMCommonStackInformation::running_default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run the function call microbenchmarks on main
and on this branch? I want to make sure that we aren't regressing performance here, as this path is very hot, and if we are we may need to figure out how to cfg(...)
some of this stuff so that it only happens when the stack-switching
feature is enabled.
This will run the microbenchmarks and will check the results for statistical significance:
$ git checkout main
$ cargo bench --bench call -- --save-baseline main
$ git checkout stack-switching-infra
$ cargo bench --bench call -- --baseline main
I can help figure things out if we need to do something here.
let result = crate::runtime::vm::catch_traps(store, &previous_runtime_state, closure); | ||
core::mem::drop(previous_runtime_state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mem::drop
dance is kind of funky -- can we instead pass ownership into catch_traps
and let it handle the dropping, if necessary? Or are there callers that need to do different things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that drop
call is definitely not pretty. We could get away with not explicitly calling drop
, but managing the scopes here slightly differently. But that seems equally funky.
The main reason for all this new EntryStoreContext
business is the following: Previously, there were two places that do some updates to the runtime on entering and exiting Wasm: In catch_traps
(together with CallThreadState
's drop
) and here in func.rs
(using enter_wasm
/exit_wasm
). I believe @alexcrichton mentioned that this split is mostly for legacy reasons due to how to things were separated into different crates until recently.
For stack switching, we now require an additional update to the runtime on entering and exiting. Intuitively, it felt to me like these changes should be made in enter_wasm
/exit_wasm
, but that doesn't quite work: We need to undo these changes if you panic
out of a Wasm execution, so I would indeed need to do all of this in catch_traps
instead, so that CallThreadState
's drop
can do the steps needed on exiting the runtime.
But updating the stack chain in catch_traps
to signal that we are entering Wasm seemed awkward to me, so I decided to just consolidate all the logic for changes that need to happen on entering and exiting Wasm into one place, which resulted in the birth of EntryStoreContext
.
TL;DR Technically, all that's needed for stack switching could all be done in catch_traps
, but that felt like it's morally the wrong spot to me.
Any suggestions? Should I just move it to catch_traps
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason for all this new
EntryStoreContext
business is the following: Previously, there were two places that do some updates to the runtime on entering and exiting Wasm: Incatch_traps
(together withCallThreadState
'sdrop
) and here infunc.rs
(usingenter_wasm
/exit_wasm
)....
Any suggestions? Should I just move it to
catch_traps
?
Do you mind splitting out just the EntryStoreContext
bits (without even the stack chain preservation) into its own PR? I think that would help me get a better grasp of these refactorings and provide better review and feedback, especially since the wasm entry/exit paths are both very fiddly and also perf sensitive.
I do like the idea of unifying these code paths a lot, but I just want to be very careful about changes to them in general. Honestly, it is fairly likely that we won't change the mechanisms any further than what you have already done in this PR here, but it would help a lot with alleviating my nibbling paranoia that we are introducing subtle bugs or perf regressions.
And I apologize for asking you to keep splitting these PRs even finer-grained. I know it can be a pain (again, see #10503) but it is super super super helpful for reviewers and making sure that we are following along with all the changes. Thanks!
// FIXME(frank-emrich) Do the fields in here need to be (Unsafe)Cells? | ||
pub struct EntryStoreContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are written to directly from Wasm JIT code, then probably yes. This struct should also be #[repr(C)]
in that case.
If it is not written to directly from Wasm, then it would only need to be unsafe cells if you are ever violating Rust's mutable/exclusive xor immutable/shared semantics.
I haven't read enough of these changes to say any more than that yet.
(Sorry for the long delay, I've moved to London and started a new job, where I needed to settle in first)
The reason we do not use these in verbatim is that our stacks are not the same as
We actually extend |
No worries. Congrats on the move and the new gig!
I'd like to dive more into this because I would (perhaps naively?) assume that there is no fundamental reason they couldn't be the same. This is also one of the biggest concerns I have about the the stack-switching implementation: the complexity of having multiple kinds of stacks that the runtime must manage; when there are different kinds of resources, we will eventually need different config knobs for tuning them; and etc. Basically, this feels like a bunch of cognitive overhead for both maintainers and users. Keeping things uniform keeps things simpler. For example, #9350 (which is admittedly a beast of a PR; I have been splitting it into smaller pieces and landing those incrementally) will ultimately reduce complexity in Wasmtime by unifying our handling of Wasm linear memories and GC heaps. I just point this out so that you know I'm not trying to pick on you, and that I am attempting to hold myself to these standards as well. But perhaps I am missing some key point that makes it so that these stack-switching stacks and fiber stacks really do have different requirements that do not overlap enough that sharing an implementation makes sense? You say it would be quite difficult to unify them, can you expand on what difficulties your foresee?
Totally, that's exactly what I'd expect: a free list pop should be much faster than
Ah okay, I get it now. You're right that we definitely don't want to cut the capacity of all existing tables in half in order to support a proposal that 99% of users won't be leveraging yet. Two things:
In the meantime, halving the capacity just for continuation tables (rather than all other existing tables) is fine. And we can do the above suggestions in follow up PRs. |
This PR is part of a series that adds support for the Wasm stack switching proposal. The explainer document for the proposal is here. There's a tracking issue describing the overall progress and limitations here: #10248
The draft PR #10177 contains the whole (initial) implementation of the proposal.
This PR contains all of the necessary infrastructure and runtime support. In other words, this contains the entire implementation, except for codegen. Tests are added at the end, after codegen has also been added in a follow-up PR.
This was developed together with @dhil.
General implementation notes
In Wasm, continuations are represented by values of type
(ref $ct)
, where$ct
is a new composite type/heap type for continuations.In the implementation, these are represented by values of type
VMContObj
. These are fat pointers, consisting of a sequence counter, and a pointer to aVMContRef
. The latter type is used for the actual representation of continuations.The sequence counter part of
VMContObj
s is used to check that every continuation value can only be used once.The
VMStoreContext
is extended to contain a "stack chain": It indicates what stack we are currently executing on. Logically, this is a linked list of stacks, since each continuation has a parent field. The chain stored in theVMStoreContext
always ends with a value representing the initial stack. This is the stack we were on when entering Wasm, which will usually be the main stack.Memory Management
Currently, memory management is very basic: The
StoreOpaque
provides a new method for allocation a newVMContRef
, and keeps track of all continuations created this way. Continuations are never deallocated at the moment, meaning that they live until the store itself is deallocated.The stack memory used by each allocation (represented by type
ContinuationStack
) is alwaysmmap
-ed, with a guard page at the end. There is currently no support for growing stacks, or segmented stacks.Backtrace generation
The existing backtrace generation mechanism is extended to be fully aware of continuations: When generating backtraces, the entire chain of continuation is traversed, not just the stack frames of the currently active stack/continuation.
Entering/Exiting Wasm
Prior to this PR, there were two separate mechanisms that save and restore some state of the runtime when entering/exiting Wasm:
enter_wasm
andexit_wasm
infunc.rs
CallThreadState
saves and restores (ondrop
) parts of theVMStoreContext
This PR consolidates these two mechanism, because it requires some additional state to be updated and restored on enter/exit:
the type
wasmtime::runtime::func::EntryStoreContext
now stores all of the required runtime state and ensures that it's restored when exiting Wasm.Tables
VMContObj
is 2 pointers wide, the maximum size of an entry in a table doubles. As a result, this PR doubles the amount of virtual memory space occupied by the table pool (but the amount of pages actually used stays the same).