Circuit Design
Last updated
Last updated
WASM (or WebAssembly) is an open standard binary code format close to assembly. Its initialobjective is to provide an alternative to java-script with better performance in the current webecosystems. Benefiting from its platform independence, front-end flexibility (can be compiled fromthe majority of languages including C, C++, assembly script, rust, etc.), good isolated runtimeand speed that is close to native binary, its usage starts to arise in the distributed cloud and edgecomputing. Recently it has become a popular binary format for users to run customized functionson AWS Lambda, Open Yurt, AZURE, etc.
The Problem. To implement a ZKSNARK-backed WASM virtual machine, we need to connect the implementation of WASM runtime with the proof system of ZKSNARK. In general, a ZKSNARK system is represented in arithmetic circuits with polynomial constraints. Therefore we need to abstract the full imperative logic of a WASM virtual machine systematically and rewrite it into arithmetic circuits with constraints. Given two outputs, one is generated by emulating the WASM bytecode in WASM runtime that enforces the semantics of WASM specification, and the other satisfies the constraints imposed on the arithmetic circuits. If the circuits we write preserve the semantics, these two outputs must be the same. Hence the proof of the ZKSNARK derived from the circuits also shows that the output is valid as a result of emulating the bytecode in WASM runtime.
Organization of the document. After a brief introduction to the basic ideas about how to connecta stateful virtual machine with ZKSNARK in Section 2, we describe the basic building block andingredients used to construct ZKWASM circuits in Section 3 and then present the circuits architecture in Section 4. After the architecture is settled, we discuss the circuits of every category of WASM instructions in Section 5. In addition, in Section 5.4 we discuss foreign instruction expansion which provides a way to extend the virtual machine for better performance and integration. In Section 6, we present the partition and proof batching technique to solve the long execution trace problem.
Throughout the document, we use the notation to specify a variable of type , to specify the number field, and to specify a multi-dimensional vector with dimension . We denote by the function type from to and use for function composition. Moreover, we use to specify the value of the cell of matrix at the th row and th column.
We consider the WASM virtual machine as a gigantic program, with the input as a tuple ,where is a WASM executable image that contains a code image and an initial memory , is its entry point, and represents the (stdin, stdout) firmware. In the serverless setup, the WASM run-time starts with an initial state based on the loaded image , then jumps to the entry point and starts executing the bytecode based on the WASM specification.
Internally the WASM run-time maintains a state denoted by a tuple (, , , , , , ) where is the current instruction address, is the calling frame with a depth
field, is the memory state, is the stack and is the set of global variables. The run-time simulates the semantic of each instruction start at until it reaches the exit. The instructions it simulates form an execution trace and each transition is a function between states that takes an input and outputs a new state .
For simplicity, we will use the notation of record field to specify a field in state . For example, denotes the current instruction address of state , denotes the input of state , etc. We also use to denote the opcode (operation code that specifies the operation to be performed) at address in the code section of image .
Based on the above definition, we define the criteria for a list of state transitions to be validunder , as follows.
-- Definition 2.1 (Valid Execution Trace). Given a WASM machine with input , and is the initial state with . A valid execution trace is a list of transition functions suchthat the following holds: (1) For all , , enforces the semantics of . (2) If $s_e$ is the last state, then the depth of the calling frame is zero: $se.F.depth = 0$.
The outermost function of ZKWASM is of type zkmain(void): (void)
. To access the external IO, we use two builtin host functions: zkwasm_input and zkwasm_output. Suppose that we have an array of instances [] and a internal cursor , zkwasm_input(1) binds the th instance to the top of the stack and similarly zkwasm_output() binds the current top of the stack with . Both zkwasm_input and zkwasm_output increase the cursor . We can also access external witness by zkwasm_input(0), when doing so we just fill the top of the stack with a witness which does not bind with any public instances.
Compared with a standard WASM run-time, ZKWASM aims to provide a proof to prove that the output is valid so that it can be used in scenarios which require trustless and privacy computation. Moreover, the verifying algorithm needs to be simple in the sense of complexity to be useful inpractical. In zkWasm we have three different categories of circuits.
Guest circuits: enforces the semantics of slices of execution traces.
Host circuits: enforces the semantics of host API calls
Aggregation circuits: batch the guest circuits and host circuits and connect proofs of slices of execution.