Circuit Design
1. INTRODUCTION
WASM (or WebAssembly) is an open standard binary code format close to assembly. Its initialobjective is to provide an alternative to java-script with better performance in the current webecosystems. Benefiting from its platform independence, front-end flexibility (can be compiled fromthe majority of languages including C, C++, assembly script, rust, etc.), good isolated runtimeand speed that is close to native binary, its usage starts to arise in the distributed cloud and edgecomputing. Recently it has become a popular binary format for users to run customized functionson AWS Lambda, Open Yurt, AZURE, etc.
The Problem. To implement a ZKSNARK-backed WASM virtual machine, we need to connect the implementation of WASM runtime with the proof system of ZKSNARK. In general, a ZKSNARK system is represented in arithmetic circuits with polynomial constraints. Therefore we need to abstract the full imperative logic of a WASM virtual machine systematically and rewrite it into arithmetic circuits with constraints. Given two outputs, one is generated by emulating the WASM bytecode in WASM runtime that enforces the semantics of WASM specification, and the other satisfies the constraints imposed on the arithmetic circuits. If the circuits we write preserve the semantics, these two outputs must be the same. Hence the proof of the ZKSNARK derived from the circuits also shows that the output is valid as a result of emulating the bytecode in WASM runtime.


We consider the WASM virtual machine as a gigantic program, with the input as a tuple (I(C,H),E,IO) ,where I is a WASM executable image that contains a code image C and an initial memory H, E is its entry point, and IO represents the (stdin, stdout) firmware. In the serverless setup, the WASM run-time starts with an initial state based on the loaded image I, then jumps to the entry point E and starts executing the bytecode based on the WASM specification.
Internally the WASM run-time maintains a state S denoted by a tuple (iaddr, F, M, G, Sp, I, IO) where iaddr is the current instruction address, F is the calling frame with a depth field, M is the memory state, Sp is the stack and G is the set of global variables. The run-time simulates the semantic of each instruction start at E until it reaches the exit. The instructions it simulates form an execution trace [t0,t1,t2,t3,⋯] and each transition ti is a function between states that takes an input s:S and outputs a new state s′:S.
For simplicity, we will use the notation of record field to specify a field in state s:S. For example, s.iaddr denotes the current instruction address of state s, s.IO.stdin denotes the input of state s, etc. We also use s.iaddr.op to denote the opcode (operation code that specifies the operation to be performed) at address s.iaddr in the code section C of image I.
Based on the above definition, we define the criteria for a list of state transitions to be validunder (I(C,H),E,IO), as follows.
-- Definition 2.1 (Valid Execution Trace). Given a WASM machine with input (I(C,H),E,IO), and s0 is the initial state with s0.iaddr=E. A valid execution trace is a list of transition functions ti suchthat the following holds: (1) For all k, sk=tk−1∘⋯∘t1∘t0(s0), tk enforces the semantics of sk.iaddr.op. (2) If $s_e$ is the last state, then the depth of the calling frame is zero: $se.F.depth = 0$.
Organization of the document. After a brief introduction to the basic ideas about how to connect a stateful virtual machine with ZKSNARK in Section 1, we describe the basic building block and ingredients used to construct ZKWASM circuits in Section 2 and then present the circuits architecture in Section 3. After the architecture is settled, we discuss the circuits of every category of WASM instructions in Section 4. In the end, we present the partition and proof batching technique to solve the long execution trace problem.
Throughout the document, we use the notation a:A to specify a variable of type A, F to specify the number field, and Fn to specify a multi-dimensional vector with dimension n. We denote by A→B the function type from A to B and use ∘ for function composition. Moreover, we use G[i][j] to specify the value of the cell of matrix G at the i th row and j th column.
Last updated