Circuit Design

1 INTRODUCTION

WASM (or WebAssembly) is an open standard binary code format close to assembly. Its initialobjective is to provide an alternative to java-script with better performance in the current webecosystems. Benefiting from its platform independence, front-end flexibility (can be compiled fromthe majority of languages including C, C++, assembly script, rust, etc.), good isolated runtimeand speed that is close to native binary, its usage starts to arise in the distributed cloud and edgecomputing. Recently it has become a popular binary format for users to run customized functionson AWS Lambda, Open Yurt, AZURE, etc.

The Problem. To implement a ZKSNARK-backed WASM virtual machine, we need to connect the implementation of WASM runtime with the proof system of ZKSNARK. In general, a ZKSNARK system is represented in arithmetic circuits with polynomial constraints. Therefore we need to abstract the full imperative logic of a WASM virtual machine systematically and rewrite it into arithmetic circuits with constraints. Given two outputs, one is generated by emulating the WASM bytecode in WASM runtime that enforces the semantics of WASM specification, and the other satisfies the constraints imposed on the arithmetic circuits. If the circuits we write preserve the semantics, these two outputs must be the same. Hence the proof of the ZKSNARK derived from the circuits also shows that the output is valid as a result of emulating the bytecode in WASM runtime.

Organization of the document. After a brief introduction to the basic ideas about how to connecta stateful virtual machine with ZKSNARK in Section 2, we describe the basic building block andingredients used to construct ZKWASM circuits in Section 3 and then present the circuits architecture in Section 4. After the architecture is settled, we discuss the circuits of every category of WASM instructions in Section 5. In addition, in Section 5.4 we discuss foreign instruction expansion which provides a way to extend the virtual machine for better performance and integration. In Section 6, we present the partition and proof batching technique to solve the long execution trace problem.

Throughout the document, we use the notation a:Aa:A to specify a variable of type AA, FF to specify the number field, and FnF_n to specify a multi-dimensional vector with dimension nn. We denote by ABA \rightarrow B the function type from AA to BB and use \circ for function composition. Moreover, we use G[i][j]G[i][j] to specify the value of the cell of matrix GG at the ii th row and jj th column.

2 WASM Run-Time as a State Machine

We consider the WASM virtual machine as a gigantic program, with the input as a tuple (I(C,H),E,IO)(I(C, H), E, IO) ,where II is a WASM executable image that contains a code image CC and an initial memory HH, EE is its entry point, and IOIO represents the (stdin, stdout) firmware. In the serverless setup, the WASM run-time starts with an initial state based on the loaded image II, then jumps to the entry point EE and starts executing the bytecode based on the WASM specification.

Internally the WASM run-time maintains a state SS denoted by a tuple (iaddriaddr, FF, MM, GG, SpSp, II, IOIO) where iaddriaddr is the current instruction address, FF is the calling frame with a depth field, MM is the memory state, SpSp is the stack and GG is the set of global variables. The run-time simulates the semantic of each instruction start at EE until it reaches the exit. The instructions it simulates form an execution trace [t0,t1,t2,t3,][t_0, t_1, t_2, t_3, \cdots] and each transition tit_i is a function between states that takes an input s:Ss: S and outputs a new state s:Ss': S.

For simplicity, we will use the notation of record field to specify a field in state s:Ss:S. For example, s.iaddrs.iaddr denotes the current instruction address of state ss, s.IO.stdins.IO.stdin denotes the input of state ss, etc. We also use s.iaddr.ops.iaddr.op to denote the opcode (operation code that specifies the operation to be performed) at address s.iaddrs.iaddr in the code section CC of image II.

Based on the above definition, we define the criteria for a list of state transitions to be validunder (I(C,H),E,IO)(I(C, H), E, IO), as follows.

-- Definition 2.1 (Valid Execution Trace). Given a WASM machine with input (I(C,H),E,IO)(I(C, H), E, IO), and s0s_0 is the initial state with s0.iaddr=Es_0.iaddr = E. A valid execution trace is a list of transition functions tit_i suchthat the following holds: (1) For all kk, sk=tk1t1t0(s0)s_k = t_{k-1} \circ \cdots \circ t_1 \circ t_0 (s_0), tkt_k enforces the semantics of sk.iaddr.ops_k.iaddr.op. (2) If $s_e$ is the last state, then the depth of the calling frame is zero: $se.F.depth = 0$.

3 IO of ZKWASM

The outermost function of ZKWASM is of type zkmain(void): (void). To access the external IO, we use two builtin host functions: zkwasm_input and zkwasm_output. Suppose that we have an array of instances [IiI_i] and a internal cursor kk, zkwasm_input(1) binds the kkth instance to the top of the stack and similarly zkwasm_output() binds the current top of the stack with IkI_k. Both zkwasm_input and zkwasm_output increase the cursor kk. We can also access external witness by zkwasm_input(0), when doing so we just fill the top of the stack with a witness which does not bind with any public instances.

4 Succinct Proof of the circuits

Compared with a standard WASM run-time, ZKWASM aims to provide a proof to prove that the output is valid so that it can be used in scenarios which require trustless and privacy computation. Moreover, the verifying algorithm needs to be simple in the sense of complexity to be useful inpractical. In zkWasm we have three different categories of circuits.

  1. Guest circuits: enforces the semantics of slices of execution traces.

  2. Host circuits: enforces the semantics of host API calls

  3. Aggregation circuits: batch the guest circuits and host circuits and connect proofs of slices of execution.

Last updated