Watch it on YouTube

To better understand how WebAssembly works, it might be beneficial to reverse engineer an application to see how WebAssembly opcodes are used on a real life scenario. The code used for this example is the C function created in the previous post. Once the C function is compiled as emailchecker.wasm (as shown in the previous post), the command below will disassemble the bytecode, which finally reveals the opcodes:

wabt/bin/wasm-objdump -d --enable-all emailchecker.wasm

The following is an excerpt of the output generated by the line above (please have a read at these acronyms first as they will appear in the comments)

  • GV is global variable
  • GVV is global variables vector
  • LV is local variable
  • LVV is local variables vector
  • EA is effective address
  • AV is alignment value
Code Disassembly:

000190 func[0] <__wasm_call_ctors>:
000191: 10 08                      | call 8 <emscripten_stack_init>
000193: 0b                         | end
000196 func[1] <emailchecker>:
;; Make space for 186 LVs, all integers
000197: ba 01 7f                   | local[0..185] type=i32
;; Value of stack pointer is read from index 0 of GVV and pushed onto stack
;; As GV can be accessed by all functions, they are perfect for the stack pointer
00019a: 23 00                      | global.get 0
;; Stack pointer is popped off the stack and stored into the element 1 of the LVV
00019c: 21 01                      | local.set 1
;; The integer 32 is pushed onto the stack
00019e: 41 20                      | i32.const 32
;; The integer 32 is popped from the stack and stored into the element 2 of the LVV
0001a0: 21 02                      | local.set 2                 
0001a2: 20 01                      | local.get 1
0001a4: 20 02                      | local.get 2
;; Calculate (local.get 1) - (local.get 2)
0001a6: 6b                         | i32.sub
;; Store the result into the element 3 of the LVV
0001a7: 21 03                      | local.set 3    
;; Value stored into element 3 of LVV is pushed onto the stack
0001a9: 20 03                      | local.get 3
;; Set global value
0001ab: 24 00                      | global.set 0
;; Byte-sized address for i32.store
0001ad: 20 03                      | local.get 3
;; Value for i32.store to store
0001af: 20 00                      | local.get 0
;; Store with alignment 2 (32-bit), offset 24 and EA = 24 + (local.get 3)
;; 0 is 8-bit alignment, 1 is 16-bit and so on
;; Alignment is a promise to the VM the stored data EA will be aligned at n* bits
;; If the EA of an access is a multiple of the AV, then the access is aligned
;; One should never promise more than the platform can support
0001b1: 36 02 18                   | i32.store 2 24              
0001b4: 41 01                      | i32.const 1
0001b6: 21 04                      | local.set 4
;; Byte-sized address for i32.store
0001b8: 20 03                      | local.get 3
;; Value to store for i32.store
0001ba: 20 04                      | local.get 4
;; Store with alignment 2 (32-bit), offset 20 and EA = 20 + (local.get 3)
0001bc: 36 02 14                   | i32.store 2 20              
0001bf: 20 03                      | local.get 3
;; Load value with alignment 2 (32-bit), offset 24 and EA = 24 + (local.get 3)
0001c1: 28 02 18                   | i32.load 2 24
;; Store this value into element 5 of LVV
0001c4: 21 05                      | local.set 5                 
0001c6: 20 03                      | local.get 3
0001c8: 20 05                      | local.get 5
0001ca: 36 02 10                   | i32.store 2 16
;; Just encloses around the conditional instructions
0001cd: 02 40                      | block                       
0001cf: 02 40                      |   block
;; Beginning of the loop
0001d1: 03 40                      |     loop

Eventually, all blocks above will close via the end clause which will also tell the function to terminate and return. Once the bytecode that defines the emailchecker function ends, the output continues showing those service functions the emscripten compiler implicitly includes and exports as they are required to run the code:

0008d5 func[8] <emscripten_stack_init>:
 ;; The current WebAssembly page size is 65536 bytes (64 KB)
 0008d6: 41 80 80 04                | i32.const 65536
 0008da: 24 03                      | global.set 3
 0008dc: 41 00                      | i32.const 0
 0008de: 41 0f                      | i32.const 15
 0008e0: 6a                         | i32.add
 ;; Max addressable memory
 0008e1: 41 70                      | i32.const 4294967280
 ;; AND both previous numbers
 0008e3: 71                         | i32.and
 ;; Set stack pointer
 0008e4: 24 02                      | global.set 2
 ;; Exit function
 0008e6: 0b                         | end
;; Get size of free space left on the stack
0008e8 func[9] <emscripten_stack_get_free>:
 0008e9: 23 00                      | global.get 0
 0008eb: 23 02                      | global.get 2
 0008ed: 6b                         | i32.sub
 0008ee: 0b                         | end
;; Get address of beginning of the stack (when empty)
0008f0 func[10] <emscripten_stack_get_base>:
 0008f1: 23 03                      | global.get 3
 0008f3: 0b                         | end
;; Get stack address when full
0008f5 func[11] <emscripten_stack_get_end>:
 0008f6: 23 02                      | global.get 2
 0008f8: 0b                         | end

[... skipped...]

000a54 func[19] <stackSave>:
 000a55: 23 00                      | global.get 0
 000a57: 0b                         | end
000a59 func[20] <stackRestore>:
 000a5a: 20 00                      | local.get 0
 000a5c: 24 00                      | global.set 0
 000a5e: 0b                         | end
000a60 func[21] <stackAlloc>:
 000a61: 02 7f                      | local[0..1] type=i32
 000a63: 23 00                      | global.get 0
 000a65: 20 00                      | local.get 0
 000a67: 6b                         | i32.sub
 000a68: 41 70                      | i32.const 4294967280
 000a6a: 71                         | i32.and
 000a6b: 22 01                      | local.tee 1
 000a6d: 24 00                      | global.set 0
 000a6f: 20 01                      | local.get 1
 000a71: 0b                         | end
000a73 func[22] <emscripten_stack_get_current>:
 000a74: 23 00                      | global.get 0
 000a76: 0b                         | end

Next, let's have a look at the output of wasm2c, a standard tool of the WebAssembly Binary Toolkit (WABT) which takes a WASM file in input and produces equivalent C code with header files. As the complete code listing is extremely verbose, we will only have a look at the emscripten_stack_init function, as its bytecode was shown above

void w2c__emscripten_stack_init_0(w2c_* instance) {
  FUNC_PROLOGUE;
  u32 var_i0, var_i1;
  var_i0 = 65536u;
  instance->w2c_g3 = var_i0;
  var_i0 = 0u;
  var_i1 = 15u;
  var_i0 += var_i1;
  var_i1 = 4294967280u;
  var_i0 &= var_i1;
  instance->w2c_g2 = var_i0;
  FUNC_EPILOGUE;
}

Previous Post Next Post