Re: [请益] 想不通直译器vs编译器vs机器码的问题

楼主: snaketsai (さいでんし)   2018-05-16 14:21:27
Update:
睡前简单弄个小更新
把我回文中说的code cache的hashtable快速弄出来了
第一次呼叫时会emite code,第二次因为hashtable查找成功则直接呼叫
另外就是把mmap要内存时的可读可写可执行,改成仅可读可写
后面再透过mprotect改成可读可执行
这样可以防止后来有人窜改JIT emit的code cache
patch放在这边:
https://paste.plurk.com/show/2636452/
P.S. 我不用github放的原因是因为我不想暴露我的主ID
还请担待
= = = = = = = = = = = =
中午吃饱饭有点胀气
看到某篇回文的推文中有人想要看JIT范例
所以简单写了个很粗糙的版本
前后花不到15分钟
所以觉得coding style很烂、没效率是很正常的,别打我QQ
(结果真的被转出去了Orz
我晚点忙完会来修code
真的assembly code gen会找时间补上的)
大抵上的思路就是你吃到虚拟机的bytecode,
就把它转成host平台的native code
接着把它塞进你跟系统动态要的、可以执行的内存区段
然后就很快乐的开始执行它
黄色上色的地方,每个作业系统给的API不一样
Linux、OSX、FreeBSD ... (most of *NIX OSes):
mprotect()、mmap() with proper permissions.
Windows:
VirtualAlloc()
= = = = = = = = = = =
/**
*
* This piece of code is to demo simple VM/JIT knowledge
* and is released under 2-clause BSD license.
*
* Copyright 2018/05/16 snaketsai
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* U2FsdGVkX19w8Ikk1T7xBlbh4vDhIEvZzshUhXft6XMFugC9M27uV9LDszf7/8gP
* OtF2AZwYaUQqzLLY5vXhCQ==
*
**/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#ifdef _NOJIT
static inline int IAdd(int inp1, int inp2) {
puts("Non-JIT version iadd.");
return inp1+inp2;
}
static inline int ISub(int inp1, int inp2) {
puts("Non-JIT version isub.");
return inp1-inp2;
}
#else
// Nope, I'm not gonna write an assembler.
// Just pretend that we _magically_ get assembly pieces we need.
const unsigned char _add[] = \
"\x55\x48\x89\xe5\x89\x7d\xfc\x89\x75\xf8\x8b\x55\xfc\x8b\x45\xf8\x01\xd0\x5d\xc3\x00";
const unsigned char _sub[] = \
"\x55\x48\x89\xe5\x89\x7d\xfc\x89\x75\xf8\x8b\x45\xfc\x2b\x45\xf8\x5d\xc3\x00";
#define MAXCODECAHE_SIZE 4096
void* AllocExeMem(size_t size) {
void* memPtr = mmap(0, size,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (memPtr == (void*)-1) {
perror("mmap() failed.");
return NULL;
}
return memPtr;
}
void *CodeCachePool;
int (*IAdd)(int, int);
int (*ISub)(int, int);
#endif
/**
Very Simple VM spec - -
Every insn. is 3 byte long.
opcodes are listed below:
0x00 HCF, halt and catch fire.
0x01 IADD, humbly add two signed integers and print the result.
0x02 ISUB, humbly substract two signed integers and print the result.
The bytecode stream should end with a 0xff byte mark.
**/
const unsigned char BytecodeStream[] = \
"\x01\x02\x03\x02\x05\x04\x00\x00\x00\xff";
int runVM(const unsigned char* bstream) {
unsigned char insn[3];
while(bstream[0] != 0xff) {
memcpy(&insn,bstream,3);
switch(insn[0]) {
case 0x00:
puts("Dave, stop, will you ?");
return 0;
case 0x01:
#ifndef _NOJIT
// emit code to code cache.
memcpy(CodeCachePool, _add, sizeof(_add));
#endif
printf("iadd: %d\n", IAdd((int)insn[1], (int)insn[2]));
break;
case 0x02:
#ifndef _NOJIT
memcpy(CodeCachePool, _sub, sizeof(_sub));
#endif
printf("isub: %d\n", ISub((int)insn[1], (int)insn[2]));
break;
default :
// Unrecognized insn.
perror("Sorry Dave, I can't do that.");
return -1;
}
bstream+=3;
}
}
int main(int ac, char* av[]) {
int ret = -1;
#ifndef _NOJIT
CodeCachePool = AllocExeMem(MAXCODECAHE_SIZE);
IAdd = CodeCachePool;
ISub= CodeCachePool;
#endif
ret = runVM(BytecodeStream);
if(ret == 0) { perror("VM exited successfully."); }
else { perror("Unexpected error occured."); }
return 0;
}
作者: cklppt (依旧如此创新未来)   2018-05-16 18:51:00
只看到HAL(遮脸w
作者: LinuxKernel (Linus Torvalds)   2018-05-16 19:56:00
113就是猛 可是没有人推文QQ
作者: soheadsome (师大狗鼻哥)   2018-05-16 19:58:00
超赞
作者: weiyucsie (选择那刻 才算开始)   2018-05-16 20:23:00
原来IAdd,ISub有个puts,难怪机器码这么长https://nullprogram.com/blog/2015/03/19/想说add,sub都有instruction,怎么会那么长XD这篇文章把可执行的部分点出来,只是没codegen也不知道可不可以叫做即时编译阿 我想到了,x86的calling convention是推入堆叠也许是这样会比较复杂吧
作者: pptsodog (天桥下说书)   2018-05-17 05:09:00
作者: weiyucsie (选择那刻 才算开始)   2018-05-17 17:11:00
看了一下-O0编出来的,好像真的有前后两段不过有下最佳化参数后就只剩几行可能就你说的prologue和epilogue
楼主: snaketsai (さいでんし)   2018-05-17 18:58:00
如果是add的话,gcc -O3会优化到只剩下lea就retq了XD其实我现在正在看要怎么模仿tcg的codegen来做tcg在backend时只负责gen中间的部份真正vcpu在cpu-exec时会macro刻意把进入换成平台相依的prologue、然后才跳进去code cache、出去时叫epilogue

Links booklink

Contact Us: admin [ a t ] ucptt.com