www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Tiny D suitable for embedded JIT

reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
Now that D has a better C option I was wondering if it is 
possible to create a small subset of D that can be used as 
embedded JIT library. I would like to trim the language to a 
small subset of D/C - only primitive types and pointers - and 
remove everything else. The idea is to have a high level assembly 
language that is suitable for use as JIT backend by other 
projects. I wanted to know if this is a feasible project - using 
DMD as the starting point. Should I even think about trying to do 
this?

The ultimate goal is to have JIT library that is small, has fast 
compilation, and generates reasonable code (i.e. some form of 
global register allocation). The options I am looking at are a) 
start from scratch, b) hack LLVM, or c) hack DMD.

Regards
Dibyendu
May 23 2018
next sibling parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Wednesday, 23 May 2018 at 18:49:05 UTC, Dibyendu Majumdar 
wrote:
 Now that D has a better C option I was wondering if it is 
 possible to create a small subset of D that can be used as 
 embedded JIT library. I would like to trim the language to a 
 small subset of D/C - only primitive types and pointers - and 
 remove everything else. The idea is to have a high level 
 assembly language that is suitable for use as JIT backend by 
 other projects. I wanted to know if this is a feasible project 
 - using DMD as the starting point. Should I even think about 
 trying to do this?

 The ultimate goal is to have JIT library that is small, has 
 fast compilation, and generates reasonable code (i.e. some form 
 of global register allocation). The options I am looking at are 
 a) start from scratch, b) hack LLVM, or c) hack DMD.

 Regards
 Dibyendu
I've recently been looking into how QEMU works and it uses something called TCG (Tiny Code Generator). QEMU works by taking code from another platform/cpu and translates it to TCG, which then gets "jitted" to the instructions for the host. From what I understand, TCG is fairly small. I think it aims to be simple rather than highly optimized, unlike LLVM which allows more complexity for the sake of performance. TCG: https://git.qemu.org/?p=qemu.git;a=blob_plain;f=tcg/README;hb=HEAD
May 23 2018
parent Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Wednesday, 23 May 2018 at 20:08:53 UTC, Jonathan Marler wrote:
 I've recently been looking into how QEMU works and it uses 
 something called TCG (Tiny Code Generator).  QEMU works by 
 taking code from another platform/cpu and translates it to TCG, 
 which then gets "jitted" to the instructions for the host.

 From what I understand, TCG is fairly small.  I think it aims 
 to be simple rather than highly optimized, unlike LLVM which 
 allows more complexity for the sake of performance.

 TCG: 
 https://git.qemu.org/?p=qemu.git;a=blob_plain;f=tcg/README;hb=HEAD
Thank you for pointing me to this - I wasn't aware of it. I already use something similar - a little more complex product that supports floating points too - NanoJIT. However to my knowledge most of these products do register allocation locally within a basic block - and spill registers when jumping across blocks. This basically results in unacceptable performance in any code that has branching or loops. I could enhance NanoJIT but its written in a way that makes changes difficult (i.e. too many low level optimizations in the code). It seems there is a lack of something in between LLVM and these implementations - either you get all powerful optimizations or you get very little ... my intention is to create something that is small but also has at least some form of global (actually per function) register allocator. I thought of hacking DMD as it favours speed of compilation and simplicity - but what I am not sure about is how easy / difficult it would be to modify DMD (mostly remove stuff). Regards Dibyendu
May 23 2018
prev sibling next sibling parent reply Joakim <dlang joakim.fea.st> writes:
On Wednesday, 23 May 2018 at 18:49:05 UTC, Dibyendu Majumdar 
wrote:
 Now that D has a better C option I was wondering if it is 
 possible to create a small subset of D that can be used as 
 embedded JIT library. I would like to trim the language to a 
 small subset of D/C - only primitive types and pointers - and 
 remove everything else. The idea is to have a high level 
 assembly language that is suitable for use as JIT backend by 
 other projects. I wanted to know if this is a feasible project 
 - using DMD as the starting point. Should I even think about 
 trying to do this?

 The ultimate goal is to have JIT library that is small, has 
 fast compilation, and generates reasonable code (i.e. some form 
 of global register allocation). The options I am looking at are 
 a) start from scratch, b) hack LLVM, or c) hack DMD.

 Regards
 Dibyendu
I don't know if this does exactly what you want, but have you seen it? https://forum.dlang.org/thread/bskpxhrqyfkvaqzoospx forum.dlang.org
May 23 2018
parent Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Thursday, 24 May 2018 at 02:39:18 UTC, Joakim wrote:
 I don't know if this does exactly what you want, but have you 
 seen it?

 https://forum.dlang.org/thread/bskpxhrqyfkvaqzoospx forum.dlang.org
Hi - thanks I hadn't seen it. It is based on LLVM - I already use LLVM and it isn't a small / or fast compiling JIT engine. Regards
May 24 2018
prev sibling next sibling parent reply Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Wednesday, 23 May 2018 at 18:49:05 UTC, Dibyendu Majumdar 
wrote:
 The ultimate goal is to have JIT library that is small, has 
 fast compilation, and generates reasonable code (i.e. some form 
 of global register allocation). The options I am looking at are 
 a) start from scratch, b) hack LLVM, or c) hack DMD.
I have been looking at DMD code (mainly the backend stuff) for this ... I think it will be too difficult for me to try to modify it :-( Regards Dibyendu
May 24 2018
parent reply Jonathan Marler <johnnymarler gmail.com> writes:
On Thursday, 24 May 2018 at 20:22:15 UTC, Dibyendu Majumdar wrote:
 On Wednesday, 23 May 2018 at 18:49:05 UTC, Dibyendu Majumdar 
 wrote:
 The ultimate goal is to have JIT library that is small, has 
 fast compilation, and generates reasonable code (i.e. some 
 form of global register allocation). The options I am looking 
 at are a) start from scratch, b) hack LLVM, or c) hack DMD.
I have been looking at DMD code (mainly the backend stuff) for this ... I think it will be too difficult for me to try to modify it :-( Regards Dibyendu
Sad to hear. Was interested to see if this was feasible. I don't have much experience with the backend but if you're still up for the task, take a look at `dmd/glue.d`. I don't know how much of the glue layer this includes but it would be a good start. DMD does have a common "glue layer" shared by DMD, LDC and GDC, so you'd basically need to find the API to build this glue layer and that's what you would use. https://github.com/dlang/dmd/blob/master/src/dmd/glue.d
May 24 2018
parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Thursday, 24 May 2018 at 22:14:50 UTC, Jonathan Marler wrote:
 Sad to hear. Was interested to see if this was feasible.  I 
 don't have much experience with the backend but if you're still 
 up for the task, take a look at `dmd/glue.d`.  I don't know how 
 much of the glue layer this includes but it would be a good 
 start.  DMD does have a common "glue layer" shared by DMD, LDC 
 and GDC, so you'd basically need to find the API to build this 
 glue layer and that's what you would use.

 https://github.com/dlang/dmd/blob/master/src/dmd/glue.d
Hi - not really as I don't know what this does. In any case my understanding is the interface between the front-end and GDC/LDC is at the level of ASTs. Regards
May 29 2018
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 30/05/2018 11:59 AM, Dibyendu Majumdar wrote:
 On Thursday, 24 May 2018 at 22:14:50 UTC, Jonathan Marler wrote:
 Sad to hear. Was interested to see if this was feasible.  I don't have 
 much experience with the backend but if you're still up for the task, 
 take a look at `dmd/glue.d`.  I don't know how much of the glue layer 
 this includes but it would be a good start.  DMD does have a common 
 "glue layer" shared by DMD, LDC and GDC, so you'd basically need to 
 find the API to build this glue layer and that's what you would use.

 https://github.com/dlang/dmd/blob/master/src/dmd/glue.d
Hi - not really as I don't know what this does. In any case my understanding is the interface between the front-end and GDC/LDC is at the level of ASTs. Regards
The input is the AST, the output to the backend is some form of IR in essence. It just maps one understanding of the code to another form, that's all.
May 29 2018
parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Wednesday, 30 May 2018 at 00:05:52 UTC, rikki cattermole wrote:
 https://github.com/dlang/dmd/blob/master/src/dmd/glue.d
Hi - not really as I don't know what this does. In any case my understanding is the interface between the front-end and GDC/LDC is at the level of ASTs.
 The input is the AST, the output to the backend is some form of 
 IR in essence.

 It just maps one understanding of the code to another form, 
 that's all.
Okay - I was trying to understand if there was some sort of IR that is intermediate stage before codegen - but I couldn't see this. Also the code is quite hard to follow without enough documentation of what's going on. Plus lots of global state I think which is fine for a command line tool but not a JIT engine. But really I looked only for a short while so please correct me if I am wrong. I decided to use a cut-down version of Eclipse OMR - the backend is much smaller than LLVM, although not as small as I would like. But I hope to create a more trimmed version in due course. (https://github.com/dibyendumajumdar/nj) Regards
May 31 2018
parent reply a11e99z <black80 bk.ru> writes:
On Thursday, 31 May 2018 at 19:16:28 UTC, Dibyendu Majumdar wrote:
 On Wednesday, 30 May 2018 at 00:05:52 UTC, rikki cattermole 
 wrote:
 https://github.com/dlang/dmd/blob/master/src/dmd/glue.d
Hi - not really as I don't know what this does. In any case my understanding is the interface between the front-end and GDC/LDC is at the level of ASTs.
 The input is the AST, the output to the backend is some form 
 of IR in essence.

 It just maps one understanding of the code to another form, 
 that's all.
Okay - I was trying to understand if there was some sort of IR that is intermediate stage before codegen - but I couldn't see this. Also the code is quite hard to follow without enough documentation of what's going on. Plus lots of global state I think which is fine for a command line tool but not a JIT engine. But really I looked only for a short while so please correct me if I am wrong. I decided to use a cut-down version of Eclipse OMR - the backend is much smaller than LLVM, although not as small as I would like. But I hope to create a more trimmed version in due course. (https://github.com/dibyendumajumdar/nj) Regards
Just saw your post and I have some questions: - does OMR supports value types or only ref types as JVM(all classes) and Lua(all tables) used? - (too few infos about ORM) did ORM implement different types of GC or just support? - with what lib you fill more comfort to work: LLVM or ORM? - what lib do best optimization? (probably LLVM) also see the Terra project for Lua http://terralang.org/ probably its more useful than Ravi cuz the last one try to optimize Lua and Ravi at same time but Terra optimize only Terra-parts with comparable performance as best BLAS/ATLAS-libs do (see PDFs about it). for now Terra looks abandoned but probably cuz nothing add to it.
Jul 24 2019
parent Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Wednesday, 24 July 2019 at 10:11:37 UTC, a11e99z wrote:
 Just saw your post and I have some questions:
 - does OMR supports value types or only ref types as JVM(all 
 classes) and Lua(all tables) used?
The OMR JIT engine only knows about primitive types and an array type. Classes etc are done by the Java front-end.
 - (too few infos about ORM) did ORM implement different types 
 of GC or just support?
I have no experience with the GC part unfortunately.
 - with what lib you fill more comfort to work: LLVM or ORM?
LLVM has a mature api. OMR's api is still being defined. OMR's JIT is also not very well tested with C like languages where stack values can be aliased. Java doesn't allow that so OMR by default assumes that this type of aliasing doesn't happen. I have some pending pull requests to enable such aliasing to be detected.
 - what lib do best optimization? (probably LLVM)
LLVM was better in my tests. LLVM is obviously being used to optimize all sorts of code, whereas OMR's optimizer is used in Java primarily. The challenges are somewhat different I think. For example, OMR is not very sophisticated when it comes to optimizing floating point operations as this is not so important in the Java world. Having said that OMR is part of a bigger compiler framework at IBM so maybe it has/had features that are not being used.
 also see the Terra project for Lua http://terralang.org/
 probably its more useful than Ravi cuz the last one try to 
 optimize Lua and Ravi at same time but Terra optimize only 
 Terra-parts with comparable performance as best BLAS/ATLAS-libs 
 do (see PDFs about it). for now Terra looks abandoned but 
 probably cuz nothing add to it.
Well, my opinion is Terra is a redundant language. It tries to implement a C like language with Lua like syntax. But LuaJIT can already generate machine code and interfaces with C easily. And if you need C then use C or Rust or even D ;-) Sorry this may be OT now.
Jul 26 2019
prev sibling parent reply MrSmith <mrsmith33 yandex.ru> writes:
On Wednesday, 23 May 2018 at 18:49:05 UTC, Dibyendu Majumdar 
wrote:
 Now that D has a better C option I was wondering if it is 
 possible to create a small subset of D that can be used as 
 embedded JIT library. I would like to trim the language to a 
 small subset of D/C - only primitive types and pointers - and 
 remove everything else. The idea is to have a high level 
 assembly language that is suitable for use as JIT backend by 
 other projects. I wanted to know if this is a feasible project 
 - using DMD as the starting point. Should I even think about 
 trying to do this?

 The ultimate goal is to have JIT library that is small, has 
 fast compilation, and generates reasonable code (i.e. some form 
 of global register allocation). The options I am looking at are 
 a) start from scratch, b) hack LLVM, or c) hack DMD.

 Regards
 Dibyendu
You may like the project of a compiler I am doing https://github.com/MrSmith33/tiny_jit TLDR: fully in D. No dependencies. Currently for amd64 + Win64 calling convension. P.S. Sorry for late response.
May 28 2018
next sibling parent dell support <satanicendeavour gmail.com> writes:
On Monday, 28 May 2018 at 19:24:58 UTC, MrSmith wrote:
 On Wednesday, 23 May 2018 at 18:49:05 UTC, Dibyendu Majumdar 
 wrote:
 [...]
You may like the project of a compiler I am doing https://github.com/MrSmith33/tiny_jit TLDR: fully in D. No dependencies. Currently for amd64 + Win64 calling convension. P.S. Sorry for late response.
I read your comment, I have a solution for you can take A help of https://dellsupports.org/ to solve the problem.
May 29 2018
prev sibling parent Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Monday, 28 May 2018 at 19:24:58 UTC, MrSmith wrote:

 You may like the project of a compiler I am doing 
 https://github.com/MrSmith33/tiny_jit
 TLDR: fully in D. No dependencies. Currently for amd64 + Win64 
 calling convension.
Cool - I will keep an eye on it. Regards
May 29 2018