www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Labels as values and threaded-code interpretation

reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
Hi,

I'm sure this has been brought up before, but I feel I need to bring it 
up again (because I'm going to be writing a threaded-code interpreter): 
http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

This is an incredibly important extension. The final switch statement is 
not a replacement because it doesn't allow the programmer to store a 
label address directly into a code stream, which is what's essential to 
write a threaded-code interpreter.

The Erlang folks went through hell just to use this feature; see the 5th 
Q at: 
http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions

The idea is to be able to write code like this:

----

import std.algorithm;

enum Op : ubyte
{
     imm,
     add,
     sub,
     // ...
     ret,
}

final class Insn
{
     Op op;
     size_t[] args;
     void* lbl;
     Insn next;
}

final class State
{
     Insn pc;
     size_t[64] regs;
}

size_t interp(Insn[] code)
{
     // Set up the instruction stream with label addresses
     // the first time that it is executed. Label addresses
     // are stable, so we only do this once.

     foreach (insn; code.filter!(x => !x.lbl)())
     {
         void* lbl;

         with (Op)
         {
             final switch (insn.op)
             {
                 case imm: lbl = &&handle_imm; break;
                 case add: lbl = &&handle_add; break;
                 case sub: lbl = &&handle_sub; break;
                 // ...
                 case ret: lbl = &&handle_ret; break;
             }
         }

         insn.lbl = lbl;
     }

     // Start interpreting the entry instruction.

     auto state = new State;
     state.pc = code[0];
     goto *state.pc.lbl;

     // Interpreter logic follows...

     // The whole point is to avoid unnecessary function
     // calls, so we use a mixin template for this stuff.
     enum advance = "state.pc = state.pc.next;" ~
                    "goto *state.pc.lbl;";

     handle_imm:
     {
         state.regs[state.pc.args[0]] = state.pc.args[1];
         mixin(advance);
     }

     handle_add:
     {
         state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] + 
state.regs[state.pc.args[2]];
         mixin(advance);
     }

     handle_sub:
     {
         state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] - 
state.regs[state.pc.args[2]];
         mixin(advance);
     }

     // ...

     handle_ret:
     {
         return state.regs[state.pc.args[0]];
     }

     assert(false);
}

----

Notice that there are no function calls going on. Just plain jumps all 
the way through. This is a technique that many real world interpreters 
use to get significant speedups, and I for one think we desperately need it.

Implementing it should be trivial as both LLVM and GCC support taking 
the address of a block. I'm sure the DMD back end could be extended to 
support it too.

-- 
Alex Rnne Petersen
alex alexrp.com / alex lycus.org
http://alexrp.com / http://lycus.org
May 31 2013
next sibling parent reply Manu <turkeyman gmail.com> writes:
--047d7b471f6af5989104de112e1d
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

GCC has a non-standard extension to do this, I've used it to get great
performance wins writing emulators. I love this feature, love to see it in
D!
On 1 Jun 2013 15:30, "Alex R=C3=B8nne Petersen" <alex lycus.org> wrote:

 Hi,

 I'm sure this has been brought up before, but I feel I need to bring it u=

 again (because I'm going to be writing a threaded-code interpreter):
 http://gcc.gnu.org/onlinedocs/**gcc/Labels-as-Values.html<http://gcc.gnu.=

 This is an incredibly important extension. The final switch statement is
 not a replacement because it doesn't allow the programmer to store a labe=

 address directly into a code stream, which is what's essential to write a
 threaded-code interpreter.

 The Erlang folks went through hell just to use this feature; see the 5th =

 at: http://www.erlang.org/doc/**installation_guide/INSTALL-**
 WIN32.html#Frequently-Asked-**Questions<http://www.erlang.org/doc/install=

 The idea is to be able to write code like this:

 ----

 import std.algorithm;

 enum Op : ubyte
 {
     imm,
     add,
     sub,
     // ...
     ret,
 }

 final class Insn
 {
     Op op;
     size_t[] args;
     void* lbl;
     Insn next;
 }

 final class State
 {
     Insn pc;
     size_t[64] regs;
 }

 size_t interp(Insn[] code)
 {
     // Set up the instruction stream with label addresses
     // the first time that it is executed. Label addresses
     // are stable, so we only do this once.

     foreach (insn; code.filter!(x =3D> !x.lbl)())
     {
         void* lbl;

         with (Op)
         {
             final switch (insn.op)
             {
                 case imm: lbl =3D &&handle_imm; break;
                 case add: lbl =3D &&handle_add; break;
                 case sub: lbl =3D &&handle_sub; break;
                 // ...
                 case ret: lbl =3D &&handle_ret; break;
             }
         }

         insn.lbl =3D lbl;
     }

     // Start interpreting the entry instruction.

     auto state =3D new State;
     state.pc =3D code[0];
     goto *state.pc.lbl;

     // Interpreter logic follows...

     // The whole point is to avoid unnecessary function
     // calls, so we use a mixin template for this stuff.
     enum advance =3D "state.pc =3D state.pc.next;" ~
                    "goto *state.pc.lbl;";

     handle_imm:
     {
         state.regs[state.pc.args[0]] =3D state.pc.args[1];
         mixin(advance);
     }

     handle_add:
     {
         state.regs[state.pc.args[0]] =3D state.regs[state.pc.args[1]] +
 state.regs[state.pc.args[2]];
         mixin(advance);
     }

     handle_sub:
     {
         state.regs[state.pc.args[0]] =3D state.regs[state.pc.args[1]] -
 state.regs[state.pc.args[2]];
         mixin(advance);
     }

     // ...

     handle_ret:
     {
         return state.regs[state.pc.args[0]];
     }

     assert(false);
 }

 ----

 Notice that there are no function calls going on. Just plain jumps all th=

 way through. This is a technique that many real world interpreters use to
 get significant speedups, and I for one think we desperately need it.

 Implementing it should be trivial as both LLVM and GCC support taking the
 address of a block. I'm sure the DMD back end could be extended to suppor=

 it too.

 --
 Alex R=C3=B8nne Petersen
 alex alexrp.com / alex lycus.org
 http://alexrp.com / http://lycus.org

--047d7b471f6af5989104de112e1d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <p dir=3D"ltr">GCC has a non-standard extension to do this, I&#39;ve used i= t to get great performance wins writing emulators. I love this feature, lov= e to see it in D!</p> <div class=3D"gmail_quote">On 1 Jun 2013 15:30, &quot;Alex R=C3=B8nne Peter= sen&quot; &lt;<a href=3D"mailto:alex lycus.org">alex lycus.org</a>&gt; wrot= e:<br type=3D"attribution"><blockquote class=3D"gmail_quote" style=3D"margi= n:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Hi,<br> <br> I&#39;m sure this has been brought up before, but I feel I need to bring it= up again (because I&#39;m going to be writing a threaded-code interpreter)= : <a href=3D"http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html" targe= t=3D"_blank">http://gcc.gnu.org/onlinedocs/<u></u>gcc/Labels-as-Values.html= </a><br> <br> This is an incredibly important extension. The final switch statement is no= t a replacement because it doesn&#39;t allow the programmer to store a labe= l address directly into a code stream, which is what&#39;s essential to wri= te a threaded-code interpreter.<br> <br> The Erlang folks went through hell just to use this feature; see the 5th Q = at: <a href=3D"http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.h= tml#Frequently-Asked-Questions" target=3D"_blank">http://www.erlang.org/doc= /<u></u>installation_guide/INSTALL-<u></u>WIN32.html#Frequently-Asked-<u></= u>Questions</a><br> <br> The idea is to be able to write code like this:<br> <br> ----<br> <br> import std.algorithm;<br> <br> enum Op : ubyte<br> {<br> =C2=A0 =C2=A0 imm,<br> =C2=A0 =C2=A0 add,<br> =C2=A0 =C2=A0 sub,<br> =C2=A0 =C2=A0 // ...<br> =C2=A0 =C2=A0 ret,<br> }<br> <br> final class Insn<br> {<br> =C2=A0 =C2=A0 Op op;<br> =C2=A0 =C2=A0 size_t[] args;<br> =C2=A0 =C2=A0 void* lbl;<br> =C2=A0 =C2=A0 Insn next;<br> }<br> <br> final class State<br> {<br> =C2=A0 =C2=A0 Insn pc;<br> =C2=A0 =C2=A0 size_t[64] regs;<br> }<br> <br> size_t interp(Insn[] code)<br> {<br> =C2=A0 =C2=A0 // Set up the instruction stream with label addresses<br> =C2=A0 =C2=A0 // the first time that it is executed. Label addresses<br> =C2=A0 =C2=A0 // are stable, so we only do this once.<br> <br> =C2=A0 =C2=A0 foreach (insn; code.filter!(x =3D&gt; !x.lbl)())<br> =C2=A0 =C2=A0 {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 void* lbl;<br> <br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 with (Op)<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 final switch (insn.op)<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 case imm: lbl =3D &= amp;&amp;handle_imm; break;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 case add: lbl =3D &= amp;&amp;handle_add; break;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 case sub: lbl =3D &= amp;&amp;handle_sub; break;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 // ...<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 case ret: lbl =3D &= amp;&amp;handle_ret; break;<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 }<br> <br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 insn.lbl =3D lbl;<br> =C2=A0 =C2=A0 }<br> <br> =C2=A0 =C2=A0 // Start interpreting the entry instruction.<br> <br> =C2=A0 =C2=A0 auto state =3D new State;<br> =C2=A0 =C2=A0 state.pc =3D code[0];<br> =C2=A0 =C2=A0 goto *state.pc.lbl;<br> <br> =C2=A0 =C2=A0 // Interpreter logic follows...<br> <br> =C2=A0 =C2=A0 // The whole point is to avoid unnecessary function<br> =C2=A0 =C2=A0 // calls, so we use a mixin template for this stuff.<br> =C2=A0 =C2=A0 enum advance =3D &quot;state.pc =3D state.pc.next;&quot; ~<br=

goto *state.pc.lbl;&quot;;<br> <br> =C2=A0 =C2=A0 handle_imm:<br> =C2=A0 =C2=A0 {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 state.regs[state.pc.args[0]] =3D state.pc.args[= 1];<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 mixin(advance);<br> =C2=A0 =C2=A0 }<br> <br> =C2=A0 =C2=A0 handle_add:<br> =C2=A0 =C2=A0 {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 state.regs[state.pc.args[0]] =3D state.regs[sta= te.pc.args[1]] + state.regs[state.pc.args[2]];<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 mixin(advance);<br> =C2=A0 =C2=A0 }<br> <br> =C2=A0 =C2=A0 handle_sub:<br> =C2=A0 =C2=A0 {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 state.regs[state.pc.args[0]] =3D state.regs[sta= te.pc.args[1]] - state.regs[state.pc.args[2]];<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 mixin(advance);<br> =C2=A0 =C2=A0 }<br> <br> =C2=A0 =C2=A0 // ...<br> <br> =C2=A0 =C2=A0 handle_ret:<br> =C2=A0 =C2=A0 {<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 return state.regs[state.pc.args[0]];<br> =C2=A0 =C2=A0 }<br> <br> =C2=A0 =C2=A0 assert(false);<br> }<br> <br> ----<br> <br> Notice that there are no function calls going on. Just plain jumps all the = way through. This is a technique that many real world interpreters use to g= et significant speedups, and I for one think we desperately need it.<br> <br> Implementing it should be trivial as both LLVM and GCC support taking the a= ddress of a block. I&#39;m sure the DMD back end could be extended to suppo= rt it too.<br> <br> -- <br> Alex R=C3=B8nne Petersen<br> <a href=3D"mailto:alex alexrp.com" target=3D"_blank">alex alexrp.com</a> / = <a href=3D"mailto:alex lycus.org" target=3D"_blank">alex lycus.org</a><br> <a href=3D"http://alexrp.com" target=3D"_blank">http://alexrp.com</a> / <a = href=3D"http://lycus.org" target=3D"_blank">http://lycus.org</a><br> </blockquote></div> --047d7b471f6af5989104de112e1d--
May 31 2013
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 01-06-2013 07:41, Manu wrote:
 GCC has a non-standard extension to do this, I've used it to get great
 performance wins writing emulators. I love this feature, love to see it
 in D!

Yes, it's basically essential for high-perf interpreters. It's a feature that a systems language like D must have.
 On 1 Jun 2013 15:30, "Alex Rønne Petersen" <alex lycus.org
 <mailto:alex lycus.org>> wrote:

     Hi,

     I'm sure this has been brought up before, but I feel I need to bring
     it up again (because I'm going to be writing a threaded-code
     interpreter):
     http://gcc.gnu.org/onlinedocs/__gcc/Labels-as-Values.html
     <http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html>

     This is an incredibly important extension. The final switch
     statement is not a replacement because it doesn't allow the
     programmer to store a label address directly into a code stream,
     which is what's essential to write a threaded-code interpreter.

     The Erlang folks went through hell just to use this feature; see the
     5th Q at:
     http://www.erlang.org/doc/__installation_guide/INSTALL-__WIN32.html#Frequently-Asked-__Questions
     <http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions>

     The idea is to be able to write code like this:

     ----

     import std.algorithm;

     enum Op : ubyte
     {
          imm,
          add,
          sub,
          // ...
          ret,
     }

     final class Insn
     {
          Op op;
          size_t[] args;
          void* lbl;
          Insn next;
     }

     final class State
     {
          Insn pc;
          size_t[64] regs;
     }

     size_t interp(Insn[] code)
     {
          // Set up the instruction stream with label addresses
          // the first time that it is executed. Label addresses
          // are stable, so we only do this once.

          foreach (insn; code.filter!(x => !x.lbl)())
          {
              void* lbl;

              with (Op)
              {
                  final switch (insn.op)
                  {
                      case imm: lbl = &&handle_imm; break;
                      case add: lbl = &&handle_add; break;
                      case sub: lbl = &&handle_sub; break;
                      // ...
                      case ret: lbl = &&handle_ret; break;
                  }
              }

              insn.lbl = lbl;
          }

          // Start interpreting the entry instruction.

          auto state = new State;
          state.pc = code[0];
          goto *state.pc.lbl;

          // Interpreter logic follows...

          // The whole point is to avoid unnecessary function
          // calls, so we use a mixin template for this stuff.
          enum advance = "state.pc = state.pc.next;" ~
                         "goto *state.pc.lbl;";

          handle_imm:
          {
              state.regs[state.pc.args[0]] = state.pc.args[1];
              mixin(advance);
          }

          handle_add:
          {
              state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]]
     + state.regs[state.pc.args[2]];
              mixin(advance);
          }

          handle_sub:
          {
              state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]]
     - state.regs[state.pc.args[2]];
              mixin(advance);
          }

          // ...

          handle_ret:
          {
              return state.regs[state.pc.args[0]];
          }

          assert(false);
     }

     ----

     Notice that there are no function calls going on. Just plain jumps
     all the way through. This is a technique that many real world
     interpreters use to get significant speedups, and I for one think we
     desperately need it.

     Implementing it should be trivial as both LLVM and GCC support
     taking the address of a block. I'm sure the DMD back end could be
     extended to support it too.

     --
     Alex Rønne Petersen
     alex alexrp.com <mailto:alex alexrp.com> / alex lycus.org
     <mailto:alex lycus.org>
     http://alexrp.com / http://lycus.org

-- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Alex Rønne Petersen:

 I'm sure this has been brought up before, but I feel I need to 
 bring it up again (because I'm going to be writing a 
 threaded-code interpreter): 
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension.

This was discussed more than once, and I think it's a valuable improvement for certain kinds of low level D programming. Walter says that he has not added this feature to D because it's useful only to write interpreters and the like. I have found it useful in GCC to also write very fast finite state machines to analyse genomic strings. But even if Walter is right, writing interpreters (and emulators) is an important purpose for a system language as D. You can't write them efficient in scripting languages, and even Haskell/F# are not good at all to write them. Languages like C/D/C++ (and few others, as later probably Rust) are the only few natural languages to write them. "Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.
 Implementing it should be trivial as both LLVM and GCC support 
 taking the address of a block. I'm sure the DMD back end could 
 be extended to support it too.

Even if DMD back-end will never implement it, I think it's important to define it formally and officially in the D language and add the relative syntax to the front-end (plus a standard version identifier that allows to write programs that have both a routine that uses computed gotos and one that doesn't use them). This has the big advantage that all future compilers that will want to implement it will use the same nice standard syntax. If D doesn't define this syntax, then maybe GDC will add it, and maybe later LDC will add it, and later maybe other future compilers will add it, and we can't be sure they will share the same syntax. Another advantage of having this syntax in D, is that even if a compiler back-end doesn't support computed gotos, the program compiles without syntax errors when you use conditional compilation to disable the piece of code that uses computed gotos. Bye, bearophile
Jun 01 2013
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 01-06-2013 09:59, bearophile wrote:
 Alex Rønne Petersen:

 I'm sure this has been brought up before, but I feel I need to bring
 it up again (because I'm going to be writing a threaded-code
 interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension.

This was discussed more than once, and I think it's a valuable improvement for certain kinds of low level D programming. Walter says that he has not added this feature to D because it's useful only to write interpreters and the like. I have found it useful in GCC to also write very fast finite state machines to analyse genomic strings. But even if Walter is right, writing interpreters (and emulators) is an important purpose for a system language as D. You can't write them efficient in scripting languages, and even Haskell/F# are not good at all to write them. Languages like C/D/C++ (and few others, as later probably Rust) are the only few natural languages to write them. "Recently" the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.
 Implementing it should be trivial as both LLVM and GCC support taking
 the address of a block. I'm sure the DMD back end could be extended to
 support it too.

Even if DMD back-end will never implement it, I think it's important to define it formally and officially in the D language and add the relative syntax to the front-end (plus a standard version identifier that allows to write programs that have both a routine that uses computed gotos and one that doesn't use them). This has the big advantage that all future compilers that will want to implement it will use the same nice standard syntax. If D doesn't define this syntax, then maybe GDC will add it, and maybe later LDC will add it, and later maybe other future compilers will add it, and we can't be sure they will share the same syntax. Another advantage of having this syntax in D, is that even if a compiler back-end doesn't support computed gotos, the program compiles without syntax errors when you use conditional compilation to disable the piece of code that uses computed gotos.

Yes, good points.
 Bye,
 bearophile

-- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
 On 01-06-2013 09:59, bearophile wrote:
 "Recently" the Python C interpreter was modified and speed up thanks to
 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.
Jun 01 2013
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 02-06-2013 06:49, Walter Bright wrote:
 On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
 On 01-06-2013 09:59, bearophile wrote:
 "Recently" the Python C interpreter was modified and speed up thanks to
 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension.

I know, I just don't always remember to type out "GNU C" instead of "C".
 Also, such a construct could not be made  safe. The trouble is you could
 pass those addresses anywhere, and goto them from anywhere.

I don't particularly care about its safety (its insanely unsafe). It's all about performance with a feature like this. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
prev sibling next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 02.06.2013 06:49, schrieb Walter Bright:
 On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
 On 01-06-2013 09:59, bearophile wrote:
 "Recently" the Python C interpreter was modified and speed up thanks to
 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension.

I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour. Not much people seem to care to read language standards. :) -- Paulo
Jun 01 2013
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 02-06-2013 08:44, Paulo Pinto wrote:
 Am 02.06.2013 06:49, schrieb Walter Bright:
 On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
 On 01-06-2013 09:59, bearophile wrote:
 "Recently" the Python C interpreter was modified and speed up thanks to
 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension.

I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour. Not much people seem to care to read language standards. :) -- Paulo

I am perfectly aware that it's a GNU C extension. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 02 2013
prev sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 02.06.2013 09:25, schrieb Brad Roberts:
 On 6/1/13 9:49 PM, Walter Bright wrote:
 On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
 On 01-06-2013 09:59, bearophile wrote:
 "Recently" the Python C interpreter was modified and speed up thanks to
 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.

While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status.

If your world is only UNIX then yeah, there are still lots of non UNIX os out there if you look outside the desktop computers. Of course, one could always question if they matter at all.
Jun 02 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/2/2013 12:49 AM, Brad Roberts wrote:
 Many if not most of those non-unix platforms use gcc, which is included in the
 set of compilers that does support it.  The point is, which you didn't change,
 is that's it's a defacto standard even if not technically in the c or c++
 language specs.

The curious question is why it never gets into the newer C and C++ Standards.
Jun 02 2013
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 02-06-2013 19:44, Walter Bright wrote:
 On 6/2/2013 12:49 AM, Brad Roberts wrote:
 Many if not most of those non-unix platforms use gcc, which is
 included in the
 set of compilers that does support it.  The point is, which you didn't
 change,
 is that's it's a defacto standard even if not technically in the c or c++
 language specs.

The curious question is why it never gets into the newer C and C++ Standards.

That one's easy: It makes 'too many' assumptions about the capabilities of the target machine. Sort of like assuming signed integers overflow. In practice, though, any relevant target can support computed gotos. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 02 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-06-02 19:44, Walter Bright wrote:

 The curious question is why it never gets into the newer C and C++
 Standards.

It's like inline assembly. I think basically every compiler supports it but it's not in the standard. At least there's a compiler for each platform supporting it. -- /Jacob Carlborg
Jun 03 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Alex Rønne Petersen:

             final switch (insn.op)
             {
                 case imm: lbl = &&handle_imm; break;
                 case add: lbl = &&handle_add; break;
                 case sub: lbl = &&handle_sub; break;
                 // ...
                 case ret: lbl = &&handle_ret; break;

Regarding the syntax, why do you use "&&"? Isn't a single "&" enough? case imm: lbl = &handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile
Jun 01 2013
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/01/2013 11:43 AM, bearophile wrote:
 Alex Rønne Petersen:

             final switch (insn.op)
             {
                 case imm: lbl = &&handle_imm; break;
                 case add: lbl = &&handle_add; break;
                 case sub: lbl = &&handle_sub; break;
                 // ...
                 case ret: lbl = &&handle_ret; break;

Regarding the syntax, why do you use "&&"? Isn't a single "&" enough? case imm: lbl = &handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile

D (like C) uses a different namespace for labels and symbols that are not labels. This compiles today: void main(){ int foo; foo: auto b = &foo; }
Jun 01 2013
parent =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 02-06-2013 05:18, Andrej Mitrovic wrote:
 On 6/1/13, Timon Gehr <timon.gehr gmx.ch> wrote:
 D (like C) uses a different namespace for labels and symbols that are
 not labels.

Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.:
 void main(){
       int foo;
       foo: auto b = __traits(gotoAddr, foo);
 }

And if it's successful, we add language support.

You need a way to jump to an arbitrary address too. -- Alex Rnne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
prev sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 01-06-2013 11:43, bearophile wrote:
 Alex Rønne Petersen:

             final switch (insn.op)
             {
                 case imm: lbl = &&handle_imm; break;
                 case add: lbl = &&handle_add; break;
                 case sub: lbl = &&handle_sub; break;
                 // ...
                 case ret: lbl = &&handle_ret; break;

Regarding the syntax, why do you use "&&"? Isn't a single "&" enough? case imm: lbl = &handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile

I just used the GNU C syntax because I was familiar with it. I don't particularly care how it ends up looking in D. But Timon makes a good point about namespaces. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
As example the very small but fast virtual machine written in GNU 
C++ from the International Contest on Functional Programming 2006:

http://codepad.org/iibBeWKw

It's faster than the same code without computed gotos.

Bye,
bearophile
Jun 01 2013
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 01-06-2013 15:17, Diggory wrote:
 On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote:
 As example the very small but fast virtual machine written in GNU C++
 from the International Contest on Functional Programming 2006:

 http://codepad.org/iibBeWKw

 It's faster than the same code without computed gotos.

 Bye,
 bearophile

Would be cool if there was a platform independent way in D to construct a sequence of "call" instructions in memory. Then it would be possible to write a JIT compiler in the exact same style as that example.

At that point, you just want a simple in-memory assembler. Think asmjit.
 It would be relatively easy to add to phobos although you'd have to be
 careful about DEP. Hmm, perhaps I will try writing something like this...

Probably a bit too domain-specific (not to say it isn't useful). -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 01 2013
prev sibling next sibling parent "Diggory" <diggsey googlemail.com> writes:
On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote:
 As example the very small but fast virtual machine written in 
 GNU C++ from the International Contest on Functional 
 Programming 2006:

 http://codepad.org/iibBeWKw

 It's faster than the same code without computed gotos.

 Bye,
 bearophile

Would be cool if there was a platform independent way in D to construct a sequence of "call" instructions in memory. Then it would be possible to write a JIT compiler in the exact same style as that example. It would be relatively easy to add to phobos although you'd have to be careful about DEP. Hmm, perhaps I will try writing something like this...
Jun 01 2013
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/01/2013 07:29 AM, Alex Rnne Petersen wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to bring it
 up again (because I'm going to be writing a threaded-code interpreter):
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch statement is
 not a replacement because it doesn't allow the programmer to store a
 label address directly into a code stream, which is what's essential to
 write a threaded-code interpreter.

I'd also like to see this.
 The Erlang folks went through hell just to use this feature; see the 5th
 Q at:
 http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions


 The idea is to be able to write code like this:

 ----

 import std.algorithm;

 enum Op : ubyte
 {
      imm,
      add,
      sub,
      // ...
      ret,
 }

 final class Insn
 {
      Op op;
      size_t[] args;
      void* lbl;
      Insn next;
 }

 final class State
 {
      Insn pc;
      size_t[64] regs;
 }

 size_t interp(Insn[] code)
 {
      // Set up the instruction stream with label addresses
      // the first time that it is executed. Label addresses
      // are stable, so we only do this once.

      foreach (insn; code.filter!(x => !x.lbl)())
      {
          void* lbl;

          with (Op)
          {
              final switch (insn.op)
              {
                  case imm: lbl = &&handle_imm; break;
                  case add: lbl = &&handle_add; break;
                  case sub: lbl = &&handle_sub; break;
                  // ...
                  case ret: lbl = &&handle_ret; break;
              }
          }

          insn.lbl = lbl;
      }
 ...

(This must be supported at compile time!)
Jun 01 2013
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Saturday, 1 June 2013 at 16:13:18 UTC, Timon Gehr wrote:
 I'd also like to see this.

Me too. Last time this came up I said no since there's some inline asm hacks you can do but turns out those hacks suck in capability and appearance compared to just having this feature.
Jun 01 2013
prev sibling next sibling parent "Rob T" <alanb ucora.com> writes:
On Saturday, 1 June 2013 at 16:23:11 UTC, Adam D. Ruppe wrote:
 On Saturday, 1 June 2013 at 16:13:18 UTC, Timon Gehr wrote:
 I'd also like to see this.

Me too. Last time this came up I said no since there's some inline asm hacks you can do but turns out those hacks suck in capability and appearance compared to just having this feature.

I wonder if this sort of thing would help solve another implementation problem, which is implementing very fast coroutines without the overhead seen with similar features such as fibers. Some discussion here: http://forum.dlang.org/thread/pcexsqbwefjueyylyyoz forum.dlang.org --rt
Jun 01 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
01-Jun-2013 20:13, Timon Gehr пишет:
 On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to bring it
 up again (because I'm going to be writing a threaded-code interpreter):
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch statement is
 not a replacement because it doesn't allow the programmer to store a
 label address directly into a code stream, which is what's essential to
 write a threaded-code interpreter.

I'd also like to see this.

Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; -- Dmitry Olshansky
Jun 02 2013
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 02-06-2013 10:52, Dmitry Olshansky wrote:
 01-Jun-2013 20:13, Timon Gehr пишет:
 On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to bring it
 up again (because I'm going to be writing a threaded-code interpreter):
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch statement is
 not a replacement because it doesn't allow the programmer to store a
 label address directly into a code stream, which is what's essential to
 write a threaded-code interpreter.

I'd also like to see this.

Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;

I'm not sure that can support threaded-code interpretation. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 02 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
02-Jun-2013 20:48, Alex Rønne Petersen пишет:
 On 02-06-2013 10:52, Dmitry Olshansky wrote:
 01-Jun-2013 20:13, Timon Gehr пишет:
 On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to bring it
 up again (because I'm going to be writing a threaded-code interpreter):
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch
 statement is
 not a replacement because it doesn't allow the programmer to store a
 label address directly into a code stream, which is what's essential to
 write a threaded-code interpreter.

I'd also like to see this.

Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;

I'm not sure that can support threaded-code interpretation.

Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer. -- Dmitry Olshansky
Jun 02 2013
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 02-06-2013 19:43, Dmitry Olshansky wrote:
 02-Jun-2013 20:48, Alex Rønne Petersen пишет:
 On 02-06-2013 10:52, Dmitry Olshansky wrote:
 01-Jun-2013 20:13, Timon Gehr пишет:
 On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to
 bring it
 up again (because I'm going to be writing a threaded-code
 interpreter):
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch
 statement is
 not a replacement because it doesn't allow the programmer to store a
 label address directly into a code stream, which is what's
 essential to
 write a threaded-code interpreter.

I'd also like to see this.

Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;

I'm not sure that can support threaded-code interpretation.

Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer.

The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 02 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
02-Jun-2013 21:49, Alex Rønne Petersen пишет:
 On 02-06-2013 19:43, Dmitry Olshansky wrote:
 02-Jun-2013 20:48, Alex Rønne Petersen пишет:
 On 02-06-2013 10:52, Dmitry Olshansky wrote:
 01-Jun-2013 20:13, Timon Gehr пишет:
 On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to
 bring it
 up again (because I'm going to be writing a threaded-code
 interpreter):
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch
 statement is
 not a replacement because it doesn't allow the programmer to store a
 label address directly into a code stream, which is what's
 essential to
 write a threaded-code interpreter.

I'd also like to see this.

Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;

I'm not sure that can support threaded-code interpretation.

Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer.

The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call.

One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :) Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one). And you still access most locals via stack pointer. And then there got to be something beyond locals - 'context', that is still passed from one bytecode execution to another. BTW Globals are actually quite expansive to access anyway, this was a mock example. -- Dmitry Olshansky
Jun 02 2013
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 02-06-2013 19:58, Dmitry Olshansky wrote:
 02-Jun-2013 21:49, Alex Rønne Petersen пишет:
 On 02-06-2013 19:43, Dmitry Olshansky wrote:
 02-Jun-2013 20:48, Alex Rønne Petersen пишет:
 On 02-06-2013 10:52, Dmitry Olshansky wrote:
 01-Jun-2013 20:13, Timon Gehr пишет:
 On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to
 bring it
 up again (because I'm going to be writing a threaded-code
 interpreter):
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch
 statement is
 not a replacement because it doesn't allow the programmer to store a
 label address directly into a code stream, which is what's
 essential to
 write a threaded-code interpreter.

I'd also like to see this.

Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression;

I'm not sure that can support threaded-code interpretation.

Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer.

The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call.

One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :)

That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running.
 Anyhow cramming the whole interpreter in one huge function too has
 downsides (and hopelessly inefficent register allocation is one).

I'd frankly be surprised if that was the case. But without any investigation, it's just word against word.
 And you still access most locals via stack pointer. And then there got
 to be something beyond locals - 'context', that is still passed from one
 bytecode execution to another.

It depends on what you mean by context. Can you elaborate?
 BTW Globals are actually quite expansive to access anyway, this was a
 mock example.

Sure. Either way, evidence suggests that this style of interpreter has performance advantages in practice. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://alexrp.com / http://lycus.org
Jun 02 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
02-Jun-2013 22:25, Alex Rønne Petersen пишет:
 On 02-06-2013 19:58, Dmitry Olshansky wrote:
 The problem here is assuming the interpreter state can be global. Once
 you make it non-global (and thus have to pass it in the goto start(...)
 call) you get all the overhead of a regular function call.

One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :)

That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running.
 Anyhow cramming the whole interpreter in one huge function too has
 downsides (and hopelessly inefficent register allocation is one).

I'd frankly be surprised if that was the case. But without any investigation, it's just word against word.

See my reply to Walter. Again I personally have not done any measurements but what that post by LuaJIt author makes a lot of sense to me.
 And you still access most locals via stack pointer. And then there got
 to be something beyond locals - 'context', that is still passed from one
 bytecode execution to another.

It depends on what you mean by context. Can you elaborate?

Context is what a virtual thread represents in the global world. Any outside hooks into it like argv in C's main are there, function tables (=extensions) and whatnot. -- Dmitry Olshansky
Jun 02 2013
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
02-Jun-2013 22:25, Alex Rønne Petersen пишет:
 On 02-06-2013 19:58, Dmitry Olshansky wrote:
 One pointer to a struct MyInterpreterState. Think of it as 'this'
 pointer :)

That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running.

If the instruction set is RISC-y then you are in trouble anyway (writing interpreter not in ASM). One way out is fusing multiple simple instructions into macro-instructions and interpreting those.
 Either way, evidence suggests that this style of interpreter has
 performance advantages in practice.

Agreed. Though this day I'd go for what I call "bastardized JIT". In essense just produce a chunk of "call XYZ" instructions, comnditionals would have to have a test + jmp. And then there is 'ret' at the end of this chunk ~4 instructions to have to deal with. What you get is branch prediction done on the hardware level and no instruction counter to worry about. -- Dmitry Olshansky
Jun 02 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/2/2013 10:58 AM, Dmitry Olshansky wrote:
 Anyhow cramming the whole interpreter in one huge function too has downsides
 (and hopelessly inefficent register allocation is one).

This should not be the case with register allocation using live ranges. You can also help with this by organizing declarations so the variables have the shortest scope.
Jun 02 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
02-Jun-2013 22:54, Walter Bright пишет:
 On 6/2/2013 10:58 AM, Dmitry Olshansky wrote:
 Anyhow cramming the whole interpreter in one huge function too has
 downsides
 (and hopelessly inefficent register allocation is one).

This should not be the case with register allocation using live ranges.

The problem is that if every goto *get_me_new_opcode compile may have no idea of where it will through it with computed labels. Hence some conservativeness of what registers contain and reloads and in effect all labels end up as isolated "functions". Plus a big stack frame is liability. And then there is a fair share of obstacles in making the kind of threaded interpreter code optimized properly, this post I find interesting: http://article.gmane.org/gmane.comp.lang.lua.general/75426
 You can also help with this by organizing declarations so the variables
 have the shortest scope.

Then splitting opcodes as functions separates their timespan even more. -- Dmitry Olshansky
Jun 02 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/2/2013 1:10 PM, Dmitry Olshansky wrote:
 The problem is that if every goto *get_me_new_opcode compile may have no idea
of
 where it will through it with computed labels. Hence some conservativeness of
 what registers contain and reloads and in effect all labels end up as isolated
 "functions". Plus a big stack frame is liability.

 And then there is a fair share of obstacles in making the kind of threaded
 interpreter code optimized properly, this post I find interesting:
 http://article.gmane.org/gmane.comp.lang.lua.general/75426

Thanks for the interesting article.
Jun 02 2013
prev sibling parent "Carl Sturtivant" <sturtivant gmail.com> writes:
On Saturday, 1 June 2013 at 17:44:02 UTC, Rob T wrote:
 I wonder if this sort of thing would help solve another 
 implementation problem, which is implementing very fast 
 coroutines without the overhead seen with similar features such 
 as fibers.

Here's some evidence that the answer to your question is "yes". http://www.cs.arizona.edu/icon/jcon/gde97.pdf
Jun 02 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Timon Gehr:

 D (like C) uses a different namespace for labels and symbols 
 that are not labels.

 This compiles today:

 void main(){
     int foo;
     foo: auto b = &foo;
 }

On a related topic I wrote this: http://d.puremagic.com/issues/show_bug.cgi?id=4902 Bye, bearophile
Jun 01 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 6/1/13, Timon Gehr <timon.gehr gmx.ch> wrote:
 D (like C) uses a different namespace for labels and symbols that are
 not labels.

Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.:
 void main(){
      int foo;
      foo: auto b = __traits(gotoAddr, foo);
 }

And if it's successful, we add language support.
Jun 01 2013
prev sibling next sibling parent Brad Roberts <braddr puremagic.com> writes:
On 6/1/13 9:49 PM, Walter Bright wrote:
 On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
 On 01-06-2013 09:59, bearophile wrote:
 "Recently" the Python C interpreter was modified and speed up thanks to
 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.

While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status.
Jun 02 2013
prev sibling next sibling parent Brad Roberts <braddr puremagic.com> writes:
On 6/2/13 12:33 AM, Paulo Pinto wrote:
 Am 02.06.2013 09:25, schrieb Brad Roberts:
 On 6/1/13 9:49 PM, Walter Bright wrote:
 On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote:
 On 01-06-2013 09:59, bearophile wrote:
 "Recently" the Python C interpreter was modified and speed up thanks to
 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.

While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status.

If your world is only UNIX then yeah, there are still lots of non UNIX os out there if you look outside the desktop computers. Of course, one could always question if they matter at all.

I agree. I said unix primarily to mean most except for msvc. Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs.
Jun 02 2013
prev sibling next sibling parent "nazriel" <spam dzfl.pl> writes:
On Saturday, 1 June 2013 at 05:29:28 UTC, Alex Rønne Petersen 
wrote:
 Hi,

 I'm sure this has been brought up before, but I feel I need to 
 bring it up again (because I'm going to be writing a 
 threaded-code interpreter): 
 http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

 This is an incredibly important extension. The final switch 
 statement is not a replacement because it doesn't allow the 
 programmer to store a label address directly into a code 
 stream, which is what's essential to write a threaded-code 
 interpreter.

 The Erlang folks went through hell just to use this feature; 
 see the 5th Q at: 
 http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions

It would also solve the problem with IASM and being unable to address the labels in iasm I noticed the problem when I tried to address the db'ed bytes. http://dpaste.dzfl.pl/36bbd7d3 Commented out //mov dword ptr RAX, meh; ??? illustrates the problem. Jumping to labels in IASM blocks seems to work ( http://dpaste.dzfl.pl/d9e47f70 ). I guess we could have two chickens backed on the same fire ;)
Jun 02 2013
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--e89a8ff1bfca82662804de28940e
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 2 June 2013 17:25, Brad Roberts <braddr puremagic.com> wrote:

 On 6/1/13 9:49 PM, Walter Bright wrote:

 On 6/1/2013 7:35 PM, Alex R=C3=B8nne Petersen wrote:

 On 01-06-2013 09:59, bearophile wrote:

 "Recently" the Python C interpreter was modified and speed up thanks t=




 this non-standard feature. CPython source code has two versions, one
 with computed gotos and one without, to compile it even if your C
 compiler doesn't support them or their GNU-C syntax.

I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this.

To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.

While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard stat=


MSVC doesn't support it, which is really annoying. --e89a8ff1bfca82662804de28940e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">On 2 June 2013 17:25, Brad Roberts <span dir=3D"ltr">&lt;<= a href=3D"mailto:braddr puremagic.com" target=3D"_blank">braddr puremagic.c= om</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><div class=3D"gmail_= quote"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-= left:1px #ccc solid;padding-left:1ex"> <div class=3D"HOEnZb"><div class=3D"h5">On 6/1/13 9:49 PM, Walter Bright wr= ote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> On 6/1/2013 7:35 PM, Alex R=C3=B8nne Petersen wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> On 01-06-2013 09:59, bearophile wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> &quot;Recently&quot; the Python C interpreter was modified and speed up tha= nks to<br> this non-standard feature. CPython source code has two versions, one<br> with computed gotos and one without, to compile it even if your C<br> compiler doesn&#39;t support them or their GNU-C syntax.<br> </blockquote> <br> I don&#39;t think there&#39;s any question as to the usefulness (and essent= ialness) of<br> this feature. I&#39;m very close to just writing most of the interpreter in= C over a<br> triviality like this.<br> </blockquote> <br> To be pedantic, C and C++ don&#39;t have that feature. Some compilers add i= t as an extension.<br> <br> Also, such a construct could not be made safe. The trouble is you could pa= ss those addresses<br> anywhere, and goto them from anywhere.<br> </blockquote> <br></div></div> While you&#39;re technically correct, every major compiler in the unix worl= d support it with the same syntax. =C2=A0It&#39;s entered into defacto stan= dard status.<br> </blockquote></div><br></div><div class=3D"gmail_extra" style>MSVC doesn&#3= 9;t support it, which is really annoying.</div></div> --e89a8ff1bfca82662804de28940e--
Jun 02 2013
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
On 6/2/13 10:44 AM, Walter Bright wrote:
 On 6/2/2013 12:49 AM, Brad Roberts wrote:
 Many if not most of those non-unix platforms use gcc, which is included in the
 set of compilers that does support it.  The point is, which you didn't change,
 is that's it's a defacto standard even if not technically in the c or c++
 language specs.

The curious question is why it never gets into the newer C and C++ Standards.

I can only surmise since I've never been a part of those processes, but: 1) since the support is already there for the people that want it, why go through the hassle? 2) going through the hassle just to get it more broadly supported is a lot of hassle. 3) people are generally fundamentally allergic to hassle. Which also lines up with why people are asking for it to be added to d, because it's not already available, so it's worth the hassle.
Jun 02 2013