www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Translating Modula2 into D: variant records, pointer

reply BLS <Killing_Zoe web.de> writes:
Translating Modula2 into D: Subject: variant records,pointer

Hi I have some Modula2 code which I would like to translate in D.
I often have to use something like that :

MODULE snippet

   FROM SYSTEM IMPORT TSIZE;
   FROM Heap IMPORT ALLOCATE;

   TYPE  NodePtr = POINTER TO Node;
         HeadPtr = POINTER TO Header;

         (* variant record *)
         Node = RECORD suc, alt: NodePtr;
          CASE terminal: BOOLEAN OF
            TRUE:  tsym: CHAR |
            FALSE: nSym: HeadPtr
          END
        END;

        Header = RECORD sym: CHAR;
                   entry: NodePtr;
                   suc: HeadPtr
                 END;


   VAR list, sentinel: HeadPtr

   (* JUST A PROCEDURE FRAGMENT *)
   PROCEDURE Find(s: CHAR; VAR h: HeadPtr);
     VAR h1: HeadPtr;
   BEGIN
     h1 := list;
     sentinel^.sym := s;

     ALLOCATE(sentinel, TSIZE(Header));

     h := h1;
     (* etc. *

   END
So: How do I have to implement this snippet in D ?
Many thanks !!!! in advance; Bjoern
Jan 09 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"BLS" <Killing_Zoe web.de> wrote in message 
news:eo04h3$1us4$1 digitaldaemon.com...
 Translating Modula2 into D: Subject: variant records,pointer

 Hi I have some Modula2 code which I would like to translate in D.
 I often have to use something like that :
 ...
 So: How do I have to implement this snippet in D ?
 Many thanks !!!! in advance; Bjoern
Something like: module snippet; struct Node { Node* suc; Node* alt; bool terminal; union { char tsym; Header* nSym; } } struct Header { char sym; Node* entry; Header* suc; } Header* list; Header* sentinel; void Find(char s, inout Header* h) { Header* h1 = list; sentinel.sym = s; sentinel = new Header; h = h1; // etc. } One thing I'm not real sure on is the variant record. The closest thing I think in D is a union, but nothing prevents you from accessing either tsym or nSym in Node at any time. I guess in Modula2, if 'terminal' is true, you can only access tsym, and otherwise you can only access nSym?
Jan 09 2007
next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Jarrett Billingsley wrote:
 One thing I'm not real sure on is the variant record.  The closest thing I 
 think in D is a union, but nothing prevents you from accessing either tsym 
 or nSym in Node at any time.  I guess in Modula2, if 'terminal' is true, you 
 can only access tsym, and otherwise you can only access nSym? 
I think you can mostly fake this in D using private (differently-named) union members and property methods that assert if 'terminal' has the wrong value. You can even make it so that the original names directly alias the private names in a non-debug build, and only use the checking version in a debug build[1]. Limitations: * You can still access the private members (under the different name) from within the same module. * You can't use operators with side-effects on the members (except for assignment of course). So no ++, +=, --, -=, ~=, etc. [1]: Just make sure the code compiles in debug builds if you do this, because of the second limitation above (since it doesn't apply to the suggested non-debug version).
Jan 09 2007
parent reply BLS <Killing_Zoe web.de> writes:
Hi Frits,
 I think you can mostly fake this in D using private (differently-named) 
 union members and property methods that assert if 'terminal' has the 
 wrong value.
 You can even make it so that the original names directly alias the 
 private names in a non-debug build, and only use the checking version in 
 a debug build[1].
Can you please offer an example based on ... Modula TYPE NodePtr = POINTER TO Node; Node = RECORD suc, alt: NodePtr; CASE terminal: BOOLEAN OF TRUE: tsym: CHAR | FALSE: nSym: HeadPtr END END; ---------------------------------- Do you mean something similar .... ? Pascal TYPE r2_rec = RECORD CASE INTEGER OF 1: (e1: INTEGER; e2: INTEGER); 2: (e3: REAL); END; C/C++ typedef union { struct { int e1; int e2; } v1; float e3; } R2Rec; ... or am I completely wrong? Bjoern
Jan 09 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
BLS wrote:
 Hi Frits,
 I think you can mostly fake this in D using private 
 (differently-named) union members and property methods that assert if 
 'terminal' has the wrong value.
 You can even make it so that the original names directly alias the 
 private names in a non-debug build, and only use the checking version 
 in a debug build[1].
Can you please offer an example based on ... Modula TYPE NodePtr = POINTER TO Node; Node = RECORD suc, alt: NodePtr; CASE terminal: BOOLEAN OF TRUE: tsym: CHAR | FALSE: nSym: HeadPtr END END;
----- struct Header{}; // to make it compile as-is struct Node { Node* suc; Node* alt; bool terminal; union { private char tsym_; private Header* nSym_; } debug { // Property versions, with full error checking // (This block is used if -debug is passed to the compiler) char tsym() in { assert(terminal == false); } body { return tsym_; } char tsym(char newval) out { assert(terminal == false); } body { terminal = false; return tsym_ = newval; } Header* nSym() in { assert(terminal == true); } body { return nSym_; } Header* nSym(Header* newval) out { assert(terminal == true); } body { terminal = true; return nSym_ = newval; } } else { // Hmm.. No way to automatically set 'terminal' in this // implementation. Damn. // (This block is used if -debug is NOT passed to the compiler) alias tsym_ tsym; alias nSym_ nSym; } } ----- It's unfortunately a bit wordy. It's also completely untested beyond the fact that it compiles ;). I noticed a limitation of the optimization opportunity I mentioned: In the debug version (with property setters) you can automatically adjust the value of 'terminal' depending on the last-set property. This isn't possible with direct aliasing as far as I can see. If this is a problem, you might want to always use the code in the first block. I see no reason the compiler can't inline the functions if passed the appropriate optimization flags, so it shouldn't really matter much. Some notes: * The first block of code (right after 'debug') is compiled in if -debug is passed to the compiler. Otherwise, the second (short) block is compiled. * The 'in' and 'out' blocks are removed by the compiler if -release is present on the compiler command line. * As mentioned in my previous post, tsym_ and nSym_ are accessible from code in the same module even though they are private. * Another implementation option is to also rename 'terminal' and make it private, and then only provide a property getter function so the code using the Node type can't set it without also setting the appropriate union member. * If you only ever set the union members right after constructing the Node instance, you may also want to think about classes and inheritance if you're OOP-inclined.
Jan 09 2007
parent reply BLS <Killing_Zoe web.de> writes:
Thanks Frits,
a lot of interesting stuff for a D newbie like me.
Frits van Bommel schrieb:

 * If you only ever set the union members right after constructing the 
 Node instance, you may also want to think about classes and inheritance 
 if you're OOP-inclined.
Indeed the snippet is part of a table driven (better data-structure) general parser. So I allways check Terminal first and then I set the union values. It was anyway my next question : How to implement this the OOP way ? and ... Can opCall() help somehow ? Thanks for beeing so patient with me. Bjoern
Jan 09 2007
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
BLS wrote:
 Thanks Frits,
 a lot of interesting stuff for a D newbie like me.
 Frits van Bommel schrieb:
 
 * If you only ever set the union members right after constructing the 
 Node instance, you may also want to think about classes and 
 inheritance if you're OOP-inclined.
Indeed the snippet is part of a table driven (better data-structure) general parser. So I allways check Terminal first and then I set the union values. It was anyway my next question : How to implement this the OOP way ? and ... Can opCall() help somehow ?
OOP looks a lot cleaner: ----- abstract class Node { Node suc; Node alt; } class TerminalNode : Node { char tsym; } class NonTerminalNode : Node { Header nSym; } ----- That's just the skeleton though. For starters, you'll want to either declare those members public or provide accessors ;). Constructors would also be nice. Since you asked though, pretty much the same effect can be achieved with static opCall (one in each child class and/or two overloaded versions in Node itself) that creates a new object of the appropriate type and fills in the fields. To check whether a node is a terminal or not, there are several options: * Try to cast a node to (Non)TerminalNode. If it's not of the appropriate type, the cast returns null. This has the benefit that it also returns a usable reference if it _is_ of the appropriate type, so that you can access its special members. * Add 'abstract bool isTerminal()' to Node and override it in the subclasses. * Add a 'const bool isTerminal' field to Node and initialize it in the constructor from a parameter. * Add 'TerminalNode asTerminal() { return null; }' to Node and override it in TerminalNode to read 'return this;'. (Do something similar with NonTerminalNode) * {{ Probably some others I can't think of right now }}
Jan 09 2007
prev sibling parent reply BLS <Killing_Zoe web.de> writes:
Many thanks, this info will give me a Go!

  I guess in Modula2, if 'terminal' is true, you
 can only access tsym, and otherwise you can only access nSym? 
No. Nothing prevents you from making mistake.(In fact variant records are a pretty nice source for bugs) Would be nice if D can do this better! Bjoern
Jan 09 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"BLS" <Killing_Zoe web.de> wrote in message 
news:eo0ih9$2odp$1 digitaldaemon.com...
 Many thanks, this info will give me a Go!

  I guess in Modula2, if 'terminal' is true, you
 can only access tsym, and otherwise you can only access nSym?
No. Nothing prevents you from making mistake.(In fact variant records are a pretty nice source for bugs) Would be nice if D can do this better! Bjoern
Hmm.. what's that 'terminal' member for then, anyway?
Jan 09 2007
parent BLS <Killing_Zoe web.de> writes:
Jarrett Billingsley schrieb:
 "BLS" <Killing_Zoe web.de> wrote in message 
 news:eo0ih9$2odp$1 digitaldaemon.com...
 
Many thanks, this info will give me a Go!

 I guess in Modula2, if 'terminal' is true, you

can only access tsym, and otherwise you can only access nSym?
No. Nothing prevents you from making mistake.(In fact variant records are a pretty nice source for bugs) Would be nice if D can do this better! Bjoern
Hmm.. what's that 'terminal' member for then, anyway?
The official language definition (Quote N. Wirth/my interpretation) says that Mr. Compiler has to take care about it. The real world implementation is just an other story. Bjoern, and Thanks again man!
Jan 10 2007