www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Translating Modula2 into D: variant records, pointer

reply BLS <Killing_Zoe web.de> writes:
Translating Modula2 into D: Subject: variant records,pointer

Hi I have some Modula2 code which I would like to translate in D.
I often have to use something like that :

MODULE snippet

   FROM SYSTEM IMPORT TSIZE;
   FROM Heap IMPORT ALLOCATE;

   TYPE  NodePtr = POINTER TO Node;
         HeadPtr = POINTER TO Header;

         (* variant record *)
         Node = RECORD suc, alt: NodePtr;
          CASE terminal: BOOLEAN OF
            TRUE:  tsym: CHAR |
            FALSE: nSym: HeadPtr
          END
        END;

        Header = RECORD sym: CHAR;
                   entry: NodePtr;
                   suc: HeadPtr
                 END;


   VAR list, sentinel: HeadPtr

   (* JUST A PROCEDURE FRAGMENT *)
   PROCEDURE Find(s: CHAR; VAR h: HeadPtr);
     VAR h1: HeadPtr;
   BEGIN
     h1 := list;
     sentinel^.sym := s;

     ALLOCATE(sentinel, TSIZE(Header));

     h := h1;
     (* etc. *

   END
So: How do I have to implement this snippet in D ?
Many thanks !!!! in advance; Bjoern
Jan 09 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"BLS" <Killing_Zoe web.de> wrote in message 
news:eo04h3$1us4$1 digitaldaemon.com...
 Translating Modula2 into D: Subject: variant records,pointer

 Hi I have some Modula2 code which I would like to translate in D.
 I often have to use something like that :
 ...
 So: How do I have to implement this snippet in D ?
 Many thanks !!!! in advance; Bjoern

Something like: module snippet; struct Node { Node* suc; Node* alt; bool terminal; union { char tsym; Header* nSym; } } struct Header { char sym; Node* entry; Header* suc; } Header* list; Header* sentinel; void Find(char s, inout Header* h) { Header* h1 = list; sentinel.sym = s; sentinel = new Header; h = h1; // etc. } One thing I'm not real sure on is the variant record. The closest thing I think in D is a union, but nothing prevents you from accessing either tsym or nSym in Node at any time. I guess in Modula2, if 'terminal' is true, you can only access tsym, and otherwise you can only access nSym?
Jan 09 2007
next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Jarrett Billingsley wrote:
 One thing I'm not real sure on is the variant record.  The closest thing I 
 think in D is a union, but nothing prevents you from accessing either tsym 
 or nSym in Node at any time.  I guess in Modula2, if 'terminal' is true, you 
 can only access tsym, and otherwise you can only access nSym? 

I think you can mostly fake this in D using private (differently-named) union members and property methods that assert if 'terminal' has the wrong value. You can even make it so that the original names directly alias the private names in a non-debug build, and only use the checking version in a debug build[1]. Limitations: * You can still access the private members (under the different name) from within the same module. * You can't use operators with side-effects on the members (except for assignment of course). So no ++, +=, --, -=, ~=, etc. [1]: Just make sure the code compiles in debug builds if you do this, because of the second limitation above (since it doesn't apply to the suggested non-debug version).
Jan 09 2007
parent reply BLS <Killing_Zoe web.de> writes:
Hi Frits,
 I think you can mostly fake this in D using private (differently-named) 
 union members and property methods that assert if 'terminal' has the 
 wrong value.
 You can even make it so that the original names directly alias the 
 private names in a non-debug build, and only use the checking version in 
 a debug build[1].

Modula TYPE NodePtr = POINTER TO Node; Node = RECORD suc, alt: NodePtr; CASE terminal: BOOLEAN OF TRUE: tsym: CHAR | FALSE: nSym: HeadPtr END END; ---------------------------------- Do you mean something similar .... ? Pascal TYPE r2_rec = RECORD CASE INTEGER OF 1: (e1: INTEGER; e2: INTEGER); 2: (e3: REAL); END; C/C++ typedef union { struct { int e1; int e2; } v1; float e3; } R2Rec; ... or am I completely wrong? Bjoern
Jan 09 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
BLS wrote:
 Hi Frits,
 I think you can mostly fake this in D using private 
 (differently-named) union members and property methods that assert if 
 'terminal' has the wrong value.
 You can even make it so that the original names directly alias the 
 private names in a non-debug build, and only use the checking version 
 in a debug build[1].

Modula TYPE NodePtr = POINTER TO Node; Node = RECORD suc, alt: NodePtr; CASE terminal: BOOLEAN OF TRUE: tsym: CHAR | FALSE: nSym: HeadPtr END END;

----- struct Header{}; // to make it compile as-is struct Node { Node* suc; Node* alt; bool terminal; union { private char tsym_; private Header* nSym_; } debug { // Property versions, with full error checking // (This block is used if -debug is passed to the compiler) char tsym() in { assert(terminal == false); } body { return tsym_; } char tsym(char newval) out { assert(terminal == false); } body { terminal = false; return tsym_ = newval; } Header* nSym() in { assert(terminal == true); } body { return nSym_; } Header* nSym(Header* newval) out { assert(terminal == true); } body { terminal = true; return nSym_ = newval; } } else { // Hmm.. No way to automatically set 'terminal' in this // implementation. Damn. // (This block is used if -debug is NOT passed to the compiler) alias tsym_ tsym; alias nSym_ nSym; } } ----- It's unfortunately a bit wordy. It's also completely untested beyond the fact that it compiles ;). I noticed a limitation of the optimization opportunity I mentioned: In the debug version (with property setters) you can automatically adjust the value of 'terminal' depending on the last-set property. This isn't possible with direct aliasing as far as I can see. If this is a problem, you might want to always use the code in the first block. I see no reason the compiler can't inline the functions if passed the appropriate optimization flags, so it shouldn't really matter much. Some notes: * The first block of code (right after 'debug') is compiled in if -debug is passed to the compiler. Otherwise, the second (short) block is compiled. * The 'in' and 'out' blocks are removed by the compiler if -release is present on the compiler command line. * As mentioned in my previous post, tsym_ and nSym_ are accessible from code in the same module even though they are private. * Another implementation option is to also rename 'terminal' and make it private, and then only provide a property getter function so the code using the Node type can't set it without also setting the appropriate union member. * If you only ever set the union members right after constructing the Node instance, you may also want to think about classes and inheritance if you're OOP-inclined.
Jan 09 2007
parent reply BLS <Killing_Zoe web.de> writes:
Thanks Frits,
a lot of interesting stuff for a D newbie like me.
Frits van Bommel schrieb:

 * If you only ever set the union members right after constructing the 
 Node instance, you may also want to think about classes and inheritance 
 if you're OOP-inclined.

Indeed the snippet is part of a table driven (better data-structure) general parser. So I allways check Terminal first and then I set the union values. It was anyway my next question : How to implement this the OOP way ? and ... Can opCall() help somehow ? Thanks for beeing so patient with me. Bjoern
Jan 09 2007
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
BLS wrote:
 Thanks Frits,
 a lot of interesting stuff for a D newbie like me.
 Frits van Bommel schrieb:
 
 * If you only ever set the union members right after constructing the 
 Node instance, you may also want to think about classes and 
 inheritance if you're OOP-inclined.

Indeed the snippet is part of a table driven (better data-structure) general parser. So I allways check Terminal first and then I set the union values. It was anyway my next question : How to implement this the OOP way ? and ... Can opCall() help somehow ?

OOP looks a lot cleaner: ----- abstract class Node { Node suc; Node alt; } class TerminalNode : Node { char tsym; } class NonTerminalNode : Node { Header nSym; } ----- That's just the skeleton though. For starters, you'll want to either declare those members public or provide accessors ;). Constructors would also be nice. Since you asked though, pretty much the same effect can be achieved with static opCall (one in each child class and/or two overloaded versions in Node itself) that creates a new object of the appropriate type and fills in the fields. To check whether a node is a terminal or not, there are several options: * Try to cast a node to (Non)TerminalNode. If it's not of the appropriate type, the cast returns null. This has the benefit that it also returns a usable reference if it _is_ of the appropriate type, so that you can access its special members. * Add 'abstract bool isTerminal()' to Node and override it in the subclasses. * Add a 'const bool isTerminal' field to Node and initialize it in the constructor from a parameter. * Add 'TerminalNode asTerminal() { return null; }' to Node and override it in TerminalNode to read 'return this;'. (Do something similar with NonTerminalNode) * {{ Probably some others I can't think of right now }}
Jan 09 2007
prev sibling parent reply BLS <Killing_Zoe web.de> writes:
Many thanks, this info will give me a Go!

  I guess in Modula2, if 'terminal' is true, you
 can only access tsym, and otherwise you can only access nSym? 

No. Nothing prevents you from making mistake.(In fact variant records are a pretty nice source for bugs) Would be nice if D can do this better! Bjoern
Jan 09 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"BLS" <Killing_Zoe web.de> wrote in message 
news:eo0ih9$2odp$1 digitaldaemon.com...
 Many thanks, this info will give me a Go!

  I guess in Modula2, if 'terminal' is true, you
 can only access tsym, and otherwise you can only access nSym?

No. Nothing prevents you from making mistake.(In fact variant records are a pretty nice source for bugs) Would be nice if D can do this better! Bjoern

Hmm.. what's that 'terminal' member for then, anyway?
Jan 09 2007
parent BLS <Killing_Zoe web.de> writes:
Jarrett Billingsley schrieb:
 "BLS" <Killing_Zoe web.de> wrote in message 
 news:eo0ih9$2odp$1 digitaldaemon.com...
 
Many thanks, this info will give me a Go!

 I guess in Modula2, if 'terminal' is true, you

can only access tsym, and otherwise you can only access nSym?

No. Nothing prevents you from making mistake.(In fact variant records are a pretty nice source for bugs) Would be nice if D can do this better! Bjoern

Hmm.. what's that 'terminal' member for then, anyway?

that Mr. Compiler has to take care about it. The real world implementation is just an other story. Bjoern, and Thanks again man!
Jan 10 2007