www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8594] New: Enum string validator in Phobos?

http://d.puremagic.com/issues/show_bug.cgi?id=8594

           Summary: Enum string validator in Phobos?
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



In the Ada language there is a handy feature, you can define an enumeration of
chars, and then give enum arrays literal as strings, and the compiler enforces
the usage of just the allowed chars:


procedure Test is
   type Hexa is ('A', 'B', 'C', 'D', 'E', 'F');
   type Hex_Array is array (0 .. 5) of Hexa;
   data : Hex_Array;
begin
  data := "BACEDC";
end;


Similar literals are very useful, there are many kinds of problems that use
data defined on a subset of the chars, and the chars are a compact
representation. Such strings are able to represent sequence of commands, start
configurations of problems, boards of games, and many kinds of discrete
problems.

(Note: in Ada stack-allocated arrays like Hex_Array are used quite often, more
than heap-allocated arrays.)


If you try to define a literal that contains a wrong char:

procedure Test is
   type Hexa is ('A', 'B', 'C', 'D', 'E', 'F');
   type Hex_Array is array (0 .. 5) of Hexa;
   data : Hex_Array;
begin
  data := "BACgDC";
end;


The Ada compiler gives you a compile-time error:

prog.adb:6:15: character not defined for type "Hexa" defined at line 2


Such compile-time validation is very useful to avoid bugs in the program, and
in D using enum literals is useful because it allows you to use a safer "static
switch" to process the data, instead of a regular "switch" on string chars.


This is one possible D translation, but even using with() the array literal
requires commas (and strings are often more handy literals):

enum Hexa : char { A='A', B='B', C='C', D='D', E='E', F='F' }
alias Hexa[6] HexArray; // this is not a true type as in Ada
void main() {
    HexArray data;
    with (Hexa)
        data = [B,A,C,E,D,C];
}



So I have created a small compile-time function + template that validates a
string at compile time:

// - - - - - - - - - - - - - - - -
import std.traits: isSomeChar, EnumMembers;

private E[] _validateEnumString(E)(in string txt)
pure nothrow if (is(E TC == enum) && isSomeChar!TC) {
    auto result = new typeof(return)(txt.length);

    OUTER:
    foreach (i, c; txt) {
        /*static*/ foreach (e; EnumMembers!E)
            if (c == e) {
                result[i] = e;
                continue OUTER;
            }
        assert(false, "Not valid enum char: " ~ c);
    }

    return result;
}

enum Hexa : char { A='A', B='B', C='C', D='D', E='E', F='F' }

template Hexas(string path) {
    enum Hexas = _validateEnumString!Hexa(path);
}

alias Hexa[6] HexArray;
void main() {
    HexArray data = Hexas!"BACEDC";
}
// - - - - - - - - - - - - - - - -




This alternative design uses a cast to avoid the input duplication and maybe
reduces the compilation time, but produces only arrays of immutable enums:

// - - - - - - - - - - - - - - - -
import std.traits: isSomeChar, EnumMembers;

private immutable(E)[] _validateEnumString(E)(in string txt)
pure nothrow if (is(E TC == enum) && isSomeChar!TC) {

    OUTER:
    foreach (i, c; txt) {
        /*static*/ foreach (e; EnumMembers!E)
            if (c == e)
                continue OUTER;
        assert(false, "Not valid enum char: " ~ c);
    }

    return cast(typeof(return))txt;
}

enum Hexa : char { A='A', B='B', C='C', D='D', E='E', F='F' }

template Hexas(string path) {
    enum Hexas = _validateEnumString!Hexa(path);
}

alias Hexa[6] HexArray;
void main() {
    HexArray data = Hexas!"BACEDC";
    auto data2 = Hexas!"BACEDC";
    static assert(is(typeof(data2) == immutable(Hexa)[]));
}
// - - - - - - - - - - - - - - - -


Defining enum array literals this way is one of the built-in features of Ada,
because it's commonly useful, this is quoted from Wikipedia:
http://en.wikipedia.org/wiki/Enumerated_type#Ada

Like Modula-3 Ada treats Boolean and Character as special pre-defined (in
package "Standard") enumerated types. Unlike Modula-3 one can also define own
character types:

type Cards is ("7", "8", "9", "J", "Q", "K", "A");


So maybe a template similar to the ones I have shown here is useful enough to
be added to Phobos.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 27 2012