www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 6176] New: [TDPL] Cannot use string variables in case expressions

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176

           Summary: [TDPL] Cannot use string variables in case expressions
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: jmdavisProg gmx.com



PDT ---
This program fails to compile

import std.stdio;

void main()
{
   string foo = "foo";
   string bar = "bar";

   string mrX;

   switch(mrX)
   {
      case foo:
         writeln(foo);
         break;
      case bar:
         writeln(bar);
         break;
      default:
         writeln("who knows");
   }
}


giving this error:

prog.d(12): Error: case must be a string or an integral constant, not foo

If you change the variables to int, then it works, but it doesn't work with
strings.

According to TDPL, p. 72, "Usually case expressions are compile-time constants,
but D allows variables too and guarantees lexical-order evaluation up to the
first match." So, according to TDPL, this code should be valid, but it
currently fails.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 19 2011
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


Stewart Gordon <smjg iname.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|rejects-valid               |spec
                 CC|                            |smjg iname.com
           Severity|normal                      |enhancement



http://www.digitalmars.com/d/1.0/statement.html#CaseStatement
"The case expressions must all evaluate to a constant value or array. They must
be implicitly convertible to the type of the switch Expression."

http://www.digitalmars.com/d/2.0/statement.html#CaseStatement
"The case expressions must all evaluate to a constant value or array, or a
runtime initialized const or immutable variable of integral type. They must be
implicitly convertible to the type of the switch Expression."

So the code is illegal.  Clearly this is a mistake in TDPL.

Though I am made to wonder why this restriction is there.  Changing to
enhancement for the meantime.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 19 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs eml.cc





 Though I am made to wonder why this restriction is there.
The purposes of a switch are to give a ordered syntax to manage several value cases (final switches are able to catch some bugs too), and to compile to efficient code, sometimes a complex mix of dispatch tables and hard-coded search trees (and more, if the compiler is smart, doing automatically one of the purposes of computed gotos). I think currently DMD doesn't optimize string switches a lot, but it's not hard to think about it using a hard-coded trie, some kind of digital tree, perfect hashing, etc. Compile-time constants allow to create such optimized code. See also bug 5862 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 19 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|spec                        |rejects-valid
           Severity|enhancement                 |normal



PDT ---
Not to be rude, but I'm changing it back to a bug. If TDPL says something, and
the compiler or online documentation doesn't agree, then it's a bug until
Walter decides that we're not following what TDPL says. The rule is essentially
that TDPL is always right unless Walter decides otherwise. It doesn't matter
what the online docs say except that if they contradict TDPL, then they're also
wrong until Walter decides that what TDPL says shouldn't be correct. So if
anything, at the moment, it's a bug in the spec in addition to the bug in the
compiler, not an enhancement. And remember that this _does_ currently work with
variables of type int. So, the spec doesn't match what the compiler is
currently doing anyway.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 19 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


Stewart Gordon <smjg iname.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |spec




 I think currently DMD doesn't optimize string
 switches a lot, but it's not hard to think about it using a hard-coded trie,
 some kind of digital tree, perfect hashing, etc. Compile-time constants allow
 to create such optimized code.
But that doesn't mean it would have to _always_ use a hard-coded tree. It goes without saying that an optimisation can happen only if the criteria for it to make sense are satisfied. Let F be the overall feature being considered, and S be the subset of this feature that can be optimised in a certain way. Why contrive F to equal S, when you can implement a non-empty F \ S just without the optimisation? If the case values are all constant, create this tree. Otherwise, just compare the switched value with the cases individually. Later on, we could improve it to use a mixture of the two approaches where some but not all cases are CTCs. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 19 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176






 If the case values are all constant, create this tree.  Otherwise, just compare
 the switched value with the cases individually.
This is possible, of course, it just requires a bit more complex compiler. A problem: if one of your strings are not compile-time const, because of a mistake of the programmer, there is a silent and invisible loss of performance. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 19 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176






 
 If the case values are all constant, create this tree.  
 Otherwise, just compare the switched value with the cases 
 individually.
This is possible, of course, it just requires a bit more complex compiler.
But the extra complexity is nothing compared to implementing the tree optimisation in the first place. Moreover, ISTM for switches with only a few values, comparing the cases individually might be actually more efficient. So this extra complexity might actually be needed in order not to pessimise these simpler cases.
 A problem: if one of your strings are not compile-time const, because of a
 mistake of the programmer, there is a silent and invisible loss of performance.
Which is to be expected. After all, compiler optimisation is a privilege, not a right. Though adopting the aforementioned mixture of the two approaches would mean that any loss of performance would be small. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 20 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


Don <clugdbug yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |clugdbug yahoo.com.au



This must be a mistake in TDPL, or just poor wording. It's true that DMD does
relax the rule on compile-time strings, to include global variables which are
initialized in static this(). But it doesn't include _all_ variables.

It's pretty clear that the unqualified description in TDPL ("D allows variables
too") cannot be correct. What if it's a shared variable, for example?

Note that defining it as requiring a compile-time constant allows CTFE to be
used.
If variables are permitted, then the rules become more complicated, not
simpler.

Allowing variables would be an appallingly bad feature. It would mean that to
understand control flow in a function which contains a switch, you need to
check every 'case' statement to see if it's a variable, and then you need to
check if that variable can change from inside the function. 
That's a huge change from the existing language, where you know that the
control expression is the only thing that affects control flow.

This is far worse for code maintenance and readability than 'goto'.

Effectively, this would remove the 'switch' statement from the language. Switch
would become nothing more than syntax salt for a sequence of 'if' statements.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 08 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176




PDT ---
Well, it works with variables which are of type int. TDPL claims that it works
with variables, and switch statements work with integral types and strings. So,
per TDPL, switch statements should work with string variables.

Now, personally, I'm very surprised that _any_ type of variable would be
permitted as the value of a case statement under any circumstances, and it
wouldn't hurt my feelings one whit if it were removed from the language
completely. I see no value in such a feature. However, TDPL is pretty clear
about allowing variables, so it certainly didn't get in there by accident (and
using integer variables _does_ currently work).

So, it's fine with me if this is declared as an errata for TDPL, but I don't
think that it's a case of TDPL being unclear.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 08 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176





 Well, it works with variables which are of type int. 
Wow. I just checked, and you're right. That's disturbing. Here's the code in statement.c: /* This is where variables are allowed as case expressions. */ if (exp->op == TOKvar) { VarExp *ve = (VarExp *)exp; VarDeclaration *v = ve->var->isVarDeclaration(); Type *t = exp->type->toBasetype(); if (v && (t->isintegral() || t->ty == Tclass)) { /* Flag that we need to do special code generation * for this, i.e. generate a sequence of if-then-else */ sw->hasVars = 1; if (sw->isFinal) error("case variables not allowed in final switch statements"); goto L1; } } Note that it only allows integers AND CLASSES! I think this is a *major* misfeature. But you're right, and I was wrong -- this behaviour is clearly intentional for the integer case, and it makes absolutely no sense to allow it for integers but not for strings. So TDPL, the spec, and DMD are all different from each other. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 08 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


dawg dawgfoto.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dawg dawgfoto.de



There is definitely some value to allow this for strings, e.g.
when you want use strings from translation files or allow
user definable commands.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 15 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


Alex Rønne Petersen <xtzgzorex gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xtzgzorex gmail.com



09:54:54 PST ---
I'm just going to interject here.

I don't understand why anyone sees the need to limit the switch construct in
any way. Why force it to use compile-time values? Why force it to support
primitives only?

A full-blown, generalized switch would greatly improve D's expressiveness, and
would cater to functional programmers. Functional languages have shown that
pattern matching (which is essentially just a generalized switch, or -- as I
like to call it -- switch done *right*) is extremely useful to write short and


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 15 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


Peter Alexander <peter.alexander.au gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter.alexander.au gmail.co
                   |                            |m



10:24:44 PST ---
I added a pull request to update the documentation to align with TDPL.

https://github.com/D-Programming-Language/d-programming-language.org/pull/60

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 15 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176






 I don't understand why anyone sees the need to limit the switch construct in
 any way.
There are two very different use cases in D here. If you want to implement a C-style finite state machine switching on a enum integer you want the compiler to squeeze of the very last bit of performance out of the code. If you are writing functional-style code in a not performance-critical part of the program you prefer a very flexible switch. In theory a well implemented switch is able to work for both use cases, but compiler practice is often different from theory, and what is good for single-instruction-conscious code is often not the best for the other use case. Requiring all switch cases to be compile-time constants gives some guarantees. Sometimes you only think a value is a compile-time constant, while it is not, and if the compiler doesn't warn you, you risk having a lower performance. Strings at compile-time in theory allow the compiler to use smarter and more faster strategies to find the various cases. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 15 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176





 I'm just going to interject here.
 
 I don't understand why anyone sees the need to limit the switch construct in
 any way. Why force it to use compile-time values? Why force it to support
 primitives only?
Switch statements are easy to reason about, because they are controlled by a single expression. If the values of the cases are allowed to vary, they are no easier to understand than a sequence of if() statements, *but* that's not what it looks like -- it's really deceptive. int a = 2; for(;;) { switch(7) { case a: return; case 7: a = 7; break; } }
 A full-blown, generalized switch would greatly improve D's expressiveness, and
 would cater to functional programmers. Functional languages have shown that
 pattern matching (which is essentially just a generalized switch, or -- as I
 like to call it -- switch done *right*) is extremely useful to write short and

But functional languages don't have variables in their case statements! In this case, it's not more expressive - it's less expressive. It's simply syntax sugar for a sequence of if() statements. A switch statement says more: only one side of the comparison is varying. The restriction is useful because it allows you to think at a higher level. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 16 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176




Some argumentation in favor of a dynamic switch
----
switch (receive(ch0, ch1, ch2))
{
case ch0:
    writeln(ch0.get());
    break;

case ch1:
    writeln(ch1.get());
    break;

case ch2:
    writeln(ch2.get());
    break;

default:
    // error
    break;
}
----
auto token = nextToken();
switch (token)
{
case lastToken:
    break;

case A: .. case B:

}
lastToken = token;
----
switch (str)
{
case re("[f|b]oo"):

case re("[f|b]ar"):

case re("[f|b]az"):
}
----

This can definitely become very confusing, e.g. when the
comparison has the side-effect of changing another case label.

To make it complete a dynamic case statement should be a boolean
expressions probably involving the expression being switch on,
i.e. the perfect dynamic switch is an "if-else" chain.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 17 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176





 There are two very different use cases in D here.
An option is to add another kind of switch attribute: enum switch (foo) { case c1: break; // all c1,c2 must be a compile-time constants case c2: break; default: break; } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 18 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176






 There are two very different use cases in D here.
An option is to add another kind of switch attribute: enum switch (foo) { case c1: break; // all c1,c2 must be a compile-time constants case c2: break; default: break; }
We already have an enum switch - it's called final switch. Inventing something new and calling it enum switch will be confusing. What would it be anyway - just an optional check for the programmer similar to the override attribute? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 18 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176






 We already have an enum switch - it's called final switch.
The purpose of this idea is different. A final switch requires to list all possibilities and it forbids the default case.
 Inventing something new and calling it enum switch will be confusing.
I see.
 What would it be anyway - just an optional check for the programmer
 similar to the override attribute?
override will stop being optional, see issue 3836. Likewise this was not meant to be optional, eventually. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 18 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176




PST ---
Personally, I think that simplest and best solution is to just restrict case
statements to compile-time constants like every language does. I agree with Don
that this feature is a misfeature. We already have if-else-if chains for the
general case.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 18 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176





 Personally, I think that simplest and best solution is to just restrict case
 statements to compile-time constants like every language does.
I agree :-) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 18 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176


Andrej Mitrovic <andrej.mitrovich gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrej.mitrovich gmail.com



10:16:13 PST ---
Switch allows more funky code:

import std.stdio;
string a() { return "a"; }
void main()
{
    switch ("a") {
        case a():
            writeln("true");
        default:
    }
}

and:

import std.stdio;
int foo() { return 1; }
void main()
{
    int x;
    switch (x = foo())
    {
        default:
    }
}

`if (x = foo())` can't work, so I don't know why `switch (x = foo())` can.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 18 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176




PST ---
 `if (x = foo())` can't work, so I don't know why `switch (x = foo())` can.
That's easy. With if, there's a strong possibility that the programmer really meant to use ==. So, by disallowing = by itself, you avoid those bugs (though it would certainly be nice to be able to do if(x = foo()) - gcc allows it without complaining if you add extra parens (though IIRC Visual Studio doesn't like it) - if((x = foo()) - but I don't think that D has anything of the sort). However, the switch statement requires a value, not an expression, so the risk of = being used instead of == is pretty much zero. So, disallowing it for switch doesn't really benefit anyone. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 18 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176





 What would it be anyway - just an optional check for the programmer
 similar to the override attribute?
override will stop being optional, see issue 3836. Likewise this was not meant to be optional, eventually.
This wouldn't make sense - why should I be forced to add something just to show I know that all the case values are compile-time constants? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 18 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6176






 This wouldn't make sense - why should I be forced to add something just to show
 I know that all the case values are compile-time constants?
Let's assume in one case you want the compiler to produce a very efficient switch, maybe because you are writing the main loop of a little interpreter. In this case you don't want one of the cases to be on a runtime value _by your mistake_, because this may break this compiler optimization, forcing a less efficient compilation of the switch. So to be sure you are not doing such mistakes, you add an annotation to the switch, and the compiler catches your mistakes. In the end I agree this is probably not necessary, and it's better for switch cases to be required to always be compile-time constants, losing a bit of switch flexibility. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 18 2012