D - Switch Statement Optimisation

J Anderson (49/49) Dec 11 2003 What type of optimisations does the compiler do for switch statements?

Benji Smith (7/12) Dec 11 2003 I always sort my switch cases by most-likely-first (to the best of my

J Anderson (47/64) Dec 11 2003 If you knew it was likely to be binary you could put the numbers for the...
Ilya Minkov (18/20) Dec 12 2003 I think this is wrong. As far as my experience with Delphi goes, this

Sean L. Palmer (6/26) Dec 12 2003 The compiler should be able to sort your cases for you. Although this i...

J Anderson (5/44) Dec 12 2003 There was this software product I saw advertised for C++ (I've forgotten...
Walter (6/9) Dec 12 2003 Yes. There is no reason for the user to do an ordinal sort, that kind of

Walter (9/17) Dec 12 2003 With Digital Mars compilers, you're better off putting sorting by most u...

Ilya Minkov (12/13) Dec 12 2003 Yup, first look is quite interesting. I'll have to take time some other

J Anderson (4/17) Dec 12 2003 I guess, case-statement optimisation is a well-threshed-out field, and

Walter (5/7) Dec 12 2003 I could do better code generation for some special cases, but the 3

J Anderson <REMOVEanderson badmama.com.au> writes:

What type of optimisations does the compiler do for switch statements?

For example if I had something like:
switch (val)
{
case 1:
�
break;
case 2:
�
break;
case 3:
�
break;
case 4:
�
break;
case 5:
�
break;
}

Could the compiler change this to something like.

If (val >= 1 && val <= 5)
(caseposfunc + val - 1)(); //Ie val is the offset to the function
else
default;


Also if I had:

switch (val)
{
case �Hello�:
�
break;
case �Goodbye�:
...
break;
case �Something else�
...
break;
case �Another thing�;
...
break;
ect�
}


Could the compiler sort the definitions and use a binary lookup (ie 
log(n)) to find a matching entry (if any).

Obviously if it did there would be some kinda threshold where a load of 
comparisons would be more efficient. Also, I guess that in some rare 
cases, this would be less efficient when users sort their cases 
statements by-most-likely first.

-Anderson

Dec 11 2003

Benji Smith <dlanguage xxagg.com> writes:

On Fri, 12 Dec 2003 06:45:28 +0800, J Anderson
<REMOVEanderson badmama.com.au> wrote:

Obviously if it did there would be some kinda threshold where a load of 
comparisons would be more efficient. Also, I guess that in some rare 
cases, this would be less efficient when users sort their cases 
statements by-most-likely first.

-Anderson

I always sort my switch cases by most-likely-first (to the best of my
design-time knowledge). If I think the switch cases will be a
bottleneck and I'm not entirely sure of which case is the most likely,
I'll benchmark a few different case orderings.

--Benji

Dec 11 2003

J Anderson <REMOVEanderson badmama.com.au> writes:

Benji Smith wrote:

On Fri, 12 Dec 2003 06:45:28 +0800, J Anderson
<REMOVEanderson badmama.com.au> wrote:

  

Obviously if it did there would be some kinda threshold where a load of 
comparisons would be more efficient. Also, I guess that in some rare 
cases, this would be less efficient when users sort their cases 
statements by-most-likely first.

-Anderson
    

I always sort my switch cases by most-likely-first (to the best of my
design-time knowledge). If I think the switch cases will be a
bottleneck and I'm not entirely sure of which case is the most likely,
I'll benchmark a few different case orderings.

--Benji
  

If you knew it was likely to be binary you could put the numbers for the 
most-likely first at the middle points.

Also, if the compiler had something like:

switch (val)
{
case "Something X":
...
case "Something Y":
...
case "Something Z":
case "Other X":
case "Other Y":
case "Other Z":
}

A dictionary-binary tree could be used, so that each letter (4 letters) 
only needs to be looked at once. ie

if (val[0..4] == "Some")
{
    if (val[4..10] == "thing ")
    {
       if (val[10] == "X")
       ...
       else if (val[10] == "Y")
       ...
       else if (val[10] == "Z")
       ...
       else
          default;
    }
    else
       default;
}
else if (val[0..6] == "Other ")
{
       if (val[6] == "X")
       ...
       else if (val[6] == "Y")
       ...
       else if (val[6] == "Z")
       ...
       else
          default;
}
else
    default;

Of course the asm version would be even more efficient.

Dec 11 2003

Ilya Minkov <minkov cs.tum.edu> writes:

Benji Smith wrote:
 I always sort my switch cases by most-likely-first (to the best of my
 design-time knowledge).

I think this is wrong. As far as my experience with Delphi goes, this
leads to the worst performance. You should generally sort your labels
so, that the values are in their ordinal sorting order. This allows the
compiler to write no more than 2 IFs and one jump adress table for each
nearly consecutive region of values.

If you have something like 1, 2, 3, 4, 5, then 1000, 1001, 1002, 1003,
you have 2 distinct groups of values. Within the group, values must be
sorted in their order for the best performance, but you can decide which
group you want to process first, since this might actually done with
IFs. Though this probably doesn't matter for 2 regions, it might
starting with 3... but as you can guess the impact is minimal.

Of course, with enums you can not always have the order in your memory,
so i would expect the compiler to reorder the cases automatically.

-eye

EDIT. If there is an optimal bit-matcher, order of the groups should not 
mater as well.

-eye

Dec 12 2003

"Sean L. Palmer" <palmer.sean verizon.net> writes:

The compiler should be able to sort your cases for you.  Although this is
made a bit more difficult by fallthrough.  But the compiler probably won't
be able to tell which cases will be hit most frequently.

Sean

"Ilya Minkov" <minkov cs.tum.edu> wrote in message
news:brcrdv$2t9g$2 digitaldaemon.com...
 Benji Smith wrote:
 I always sort my switch cases by most-likely-first (to the best of my
 design-time knowledge).

 I think this is wrong. As far as my experience with Delphi goes, this
 leads to the worst performance. You should generally sort your labels
 so, that the values are in their ordinal sorting order. This allows the
 compiler to write no more than 2 IFs and one jump adress table for each
 nearly consecutive region of values.

 If you have something like 1, 2, 3, 4, 5, then 1000, 1001, 1002, 1003,
 you have 2 distinct groups of values. Within the group, values must be
 sorted in their order for the best performance, but you can decide which
 group you want to process first, since this might actually done with
 IFs. Though this probably doesn't matter for 2 regions, it might
 starting with 3... but as you can guess the impact is minimal.

 Of course, with enums you can not always have the order in your memory,
 so i would expect the compiler to reorder the cases automatically.

 -eye

 EDIT. If there is an optimal bit-matcher, order of the groups should not
 mater as well.

 -eye

Dec 12 2003

J Anderson <REMOVEanderson badmama.com.au> writes:

Sean L. Palmer wrote:

The compiler should be able to sort your cases for you.  Although this is
made a bit more difficult by fallthrough.  But the compiler probably won't
be able to tell which cases will be hit most frequently.

Sean
  

There was this software product I saw advertised for C++ (I've forgotten 
the name) that could benchmark your code and then optimise functions, 
but most-used-first.   I guess the same technique could apply to switch 
statements.  Although you can't predict every run, it could help a bit.

"Ilya Minkov" <minkov cs.tum.edu> wrote in message
news:brcrdv$2t9g$2 digitaldaemon.com...
  

Benji Smith wrote:
    

I always sort my switch cases by most-likely-first (to the best of my
design-time knowledge).
      

I think this is wrong. As far as my experience with Delphi goes, this
leads to the worst performance. You should generally sort your labels
so, that the values are in their ordinal sorting order. This allows the
compiler to write no more than 2 IFs and one jump adress table for each
nearly consecutive region of values.

If you have something like 1, 2, 3, 4, 5, then 1000, 1001, 1002, 1003,
you have 2 distinct groups of values. Within the group, values must be
sorted in their order for the best performance, but you can decide which
group you want to process first, since this might actually done with
IFs. Though this probably doesn't matter for 2 regions, it might
starting with 3... but as you can guess the impact is minimal.

Of course, with enums you can not always have the order in your memory,
so i would expect the compiler to reorder the cases automatically.

-eye

EDIT. If there is an optimal bit-matcher, order of the groups should not
mater as well.

-eye

Dec 12 2003

"Walter" <walter digitalmars.com> writes:

"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:brd07g$3l1$1 digitaldaemon.com...
 The compiler should be able to sort your cases for you.  Although this is
 made a bit more difficult by fallthrough.  But the compiler probably won't
 be able to tell which cases will be hit most frequently.

Yes. There is no reason for the user to do an ordinal sort, that kind of
thing compilers are good at. And since the compiler cannot know which are
the most frequent case, having the user sort them by
most-used-appears-first, works pretty good in practice.

Dec 12 2003

"Walter" <walter digitalmars.com> writes:

"Ilya Minkov" <minkov cs.tum.edu> wrote in message
news:brcqlo$2s89$1 digitaldaemon.com...
 Benji Smith wrote:
 I always sort my switch cases by most-likely-first (to the best of my
 design-time knowledge).

 I think this is wrong. As far as my experience with Delphi goes, this
 leads to the worst performance. You should generally sort your labels
 so, that the values are in their ordinal sorting order. This allows the
 compiler to write no more than 2 IFs and one jump adress table for each
 nearly consecutive region of values.

With Digital Mars compilers, you're better off putting sorting by most used
first. 3 kinds of code are generated for switches, depending on the
population of case values:

1) sequence of cmp-je
2) index case value into a jump table
3) search for case value in table, use resulting index into jump table

As always, you can verify what is being done by using obj2asm.

Dec 12 2003

Ilya Minkov <minkov cs.tum.edu> writes:

Walter wrote:

 As always, you can verify what is being done by using obj2asm.

Yup, first look is quite interesting. I'll have to take time some other 
day, i've got an importand math test tomorrow. I want to propose yet 
another strategy. When matching on string hashes or pointers, or other 
nearly-random, loosely populated data, you can almost always find and 
cut out a few bits which are always different, and use them directly as 
indexes into a jump table. The table would only have to be somewhat 
larget than minimal, up to 50% (worst case), generally about the half of 
that. The next power of 2 in size, that is.

If it doesn't quite work in some certain case, you can stack up 2 cases 
or fallback to the old strategies.

-eye

Dec 12 2003

J Anderson <REMOVEanderson badmama.com.au> writes:

Ilya Minkov wrote:

 Walter wrote:

 As always, you can verify what is being done by using obj2asm.


 Yup, first look is quite interesting. I'll have to take time some 
 other day, i've got an importand math test tomorrow. I want to propose 
 yet another strategy. When matching on string hashes or pointers, or 
 other nearly-random, loosely populated data, you can almost always 
 find and cut out a few bits which are always different, and use them 
 directly as indexes into a jump table. The table would only have to be 
 somewhat larget than minimal, up to 50% (worst case), generally about 
 the half of that. The next power of 2 in size, that is.

 If it doesn't quite work in some certain case, you can stack up 2 
 cases or fallback to the old strategies.

 -eye

I guess, case-statement optimisation is a well-threshed-out field, and 
Walter would be using the best technique known.

-Anderson

Dec 12 2003

"Walter" <walter digitalmars.com> writes:

"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:brdp7l$18gf$2 digitaldaemon.com...
 I guess, case-statement optimisation is a well-threshed-out field, and
 Walter would be using the best technique known.

I could do better code generation for some special cases, but the 3
strategies used cover the territory effectively enough that it just never
has been a performance issue.

Dec 12 2003

D Programming

C/C++ Programming

Other

D - Switch Statement Optimisation