www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 4115] New: Reading few CPU flags from D code

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4115

           Summary: Reading few CPU flags from D code
           Product: D
           Version: future
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



Delphi has ranged types of integral values, that increase the safety of
programs, restricting a variable in a sub range. In D a struct template can be
created to implement a ranged integral value:

Ranged!(1, 1001, int) foo;
alias Ranged!('a', 'z'+1, char) Lowercase;

(The type used by the struct of the can be omitted, so for a range in [1, 1000]
it can choose an int.)

See a similar idea in C++:
http://www.richherrick.com/software/herrick_library.html

Multiplications are quite less common on ranged variables, + and - == and
assigns are the most common operations done on them.

The preconditions of the methods of that struct can test for the out-of-range
conditions. In release mode they get removed (or I can use a debug statement).
But it's better to keep those tests when possible, so I'd like that Ranged to
be efficient.

Delphi ranges are fast also because the compiler can remove some unnecessary
checks, I can't do this in a simple way (template expressions are overkill
here). The struct has to test for out-of-range and true overflows of the
int/ubyte/etc they are implemented on.

But there are no good solution in D because:
- Checking for overflow in D with no inline assembly can be a little slow.
- Modern programmers know assembly less than in the past
- Asm is more error-prone
- asm is less portable than D code
- dmd (or LDC with no LDC extensions) don't inline functions and struct methods
that contain asm code
- And maybe the prologue-epilogue of the asm code can kill any performance
improvement given by reading the overflow bit from asm.

A solution is to make the backend smarter, so it recognizes patterns in the
code and compiles it into good asm, but LLVM doesn't currently perform well
here yet:

http://llvm.org/bugs/show_bug.cgi?id=4916
http://llvm.org/bugs/show_bug.cgi?id=4917
http://llvm.org/bugs/show_bug.cgi?id=4918

Even if/when LLVM implement those tiny optimizations, that's not a full
solution because the bad thing with compiler optimizations is that you can't
rely on them.

A solution that I think is better, that is portable on many CPU types (CPUs
aren't forced have all those flags, but they are common, and the compiler can
map the requested semantics using the correct asm instructions for different
CPUs too), and gives good performance, is to add ways to read the contents of
Overflow, Zero and Carry flags to std.intrinsic.

A simple way to implement it is to turn them into boolean functions that the
compiler manages in a special way, as the other intrinsics:

bool over = overflow_flag();
if (carry_flag()) {...} else {...}

Then the compiler has to manage them efficiently (for example here using a
single JNO or JO instruction), and inlining functions if they contain such
intrinsics.

Unlike the other intrinsics I have given them a semantic name, instead of the
name of the asm instruction, so the D compiler can use the right instruction
from different CPUs, increasing their portability.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 21 2010
parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4115


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |bugzilla digitalmars.com
         Resolution|                            |WONTFIX



09:18:18 PDT ---
The trouble with the idea of, say:

   a = b + 2;
   c = carry();

where carry() is a compiler intrinsic that reads the CPU carry flag, is that
there's nothing that says the previous statement even sets the carry flag in
any consistent manner. For example:

   a = b + 1;

may be implemented with an INC instruction, which does not set the carry flag
on overflow. Many optimizations may transform the code to not set the carry
flag, or to have the carry flag set by some other operation.

While this idea looks portable, it would be completely non-portable in
practice. Even its behavior with one compiler can arbitrarily depend on the
code mix surrounding it, and be highly sensitive to any changes in it in ways
that would be impractical for the user to track.

In its proposed form, the idea is not workable.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 28 2010