digitalmars.D.bugs - [Issue 3760] New: Allow std.math pure function to be used in array operations.

d-bugmail puremagic.com (46/46) Jan 31 2010 http://d.puremagic.com/issues/show_bug.cgi?id=3760

d-bugmail puremagic.com (15/16) Feb 04 2010 could perform up to 4 mathematial operations in parallel (like sin, cos,...
d-bugmail puremagic.com (50/63) Feb 04 2010 http://d.puremagic.com/issues/show_bug.cgi?id=3760

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=3760

           Summary: Allow std.math pure function to be used in array
                    operations.
           Product: D
           Version: 2.041
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: baryluk smp.if.uj.edu.pl



10:22:37 PST ---
It would be good to have possibility to use something like:
  a[] = sin(b[]);

To perform sin function on each element of b.

Or more complicated formulars, like:
  a[] += sin(a[] * b[] + 0.1*x) - x*a[];

I propose that such expression be supported for all relevant operations in
std.math (cos, sin, tan, exp, log, ...).

I also propose to have property " arrayoperation" for any custom pure function
of T f(T x) pure. which will equivalent to implicitly implementing:

T[] f(T[] x) pure nothrow {
 T[] r = new T[x.length];
 foreach (i, ref y; r) { y = f(x[i]); }
 return r;
}

which will also be used automatically in array operations expressions and
called automatically by compiler.

There is also need to think about two and more argument functions in std.math,
like pow. For such functions (also pure) i think they should be implemented as


T[] f(T[] a, T[] b) pure nothrow {
 T[] r = new T[x.length];
 foreach (i, ref y; r) { y = f(a[i],b[i]); }
 return r;
}


Of course temporary array r will not be created if f() will be part of array
operation.

Rationale for this is that modern processors have SSE instructions which could
perform up to 4 mathematial operations in parallel (like sin, cos, exp, log,
pow). And one of the reason for array operations is possibility to implement
them this (efficient) way.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Jan 31 2010

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=3760




 Rationale for this is that modern processors have SSE instructions which 

could perform up to 4 mathematial operations in parallel (like sin, cos, exp, 
log, pow).

Not really. The x87 has built-in functions for trig and exponential functions, 
but SSE doesn't.
It's pretty hard to make them more efficient than calling a loop on each 
element individually. If you only need approximate values, it's possible to get 
a modest speedup, but if you need full accuracy, it's tough.
Essentially because you can't have any branch instructions in the calculation, 
and working around this quickly chews up the 4-at-a-time benefit.

You'd do this for syntax sugar, not for performance.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Feb 04 2010

d-bugmail puremagic.com writes:

http://d.puremagic.com/issues/show_bug.cgi?id=3760




15:54:00 PST ---

 Rationale for this is that modern processors have SSE instructions which 

 could perform up to 4 mathematial operations in parallel (like sin, cos, exp, 
 log, pow).
 
 Not really. The x87 has built-in functions for trig and exponential functions, 
 but SSE doesn't.
 It's pretty hard to make them more efficient than calling a loop on each 
 element individually. If you only need approximate values, it's possible to
get 
 a modest speedup, but if you need full accuracy, it's tough.
 Essentially because you can't have any branch instructions in the calculation, 
 and working around this quickly chews up the 4-at-a-time benefit.

Ok, you are right. But there are CPUs with transcendental functions,
like AlitVec, Cell. Also Larabbe was supposed to have one.

About approximated values, you have right. But such approximate functions of
ie. sin,
cann't be used becuase they will be not precise enough or they will be to slow,
or they will not fully conform to IEEE 754.

It would be better to have possibility to write custom functions like in my
example.
But my example wasn't giving any performance benefits.

One can write approximated_sin(float x).
It would be interesting question how to provide a vectorized version,
for array operations.

Implicit approximated_sin(float[] x) is not vectorizable automatically.

Also performing 4-way evaluation in paralelal of approximated sin, with x
changed
automatically to float[4] will not be good, because even that it is pure
nothrow, it still
can ie. perform conditional instruction and variable lenght loops.

Normally such problems are resolved using masking in SSE registers, but it
needs to be
solved manually by programmer.

mayby. approximated_sin(N)(float[N] x) ? 

when mostly N=4.

problems (and possible solutions) remains:
 - portability - approximated_sin!(4) will use platform specific ways of useing
SSE (via intrinisic preferable to not allocate registers by hand)l
 - alligment - compiler will call not vectorized (approximated_sin!(1) ? or
just approximated_sin) functions on the bounduary of the arrays, so rest of the
array operations and function calls will be alligned properly.
 - conditionals - allowed, but should not be used.




 
 You'd do this for syntax sugar, not for performance.

Yes. Array operations is nice syntax, and leavs a potential to speed up
computations.
For just a syntax suggar it is already good to extend this expressions.
(they currently anyway doesn't use sse).


Seeing this problems now, i see that SSE argument isn't so simple. But still it
is usefull to extend arrayops. SSE issues need to be addresse and maybe after
discusion solved in the feature.

It is also possible to perform compile time parsing of expression (using real
parser, or by help of compiler/templates/types), and emit mixin with proper
code. But this raises other questions:
 - why library code need to perform the same thing as compiler
 - why then we need array operations at all
 - mixins aren't so transparent to the user as could be "macros" (which we
don't have yet).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Feb 04 2010

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - [Issue 3760] New: Allow std.math pure function to be used in array operations.