www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 3760] New: Allow std.math pure function to be used in array operations.

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3760

           Summary: Allow std.math pure function to be used in array
                    operations.
           Product: D
           Version: 2.041
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: baryluk smp.if.uj.edu.pl


--- Comment #0 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-01-31
10:22:37 PST ---
It would be good to have possibility to use something like:
  a[] = sin(b[]);

To perform sin function on each element of b.

Or more complicated formulars, like:
  a[] += sin(a[] * b[] + 0.1*x) - x*a[];

I propose that such expression be supported for all relevant operations in
std.math (cos, sin, tan, exp, log, ...).

I also propose to have property " arrayoperation" for any custom pure function
of T f(T x) pure. which will equivalent to implicitly implementing:

T[] f(T[] x) pure nothrow {
 T[] r = new T[x.length];
 foreach (i, ref y; r) { y = f(x[i]); }
 return r;
}

which will also be used automatically in array operations expressions and
called automatically by compiler.

There is also need to think about two and more argument functions in std.math,
like pow. For such functions (also pure) i think they should be implemented as


T[] f(T[] a, T[] b) pure nothrow {
 T[] r = new T[x.length];
 foreach (i, ref y; r) { y = f(a[i],b[i]); }
 return r;
}


Of course temporary array r will not be created if f() will be part of array
operation.

Rationale for this is that modern processors have SSE instructions which could
perform up to 4 mathematial operations in parallel (like sin, cos, exp, log,
pow). And one of the reason for array operations is possibility to implement
them this (efficient) way.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 31 2010
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3760



--- Comment #1 from Don <clugdbug yahoo.com.au> 2010-02-04 05:16:00 PST ---
 Rationale for this is that modern processors have SSE instructions which 
could perform up to 4 mathematial operations in parallel (like sin, cos, exp, log, pow). Not really. The x87 has built-in functions for trig and exponential functions, but SSE doesn't. It's pretty hard to make them more efficient than calling a loop on each element individually. If you only need approximate values, it's possible to get a modest speedup, but if you need full accuracy, it's tough. Essentially because you can't have any branch instructions in the calculation, and working around this quickly chews up the 4-at-a-time benefit. You'd do this for syntax sugar, not for performance. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 04 2010
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=3760



--- Comment #2 from Witold Baryluk <baryluk smp.if.uj.edu.pl> 2010-02-04
15:54:00 PST ---
(In reply to comment #1)
 Rationale for this is that modern processors have SSE instructions which 
could perform up to 4 mathematial operations in parallel (like sin, cos, exp, log, pow). Not really. The x87 has built-in functions for trig and exponential functions, but SSE doesn't. It's pretty hard to make them more efficient than calling a loop on each element individually. If you only need approximate values, it's possible to get a modest speedup, but if you need full accuracy, it's tough. Essentially because you can't have any branch instructions in the calculation, and working around this quickly chews up the 4-at-a-time benefit.
Ok, you are right. But there are CPUs with transcendental functions, like AlitVec, Cell. Also Larabbe was supposed to have one. About approximated values, you have right. But such approximate functions of ie. sin, cann't be used becuase they will be not precise enough or they will be to slow, or they will not fully conform to IEEE 754. It would be better to have possibility to write custom functions like in my example. But my example wasn't giving any performance benefits. One can write approximated_sin(float x). It would be interesting question how to provide a vectorized version, for array operations. Implicit approximated_sin(float[] x) is not vectorizable automatically. Also performing 4-way evaluation in paralelal of approximated sin, with x changed automatically to float[4] will not be good, because even that it is pure nothrow, it still can ie. perform conditional instruction and variable lenght loops. Normally such problems are resolved using masking in SSE registers, but it needs to be solved manually by programmer. mayby. approximated_sin(N)(float[N] x) ? when mostly N=4. problems (and possible solutions) remains: - portability - approximated_sin!(4) will use platform specific ways of useing SSE (via intrinisic preferable to not allocate registers by hand)l - alligment - compiler will call not vectorized (approximated_sin!(1) ? or just approximated_sin) functions on the bounduary of the arrays, so rest of the array operations and function calls will be alligned properly. - conditionals - allowed, but should not be used.
 
 You'd do this for syntax sugar, not for performance.
Yes. Array operations is nice syntax, and leavs a potential to speed up computations. For just a syntax suggar it is already good to extend this expressions. (they currently anyway doesn't use sse). Seeing this problems now, i see that SSE argument isn't so simple. But still it is usefull to extend arrayops. SSE issues need to be addresse and maybe after discusion solved in the feature. It is also possible to perform compile time parsing of expression (using real parser, or by help of compiler/templates/types), and emit mixin with proper code. But this raises other questions: - why library code need to perform the same thing as compiler - why then we need array operations at all - mixins aren't so transparent to the user as could be "macros" (which we don't have yet). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 04 2010