digitalmars.D - A little challenge...

Norbert Nemec <Norbert Nemec-online.de> writes:
Hi everybody,

thinking about array expressions, I have stumbled over an interesting
challenge for which I still have no idea:

Consider the mathematical sum notation:

\sum_i a_i*b_i

here, the variable i is defined only at the scope inside the expression.

A analogous D syntax could be something like

sum!(i)(a[i]*b[i])

where sum would have to be some kind of template that takes i as a name
parameter and then defines it as variable inside the scope of the second
expression.

(Of course, the scalar product is trivial enough to do it in a zillion
of other ways, but for more complex sum expressions with potentially
more than one variable, this would really be neat to have in an array
library.)

Does anyone have an idea how this could be realized in D? Allowing a
template to define additional symbols only for the scope of its
arguments? Is it possible at all with the current language definion?
Does anyone have an idea for an elegant language extension to allows
this? It would mean taking lazy evaluation even one step further, all
the way to a lazy symbol binding. Seems straighforward enough to me to
implement in a compiler. Just the question how exactly to define the
semantics.

Greetings,
Norbert

Feb 25 2010
Jason House <jason.james.house gmail.com> writes:
Norbert Nemec Wrote:

Hi everybody,

thinking about array expressions, I have stumbled over an interesting
challenge for which I still have no idea:

Consider the mathematical sum notation:

\sum_i a_i*b_i

here, the variable i is defined only at the scope inside the expression.

A analogous D syntax could be something like

sum!(i)(a[i]*b[i])

Would sum!( "i", "a[i]*b[i]" ) be acceptable? That should be achievable with a
template mixin that does string mixins under the hood.

Feb 25 2010
Norbert Nemec <Norbert Nemec-online.de> writes:
Jason House wrote:
Would sum!( "i", "a[i]*b[i]" ) be acceptable? That should be achievable with a
template mixin that does string mixins under the hood.

It is indeed a solution for the problem, but I still don't like it too
much. For one, writing expressions as strings always feels awkward. Even
though D can handle these strings at compile time, it just doesn't feel
like writing native D code.

Beyond this "gut feeling" I also see two technical problems:

* code editors do not recognize the string content as code, so they
cannot offer syntax highlighting or more advanced language tools

* the syntax does not allow nesting:

sum(i)( a[i] * sum(j)(b[i,j]*c[j]) )

Greetings,
Norbert

Feb 26 2010
Don <nospam nospam.com> writes:
Norbert Nemec wrote:
Jason House wrote:
Would sum!( "i", "a[i]*b[i]" ) be acceptable? That should be
achievable with a template mixin that does string mixins under the hood.

It is indeed a solution for the problem, but I still don't like it too
much. For one, writing expressions as strings always feels awkward. Even
though D can handle these strings at compile time, it just doesn't feel
like writing native D code.

Beyond this "gut feeling" I also see two technical problems:

* code editors do not recognize the string content as code, so they
cannot offer syntax highlighting or more advanced language tools

You can use the q{ } string syntax.

* the syntax does not allow nesting:

sum(i)( a[i] * sum(j)(b[i,j]*c[j]) )

This is a much bigger problem. It's not too difficult if you allow only
a set of built-in operations, but if you allow user-defined operations,
it's tough.

Template alias parameters have got much more powerful since I wrote
BLADE, so maybe it's more feasible now.

Feb 26 2010
"Igor Lesik" <curoles yahoo.com> writes:
"Jason House" <jason.james.house gmail.com> wrote in message
news:hm75nn$1a2q$1 digitalmars.com...
Norbert Nemec Wrote:

Hi everybody,

thinking about array expressions, I have stumbled over an interesting
challenge for which I still have no idea:

Consider the mathematical sum notation:

\sum_i a_i*b_i

here, the variable i is defined only at the scope inside the expression.

A analogous D syntax could be something like

sum!(i)(a[i]*b[i])

Would sum!( "i", "a[i]*b[i]" ) be acceptable? That should be achievable
with a template mixin that does string mixins under the hood.

I think "i" could be implicit. Here is how I was able to implement it:

void main() {
int[] a = [1,2,3];
int[] b = [4,5,6];
int[] c = [5,4,3];

int IamLazy(){ writeln("am I?"); return 1; }

writeln( sum!("a[i]*b[i]")(a,b) ); // 4 + 10 + 18 = 32
writeln( sum!("b[i]/a[i]")(a,b) ); // 4 + 2 + 2 = 8
writeln( sum!("(a[i]+b[i])%2 + c[i]")(a,b,c) ); // 6 5 4 = 15
writeln( sum!("a[i]*b[i]")(a,b,IamLazy()) );

// fun
writeln( mixin( Msum!("a[i]*b[i]") ) );
writeln( mixin( Msum!("(a[i]+b[i])%2 + c[i]") ) );
}

R _sum(string OP, R : R[], T...)(lazy T op) {
mixin Rename!(T.length);
R ret = 0;
foreach (i; 0 .. op[0].length) {
ret += mixin(OP);
}
return ret;
}

auto sum(string OP, T...)(lazy T op) if (T.length > 0)
{
return _sum!(OP, T[0], T)(op);
}

Feb 26 2010
Philippe Sigaud <philippe.sigaud gmail.com> writes:
--001517477f86d7bb17048086951d
Content-Type: text/plain; charset=ISO-8859-1

On Fri, Feb 26, 2010 at 10:07, Norbert Nemec <Norbert nemec-online.de>wrote:

Jason House wrote:

Would sum!( "i", "a[i]*b[i]" ) be acceptable? That should be achievable
with a template mixin that does string mixins under the hood.

It is indeed a solution for the problem, but I still don't like it too
much. For one, writing expressions as strings always feels awkward. Even
though D can handle these strings at compile time, it just doesn't feel like
writing native D code.

Beyond this "gut feeling" I also see two technical problems:

* code editors do not recognize the string content as code, so they cannot
offer syntax highlighting or more advanced language tools

* the syntax does not allow nesting:

sum(i)( a[i] * sum(j)(b[i,j]*c[j]) )

You could use anonymous functions, like this:

sum!( (i,a,b) { return a[i]+b[i];})(array1, array2);

Here I consider a function of an index and some arrays, but you could use
functions on elements:

sum!( (a,b) { return a+b*a;})(array1, array2);

Editors can recognize these functions and they nest (though it's a bit
heavy):

sum!( (a,b) { return a + sum!((b) { return b*b;})(array2) * a)(array1);

A possible implementation for one array can be :

T sum(alias op, T)(T[] array) /* also, RandomAccessRange*/
{
T result;
foreach(i, elem; array) result += op(i, array);
return result;
}

It's quite simplified: I consider op returns a T...

Now, what's interesting is generalizing it somewhat : any number of inputs,
and inputs can be input ranges and not only arrays:

auto sum(alias op, R...)(R ranges) if(allSatisfy!(isInputRange,R))
{
alias staticMap!(ElementType,R) FrontType;
FrontType f;
typeof(op(f)) theSum;
while(!ranges[0].empty)
{
foreach(i, Type; FrontType) { // extracting the fronts
f[i] = ranges[i].front;
ranges[i].popFront;
}
theSum += op(f);
}
return theSum;
}

usage:

int[] ar1 = [0,1,2,3];
int[] ar2 = [4,5,6,7];
auto s = sum!((a,b,c) {return a+b*c;})(ar1,ar2,ar1);
writeln(s); // 44

Note that we could sum ranges with any element type, as long as the .init
value can be summed. float/double being initialiazed to NaN, it's a bit
tricky.

We could adopt std.algorithm 'string functions' trick, to get this syntax,
which I quite like:

auto s = sum!"a+b*c"(ar1, ar2, ar3);

Maybe you could be interested by looking at some modules with these ideas
at:
http://www.dsource.org/projects/dranges

(addendum: another generalization is of course to have two operations: the
'mapping' operation and the 'reducing' operation
like this:
auto sum = mapReduce!("a*b*c+1", "a+b")(range1, range2, range3);
The first function is applied on the elements, the second on the successive
results of the first fun.

Philippe

--001517477f86d7bb17048086951d
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br><div class=3D"gmail_quote">On Fri, Feb 26, 2010 at 10:07, Norbert N=
emec <span dir=3D"ltr">&lt;<a href=3D"mailto:Norbert nemec-online.de">Norbe=
rt nemec-online.de</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quot=
e" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204,=
<div class=3D"im">Jason House wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde=
r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Would sum!( &quot;i&quot;, &quot;a[i]*b[i]&quot; ) be acceptable? That shou=
ld be achievable with a template mixin that does string mixins under the ho=
od.<br>
</blockquote>
<br></div>
It is indeed a solution for the problem, but I still don&#39;t like it too =
much. For one, writing expressions as strings always feels awkward. Even th=
ough D can handle these strings at compile time, it just doesn&#39;t feel l=
ike writing native D code.<br>

<br>
Beyond this &quot;gut feeling&quot; I also see two technical problems:<br>
<br>
* code editors do not recognize the string content as code, so they cannot =
offer syntax highlighting or more advanced language tools<br>
<br>
* the syntax does not allow nesting:<br>
<br>
=A0 =A0 =A0 =A0sum(i)( a[i] * sum(j)(b[i,j]*c[j]) )<br></blockquote><div><=
br>You could use anonymous functions, like this:<br><br>sum!( (i,a,b) { ret=
urn a[i]+b[i];})(array1, array2);<br><br>Here I consider a function of an i=
ndex and some arrays, but you could use functions on elements:<br>
=A0<br>
sum!( (a,b) { return a+b*a;})(array1, array2);<br>
<br>Editors can recognize these functions and they nest (though it&#39;s a =
bit heavy):<br><br>sum!( (a,b) { return a + sum!((b) { return b*b;})(array2=
) * a)(array1);<br><br>A possible implementation for one array can be :<br>
<br>T sum(alias op, T)(T[] array) /* also, RandomAccessRange*/<br>{<br>=A0=
=A0=A0 T result;<br>=A0=A0=A0 foreach(i, elem; array) result +=3D op(i, arr=
ay);<br>=A0=A0=A0 return result;<br>}<br><br>It&#39;s quite simplified: I c=
onsider op returns a T...<br>
<br>Now, what&#39;s interesting is generalizing it somewhat : any number of=
inputs, and inputs can be input ranges and not only arrays:<br><br>auto su=
m(alias op, R...)(R ranges) if(allSatisfy!(isInputRange,R))<br>{<br>=A0=A0=
=A0 alias staticMap!(ElementType,R) FrontType;<br>
=A0=A0=A0 FrontType f;<br>=A0=A0=A0 typeof(op(f)) theSum;<br>=A0=A0=A0 whil=
e(!ranges[0].empty)<br>=A0=A0=A0 {<br>=A0=A0=A0=A0=A0=A0=A0 foreach(i, Type=
; FrontType) { // extracting the fronts<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0 f[i] =3D ranges[i].front;<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ranges[i=
].popFront;<br>
=A0=A0=A0=A0=A0=A0=A0 }<br>=A0=A0=A0=A0=A0=A0=A0 theSum +=3D op(f);<br>=A0=
=A0=A0 }<br>=A0=A0=A0 return theSum;<br>}<br><br>usage:<br><br>int[] ar1 =
=3D [0,1,2,3];<br>int[] ar2 =3D [4,5,6,7];<br>auto s =3D sum!((a,b,c) {retu=
rn a+b*c;})(ar1,ar2,ar1);<br>writeln(s); // 44<br>
<br>Note that we could sum ranges with any element type, as long as the .in=
it value can be summed. float/double being initialiazed to NaN, it&#39;s a =
bit tricky.<br><br>We could adopt std.algorithm &#39;string functions&#39; =
trick, to get this syntax, which I quite like:<br>
<br>auto s =3D sum!&quot;a+b*c&quot;(ar1, ar2, ar3);<br><br>Maybe you could=
be interested by looking at some modules with these ideas at:<br><a href=
=3D"http://www.dsource.org/projects/dranges">http://www.dsource.org/project=
s/dranges</a><br>
<br>(addendum: another generalization is of course to have two operations: =
the &#39;mapping&#39; operation and the &#39;reducing&#39; operation<br>lik=
e this:<br>auto sum =3D mapReduce!(&quot;a*b*c+1&quot;, &quot;a+b&quot;)(ra=
nge1, range2, range3);<br>
The first function is applied on the elements, the second on the successive=
results of the first fun.<br><br>Philippe<br>=A0<br><br><br><br></div></di=
v>

--001517477f86d7bb17048086951d--

Feb 26 2010
Daniel Keep <daniel.keep.lists gmail.com> writes:
BLADE may also be of some interest to you.


Feb 25 2010
Norbert Nemec <Norbert Nemec-online.de> writes:
Daniel Keep wrote:
BLADE may also be of some interest to you.

Thanks for the hint! I had seen BLADE sometimes before, but I realize
that I should really study it in more detail.

Feb 26 2010
"Robert Jacques" <sandford jhu.edu> writes:
On Thu, 25 Feb 2010 18:57:32 -0500, Norbert Nemec
<Norbert nemec-online.de> wrote:
Hi everybody,

thinking about array expressions, I have stumbled over an interesting
challenge for which I still have no idea:

Consider the mathematical sum notation:

\sum_i a_i*b_i

here, the variable i is defined only at the scope inside the expression.

A analogous D syntax could be something like

sum!(i)(a[i]*b[i])

where sum would have to be some kind of template that takes i as a name
parameter and then defines it as variable inside the scope of the second
expression.

(Of course, the scalar product is trivial enough to do it in a zillion
of other ways, but for more complex sum expressions with potentially
more than one variable, this would really be neat to have in an array
library.)

Does anyone have an idea how this could be realized in D? Allowing a
template to define additional symbols only for the scope of its
arguments? Is it possible at all with the current language definion?
Does anyone have an idea for an elegant language extension to allows
this? It would mean taking lazy evaluation even one step further, all
the way to a lazy symbol binding. Seems straighforward enough to me to
implement in a compiler. Just the question how exactly to define the
semantics.

Greetings,
Norbert

Well there is std.algorithm's map and reduce. Somehow, I feel slicing,
ranges/generators and array expressions should be able to handle this. For
example: \sum_i i*a_i*b_i-1 => sum(iota * a * b[1..$]) But then again I'm having trouble thinking of real examples off the top of my head.  Feb 25 2010 Norbert Nemec <Norbert Nemec-online.de> writes: Robert Jacques wrote: Well there is std.algorithm's map and reduce. Somehow, I feel slicing, ranges/generators and array expressions should be able to handle this. For example: \sum_i i*a_i*b_i-1 => sum(iota * a * b[1..$]) But then
again I'm having trouble thinking of real examples off the top of my head.

In fact, that is the way you would do it in e.g. Python/NumPy. It works
fine for many common cases but does not scale up to more complex
situations. The mathematical sum notation scales up arbitrarily and
remains clear.

I would want to offer both options and leave it to the user to choose
the more appropriate notation.

Feb 26 2010
Norbert Nemec <Norbert Nemec-online.de> writes:
Robert Jacques wrote:
That sounds sensible. However, extensive experience in Matlab has taught
me that resorting to custom for-loop indicates you've failed to
sufficiently think in arrays. :)

Indeed, most use cases are simple enough to be handled in array
notation. I have worked with Matlab and Python and managed to come up
with array notations in many non-trivial cases as well. However, once in
a while, it just cannot be done. Typically, this happens when you have
to handle non-linear terms or high order tensorial objects.

Of course, my examples were simple enough to permit alternative
expressions, but I have encountered quite a number of cases where
I could not avoid a loop in Python. I is hard to spontaneously construct
something useful that I can describe in a few lines.

Imagine a charge density in one dimension:
rho[r]
and then compute the coulomb energy
sum(r1,r2)(rho[r1]*rho[r2]/(r1-r2))

Or an expression containing function calls

sum(i,j)(f(i)*g(j)*A[i,j]))

Ultimately, 'sum' and other reduction would actually be just one use
case. One could even use the same mechanism to construct arrays from
expressions.

auto A = array(a=0:10,b=0:20)(2*a + b%3)

(Disregard the exact syntax here...)

use cases.

Greetings,
Norbert

Feb 27 2010
Ary Borenszweig <ary esperanto.org.ar> writes:
Norbert Nemec wrote:
Hi everybody,

thinking about array expressions, I have stumbled over an interesting
challenge for which I still have no idea:

Consider the mathematical sum notation:

\sum_i a_i*b_i

here, the variable i is defined only at the scope inside the expression.

A analogous D syntax could be something like

sum!(i)(a[i]*b[i])

where sum would have to be some kind of template that takes i as a name
parameter and then defines it as variable inside the scope of the second
expression.

You are missing i's initial and ending values.

I think it should be something like:

sum!("i", 0, n)(a[i]*b[i])

Feb 26 2010
Norbert Nemec <Norbert Nemec-online.de> writes:
Ary Borenszweig wrote:
Norbert Nemec wrote:
Hi everybody,

thinking about array expressions, I have stumbled over an interesting
challenge for which I still have no idea:

Consider the mathematical sum notation:

\sum_i a_i*b_i

here, the variable i is defined only at the scope inside the expression.

A analogous D syntax could be something like

sum!(i)(a[i]*b[i])

where sum would have to be some kind of template that takes i as a
name parameter and then defines it as variable inside the scope of the
second expression.

You are missing i's initial and ending values.

I think it should be something like:

sum!("i", 0, n)(a[i]*b[i])

I assumed them to default to the array boundaries, but that does not
really matter at this point.

Feb 26 2010
"Robert Jacques" <sandford jhu.edu> writes:
On Fri, 26 Feb 2010 10:49:30 -0500, Norbert Nemec
<Norbert nemec-online.de> wrote:

Ary Borenszweig wrote:
Norbert Nemec wrote:
Hi everybody,

thinking about array expressions, I have stumbled over an interesting
challenge for which I still have no idea:

Consider the mathematical sum notation:

\sum_i a_i*b_i

here, the variable i is defined only at the scope inside the
expression.

A analogous D syntax could be something like

sum!(i)(a[i]*b[i])

where sum would have to be some kind of template that takes i as a
name parameter and then defines it as variable inside the scope of the
second expression.

I think it should be something like:
sum!("i", 0, n)(a[i]*b[i])

I assumed them to default to the array boundaries, but that does not
really matter at this point.

Detecting those array boundaries is not an easy task. For example, if you
take the expression in as a delegate you can't detect which ranges are
used in it.

Feb 26 2010
"Robert Jacques" <sandford jhu.edu> writes:
On Fri, 26 Feb 2010 03:58:59 -0500, Norbert Nemec
<Norbert nemec-online.de> wrote:

Robert Jacques wrote:
Well there is std.algorithm's map and reduce. Somehow, I feel slicing,
ranges/generators and array expressions should be able to handle this.
For example: \sum_i i*a_i*b_i-1 => sum(iota * a * b[1..\$]) But then
again I'm having trouble thinking of real examples off the top of my

In fact, that is the way you would do it in e.g. Python/NumPy. It works
fine for many common cases but does not scale up to more complex
situations. The mathematical sum notation scales up arbitrarily and
remains clear.

I would want to offer both options and leave it to the user to choose
the more appropriate notation.

That sounds sensible. However, extensive experience in Matlab has taught
me that resorting to custom for-loop indicates you've failed to
sufficiently think in arrays. :)
sum(i)( a[i] * sum(j)(b[i,j]*c[j]) ) => sum(a.*(b'*c)) or
a.*sum(b.*(c*ones(1,length(c)))) ,1)
or something like that. (As an aside, having a more efficient syntax for
broadcasts, instead of having to use the outer product all the time, would
be nice)

Is there some example of a complex case you can post? I think we'll all
think of better solutions with a goal in sight.

Feb 26 2010
"Robert Jacques" <sandford jhu.edu> writes:
On Sat, 27 Feb 2010 04:56:00 -0500, Norbert Nemec
<Norbert nemec-online.de> wrote:

Robert Jacques wrote:
That sounds sensible. However, extensive experience in Matlab has
taught me that resorting to custom for-loop indicates you've failed to
sufficiently think in arrays. :)

Indeed, most use cases are simple enough to be handled in array
notation. I have worked with Matlab and Python and managed to come up
with array notations in many non-trivial cases as well. However, once in
a while, it just cannot be done. Typically, this happens when you have
to handle non-linear terms or high order tensorial objects.

Of course, my examples were simple enough to permit alternative
expressions, but I have encountered quite a number of cases where
I could not avoid a loop in Python. I is hard to spontaneously construct
something useful that I can describe in a few lines.

Imagine a charge density in one dimension:
rho[r]
and then compute the coulomb energy
sum(r1,r2)(rho[r1]*rho[r2]/(r1-r2))

Or an expression containing function calls

sum(i,j)(f(i)*g(j)*A[i,j]))

Ultimately, 'sum' and other reduction would actually be just one use
case. One could even use the same mechanism to construct arrays from
expressions.

auto A = array(a=0:10,b=0:20)(2*a + b%3)

(Disregard the exact syntax here...)

use cases.

Greetings,
Norbert

Thank you. I understand the difficulty of finding good examples.  I did
take a look at my own research, but didn't find any good examples of
complex, single line expressions. Somehow, this feels like a narrow
problem area; simple things are easy to express using array ops, really
complex things take multiple lines to express and should really be done
with loops in the first place. Finding the middle ground and then finding
a concise way of correctly expressing it both seem like difficult
problems, though definitely ones worth investigating. Anyways, I've taken
a look at the new examples:

Let Map create an infinite array whose elements are generated via function
calls. (I've run into the function call issue before)
Let Index = Map!"a" or something similar
Let .B!D denote broadcasting an array along the D dimension
Let .* denote the element wise multiply. Practically, you might want to
use .A and .M to switch between element-wise array and matrix style math.
Practically, both Map and .B require the concept of one or more dimensions
which are 'unbounded'. Unbounded dimensions become bounded when they
interact with any operation that does bounds checking with a bounded
dimension.

sum(r1,r2)(rho[r1]*rho[r2]/(r1-r2))      => sum( rho.B!1.*rho.B!0./(
Index.B!1 - Index.B!0 ) )
sum(i,j)(f(i)*g(j)*A[i,j]))              => sum( Map!f.B!1.*Map!g.B!0.*A )
auto A = array(a=0:10,b=0:20)(2*a + b%3) => auto A = take([0,10],[0,20],
2*Index.B!1 + Index.B!0 % 3 );
auto A = take( Map!"2*a +
b%3"[0..10,0..20] )

Although the ideal syntax is definitely shorter and a bit more readable
than the array ops version, the array ops seems more concise than the
delegate based alternative:

sum( Map!((int i, int j){ return rho(i)*rho(j)/(i-j);
})[0..rho.length,0..rho.length] );

By the way, I'd recommend keeping all these examples for use in your
documentation, as people don't naturally think in arrays and array ops.

Feb 27 2010