www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [RFC] Add an operator for ranges to D. Pros and cons?

reply "Dejan Lekic" <dejan.lekic gmail.com> writes:
Dear D community, I do not know about You, but I certainly do not 
like writing code like:

inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

I also tried to use operators ">>" and "~" but these make it 
confusing and hard to understand what the statement actually does.

Therefore I would like to know what do you think about the idea 
of having additional operator exclusively made for ranges? This 
operator would make it obvious that data are "streamed" (lack of 
better term) among ranges.

The first name I could come up with was "opArrow" but "opData" 
could also be okay, and operator would be either "~>" or "->".

This would give us an obvious, unambiguous statement:

Console.in ~> filter1(param) ~> fooRange ~> Console.out;
// Console is an imaginary class/struct

Or:
arr ~> odd ~> random ~> randomOdd;

I humbly believe that ranges are one of the most important 
concepts in D and that, plus the readability increase are two 
valid reasons for having this new operator.

I am also asking this because my point of view is strictly 
pragmatic - there may be technical reasons why we should not have 
this, or why we should have it done some other way, so please 
share your opinion.

Kind regards
Nov 07 2012
next sibling parent "Dejan Lekic" <dejan.lekic gmail.com> writes:
On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic wrote:
 Dear D community, I do not know about You, but I certainly do 
 not like writing code like:

 inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

 I also tried to use operators ">>" and "~" but these make it 
 confusing and hard to understand what the statement actually 
 does.

 Therefore I would like to know what do you think about the idea 
 of having additional operator exclusively made for ranges? This 
 operator would make it obvious that data are "streamed" (lack 
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

 Or:
 arr ~> odd ~> random ~> randomOdd;

 I humbly believe that ranges are one of the most important 
 concepts in D and that, plus the readability increase are two 
 valid reasons for having this new operator.

 I am also asking this because my point of view is strictly 
 pragmatic - there may be technical reasons why we should not 
 have this, or why we should have it done some other way, so 
 please share your opinion.

 Kind regards

EDIT: I really dislike that word "streamed" that I used. "Chained" would perhaps be a better one. :)
Nov 07 2012
prev sibling next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic wrote:
 Therefore I would like to know what do you think about the idea 
 of having additional operator exclusively made for ranges? This 
 operator would make it obvious that data are "streamed" (lack 
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

 Or:
 arr ~> odd ~> random ~> randomOdd;

I'm confused. So does this new operator just do the same thing as dot, but only work with ranges? Or does it have additional useful semantics?
Nov 07 2012
prev sibling next sibling parent "Dejan Lekic" <dejan.lekic gmail.com> writes:
On Wednesday, 7 November 2012 at 13:21:36 UTC, Peter Alexander 
wrote:
 On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic 
 wrote:
 Therefore I would like to know what do you think about the 
 idea of having additional operator exclusively made for 
 ranges? This operator would make it obvious that data are 
 "streamed" (lack of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

 Or:
 arr ~> odd ~> random ~> randomOdd;

I'm confused. So does this new operator just do the same thing as dot, but only work with ranges? Or does it have additional useful semantics?

UFCS is what makes that code-mess I started with. Imagine having ranges be part of some objects. I already gave an example of Console.in and Console.out. But say they are even deeper, so you have to refer to them using obj1.member.range notation, and now imagine using dot operator in some complex operation on ranges where you chain 5 or more ranges... All those dots and parenthesis can make head boil (at least it does make my head boil, not to mention that my colleague can's easily understand that statement at all written using UFCS).
Nov 07 2012
prev sibling next sibling parent "Tavi Cacina" <octavian.cacina outlook.com> writes:
On Wednesday, 7 November 2012 at 13:07:13 UTC, Dejan Lekic wrote:
 I humbly believe that ranges are one of the most important 
 concepts in D

yeah, the range chaining is quite cool. I am just starting with D, and I find this formatting most appealing (took from a comment of an article about components). Console.in // get some input .filter1(param) // filter it based on X .fooRange // tweak it some more .Console.out; // beam it up, scotty It allows to describe the chaining inplace. You have though a point, if you are combining the ranges in a long 'sausage', a dedicated operator may increase the readability.
Nov 07 2012
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Dejan Lekic:

 Dear D community, I do not know about You, but I certainly do 
 not like writing code like:

 inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

I suggest to format it this way, it's more readable: auto something = inRange .fooRange(param) .barRange() .bazRange(param1, param2) .outRange();
 Therefore I would like to know what do you think about the idea 
 of having additional operator exclusively made for ranges? This 
 operator would make it obvious that data are "streamed" (lack 
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData" 
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

I think it doesn't give a significant improvement. But maybe there are more interesting use cases. I'd like D ranges to support the "~" (using a template mixin to give them such operator), that acts like chain. So instead of writing: range1.chain(range2) You write: range1 ~ range2 It's also nice to have lazy lists, maybe based on fibers, with few operators to concat them, etc. Bye, bearophile
Nov 07 2012
parent reply Dejan Lekic <dejan.lekic gmail.com> writes:
bearophile wrote:

 Dejan Lekic:
 
 Dear D community, I do not know about You, but I certainly do
 not like writing code like:

 inRange.fooRange(param).barRange.
   .bazRange(param1, param2).outRange;

I suggest to format it this way, it's more readable: auto something = inRange .fooRange(param) .barRange() .bazRange(param1, param2) .outRange();
 Therefore I would like to know what do you think about the idea
 of having additional operator exclusively made for ranges? This
 operator would make it obvious that data are "streamed" (lack
 of better term) among ranges.

 The first name I could come up with was "opArrow" but "opData"
 could also be okay, and operator would be either "~>" or "->".

 This would give us an obvious, unambiguous statement:

 Console.in ~> filter1(param) ~> fooRange ~> Console.out;
 // Console is an imaginary class/struct

I think it doesn't give a significant improvement. But maybe there are more interesting use cases. I'd like D ranges to support the "~" (using a template mixin to give them such operator), that acts like chain. So instead of writing: range1.chain(range2) You write: range1 ~ range2 It's also nice to have lazy lists, maybe based on fibers, with few operators to concat them, etc. Bye, bearophile

I already did try using the tilda operator for a while, then I realised that people are getting confused thinking the line is concatinating strings, then then realise those are ranges... That is exactly the reason why I asked the D community what they think about having a new operator only for ranges... I also do what you suggest quite a lot. In fact I almost write it the same way you do in your example. But think about potential scenario when you give parameters as members of some structure: auto something = inRange .fooRange(someObject.someMember.membersMember) .barRange(SomeClass.staticMember) .bazRange(/* etc */) .jarRange(param1, param2) .outRange(/* etc */); Moreover, what if developer does not add "Range" to the name (typical case)? Imagine confusion with such UFCS methods and properties... -- Dejan Lekic - http://dejan.lekic.org
Nov 07 2012
parent Dejan Lekic <dejan.lekic gmail.com> writes:
bearophile wrote:

 Dejan Lekic:
 
 I already did try using the tilda operator for a while, then I
 realised that
 people are getting confused thinking the line is concatinating
 strings, then
 then realise those are ranges...

"~" is used for all arrays (while array.Appender used put for mysterious reasons). If more and more D code starts using "~" to concatenate ranges or arrays, I think D programmers will get used to this more general meaning. Bye, bearophile

~ with arrays has different semantics I fear. -- Dejan Lekic - http://dejan.lekic.org
Nov 07 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Dejan Lekic:

 I already did try using the tilda operator for a while, then I 
 realised that
 people are getting confused thinking the line is concatinating 
 strings, then
 then realise those are ranges...

"~" is used for all arrays (while array.Appender used put for mysterious reasons). If more and more D code starts using "~" to concatenate ranges or arrays, I think D programmers will get used to this more general meaning. Bye, bearophile
Nov 07 2012
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, November 07, 2012 14:07:12 Dejan Lekic wrote:
 Therefore I would like to know what do you think about the idea
 of having additional operator exclusively made for ranges? This
 operator would make it obvious that data are "streamed" (lack of
 better term) among ranges.

As far as I can tell, it adds zero functionality. It's purely a matter of trying to create cleaner looking code. That being the case, I would think that suggestions like auto something = inRange .fooRange(param) .barRange() .bazRange(param1, param2) .outRange(); solve the problem quite nicely, though I honestly, I have no problem with simply doing auto something = outRange(bazRange(barRange(fooRange(inRange, param)), param1, param2)); though with that many chained items and several of them taking multiple parameters, something like auto tempSomething = barRange(fooRange(inRange, param)); auto something = outRange(bazRange(tempSomething, param1, param2)); would probably be better. The first approach using UFCS seems rather popular though, and it's _very_ clean. My main gripe with it is that the flow is backwards, but I seem to be in the minority in thinking that. Regardless, there are ways to format code so that it's quite clean without making any language changes. I don't see how adding an operator would really help. It just complicates the language further. - Jonathan M Davis
Nov 07 2012
parent Dejan Lekic <dejan.lekic gmail.com> writes:
 
 auto something = outRange(bazRange(barRange(fooRange(inRange, param)), param1,
 param2));

It looks readable in this case, but to have it clean like that your parameters should be variables, otherwise imagine what would be if in all that you have calls to some functions to obtain argument for some of those ranges: ...fooRange(someObject.getInRange(), foo!(bla)(param1...)),param1... If I was the author, all would be fine, but if I give that code to someone, he/she will need time to understand what is actually happening... -- Dejan Lekic - http://dejan.lekic.org
Nov 07 2012