www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D with experimental features like tuple unpacking

reply "bearophile" <bearophileHUGS lycos.com> writes:
UFCS, plus some tweaks to the standard library like this one 
(http://d.puremagic.com/issues/show_bug.cgi?id=8755 ) help write 
D code in a more functional (flow programming, Walter calls it 
component programming) style.

So tuples become more and more useful, and the lack of a syntax 
to unpack them becomes worse. So with time this unpacking syntax 
has bubbled up to become about my top enhancement request.

One good way to show what I mean is to actually show some code 
without and with such hypothetical syntax. The code needs to 
work, to be short, and not to be contrived. This computes the 
Huffman encoding of a dchar[]:


// --------------------------------------
import std.stdio, std.algorithm, std.typecons, 
std.container,std.array;

auto encode(T)(Group!("a == b", T[]) sf) {
     auto heap = sf.map!(s => tuple(s[1], [tuple(s[0], "")]))
                 .array.heapify!q{b < a};

     while (heap.length > 1) {
         auto lo = heap.front;
         heap.removeFront;
         auto hi = heap.front;
         heap.removeFront;

         foreach (ref pair; lo[1])
             pair[1] = '0' ~ pair[1];
         foreach (ref pair; hi[1])
             pair[1] = '1' ~ pair[1];
         heap.insert(tuple(lo[0] + hi[0], lo[1] ~ hi[1]));
     }

     return heap.front[1].schwartzSort!(p => tuple(p[1].length, 
p[0]));
}

void main() {
     auto s = "this is an example for huffman encoding"d;
     foreach (p; s.dup.sort().release.group.encode)
         writefln("'%s'  %s", p.tupleof);
}
// --------------------------------------


sort is followed by () because for unknown reasons we have not 
yet deprecated/removed the broken built-in array sort.


The same code with unpacking syntax:


// --------------------------------------
import std.stdio, std.algorithm, std.typecons, 
std.container,std.array;

auto encode(T)(Group!("a == b", T[]) sf) {
     auto heap = sf.map!(((c, f)) => tuple(f, [tuple(c, "")]))
                 .array.heapify!q{b < a};

     while (heap.length > 1) {
         auto (lof, loa) = heap.front;
         heap.removeFront;
         auto (hif, hia) = heap.front;
         heap.removeFront;

         foreach ((c, ref e); loa)
             e = '0' ~ e;
         foreach ((c, ref e); hia)
             e = '1' ~ e;
         heap.insert(tuple(lof + hif, loa ~ hia));
     }

     return heap.front[1].schwartzSort!(((c, e)) => 
tuple(e.length, c));
}

void main() {
     auto s = "this is an example for huffman encoding"d;
     foreach ((c, e); s.dup.sort().release.group.encode)
         writefln("'%s'  %s", c, e);
}
// --------------------------------------


The noisy [0] [1] are vanished from the encoding function.

In that code tuple unpacking happens in normal assignments:

auto (lof, loa) = heap.front;


Inside function signatures; notice the double (()) needed for a 
lambda:

map!(((c, f)) => ...)


And in foreach:

foreach ((c, ref e); loa)


Walter said that it's hard to predict what bad interactions such 
syntax will have with the other parts of the D syntax. I agree 
with this. On the other hand being frozen with fear isn't enough, 
the partial patch by Hara is getting dust since months. Given the 
importance of this feature, to break this ice and start moving 
forward I suggest to distribute a dmd with this experimental 
feature, and let people use and try it for few months or one year 
or more. Hopefully this will be enough to uncover the problems.

One way to do it is to just distribute a compiled experimental 
dmd with this feature. But this dmd will become quickly obsolete. 
An alternative solution is to use the idea of the Haskell GHC 
compiler, and introduce a compiler switch like 
-experimental=tuple_unpack (it's off on default). Python uses 
"from future import ..." for similar purposes.

I think it's good to have such staging in D for new ideas (I 
think it will be less used than in GHC). Such staging will avoid 
design mistakes like  []   () for UDAs and similar. Having a 
feature implemented in a GitHub patch is not enough, because only 
1-5 persons try it for a very short time. And putting a feature 
in DMD to eventually remove it in a successive DMD if it's bad 
(like the built-in regular expression syntax experiment) can't 
work any more at the current advanced stage of the D development.

Bye,
bearophile
Mar 21 2013
next sibling parent "Rob T" <alanb ucora.com> writes:
I would like to see tuple syntax and abilities improved. It's 
been a while since I last tried to use them so I'm not prepared 
to explain in detail what I'd like to see for improvements, 
however I can say that when I did try to use them I remember they 
were much more unwieldy to use and more limited that I thought 
was necessary.

I tried to use them for named parameters and for a function that 
could return multiple results, both without resorting to structs, 
but those attempts failed.

I basically wrote tuples off as too complicated and awkward to 
bother with and have not attempted to use them since, but that is 
unfortunate.

--rt
Mar 21 2013
prev sibling next sibling parent reply Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
22.03.2013 1:13, bearophile пишет:
 And in foreach:

 foreach ((c, ref e); loa)

D already have undocumented[1] front tuple expansion in foreach over range: --- import std.algorithm; import std.stdio; void main() { enum s = "this is an example for huffman encoding"d; foreach (c, n; s.dup.sort().release().group()) writefln("'%s' %s", c, n); } --- [1] http://d.puremagic.com/issues/show_bug.cgi?id=7361 -- Денис В. Шеломовский Denis V. Shelomovskij
Mar 22 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
22-Mar-2013 11:05, Denis Shelomovskij пишет:
 22.03.2013 1:13, bearophile пишет:
 And in foreach:

 foreach ((c, ref e); loa)

D already have undocumented[1] front tuple expansion in foreach over range: --- import std.algorithm; import std.stdio; void main() { enum s = "this is an example for huffman encoding"d; foreach (c, n; s.dup.sort().release().group()) writefln("'%s' %s", c, n); } --- [1] http://d.puremagic.com/issues/show_bug.cgi?id=7361

Nice, but where is the freaking manual :) -- Dmitry Olshansky
Mar 22 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
22-Mar-2013 14:58, bearophile пишет:

 Jacob Carlborg:

 auto (lof, loa) = heap.front;

How is front suddenly returning two elements?

It is not returning two elements. In D you can't return two elements. It is returning a 2-Tuple. You see the 2-tuples are built by the map() and then used to create a heap: auto heap = sf.map!(((c, f)) => tuple(f, ...)).array.heapify!q{...}; And this proposed syntax performs pattern matching on that 2-tuple, unpacking it into two variables (Hara has implemented this already): auto (lof, loa) = ...; In Haskell, Scala, Python, F#, etc, the semantics and syntax are similar.

I'd hate any solution that distances ordinary structs, library tuples and fixed-sized arrays. If anything these should be trivially substitutable where makes sense. Thus I'd go for struct destructuring syntax like (god forgive) JS6. I even had proposal to about same effect, but it was ignored. Among other things it replace .tupleof with more general destructuring facility. -- Dmitry Olshansky
Mar 22 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
22-Mar-2013 16:44, renoX пишет:
 On Friday, 22 March 2013 at 11:14:47 UTC, Dmitry Olshansky wrote:
 I'd hate any solution that distances ordinary structs, library tuples
 and fixed-sized arrays.

Are struct/tuples similar to fixed-sized array? I thought that both are different because you can select which element you access at runtime in an array, but usually this is not possible with structs and tuples where this is selected at compile time and then I thought about reflection and I'm not so sure..

Fundamentaly there is no difference at all as long as data layout is the same. The fact that you sort of can't easily index tuple of 3 ints as a fixed array of 3 ints is a clear indication of a tendency that I find disturbing. Anyway D is a systems language so you can just blast your way past these limitations: auto abc = Tuple!(int, int, int)(1, 4, 5); int[3] data = *cast(int[3]*)&abc; ... //now you can alternatively even to dynamic array: int[] dynViewOf = (cast(int*)&abc)[0..3]; There is no escaping the fact that it's just a piece of data structured in a certain way. -- Dmitry Olshansky
Mar 22 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
22-Mar-2013 17:41, renoX пишет:
 The fact that you sort of can't easily index tuple of 3 ints as a
 fixed array of 3 ints is a clear indication of a tendency that I find
 disturbing.

But there is a reason for this: tuple/structs can contain heterogeneous type whereas array contains only homogeneous type..

Surely I know this don't you think ? ;)
 So if you want to index a tuple/struct of say {int,int[128]} you have to
 build an array of pointers to these index, possible but clearly not free..

This example is trivially done with single point provided you know the layout/padding. There are ways to peek at that in D too. In general yes, heterogeneous is what makes structs and tuples standout.
 So I wouldn't agree that tuple/struct are 'trivially substitutable' to
 array..

In the case of exactly the same data layout they do. Plenty of cases are like this e. g. 2-d, 3-d points being structs. These could be treated as arrays easily. Otherwise it is a good idea to not try to when it doesn't make sense. -- Dmitry Olshansky
Mar 22 2013
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
22-Mar-2013 15:25, bearophile пишет:
 Dmitry Olshansky:

 auto (lof, loa) = ...;

 In Haskell, Scala, Python, F#, etc, the semantics and syntax are
 similar.

I'd hate any solution that distances ordinary structs, library tuples and fixed-sized arrays. If anything these should be trivially substitutable where makes sense.

In Bugzilla I suggested that syntax for arrays too, I'd like that: int[2] data = [10, 20]; auto (a, b) = data; Regarding tuples and structs, Tuples are implemented as structs, so forbidding struct unpacking is probably more work than not doing it. On the other hand structs and tuples are not exactly the same thing, despite one is implemented with the other.

What the heck this paragraph supposed to mean? Sorry I can't get the message out of it. What I think is that unpacking both tuples, structs and _fixed_ arrays has to be done the same way, period. That way has to be easy, concise and non error prone. Otherwise I'd call it all a failure. Dynamic arrays are the different beasts we may postpone decision on whether unpacking these in the same vein makes sense.
 I even had proposal to about same effect, but it was ignored. Among
 other things it replace .tupleof with more general destructuring
 facility.

I don't want to use ".tupleof" to unpack a Tuple. Because it's a very commonly done operation, so it needs a short syntax.

What I've meant is to add easily controllable way to slice up fields from structs as tuples (pretty much as .tupleof is doing but more fine-grained) see: struct P { int x; int y; double d; } P p = P(2, 5, 5.6); auto (x,y) = p.{x, y}; // + your unpacking p.{x,y} = p.{y,x}; // anmd many more short-hand things are possible Nested stuff is also no problem: struct J{ P a; P b; double r; } J j = ....; auto (first, second, last) = j.{a, b.{y}, r}; static assert(typeof(first) == P); static assert(typeof(second) == int); static assert(typeof(last) == double); Same with any nesting depth, b.{x,y} has type (int,int) as internal tuple or Tuple!(int, int) as in Phobos but it'd better be in druntime then. .tupleof is then done as trivial default: auto unpacked = j.{}; static assert(typeof(unpacked) == Tuple!(P, P, double)); -- Dmitry Olshansky
Mar 22 2013
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-03-21 22:13, bearophile wrote:

 auto (lof, loa) = heap.front;

How is front suddenly returning two elements? -- /Jacob Carlborg
Mar 22 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Denis Shelomovskij:

 D already have undocumented[1] front tuple expansion in foreach 
 over range:
 ---
 import std.algorithm;
 import std.stdio;

 void main() {
     enum s = "this is an example for huffman encoding"d;
     foreach (c, n; s.dup.sort().release().group())
         writefln("'%s'  %s", c, n);
 }

I know, but it doesn't work in the case of my code I have shown, with a SortedRange of 2-tuples: void main() { auto s = "this is an example for huffman encoding"d; foreach (c, n; s.dup.sort().release.group.encode) writefln("'%s' %s", c, n); } --------------------------- Jacob Carlborg:
 auto (lof, loa) = heap.front;

How is front suddenly returning two elements?

It is not returning two elements. In D you can't return two elements. It is returning a 2-Tuple. You see the 2-tuples are built by the map() and then used to create a heap: auto heap = sf.map!(((c, f)) => tuple(f, ...)).array.heapify!q{...}; And this proposed syntax performs pattern matching on that 2-tuple, unpacking it into two variables (Hara has implemented this already): auto (lof, loa) = ...; In Haskell, Scala, Python, F#, etc, the semantics and syntax are similar. Bye, bearophile
Mar 22 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 22 March 2013 at 07:05:03 UTC, Denis Shelomovskij 
wrote:
 22.03.2013 1:13, bearophile пишет:
 And in foreach:

 foreach ((c, ref e); loa)

D already have undocumented[1] front tuple expansion in foreach over range: --- import std.algorithm; import std.stdio; void main() { enum s = "this is an example for huffman encoding"d; foreach (c, n; s.dup.sort().release().group()) writefln("'%s' %s", c, n); } --- [1] http://d.puremagic.com/issues/show_bug.cgi?id=7361

I've run in to this a few times and it's very irritating. The lack of an explicit syntax introduces an ambiguity: foreach (c, n; s.dup.sort().release().group()) writefln("'%s' %s", c, n); should c be the index, or should the tuple be unpacked? Currently the tuple is unpacked. In the general case, there is no easy way of having n be the tuple and c be the size_t index. Sure, you could write foreach (size_t c, n; s.dup.sort().release().group()) writefln("'%s' %s", c, n); forcing n to be a tuple, which I think should work ( but doesn't, with a lot of irrelevant error messages: http://dpaste.dzfl.pl/7589a472 ), but if the first element of the tuple is also type size_t then you have a true ambiguity. tldr: If we're going to have tuple unpacking, we need some way of controlling it.
Mar 22 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 auto (lof, loa) = ...;

 In Haskell, Scala, Python, F#, etc, the semantics and syntax 
 are similar.

I'd hate any solution that distances ordinary structs, library tuples and fixed-sized arrays. If anything these should be trivially substitutable where makes sense.

In Bugzilla I suggested that syntax for arrays too, I'd like that: int[2] data = [10, 20]; auto (a, b) = data; Regarding tuples and structs, Tuples are implemented as structs, so forbidding struct unpacking is probably more work than not doing it. On the other hand structs and tuples are not exactly the same thing, despite one is implemented with the other.
 I even had proposal to about same effect, but it was ignored. 
 Among other things it replace .tupleof with more general 
 destructuring facility.

I don't want to use ".tupleof" to unpack a Tuple. Because it's a very commonly done operation, so it needs a short syntax. --------------------------- John Colvin:
 tldr: If we're going to have tuple unpacking, we need some way 
 of controlling it.

I agree, that's why in my proposed code I have added parentheses: foreach ((c, ref e); loa) foreach ((c, e); s.dup.sort().release.group.encode) Bye, bearophile
Mar 22 2013
prev sibling next sibling parent "ixid" <nuaccount gmail.com> writes:
 foreach ((c, ref e); loa)


I think parens become rather confusing, what about something like: foreach (|c, ref e|; loa)
Mar 22 2013
prev sibling next sibling parent "renoX" <renozyx gmail.com> writes:
On Friday, 22 March 2013 at 11:14:47 UTC, Dmitry Olshansky wrote:
 I'd hate any solution that distances ordinary structs, library 
 tuples and fixed-sized arrays.

Are struct/tuples similar to fixed-sized array? I thought that both are different because you can select which element you access at runtime in an array, but usually this is not possible with structs and tuples where this is selected at compile time and then I thought about reflection and I'm not so sure..
Mar 22 2013
prev sibling next sibling parent "renoX" <renozyx gmail.com> writes:
 The fact that you sort of can't easily index tuple of 3 ints as 
 a fixed array of 3 ints is a clear indication of a tendency 
 that I find disturbing.

But there is a reason for this: tuple/structs can contain heterogeneous type whereas array contains only homogeneous type.. So if you want to index a tuple/struct of say {int,int[128]} you have to build an array of pointers to these index, possible but clearly not free.. So I wouldn't agree that tuple/struct are 'trivially substitutable' to array..
Mar 22 2013
prev sibling next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 22 March 2013 at 13:41:31 UTC, renoX wrote:
 The fact that you sort of can't easily index tuple of 3 ints 
 as a fixed array of 3 ints is a clear indication of a tendency 
 that I find disturbing.

But there is a reason for this: tuple/structs can contain heterogeneous type whereas array contains only homogeneous type.. So if you want to index a tuple/struct of say {int,int[128]} you have to build an array of pointers to these index, possible but clearly not free.. So I wouldn't agree that tuple/struct are 'trivially substitutable' to array..

No. A tuple is by definition an ordered, finite sequence of elements. There is no problem to index that. tuple a = { 5; true; 0.1 } is(typeof(a[0]) == int) is(typeof(a[1]) == bool)
Mar 22 2013
prev sibling next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 22 March 2013 at 14:23:53 UTC, Tobias Pankrath wrote:
 But there is a reason for this: tuple/structs can contain 
 heterogeneous type whereas array contains only homogeneous 
 type..
 So if you want to index a tuple/struct of say {int,int[128]} 
 you have to build an array of pointers to these index, 
 possible but clearly not free..
 So I wouldn't agree that tuple/struct are 'trivially 
 substitutable' to array..

No. A tuple is by definition an ordered, finite sequence of elements. There is no problem to index that. tuple a = { 5; true; 0.1 } is(typeof(a[0]) == int) is(typeof(a[1]) == bool)

There is no problem to index that.

because the layout should be known at compile time.
Mar 22 2013
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 Denis Shelomovskij:

 D already have undocumented[1] front tuple expansion in 
 foreach over range:
 ---
 import std.algorithm;
 import std.stdio;

 void main() {
    enum s = "this is an example for huffman encoding"d;
    foreach (c, n; s.dup.sort().release().group())
        writefln("'%s'  %s", c, n);
 }

I know, but it doesn't work in the case of my code I have shown, with a SortedRange of 2-tuples: void main() { auto s = "this is an example for huffman encoding"d; foreach (c, n; s.dup.sort().release.group.encode) writefln("'%s' %s", c, n); }

So I have suggested to remove that syntax and replace it with a syntax that is more explicit, more flexible and more general: http://d.puremagic.com/issues/show_bug.cgi?id=9817 Bye, bearophile
Mar 26 2013