www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Add these to Phobos?

reply "Mehrdad" <wfunction hotmail.com> writes:
I think we need something like these in Phobos -- I was quite 
surprised that I didn't find these in Phobos. They're really 
handy in Python.


auto groupby(alias Key = k => k, alias Value = v => v, alias 
Equal = (a, b) => a == b, R)(R range)
	if (isInputRange!(R))
{
	struct GroupBy
	{
		alias typeof(Key(R.init.front)) TKey;
		R r;
		 property bool empty() { return this.r.empty; }
		void popFront()
		{
			auto k = Key(this.r.front);
			this.r.popFront();
			while (!this.r.empty && Equal(k, Key(this.r.front)))
			{ this.r.popFront(); }
		}
		 property auto front()
		{
			TKey k = Key(this.r.front);
			struct Grouper
			{
				TKey k;
				R r;
				 property bool empty()
				{ return this.r.empty || !Equal(this.k, Key(this.r.front)); }
				void popFront() { this.r.popFront(); }
				 property auto front()
				{ return Value(this.r.front); }
			}
			return Grouper(k, this.r);
		}
		static if (isForwardRange!(R))
		{  property typeof(this) save() { return 
typeof(this)(this.r.save()); } }
	}
	return GroupBy(range);
}

auto dict(R)(R range)
{
	ElementType!(R)[typeof(ElementType!(R).init[0])] result;
	foreach (t; range) { result[t[0]] = t /* or t[1]? */; }
	return result;
}

auto sorted(alias F = q{a < b}, SwapStrategy S = 
SwapStrategy.unstable, R)(R range)
{
	auto arr = range.array();
	arr.sort!(F, S)();
	return arr;
}


Ideas?
Oct 15 2012
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Mehrdad:

 auto groupby(alias Key = k => k, alias Value = v => v, alias 
 Equal = (a, b) => a == b, R)(R range)
 	if (isInputRange!(R))

I have not studied the semantics of this groupby, but Phobos contains a std.algorithm.group that is not too much different from the Python itertools function (there are some differences, but they are often acceptable).
 auto dict(R)(R range)
 {
 	ElementType!(R)[typeof(ElementType!(R).init[0])] result;
 	foreach (t; range) { result[t[0]] = t /* or t[1]? */; }
 	return result;
 }

There are various ways to design a similar function, and I think something similar is handy to have. I think I have already put an enhancement request for it in Bugzilla.
 auto sorted(alias F = q{a < b}, SwapStrategy S = 
 SwapStrategy.unstable, R)(R range)
 {
 	auto arr = range.array();
 	arr.sort!(F, S)();
 	return arr;
 }

http://d.puremagic.com/issues/show_bug.cgi?id=5076 Bye, bearophile
Oct 15 2012
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/15/12 5:35 PM, Mehrdad wrote:
 I think we need something like these in Phobos -- I was quite surprised
 that I didn't find these in Phobos. They're really handy in Python.

 Ideas?

Yes, I wanted to add some relational operators forever! There's a group() function in std.algorithm but doesn't offer a range of ranges. Andrei
Oct 15 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/15/12 7:39 PM, Mehrdad wrote:
 On Monday, 15 October 2012 at 22:35:41 UTC, Andrei Alexandrescu wrote:
 On 10/15/12 5:35 PM, Mehrdad wrote:
 I think we need something like these in Phobos -- I was quite surprised
 that I didn't find these in Phobos. They're really handy in Python.

 Ideas?

Yes, I wanted to add some relational operators forever! There's a group() function in std.algorithm but doesn't offer a range of ranges. Andrei

+1 yeah, group() was useless for me. I wanted to group consecutive items together by some criterion, which it (ironically) explicitly doesn't do.

It's not ironic, it's intentional. In libraries such as Scala's you specify GroupBy with whatever predicate against an arbitrary stream, and it takes care of all the intermediate data structures (e.g. hashes, arrays) and/or additional operations (sorting). In D, there's a strong emphasis of explicit costs and benefits, so to group by an arbitrary key you'd first sort by that key and then use the "dumb group" that only knows how to group on adjacent entries. I'm not sure which approach is best, but I tend to think the current approach is a better fit for D's general ethos. Andrei
Oct 17 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Monday, 15 October 2012 at 22:35:41 UTC, Andrei Alexandrescu 
wrote:
 On 10/15/12 5:35 PM, Mehrdad wrote:
 I think we need something like these in Phobos -- I was quite 
 surprised
 that I didn't find these in Phobos. They're really handy in 
 Python.

 Ideas?

Yes, I wanted to add some relational operators forever! There's a group() function in std.algorithm but doesn't offer a range of ranges. Andrei

+1 yeah, group() was useless for me. I wanted to group consecutive items together by some criterion, which it (ironically) explicitly doesn't do.
Oct 15 2012
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, October 15, 2012 23:35:59 Mehrdad wrote:
 auto sorted(alias F = q{a < b}, SwapStrategy S =
 SwapStrategy.unstable, R)(R range)
 {
 auto arr = range.array();
 arr.sort!(F, S)();
 return arr;
 }

What does this gain over sort? If you use sort auto result = sort(range); you get a SortedRange, which functions like find can take advantage of, and it's one line just like your sorted function. If you need a new array for it, then just call array. auto result = sort(array(range)); I don't see what your proposed function buys you. Is it just that you want to operate on the sorted array rather than a SortedRange? In that case, it's just auto arr = array(range); sort(arr); So, sorted would save you all of one line. Such a function seems incredibly lightweight to be adding it to the standard library. - Jonathan M Davis
Oct 15 2012
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/15/12 9:29 PM, Mehrdad wrote:
 Hmm, I didn't know 'sort' returns a value. That's very confusing
 since now I have no idea if it's mutating or not. Python has both
 sort and sorted, with clear meanings. (The goal was to make a
 non-mutating version that I could call functional.) Maybe we
 should clarify what sort exactly does and doesn't do...

std.algorithm operates in place wherever possible, and sort's result is a view on the same range as the original, but with new capabilities created by the sorting process. I think it's a very elegant design. Andrei
Oct 17 2012
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/15/12 9:47 PM, Jonathan M Davis wrote:
 And if SortedRange is changed to provide access to its underlying range via a
 member named source (it was recently suggested by Andrei that we should
 standardize on doing that where appropriate), then it could become a one-
 liner:

 auto result = sort(array(range)).source;

Yes, I think we should move toward the ".source" convention for all applicable ranges, e.g. retro.source yields the original range etc. Andrei
Oct 17 2012
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/15/12 10:35 PM, Mehrdad wrote:
 On Tuesday, 16 October 2012 at 01:47:58 UTC, Jonathan M Davis wrote:
 It should probably explain the rationale behind returning SortedRange
 so that it's much clearer as to why you'd want to use the return value
 rather than the original (now sorted) range.

+1 As it stands it's not at all clear from the documentation what the intention is, or how someone can go about sorting something without mutating it. Thanks for the responses.

Agreed. Part of the problem was that at the time I changed sort to return a value (it used to return void), severe compiler bugs forced me to take it out again for a while. So it's been "experimental" until it just silently started working and I forgot about it; and you know how documentation of experimental work goes... Regarding sorted(), one possibility would be to define a lazy sorting routine under that name or lazySort(). It would make for a better name than heap(), which implements the desired functionality. Andrei
Oct 17 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Tuesday, 16 October 2012 at 00:42:55 UTC, Jonathan M Davis
wrote:
 On Monday, October 15, 2012 23:35:59 Mehrdad wrote:
 auto sorted(alias F = q{a < b}, SwapStrategy S =
 SwapStrategy.unstable, R)(R range)
 {
 auto arr = range.array();
 arr.sort!(F, S)();
 return arr;
 }

What does this gain over sort? If you use sort auto result = sort(range); you get a SortedRange, which functions like find can take advantage of, and it's one line just like your sorted function. If you need a new array for it, then just call array. auto result = sort(array(range));

Hmm, I didn't know 'sort' returns a value. That's very confusing since now I have no idea if it's mutating or not. Python has both sort and sorted, with clear meanings. (The goal was to make a non-mutating version that I could call functional.) Maybe we should clarify what sort exactly does and doesn't do...
Oct 15 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, October 16, 2012 03:29:17 Mehrdad wrote:
 On Tuesday, 16 October 2012 at 00:42:55 UTC, Jonathan M Davis
 
 wrote:
 On Monday, October 15, 2012 23:35:59 Mehrdad wrote:
 auto sorted(alias F = q{a < b}, SwapStrategy S =
 SwapStrategy.unstable, R)(R range)
 {
 auto arr = range.array();
 arr.sort!(F, S)();
 return arr;
 }

What does this gain over sort? If you use sort auto result = sort(range); you get a SortedRange, which functions like find can take advantage of, and it's one line just like your sorted function. If you need a new array for it, then just call array. auto result = sort(array(range));

Hmm, I didn't know 'sort' returns a value. That's very confusing since now I have no idea if it's mutating or not. Python has both sort and sorted, with clear meanings. (The goal was to make a non-mutating version that I could call functional.) Maybe we should clarify what sort exactly does and doesn't do...

It sorts the range that's passed in and then returns a SortedRange which wraps the original range. The advantage of the SortedRange is that functions like find will then know that it's sorted and can take advantage of that, but if you just want to sort it, then just ignore the return value. The documentation could be clearer about the return type, but it _is_ listed as part of the signature. It should probably explain the rationale behind returning SortedRange so that it's much clearer as to why you'd want to use the return value rather than the original (now sorted) range. Regardless, if you want to sort a copy of a range but keep the same type, then all you have to do is auto newRange = array(range); sort(newRange); And if SortedRange is changed to provide access to its underlying range via a member named source (it was recently suggested by Andrei that we should standardize on doing that where appropriate), then it could become a one- liner: auto result = sort(array(range)).source; - Jonathan M Davis
Oct 15 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Tuesday, 16 October 2012 at 01:47:58 UTC, Jonathan M Davis 
wrote:
 It should probably explain the rationale behind returning 
 SortedRange so that it's much clearer as to why you'd want to 
 use the return value rather than the original (now sorted) 
 range.

+1 As it stands it's not at all clear from the documentation what the intention is, or how someone can go about sorting something without mutating it. Thanks for the responses.
Oct 15 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, October 16, 2012 03:47:43 Jonathan M Davis wrote:
 And if SortedRange is changed to provide access to its underlying range via
 a member named source (it was recently suggested by Andrei that we should
 standardize on doing that where appropriate), then it could become a one-
 liner:
 
 auto result = sort(array(range)).source;

Actually, it looks like SortedRange already provides access to the range that it wraps via release. It returns the wrapped range and removes it from the SortedRange, which is actually better than source, because if SortedRange had source, then it would be possible to make it so that it wasn't sorted anymore. In any case, that means that the one line becomes auto result = sort(array(range)).release(); - Jonathan M Davis
Oct 15 2012