www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Merging one Array with Another

reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
What's the fastest Phobos-way of doing either

     x ~= y; // append
     x = x.uniq; // remove duplicates

or

     x = (x ~ y).uniq; // append and remove duplicates in one go

provided that

     T[] x, y;

?
May 01 2015
next sibling parent reply "Ilya Yaroshenko" <ilyayaroshenko gmail.com> writes:
Both variants are wrong because uniq needs sorted ranges.

Probably you need something like that:

x = x.chain(y).sort.uniq.array;

On Friday, 1 May 2015 at 19:08:51 UTC, Per Nordlöw wrote:
 What's the fastest Phobos-way of doing either

     x ~= y; // append
     x = x.uniq; // remove duplicates

 or

     x = (x ~ y).uniq; // append and remove duplicates in one go

 provided that

     T[] x, y;

 ?
May 01 2015
next sibling parent reply "Ilya Yaroshenko" <ilyayaroshenko gmail.com> writes:
If x is already sorted

x = x.completeSort(y).uniq.array;

On Friday, 1 May 2015 at 19:30:08 UTC, Ilya Yaroshenko wrote:
 Both variants are wrong because uniq needs sorted ranges.

 Probably you need something like that:

 x = x.chain(y).sort.uniq.array;

 On Friday, 1 May 2015 at 19:08:51 UTC, Per Nordlöw wrote:
 What's the fastest Phobos-way of doing either

    x ~= y; // append
    x = x.uniq; // remove duplicates

 or

    x = (x ~ y).uniq; // append and remove duplicates in one go

 provided that

    T[] x, y;

 ?
May 01 2015
parent reply "Ilya Yaroshenko" <ilyayaroshenko gmail.com> writes:
fix:

	completeSort(x.assumeSorted, y);
	x = x.chain(y).uniq.array;

or

	completeSort(x.assumeSorted, y.uniq.array);
	x ~= y;


On Friday, 1 May 2015 at 19:34:20 UTC, Ilya Yaroshenko wrote:
 If x is already sorted

 x = x.completeSort(y).uniq.array;

 On Friday, 1 May 2015 at 19:30:08 UTC, Ilya Yaroshenko wrote:
 Both variants are wrong because uniq needs sorted ranges.

 Probably you need something like that:

 x = x.chain(y).sort.uniq.array;

 On Friday, 1 May 2015 at 19:08:51 UTC, Per Nordlöw wrote:
 What's the fastest Phobos-way of doing either

   x ~= y; // append
   x = x.uniq; // remove duplicates

 or

   x = (x ~ y).uniq; // append and remove duplicates in one go

 provided that

   T[] x, y;

 ?
May 01 2015
parent "Ilya Yaroshenko" <ilyayaroshenko gmail.com> writes:
fix:

  	completeSort(x.assumeSorted, y);
  	x = x.chain(y).uniq.array;

  or  (was fixed)

         y = y.sort().uniq.array;
  	completeSort(x.assumeSorted, y);
  	x ~= y;
May 01 2015
prev sibling next sibling parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 1 May 2015 at 19:30:08 UTC, Ilya Yaroshenko wrote:
 Probably you need something like that:

 x = x.chain(y).sort.uniq.array;
You're right: import std.stdio, std.algorithm, std.range; auto x = [11, 3, 2, 4, 5, 1]; auto y = [0, 3, 10, 2, 4, 5, 1]; writeln(x.chain(y).uniq); writeln(x.chain(y).sort.uniq); outputs [11, 3, 2, 4, 5, 1, 0, 3, 10, 2, 4, 5, 1] [0, 1, 2, 3, 4, 5, 10, 11] so why doesn't say anything about need for sortness!? I expected D to be strict here and SortedRange as input to uniq.
May 01 2015
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/01/2015 03:41 PM, "Per =?UTF-8?B?Tm9yZGzDtnci?= 
<per.nordlow gmail.com>" wrote:

 so why doesn't



 say anything about need for sortness!?
It is about the uniqueness of consecutive elements. It says "unique consecutive elements". Ali
May 01 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 1 May 2015 at 22:47:17 UTC, Ali Çehreli wrote:
 It is about the uniqueness of consecutive elements. It says 
 "unique consecutive elements".

 Ali
Ah. Then I guess that if input is SortedRange then SortedRange should be returned. Thanks.
May 01 2015
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/01/2015 06:39 PM, "Per =?UTF-8?B?Tm9yZGzDtnci?= 
<per.nordlow gmail.com>" wrote:> On Friday, 1 May 2015 at 22:47:17 UTC, 
Ali Çehreli wrote:
 It is about the uniqueness of consecutive elements. It says "unique
 consecutive elements".

 Ali
Ah. Then I guess that if input is SortedRange then SortedRange should be returned.
Interesting idea. I have defined a simple algorithm below to see how it could work (my skipped() function instead of uniq()). import std.stdio; import std.range; import std.algorithm; import std.exception; struct Skipper(R) { R range; this(R range) { enforce(range.length % 2 == 0, "This example requires even number of elements"); this.range = range; } property bool empty() { return range.empty; } property auto front() { return range.front; } void popFront() { range.popFront(); range.popFront(); } } auto skipped(R)(R range) { import std.traits; static if (isInstanceOf!(SortedRange, R)) { // Nice! Let's add an .assumeSorted at the end return Skipper!R(range).assumeSorted; } else { return Skipper!R(range); } } void use(R)(R range) { pragma(msg, R); writeln(range); writeln(); } void main() { auto a = [ 1, 2, 3, 4, 5, 6 ]; use(a.skipped); use(a.assumeSorted.skipped); } Prints at compile time: Skipper!(int[]) SortedRange!(Skipper!(SortedRange!(int[], "a < b")), "a < b") Prints at run time: [1, 3, 5] [1, 3, 5] One issue is that many standard algorithms could benefit from this as well. Should it be only for SortedRange? What about user defined types... What if I wanted MyCoolRange to be appended automatically as .myCoolRange. Could there we standard mechanism to support any range type? Maybe the answer is a helper mixin that defines a template specialization for SortedRange!(R2, Func) instead of my 'static if' solution. (I was too impatient to get that syntax right. :) ) Ali
May 01 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Saturday, 2 May 2015 at 04:08:04 UTC, Ali Çehreli wrote:
 Interesting idea. I have defined a simple algorithm below to 
 see how it could work (my skipped() function instead of uniq()).
That's a bit above my current D experience to decide. What about just tweaking uniq() for now to propagate SortedRange and leave the rest for later? Or will this break existing uses of uniq?
May 03 2015
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/03/2015 07:56 AM, "Per =?UTF-8?B?Tm9yZGzDtnci?= 
<per.nordlow gmail.com>" wrote:
 On Saturday, 2 May 2015 at 04:08:04 UTC, Ali Çehreli wrote:
 Interesting idea. I have defined a simple algorithm below to see how
 it could work (my skipped() function instead of uniq()).
That's a bit above my current D experience to decide. What about just tweaking uniq() for now to propagate SortedRange and leave the rest for later?
The implementation seems trivial to me. If others don't object, I suggest you open an enhancement request.
 Or will this break existing uses of uniq?
Other than the fact that uniq may return SortedRange, I don't see any issue. If an existing piece of code was explicitly checking whether a certain range object was UniqResult, no code should be affected. Another possibility is to return UniqResult in both cases but expose the special SortedRange member functions on it if the input were SortedRange. (Again, trivial.) Ali
May 03 2015
prev sibling parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 1 May 2015 at 19:30:08 UTC, Ilya Yaroshenko wrote:
 Both variants are wrong because uniq needs sorted ranges.

 Probably you need something like that:

 x = x.chain(y).sort.uniq.array;
Interesting. Is x = x.chain(y).sort faster than x ~= y; x.sort; ? If so why?
May 02 2015
parent reply "Meta" <jared771 gmail.com> writes:
On Saturday, 2 May 2015 at 10:18:07 UTC, Per Nordlöw wrote:
 On Friday, 1 May 2015 at 19:30:08 UTC, Ilya Yaroshenko wrote:
 Both variants are wrong because uniq needs sorted ranges.

 Probably you need something like that:

 x = x.chain(y).sort.uniq.array;
Interesting. Is x = x.chain(y).sort faster than x ~= y; x.sort; ? If so why?
Probably the latter is slower than the former, at the very least because the latter requires memory allocation whereas the former does not.
May 02 2015
parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Saturday, 2 May 2015 at 11:16:30 UTC, Meta wrote:
 Probably the latter is slower than the former, at the very 
 least because the latter requires memory allocation whereas the 
 former does not.
Ahh!, auto x = [11, 3, 2, 4, 5, 1]; auto y = [0, 3, 10, 2, 4, 5, 1]; auto z = x.chain(y).sort; // sort them in place assert(x == [0, 1, 1, 2, 2, 3]); assert(y == [3, 4, 4, 5, 5, 10, 11]); is very cool, provided that y is allowed to mutate. Now I understand why chain is useful. A reallocation will however be needed in the final assignment though. But no temporary!
May 02 2015
prev sibling parent reply "Andrea Fontana" <nospam example.com> writes:
On Friday, 1 May 2015 at 19:08:51 UTC, Per Nordlöw wrote:
 What's the fastest Phobos-way of doing either

     x ~= y; // append
     x = x.uniq; // remove duplicates

 or

     x = (x ~ y).uniq; // append and remove duplicates in one go

 provided that

     T[] x, y;

 ?
Maybe a way like this could be useful: http://dpaste.dzfl.pl/7b4b37b490a7
May 06 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Wednesday, 6 May 2015 at 16:05:15 UTC, Andrea Fontana wrote:
 Maybe a way like this could be useful:
 http://dpaste.dzfl.pl/7b4b37b490a7
If r is a SortedRange this is very unneccesary wasteful because of the use AA. In that case you, instead, only want to remove equal consequtive elements without the need for any AA. I'm guessing there's already an algorithm for this somewhere in Phobos. Ideas anyone?
May 06 2015
parent reply "Andrea Fontana" <nospam example.com> writes:
On Thursday, 7 May 2015 at 06:53:39 UTC, Per Nordlöw wrote:
 On Wednesday, 6 May 2015 at 16:05:15 UTC, Andrea Fontana wrote:
 Maybe a way like this could be useful:
 http://dpaste.dzfl.pl/7b4b37b490a7
If r is a SortedRange this is very unneccesary wasteful because of the use AA. In that case you, instead, only want to remove equal consequtive elements without the need for any AA. I'm guessing there's already an algorithm for this somewhere in Phobos. Ideas anyone?
It's not that difficult to implement. You just need to implement a merge() range that returns the min of all ranges' front(). Then you can define distinct() for SortedRange as: merge(sortedrange1, sortedrange2, sortedrange3).uniq Andrea
May 07 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Thursday, 7 May 2015 at 08:03:41 UTC, Andrea Fontana wrote:
 It's not that difficult to implement.
 You just need to implement a merge() range that returns the min 
 of all ranges' front(). Then you can define distinct() for 
 SortedRange as:

 merge(sortedrange1, sortedrange2, sortedrange3).uniq

 Andrea
Why do you need variadics here? Why the need for sortedrange1, sortedrange2 and sortedrange3? I was only interested in removing equal consequtive elements within the same range.
May 07 2015
next sibling parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Thursday, 7 May 2015 at 09:21:58 UTC, Per Nordlöw wrote:
 I was only interested in removing equal consequtive elements 
 within the same range.
I looked at UniqResult. What we need is to fix the typesystem with perhaps some traits the figure out which ranges (multi-layered meta-ranges) posses the sorted property or not.
May 07 2015
prev sibling parent reply "Andrea Fontana" <nospam example.com> writes:
On Thursday, 7 May 2015 at 09:21:58 UTC, Per Nordlöw wrote:
 On Thursday, 7 May 2015 at 08:03:41 UTC, Andrea Fontana wrote:
 It's not that difficult to implement.
 You just need to implement a merge() range that returns the 
 min of all ranges' front(). Then you can define distinct() for 
 SortedRange as:

 merge(sortedrange1, sortedrange2, sortedrange3).uniq

 Andrea
Why do you need variadics here? Why the need for sortedrange1, sortedrange2 and sortedrange3? I was only interested in removing equal consequtive elements within the same range.
Because it is a more generic operation and you can work on a lazy range. Anyway, to sort and to do uniq it isn't the fastest way. Or maybe I just didn't understand what you really need. :)
May 07 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Thursday, 7 May 2015 at 13:38:23 UTC, Andrea Fontana wrote:
 Because it is a more generic operation and you can work on a 
 lazy range.
 Anyway, to sort and to do uniq it isn't the fastest way.

 Or maybe I just didn't understand what you really need. :)
Thanks. These are good ideas in general. I'm curious to why something like merge isn't already in Phobos. The most similar existing range in Phobos is probably http://dlang.org/phobos/std_range.html#roundRobin I believe MergeRange will be bi-directional right? Further, I believe minElement and maxElement at https://github.com/nordlow/justd/blob/master/algorithm_ex.d#L215 should have a CT-specialization in the case when input is a SortedRange. In that case minElement should return front and maxElement should return back, right?
May 07 2015
next sibling parent reply "Andrea Fontana" <nospam example.com> writes:
On Thursday, 7 May 2015 at 21:53:24 UTC, Per Nordlöw wrote:
 On Thursday, 7 May 2015 at 13:38:23 UTC, Andrea Fontana wrote:
 Because it is a more generic operation and you can work on a 
 lazy range.
 Anyway, to sort and to do uniq it isn't the fastest way.

 Or maybe I just didn't understand what you really need. :)
Thanks. These are good ideas in general. I'm curious to why something like merge isn't already in Phobos. The most similar existing range in Phobos is probably http://dlang.org/phobos/std_range.html#roundRobin I believe MergeRange will be bi-directional right? Further, I believe minElement and maxElement at https://github.com/nordlow/justd/blob/master/algorithm_ex.d#L215 should have a CT-specialization in the case when input is a SortedRange. In that case minElement should return front and maxElement should return back, right?
Name could be misleading. This is a sortedrange: [4,3,2,1,0]. In your case minElement is 4, maxElement is 0 :) On ranges with more complex elements sort order can be even less obvious. I think first and back it's ok!
May 08 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 8 May 2015 at 08:27:19 UTC, Andrea Fontana wrote:
 Name could be misleading. This is a sortedrange: [4,3,2,1,0]. 
 In your case minElement is 4, maxElement is 0 :) On ranges with 
 more complex elements sort order can be even less obvious. I 
 think first and back it's ok!
Ok. I guess you mean front and back right?
May 08 2015
parent "Andrea Fontana" <nospam example.com> writes:
On Friday, 8 May 2015 at 09:23:42 UTC, Per Nordlöw wrote:
 On Friday, 8 May 2015 at 08:27:19 UTC, Andrea Fontana wrote:
 Name could be misleading. This is a sortedrange: [4,3,2,1,0]. 
 In your case minElement is 4, maxElement is 0 :) On ranges 
 with more complex elements sort order can be even less 
 obvious. I think first and back it's ok!
Ok. I guess you mean front and back right?
ahah :) You're right!
May 08 2015
prev sibling parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Thursday, 7 May 2015 at 21:53:24 UTC, Per Nordlöw wrote:
 Thanks. These are good ideas in general. I'm curious to why
 something like merge isn't already in Phobos. The most similar
 existing range in Phobos is probably
So I implemented a first working version of merge() by reusing roundRobin(). Working version at: https://github.com/nordlow/justd/blob/master/range_ex.d#L599 Destroy!
May 11 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Monday, 11 May 2015 at 07:05:30 UTC, Per Nordlöw wrote:
 So I implemented a first working version of merge() by reusing 
 roundRobin(). Working version at:

 https://github.com/nordlow/justd/blob/master/range_ex.d#L599

 Destroy!
I guess we should support bi-directional access through back() and popBack() aswell right?
May 11 2015
next sibling parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Monday, 11 May 2015 at 07:12:05 UTC, Per Nordlöw wrote:
 I guess we should support bi-directional access through back() 
 and popBack() aswell right?
Does this mean that we need to statically check whether the SortedRanges support Bidirectional access or not? Or is a SortedRange by convention always random-access?
May 11 2015
prev sibling parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Monday, 11 May 2015 at 07:12:05 UTC, Per Nordlöw wrote:
 I guess we should support bi-directional access through back() 
 and popBack() aswell right?
I believe I have BidirectionalRange support aswell now at https://github.com/nordlow/justd/blob/master/range_ex.d#L597
May 11 2015