www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Keeping a dynamic sorted range

reply "bearophile" <bearophileHUGS lycos.com> writes:
(This is a partial repost from a recent D.learn thread.)

In Phobos we have SortedRange and assumeSorted, but I do find 
them not very good for a common enough use case.

The use case is to keep a sorted array, keep adding items to it 
(adding larger and larger items at the end. Or sometimes even 
inserting items in the middle. In both cases I keep the sorting 
invariant). And while I add items, I also now and then want to 
perform a binary search on the sorted range.

So sometimes I'd like to do something like this (but a 
SortedRange doesn't have append):

struct Foo { int x; }
SortedRange!(Foo[], q{ a.x < b.x }) data;
data ~= Foo(5);
immutable n = data.upperBound(Foo(2)).length;

Bye,
bearophile
Nov 07 2014
next sibling parent reply Max Klyga <max.klyga gmail.com> writes:
On 2014-11-07 14:11:30 +0000, bearophile said:

 (This is a partial repost from a recent D.learn thread.)
 
 In Phobos we have SortedRange and assumeSorted, but I do find them not 
 very good for a common enough use case.
 
 The use case is to keep a sorted array, keep adding items to it (adding 
 larger and larger items at the end. Or sometimes even inserting items 
 in the middle. In both cases I keep the sorting invariant). And while I 
 add items, I also now and then want to perform a binary search on the 
 sorted range.
 
 So sometimes I'd like to do something like this (but a SortedRange 
 doesn't have append):
 
 struct Foo { int x; }
 SortedRange!(Foo[], q{ a.x < b.x }) data;
 data ~= Foo(5);
 immutable n = data.upperBound(Foo(2)).length;
 
 Bye,
 bearophile
Ranges are not container. They are meant for traversing. If you want a sorted range - use an underlying container that preserves ordering (trees, heaps)
Nov 07 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Max Klyga:

 Ranges are not container. They are meant for traversing. If you 
 want a sorted range - use an underlying container that 
 preserves ordering (trees, heaps)
Let's asssume that the underlying container is a sorted built-in dynamic array (that has more locality than a tree and allows very fast binary searches, unlike a heap). So what solution do you suggest? A pair of functions that work on a sorted container of that type for Phobos? Bye, bearophile
Nov 07 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 Let's asssume that the underlying container is a sorted 
 built-in dynamic array (that has more locality than a tree and 
 allows very fast binary searches, unlike a heap).
An array, a sorted array, is a simple data structure that often wins in terms of memory, simplicity, and efficiency. Generally it's better to not use a more complex data structure if a simpler one suffices. Bye, bearophile
Nov 07 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/7/14 7:54 AM, bearophile wrote:
 Max Klyga:

 Ranges are not container. They are meant for traversing. If you want a
 sorted range - use an underlying container that preserves ordering
 (trees, heaps)
Let's asssume that the underlying container is a sorted built-in dynamic array (that has more locality than a tree and allows very fast binary searches, unlike a heap). So what solution do you suggest? A pair of functions that work on a sorted container of that type for Phobos? Bye, bearophile
One possibility is a function sortedChain that takes two or more sorted ranges and returns a lazy sorted range. -- Andrei
Nov 09 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 One possibility is a function sortedChain that takes two or 
 more sorted ranges and returns a lazy sorted range. -- Andrei
Perhaps you are answering another thread and another problem. In this problem there is only one range. Bye, bearophile
Nov 10 2014
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 10 November 2014 at 02:19:33 UTC, Andrei Alexandrescu 
wrote:
 One possibility is a function sortedChain that takes two or 
 more sorted ranges and returns a lazy sorted range. -- Andrei
setUnion? http://dlang.org/phobos/std_algorithm.html#.setUnion
Nov 10 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/14 5:06 AM, Vladimir Panteleev wrote:
 On Monday, 10 November 2014 at 02:19:33 UTC, Andrei Alexandrescu wrote:
 One possibility is a function sortedChain that takes two or more
 sorted ranges and returns a lazy sorted range. -- Andrei
setUnion? http://dlang.org/phobos/std_algorithm.html#.setUnion
That's right! -- Andrei
Nov 10 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 http://dlang.org/phobos/std_algorithm.html#.setUnion
That's right! -- Andrei
Still, this is very far from what I asked for... Bye, bearophile
Nov 10 2014
prev sibling next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 11/07/2014 06:11 AM, bearophile wrote:
 (This is a partial repost from a recent D.learn thread.)

 In Phobos we have SortedRange and assumeSorted, but I do find them not
 very good for a common enough use case.

 The use case is to keep a sorted array, keep adding items to it (adding
 larger and larger items at the end. Or sometimes even inserting items in
 the middle. In both cases I keep the sorting invariant). And while I add
 items, I also now and then want to perform a binary search on the sorted
 range.

 So sometimes I'd like to do something like this (but a SortedRange
 doesn't have append):

 struct Foo { int x; }
 SortedRange!(Foo[], q{ a.x < b.x }) data;
 data ~= Foo(5);
 immutable n = data.upperBound(Foo(2)).length;

 Bye,
 bearophile
If an array is sorted every time an element is added, then insertion is N.log(N) and searching is log(N). I don't know when that penalty is better than data locality that an array brings. A more traditional data structure is a binary tree in this case because it has log(N) for both insertion and search. On the other hand, array wins if the insertions are batched and then there is a single sort before many searches. As Max Klyga said, that single sort better be applied on the container, not on the range. Ali
Nov 07 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Ali Çehreli:

 If an array is sorted every time an element is added,
Items are most times added at the end, and they respect the sortness of the whole array. The array never gets sorted.
 On the other hand, array wins if the insertions are batched and
Insertions are not batched, and they are interspersed with searches. This is the common use case.
 As Max Klyga said, that single sort better be applied
 on the container, not on the range.
The container unfortunately doesn't know it is sorted, so I have to verify the sortness invariant manually, and before the search I need to use an assumeSorted(). This is not good. Bye, bearophile
Nov 07 2014
prev sibling next sibling parent reply "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Friday, 7 November 2014 at 14:11:32 UTC, bearophile wrote:
 (This is a partial repost from a recent D.learn thread.)

 In Phobos we have SortedRange and assumeSorted, but I do find 
 them not very good for a common enough use case.

 The use case is to keep a sorted array, keep adding items to it 
 (adding larger and larger items at the end. Or sometimes even 
 inserting items in the middle. In both cases I keep the sorting 
 invariant). And while I add items, I also now and then want to 
 perform a binary search on the sorted range.

 So sometimes I'd like to do something like this (but a 
 SortedRange doesn't have append):

 struct Foo { int x; }
 SortedRange!(Foo[], q{ a.x < b.x }) data;
 data ~= Foo(5);
 immutable n = data.upperBound(Foo(2)).length;

 Bye,
 bearophile
Facing this same problem, a while ago I started work on a generic, higher-order container that provides insertion, deletion and search while keeping itself sorted: https://gist.github.com/JakobOvrum/f1738d31bb7ba7a46581 The above is just a WIP; it's not complete. Of course, positional container primitives like `insertFront` and `insertBack` will not be supported. The implementation is fairly messy due to the lack of traits for containers, as well as due to some deficiencies in `SortedRange`. It's obviously useful for arrays, and it's kind of clever how it can merge lists efficiently, but I'm not sure if it's really worth all the effort; is it really useful to have something like this that aims to support such a wide range of underlying containers? Is it actually useful in real programs for anything but arrays? So, I stopped working on it...
Nov 07 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Jakob Ovrum:

 Of course, positional container primitives like `insertFront` 
 and `insertBack` will not be supported.
Sometimes (or often) "insertBack" (or better the ~= operator) is what I want, because I add items larger than ones already present. insertBack has to verify the array is empty, or the last one is smaller (or verifies the sorting predicate) than the item I'm going to append.
 is it really useful to have something like this that aims to 
 support such a wide range of underlying containers? Is it 
 actually useful in real programs for anything but arrays? So, I 
 stopped working on it...
In most cases a built-in dynamic array is fine. Once in a while you want to use an std.array.Array. When they are not enough, I use a sorted tree or something else. Bye, bearophile
Nov 08 2014
prev sibling next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
It doesn't really address your question, but if you're mostly 
inserting you may want to store it as a heap and sort before 
lookup.
Nov 10 2014
prev sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
To address your question a bit more fully, it seems like this is 
something the range proposal for C++ is better suited for. Then 
appending would just be a special case of inserting at a specific 
position. I've got to say, if I had the time I'd implement the 
C++ ranges in D. They seem more powerful and easier to use.
Nov 10 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/10/14 7:42 AM, Sean Kelly wrote:
 To address your question a bit more fully, it seems like this is
 something the range proposal for C++ is better suited for. Then
 appending would just be a special case of inserting at a specific
 position. I've got to say, if I had the time I'd implement the C++
 ranges in D. They seem more powerful and easier to use.
What would be their advantages? -- Andrei
Nov 10 2014
parent "Sean Kelly" <sean invisibleduck.org> writes:
On Monday, 10 November 2014 at 19:59:02 UTC, Andrei Alexandrescu 
wrote:
 On 11/10/14 7:42 AM, Sean Kelly wrote:
 To address your question a bit more fully, it seems like this 
 is
 something the range proposal for C++ is better suited for. Then
 appending would just be a special case of inserting at a 
 specific
 position. I've got to say, if I had the time I'd implement the 
 C++
 ranges in D. They seem more powerful and easier to use.
What would be their advantages?
The ability to denote a position within a range is an important facet of some algorithms, such as rotate. For example, see: https://ericniebler.github.io/std/wg21/D4128.html#iterator-operations-are-primitive
Nov 10 2014