www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Fastest way to "ignore" elements from array without removal

reply z <z z.com> writes:
What would be the overall best manner(in ease of implementation 
and speed) to arbitrarily remove an item in the middle of an 
array while iterating through it?
So far i've thought about simply using D's standard [0...x] ~ 
[x+1..$] with an array of indexes but wouldn't that just cause 
slow reallocations? According to wikipedia the performance would 
be suboptimal.[1]
I've also thought about using a pointer array and just assigning 
a null pointer when the item doesn't need to be iterated on but 
i'm not sure which method will result in the fastest execution 
times for the general case where over half of the items will be 
removed from the index/pointer array.

Big thanks

[1] - https://en.wikipedia.org/wiki/Dynamic_array#Performance
Feb 15 2021
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 16 February 2021 at 04:20:06 UTC, z wrote:
 What would be the overall best manner(in ease of implementation 
 and speed) to arbitrarily remove an item in the middle of an 
 array while iterating through it?
http://phobos.dpldocs.info/std.algorithm.iteration.filter.html
Feb 15 2021
parent reply z <z z.com> writes:
On Tuesday, 16 February 2021 at 04:43:33 UTC, Paul Backus wrote:
 On Tuesday, 16 February 2021 at 04:20:06 UTC, z wrote:
 What would be the overall best manner(in ease of 
 implementation and speed) to arbitrarily remove an item in the 
 middle of an array while iterating through it?
http://phobos.dpldocs.info/std.algorithm.iteration.filter.html
Does filter support multiple arguments for the predicate?(i.e. using a function that has a "bool function(T1 a, T2 b)" prototype) If not could still implement the function inside the loop but that would be unwieldy. And does it create copies every call? this is important because if i end up using .filter it will be called a 6 to 8 digit number of times.
Feb 16 2021
parent Paul Backus <snarwin gmail.com> writes:
On Tuesday, 16 February 2021 at 09:08:33 UTC, z wrote:
 Does filter support multiple arguments for the predicate?(i.e. 
 using a function that has a "bool function(T1 a, T2 b)" 
 prototype)
I am not sure exactly what you are asking here, but you can probably accomplish what you want by combining filter with std.range.chunks or std.range.slide. http://phobos.dpldocs.info/std.range.chunks.html http://phobos.dpldocs.info/std.range.slide.html
 If not could still implement the function inside the loop but 
 that would be unwieldy.
 And does it create copies every call? this is important because 
 if i end up using .filter it will be called a 6 to 8 digit 
 number of times.
filter does not create any copies of the original array. The same is true for pretty much everything in std.range and std.algorithm.
Feb 16 2021
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Feb 16, 2021 at 04:20:06AM +0000, z via Digitalmars-d-learn wrote:
 What would be the overall best manner(in ease of implementation and
 speed) to arbitrarily remove an item in the middle of an array while
 iterating through it?
 So far i've thought about simply using D's standard [0...x] ~ [x+1..$]
 with an array of indexes but wouldn't that just cause slow
 reallocations?  According to wikipedia the performance would be
 suboptimal.[1]
 I've also thought about using a pointer array and just assigning a
 null pointer when the item doesn't need to be iterated on but i'm not
 sure which method will result in the fastest execution times for the
 general case where over half of the items will be removed from the
 index/pointer array.
[...] It depends on what your goal is. Do you want to permanently remove the items from the array? Or only skip over some items while iterating over it? For the latter, see std.algorithm.iteration.filter. For the former, you can use the read-head/write-head algorithm: keep two indices as you iterate over the array, say i and j: i is for reading (incremented every iteration) and j is for writing (not incremented if array[i] is to be deleted). Each iteration, if j < i, copy array[i] to array[j]. At the end of the loop, assign the value of j to the length of the array. Example: int[] array = ...; size_t i=0, j=0; while (i < array.length) { doSomething(array[i]); if (!shouldDelete(array[i])) j++; if (j < i) array[j] = array[i]; i++; } array.length = j; Basically, the loop moves elements up from the back of the array on top of elements to be deleted. This is done in tandem with processing each element, so it requires only traversing array elements once, and copies array elements at most once for the entire loop. Array elements are also read / copied sequentially, to maximize CPU cache-friendliness. T -- Without outlines, life would be pointless.
Feb 15 2021
next sibling parent z <z z.com> writes:
On Tuesday, 16 February 2021 at 06:03:50 UTC, H. S. Teoh wrote:
 It depends on what your goal is.  Do you want to permanently 
 remove the items from the array?  Or only skip over some items 
 while iterating over it?  For the latter, see 
 std.algorithm.iteration.filter.
The array itself is read only, so it'll have to be an array of pointers/indexes.
 For the former, you can use the read-head/write-head algorithm: 
 keep two indices as you iterate over the array, say i and j: i 
 is for reading (incremented every iteration) and j is for 
 writing (not incremented if array[i] is to be deleted).  Each 
 iteration, if j < i, copy array[i] to array[j].  At the end of 
 the loop, assign the value of j to the length of the array.

 Example:

 	int[] array = ...;
 	size_t i=0, j=0;
 	while (i < array.length)
 	{
 		doSomething(array[i]);
 		if (!shouldDelete(array[i]))
 			j++;
 		if (j < i)
 			array[j] = array[i];
 		i++;
 	}
 	array.length = j;

 Basically, the loop moves elements up from the back of the 
 array on top of elements to be deleted.  This is done in tandem 
 with processing each element, so it requires only traversing 
 array elements once, and copies array elements at most once for 
 the entire loop.  Array elements are also read / copied 
 sequentially, to maximize CPU cache-friendliness.


 T
This is most likely ideal for what i'm trying to do.(resizes/removes will probably have to propagate to other arrays) The only problem is that it does not work with the first element but i could always just handle the special case on my own.[1] I'll probably use .filter or an equivalent for an initial first pass and this algorithm for the rest, thank you both! [1] https://run.dlang.io/is/f9p29A (the first element is still there, and the last element is missing. both occur if the first element didn't pass the check.)
Feb 16 2021
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 2/16/21 1:03 AM, H. S. Teoh wrote:
 For the former, you can use the read-head/write-head algorithm: keep two
 indices as you iterate over the array, say i and j: i is for reading
 (incremented every iteration) and j is for writing (not incremented if
 array[i] is to be deleted).  Each iteration, if j < i, copy array[i] to
 array[j].  At the end of the loop, assign the value of j to the length
 of the array.
std.algorithm.mutation.remove does this for you. It's just a bit awkward as it doesn't do this based on values, you have to pass a lambda. auto removed = arr.remove!(v => v == target); // removed is now the truncated array And if you don't care about preserving the order, it can be done faster: auto removed = arr.remove!(v => v == target, SwapStrategy.unstable); -Steve
Feb 16 2021