www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Exposing pred from SortedRange and changes to algorithms that

reply aliak <something something.com> writes:
Would it be an acceptable enhancement to phobos to expose the 
predicate in SortedRange 
(https://dlang.org/library/std/range/sorted_range.html)?

The rationale behind it would be so that functions like 
setDifference 
(https://dlang.org/library/std/algorithm/setops/set_difference.html) or any
function that requires a sorted range to "run" does not need to be provided the
sorting predicate, ie:

setDifference
largestPartialIntersection
largestPartialIntersectionWeighted
multiwayUnion
setIntersection
setSymmetricDifference
setUnion
multiwayMerge

Where functions do required ranges to be sorted and a predicate 
passed in, this enhancement should reduce programmer errors (i.e. 
providing the wrong predicate to the algorithm that was used to 
sort the range that you want to operate on). The sortedrange 
knows how it was sorted, so there should be no need to duplicate 
that (unless there are maybe some valid reasons?).

It would also make for SortedRange itself to be made lighter 
(maybe not a good idea at this point but it can be done at least) 
by moving it's functionality in to algorithms that are already 
there. Eg, "SortedRange.contains" can be done by 
algorithm.canFind:

auto canFind(Range, T)(Range range, T element) if 
(isSortedRange!Range) {
   // do binary search for element
}

PS: Is this the correct place for posts like this? Apologies if 
not :s
Jan 12 2018
parent reply Seb <seb wilzba.ch> writes:
On Friday, 12 January 2018 at 09:52:36 UTC, aliak wrote:
 Would it be an acceptable enhancement to phobos to expose the 
 predicate in SortedRange 
 (https://dlang.org/library/std/range/sorted_range.html)?

 The rationale behind it would be so that functions like 
 setDifference 
 (https://dlang.org/library/std/algorithm/setops/set_difference.html) or any
function that requires a sorted range to "run" does not need to be provided the
sorting predicate, ie:

 setDifference
 largestPartialIntersection
 largestPartialIntersectionWeighted
 multiwayUnion
 setIntersection
 setSymmetricDifference
 setUnion
 multiwayMerge

 Where functions do required ranges to be sorted and a predicate 
 passed in, this enhancement should reduce programmer errors 
 (i.e. providing the wrong predicate to the algorithm that was 
 used to sort the range that you want to operate on). The 
 sortedrange knows how it was sorted, so there should be no need 
 to duplicate that (unless there are maybe some valid reasons?).

 It would also make for SortedRange itself to be made lighter 
 (maybe not a good idea at this point but it can be done at 
 least) by moving it's functionality in to algorithms that are 
 already there. Eg, "SortedRange.contains" can be done by 
 algorithm.canFind:

 auto canFind(Range, T)(Range range, T element) if 
 (isSortedRange!Range) {
   // do binary search for element
 }
canFind uses find internally, which already has a shortcut for SortedRange. I don't like contains either, but the idea was to have a separate method with different performance guarantees as canFind is typically O(n). Anyways I have tried to deprecate it: https://github.com/dlang/phobos/pull/5651 Maybe you have better arguments? ---- Now to your main question: Exposing doesn't help much, because you can't compare them, but there is WIP on lambda comparisons: https://github.com/dlang/dmd/pull/7484 With it, the default lamdas can be compared for equality and what you want to do can/will be done.
 PS: Is this the correct place for posts like this? Apologies if 
 not :s
Yeah I think it is.
Jan 12 2018
parent aliak <something something.com> writes:
On Friday, 12 January 2018 at 10:53:04 UTC, Seb wrote:
 canFind uses find internally, which already has a shortcut for 
 SortedRange.
 I don't like contains either, but the idea was to have a 
 separate method with different performance guarantees as 
 canFind is typically O(n).
 Anyways I have tried to deprecate it:

 https://github.com/dlang/phobos/pull/5651

 Maybe you have better arguments?
Ah I see. Nah I don't have better arguments :p Think yours were good enough. If I understood correctly the argument to keeping contains is because it means "here is a function that searches logarithmically"? Is that correct? While I do agree "contains" indicates some kind of quick(er) check functionality over an algorithm with the word find in it, I can't quite elaborate why that's the case, but I think it only applies when there's context, and not "generically", and because I've used hash maps a lot. The problem is that it's not enforceable and hence cannot be depended upon, so I don't think it's a good argument in the current scenario. A better convention would be to have the indiction of complexity explicitly in the name, i.e. name it "locagirhtmicFind" or "binarySearch", etc, and have that on a SortedRange. But I assume that ship has sailed... Tried looking for other languages which differentiate between find/search/exists/contains/etc based on complexity and I don't think there are any? Are there? Swift does "contains" C++ does "find" Rust does "contains" (and find??) Julia does "searchsorted" None of them make any complexity promises, the only one I can look at and would be surprised if it didn't do a faster search is the Julia one.
 ----

 Now to your main question:

 Exposing doesn't help much, because you can't compare them, but 
 there is WIP on lambda comparisons:

 https://github.com/dlang/dmd/pull/7484

 With it, the default lamdas can be compared for equality and 
 what you want to do can/will be done.
Aha ok so basically you're saying that even if pred was public then I can't make sure R1.pred == R2.pred so it does't make it a 100% guarantee right? But regardless of that wouldn't making pred public be needed to do the above anyway? I.e. if an algorithm wants to make sure the SortedRange preds were the same it still needs to access them right? And if the algorithms are single ranged, then you don't need to compare any lambdas then - i.e. canFind just uses whatever pred was in SortedRange. Or did I maybe misunderstand? Cheers
Jan 12 2018