www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Looking for something similar to Python's bisect_right

reply "Zeke" <lcarsos lcarsos.com> writes:
Hello, I'm on day 2 of learning D. I've learned C, C++, Java, 
Python, Ruby in University, but I wanted to broaden my palette by 
picking up D.

This project is a basic implementation of Project Euler problem 
10. I build an array of primes, and for each new test number I 
check if the test is divisible by any prime from 2 to sqrt(test) 
(which cuts down on number of iterations). Trouble is, I can't 
find a method that's in the libraries to calculate the index of 
where sqrt(test) would exist or would be inserted into a sorted 
array.

I tried casting the primes array to a SortedRange using 
assumeSorted() but it didn't work for me. countUntil() returns -1 
if the value doesn't exist. and I didn't find anything in the 
std.algorithm, std.array, or std.container.

Here's my current working implementation of is_prime and the 
functions it uses:

pure nothrow  safe bool is_prime(in ulong[] primes, ulong num) {
     // Need to find a more elegant way to find the index of where 
a number's
     // square root would be in a sorted array.
     auto upper = insind(primes, cast(ulong)sqrt(cast(double)num)) 
+ 1;
     // Check each prime in the array up to the square root of our 
current test
     // number to see if it is prime.
     foreach(ulong prime; primes[0..upper]) {
         if (num % prime == 0) {
             return false;
         }
     }
     return true;
}

/**
  * Friendly overload so I don't have to give the bounds of the 
array for the
  * recursive step.
  */
pure nothrow  safe ulong
insind(in ulong[] arr, ulong key)
{
     return insind(arr, key, 0, arr.length);
}

/**
  * Using tail-recursion (so we don't build up a huge call stack), 
binary search
  * the array for the index where the key would be found if it 
existed or were
  * to be inserted into the array.
  */
pure nothrow  safe ulong
insind(in ulong[] arr, ulong key, ulong lower, ulong upper)
{
     if (lower >= upper) {
         return lower;
     } else {
         ulong mid = (lower + upper) / 2;
         if (arr[mid] > key) {
             return insind(arr, key, lower, mid - 1);
         } else if (arr[mid] < key) {
             return insind(arr, key, mid + 1, upper);
         } else {
             return mid;
         }
     }
}

I would also appreciate tips to help my "D-ness" coding style 
develop.
Oct 29 2013
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 10/29/2013 11:04 PM, Zeke wrote:

lowerBound and friends are related:



Ali
Oct 29 2013
parent "Zeke" <lcarsos lcarsos.com> writes:
On Wednesday, 30 October 2013 at 06:10:59 UTC, Ali Çehreli wrote:
 On 10/29/2013 11:04 PM, Zeke wrote:

 lowerBound and friends are related:



 Ali
lowerBound returns a range with the last value being the sqrt, so I can't directly iterate over it because the squares of primes get into the primes array. My code looks like this now: ... auto sqrt = cast(ulong)sqrt(cast(double)num); auto sorted = assumeSorted(primes); ulong upper = sorted.lowerBound(sqrt).length + 1; foreach(ulong prime; sorted[0..upper]) { ... Which has no speed improvement, but is using a standard library, so that accomplishes that. I'd like to refactor so that the primes array is always a SortedRange, so I don't have to cast it for every call of is_prime(). What type would I need to change is_prime(ulong[] primes, ulong num) to so that I could give it the SortedRange? Also how would I insert a found prime onto the back of a SortedRange? Zeke
Oct 30 2013
prev sibling parent reply "qznc" <qznc web.de> writes:
On Wednesday, 30 October 2013 at 06:04:36 UTC, Zeke wrote:
 Hello, I'm on day 2 of learning D. I've learned C, C++, Java, 
 Python, Ruby in University, but I wanted to broaden my palette 
 by picking up D.

 This project is a basic implementation of Project Euler problem 
 10. I build an array of primes, and for each new test number I 
 check if the test is divisible by any prime from 2 to 
 sqrt(test) (which cuts down on number of iterations). Trouble 
 is, I can't find a method that's in the libraries to calculate 
 the index of where sqrt(test) would exist or would be inserted 
 into a sorted array.

 I tried casting the primes array to a SortedRange using 
 assumeSorted() but it didn't work for me. countUntil() returns 
 -1 if the value doesn't exist. and I didn't find anything in 
 the std.algorithm, std.array, or std.container.

 Here's my current working implementation of is_prime and the 
 functions it uses:

 pure nothrow  safe bool is_prime(in ulong[] primes, ulong num) {
     // Need to find a more elegant way to find the index of 
 where a number's
     // square root would be in a sorted array.
     auto upper = insind(primes, 
 cast(ulong)sqrt(cast(double)num)) + 1;
     // Check each prime in the array up to the square root of 
 our current test
     // number to see if it is prime.
     foreach(ulong prime; primes[0..upper]) {
         if (num % prime == 0) {
             return false;
         }
     }
     return true;
 }
Why do you want to find the exact prime first? Just check against sqrt(num) in the loop. auto upper = cast(ulong)sqrt(cast(double)num)) + 1; foreach(ulong prime; primes) { if (prime > upper) return true; if (num % prime == 0) return false; } assert (false); // should be unreachable?
Oct 30 2013
parent reply "Zeke" <lcarsos lcarsos.com> writes:
On Wednesday, 30 October 2013 at 14:17:22 UTC, qznc wrote:
 Why do you want to find the exact prime first? Just check 
 against sqrt(num) in the loop.

 auto upper = cast(ulong)sqrt(cast(double)num)) + 1;
 foreach(ulong prime; primes) {
   if (prime > upper) return true;
   if (num % prime == 0) return false;
 }
 assert (false); // should be unreachable?
Because having a branch inside the foreach causes a branch prediction error when you've found a prime number. Simply iterating up to the sqrt and then terminating the loop has no branch prediction error. Try it for yourself.
Oct 30 2013
parent "qznc" <qznc web.de> writes:
On Wednesday, 30 October 2013 at 18:56:42 UTC, Zeke wrote:
 On Wednesday, 30 October 2013 at 14:17:22 UTC, qznc wrote:
 Why do you want to find the exact prime first? Just check 
 against sqrt(num) in the loop.

 auto upper = cast(ulong)sqrt(cast(double)num)) + 1;
 foreach(ulong prime; primes) {
  if (prime > upper) return true;
  if (num % prime == 0) return false;
 }
 assert (false); // should be unreachable?
Because having a branch inside the foreach causes a branch prediction error when you've found a prime number. Simply iterating up to the sqrt and then terminating the loop has no branch prediction error. Try it for yourself.
Interesting thought. In your code, there are two conditional branches in each iteration: The check for the end of foreach range and the prime check. Why is one more condition for the upper check so bad? I admit, my code includes the superfluous foreach range check. Hence this improved version: ulong prime; int i = 0; assert (primes[$] > upper); do { prime = primes[i]; if (num % prime == 0) return false; i+=1; } while (prime < upper); return true; In this version there is still an implicit bounds check for the array access. For dmd "-noboundscheck" disables that branching. Optimizing for branch prediction sounds premature, since you are using a slow algorithm anyways. ;)
Oct 30 2013