www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - speeding up + ctfe question

reply maarten van damme <maartenvd1994 gmail.com> writes:
I finally made the primerange I wanted to make. I think I'm using a
rather dumb algorithm. The idea is to store the prime we're at in
curPrime and the amount of preceding primes in countPrime. Then
opindex is able to calculate how many primes following or preceding
curPrime we have to find to get the asked value.
I'm not english so some names can be poorly chosen. Code is at
http://dl.dropbox.com/u/15024434/d/primeRange.d

The problem is that it's pretty slow. I wanted to find the first prime
number with 10 digits and it literally takes forever (but manages it
though). Is there an easy way to speed up or is this the ceiling?
(better algorithms, stupidity's I've left,...)

I also have an array of primes I use to "quicktest" to see if
something is a prime. You can change it's size using setQuicktestSize
at runtime. I want to make it large at compile time though. as I'm new
to ctfe I have no idea as to how one can do this.
May 26 2012
next sibling parent sclytrack <sclytrack iq87.fr> writes:
On 05/26/2012 03:49 PM, maarten van damme wrote:
 I finally made the primerange I wanted to make. I think I'm using a
 rather dumb algorithm. The idea is to store the prime we're at in
 curPrime and the amount of preceding primes in countPrime. Then
 opindex is able to calculate how many primes following or preceding
 curPrime we have to find to get the asked value.
 I'm not english so some names can be poorly chosen. Code is at
 http://dl.dropbox.com/u/15024434/d/primeRange.d

 The problem is that it's pretty slow. I wanted to find the first prime
 number with 10 digits and it literally takes forever (but manages it
 though). Is there an easy way to speed up or is this the ceiling?
 (better algorithms, stupidity's I've left,...)

 I also have an array of primes I use to "quicktest" to see if
 something is a prime. You can change it's size using setQuicktestSize
 at runtime. I want to make it large at compile time though. as I'm new
 to ctfe I have no idea as to how one can do this.
See below for a rather dumb algorithm. The idea is to list a list of prime numbers. The problem is that it is pretty slow and you get a ... //src/main.d(57): Error: template instance main.isDivisable!(499,498) recursive expansion error when the number is too large. I'm not English so the names ARE poorly chosen. //writeln(primeList!(200)); //<---don't pick over 200 or it goes nuts //writeln(decompose!(49,2)); template primeList(int index) { static if (index > 1) { static if (isPrime!(index)) { enum int [] primeList = primeList!(index-1) ~ [index]; } else { enum int [] primeList = primeList!(index-1); } } else { enum int [] primeList = []; } } template isDivisable( int number, int divisor) { enum bool isDivisable = (number % divisor) == 0; } // 10 ---> [2,5] template decompose(int number, int d = 2) { static if (d < number) { static if (isDivisable!(number, d)) { enum int [] decompose = [d] ~ decompose!(number/d, d); } else { enum int [] decompose = decompose!(number, d+1); } } else { enum int [] decompose = [number]; } } template isPrimeDecompose(int number, int d = 2) { static if (d < number) { static if (isDivisable!(number, d)) { enum bool isPrimeDecompose = false; } else { enum bool isPrimeDecompose = isPrimeDecompose!(number, d+1); } } else { enum bool isPrimeDecompose = true; } } template isPrime(int number) { static if (number > 1) enum bool isPrime = isPrimeDecompose!(number, 2); else enum bool isPrime = false; }
May 26 2012
prev sibling parent reply "jerro" <a a.com> writes:
On Saturday, 26 May 2012 at 13:49:53 UTC, maarten van damme wrote:

Is there an easy way to speed up or is this the
 ceiling?
I got a 30 percent speedup by replacing this line: if(&& canFind(quicktestPrimes, possiblePrime)) with this: if(quicktestPrimes.back >= possiblePrime && canFind(quicktestPrimes, possiblePrime)) You could probably get a much better speedup by using some kind of a prime sieve. BTW you shouldn't be using opIndex like that. Calling opIndex shouldn't change observable state of the object and it should run in O(log(n)) or better. People expect an index operator to behave similarly to the index operator on arrays. If the function does something completely different, it is very confusing to make it an opIndex.
May 26 2012
parent reply maarten van damme <maartenvd1994 gmail.com> writes:
well, I was thinking about using a sieve but when you were to request
prime 400_000 you're sieve would blow up in size. That's why I opted
for something else (and I don't know if it was the right thing to do
though). (Ab)using opIndex like that is indeed a bit wrong but what
would be the alternative? remove slicing and introduce getPrime?
Wouldn't that be the same but without the great syntactic sugar?
May 26 2012
parent reply "jerro" <a a.com> writes:
On Saturday, 26 May 2012 at 18:40:53 UTC, maarten van damme wrote:
 well, I was thinking about using a sieve but when you were to 
 request
 prime 400_000 you're sieve would blow up in size.
Because you only need primes up to sqrt(n) to check if n is prime, you can make a sieve based range that only uses O(sqrt(n)) memory. I've put an example at https://gist.github.com/2795954 .It could be optimized further (skipping even numbers, keeping track of the multiples of small primes...), but it is already much faster than a range that does not use a sieve.
That's why I
 opted
 for something else (and I don't know if it was the right thing 
 to do
 though). (Ab)using opIndex like that is indeed a bit wrong but 
 what
 would be the alternative? remove slicing and introduce getPrime?
 Wouldn't that be the same but without the great syntactic sugar?
Syntax sugar isn't great if it confuses people and causes bugs. You can use r.drop(n).front to get nth element of a range. If you need to do it often, you could just write a nth() function.
May 26 2012
parent reply maarten van damme <maartenvd1994 gmail.com> writes:
well, maybe, looking back at it, a range isn't the ideal way to go
about generating primes easily. I'm going to implement the sieve of
Atkin and make it return an array of primes up to a given number. This
might be a bit more useful and fast.

Is there somewhere I can catch up on ctfe? after reading andrei's book
and the documentation it's still not clear to me how I could make my
setquicktestsize function run at compile time.
May 27 2012
parent reply sclytrack <sclytrack iq87.fr> writes:
On 05/27/2012 01:18 PM, maarten van damme wrote:
 well, maybe, looking back at it, a range isn't the ideal way to go
 about generating primes easily. I'm going to implement the sieve of
 Atkin and make it return an array of primes up to a given number. This
 might be a bit more useful and fast.

 Is there somewhere I can catch up on ctfe? after reading andrei's book
 and the documentation it's still not clear to me how I could make my
 setquicktestsize function run at compile time.
https://github.com/PhilippeSigaud/D-templates-tutorial
May 27 2012
parent reply maarten van damme <maartenvd1994 gmail.com> writes:
thank you :)

I wrote a sieve of aristotle because atkin's sieve needed waay to many
optimizations to beat aristotle's sieve by even a little bit so I
wanted to see if aristotle's was good enough.

I ran in two problems. It was extremely fast when sieving a byte array
of 1 million entries (without optimizations at all) but when trying
with 10_000_000 entries the program crashes before it even starts to
execute (main isn't reached).

Then I decided to benchmark the code and dmd fails to compile with
"Internal error: ..\ztc\cgcs.s 354".

I assume the first problem has to do with memory limits and that the
second is a little dmd bug. code is at
http://dl.dropbox.com/u/15024434/d/aristotleSieve.d
May 27 2012
next sibling parent sclytrack <sclytrack iq87.fr> writes:
On 05/27/2012 02:10 PM, maarten van damme wrote:
 thank you :)

 I wrote a sieve of aristotle because atkin's sieve needed waay to many
 optimizations to beat aristotle's sieve by even a little bit so I
 wanted to see if aristotle's was good enough.

 I ran in two problems. It was extremely fast when sieving a byte array
 of 1 million entries (without optimizations at all) but when trying
 with 10_000_000 entries the program crashes before it even starts to
 execute (main isn't reached).

 Then I decided to benchmark the code and dmd fails to compile with
 "Internal error: ..\ztc\cgcs.s 354".

 I assume the first problem has to do with memory limits and that the
 second is a little dmd bug. code is at
 http://dl.dropbox.com/u/15024434/d/aristotleSieve.d
Yeah I get the same errors. And I'm not acquainted with the benchmark code. -- http://dlang.org/concepts.html isprime there says that 2 is not a prime. enum result = isprime(2); compile time enum "result" says false. -- http://erdani.com/d/web/phobos-prerelease/std_benchmark.html (the future) -- Er is zon buiten. Zonnige zondag namiddag met priemgetallen in de plaats van buiten zitten. Tss tss. :-)
May 27 2012
prev sibling parent reply "jerro" <a a.com> writes:
 I ran in two problems. It was extremely fast when sieving a 
 byte array
 of 1 million entries (without optimizations at all) but when 
 trying
 with 10_000_000 entries the program crashes before it even 
 starts to
 execute (main isn't reached).
You say it does compile, but then crashes immediatly? I can't reproduce that. For me it is a compile time error if toSieve is 16MB or more, but otherwise it runs fine. Anyway you should use a dynamic array instead of a static one if you want really large arrays.
 Then I decided to benchmark the code and dmd fails to compile 
 with
 "Internal error: ..\ztc\cgcs.s 354".

 I assume the first problem has to do with memory limits and 
 that the
 second is a little dmd bug.
It is a bug, all internal errors are. It looks like you can avoid like this: auto tmp=(benchmark!(f0)(1000)); int time=tmp[0].to!("msecs", int);
May 27 2012
parent reply maarten van damme <maartenvd1994 gmail.com> writes:
ok, can't seem to reproduce the crashing. now on to optimizing my
sieve a bit more ,9 miliseconds for 1_000_000 is still to slow.
--
Er is zon buiten. Zonnige zondag namiddag met priemgetallen in de
plaats van buiten zitten. Tss tss. :-)
hoe? wie? x)
May 27 2012
parent reply "jerro" <a a.com> writes:
On Sunday, 27 May 2012 at 17:00:01 UTC, maarten van damme wrote:
 ok, can't seem to reproduce the crashing. now on to optimizing 
 my
 sieve a bit more ,9 miliseconds for 1_000_000 is still to slow.
 --
 Er is zon buiten. Zonnige zondag namiddag met priemgetallen in 
 de
 plaats van buiten zitten. Tss tss. :-)
 hoe? wie? x)
Have you tried turning optimization on? I get about 8ms with your code with no flags and about 2.9 with -O -inline -release. To optimize it further you could make a sieve that only works with odd numbers. I get (https://gist.github.com/2817289) about 1ms that way. I'm sure it is possible to do much better, but that would probably be much more work too.
May 27 2012
parent reply maarten van damme <maartenvd1994 gmail.com> writes:
I tried using 6k+-1 for all primes and for some reason it performed
slower. I think I have something completely inefficient somewhere,
can't figure out where though.
I think it has something to do with me increasing k and then
multiplying with k while I could have simply added 6 to K...

and I haven't tried with optimization on, first I want to get the
algorithm as good as possible before trying to squeeze out those extra
milliseconds :)
May 28 2012
parent reply "jerro" <a a.com> writes:
 I tried using 6k+-1 for all primes and for some reason it 
 performed
 slower. I think I have something completely inefficient 
 somewhere,
 can't figure out where though.
You aren't, by any chance, using divisions or remainders? those are much slower than, say, multiplications (unless the divisor is a power of two and known at compile time, in which case the compiler will optimize them to bitwise operations).
 and I haven't tried with optimization on, first I want to get 
 the
 algorithm as good as possible before trying to squeeze out 
 those extra
 milliseconds :)
I prefer to compile with optimizations on when trying to optimize something. That way I don't waste my time on micro-optimizations that the compiler could do for me.
May 28 2012
parent maarten van damme <maartenvd1994 gmail.com> writes:
I didn't use divisions or remainders, only multiplications. I've
changed it so it only uses addition and now it still doesn't
outperform a version that only checks odd's. it's not as fast as your
version where every index corresponds to i*2+1 because I fill every
even number with false...
May 28 2012