www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Why is this code slow?

reply Csaba <feketecsaba gmail.com> writes:
I know that benchmarks are always controversial and depend on a 
lot of factors. So far, I read that D performs very well in 
benchmarks, as well, if not better, as C.

I wrote a little program that approximates PI using the Leibniz 
formula. I implemented the same thing in C, D and Python, all of 
them execute 1,000,000 iterations 20 times and display the 
average time elapsed.

Here are the results:

C: 0.04s
Python: 0.33s
D: 0.73s

What the hell? D slower than Python? This cannot be real. I am 
sure I am making a mistake here. I'm sharing all 3 programs here:

C: https://pastebin.com/s7e2HFyL
D: https://pastebin.com/fuURdupc
Python: https://pastebin.com/zcXAkSEf

As you can see the function that does the job is exactly the same 
in C and D.

Here are the compile/run commands used:

C: `gcc leibniz.c -lm -oleibc`
D: `gdc leibniz.d -frelease -oleibd`
Python: `python3 leibniz.py`

PS. my CPU is AMD A8-5500B and my OS is Ubuntu Linux, if that 
matters.
Mar 24
next sibling parent matheus <matheus gmail.com> writes:
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:
 ...

 Here are the results:

 C: 0.04s
 Python: 0.33s
 D: 0.73s
 
 ...
I think a few things can be going on, but one way to go is trying using optimization flags like "-O2", and run again. But anyway, looking through Assembly generated: C: https://godbolt.org/z/45Kn1W93b D: https://godbolt.org/z/Ghr3fqaTW The Leibniz's function is very close each other, except for one thing, the "pow" function on D side. It's a template, maybe you should start from there, in fact I'd try the pow from C to see what happens. Matheus.
Mar 24
prev sibling next sibling parent Sergey <kornburn yandex.ru> writes:
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:
 As you can see the function that does the job is exactly the 
 same in C and D.
Not really.. The speed of Leibniz algo is mostly the same. You can check the code in this benchmark for example: https://github.com/niklas-heer/speed-comparison What you could fix in your code: * you can use enum for BENCHMARKS and ITERATIONS * use pow from core.stdc.math * use sw.reset() in a loop So the main part could look like this: ```d auto sw = StopWatch(AutoStart.no); sw.start(); foreach (i; 0..BENCHMARKS) { result += leibniz(ITERATIONS); total_time += sw.peek.total!"nsecs"; sw.reset(); } sw.stop(); ```
Mar 24
prev sibling next sibling parent reply kdevel <kdevel vogtner.de> writes:
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:
 I know that benchmarks are always controversial and depend on a 
 lot of factors. So far, I read that D performs very well in 
 benchmarks, as well, if not better, as C.

 I wrote a little program that approximates PI using the Leibniz 
 formula. I implemented the same thing in C, D and Python, all 
 of them execute 1,000,000 iterations 20 times and display the 
 average time elapsed.

 Here are the results:

 C: 0.04s
 Python: 0.33s
 D: 0.73s

 What the hell? D slower than Python? This cannot be real. I am 
 sure I am making a mistake here. I'm sharing all 3 programs 
 here:

 C: https://pastebin.com/s7e2HFyL
 D: https://pastebin.com/fuURdupc
 Python: https://pastebin.com/zcXAkSEf
Usually you do not translate mathematical expressions directly into code: ``` n += pow(-1.0, i - 1.0) / (i * 2.0 - 1.0); ``` The term containing the `pow` invocation computes the alternating sequence -1, 1, -1, ..., which can be replaced by e.g. ``` immutable int [2] sign = [-1, 1]; n += sign [i & 1] / (i * 2.0 - 1.0); ``` This saves the expensive call to the pow function.
Mar 24
next sibling parent reply rkompass <rkompass gmx.de> writes:
 The term containing the `pow` invocation computes the 
 alternating sequence -1, 1, -1, ..., which can be replaced by 
 e.g.

 ```
    immutable int [2] sign = [-1, 1];
    n += sign [i & 1] / (i * 2.0 - 1.0);
 ```

 This saves the expensive call to the pow function.
I used the loop: ```d for (int i = 1; i < iter; i++) n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); ``` in both C and D, with gcc and gdc and got average execution times: --- C ----- original: .... loop replacement: .... -O2: 0.009989 .... 0.003198 ........... 0.001335 --- D ----- original: .... loop replacement: .... -O2: 0.230346 .... 0.003083 ........... 0.001309 almost no difference. But the D binary is much larger on my Linux: 4600920 bytes instead of 15504 bytes for the C version. Are there some simple switches / settings to get a smaller binary?
Mar 24
next sibling parent reply Sergey <kornburn yandex.ru> writes:
On Sunday, 24 March 2024 at 22:16:06 UTC, rkompass wrote:
 Are there some simple switches / settings to get a smaller 
 binary?
1) If possible you can use "betterC" - to disable runtime 2) otherwise ```bash --release --O3 --flto=full -fvisibility=hidden -defaultlib=phobos2-ldc-lto,druntime-ldc-lto -L=-dead_strip -L=-x -L=-S -L=-lz ```
Mar 24
parent reply rkompass <rkompass gmx.de> writes:
On Sunday, 24 March 2024 at 23:02:19 UTC, Sergey wrote:
 On Sunday, 24 March 2024 at 22:16:06 UTC, rkompass wrote:
 Are there some simple switches / settings to get a smaller 
 binary?
1) If possible you can use "betterC" - to disable runtime 2) otherwise ```bash --release --O3 --flto=full -fvisibility=hidden -defaultlib=phobos2-ldc-lto,druntime-ldc-lto -L=-dead_strip -L=-x -L=-S -L=-lz ```
Thank you. I succeeded with `gdc -Wall -O2 -frelease -shared-libphobos` A little remark: The approximation to pi is slow, but oscillates up and down much more than its average. So doing the average of 2 steps gives many more precise digits. We can simulate this by doing a last step with half the size: ```d double leibniz(int it) { double n = 1.0; for (int i = 1; i < it; i++) n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); n += 0.5*((it%2) ? -1.0 : 1.0) / (it * 2.0 + 1.0); return n * 4.0; } ``` Of course you may also combine the up(+) and down(-) step to one: 1/i - 1/(i+2) = 2/(i*(i+2)) ```d double leibniz(int iter) { double n = 0.0; for (int i = 1; i < iter; i+=4) n += 2.0 / (i * (i+2.0)); return n * 4.0; } ``` or even combine both approaches. But of, course mathematically much more is possible. This was not about approximating pi as fast as possible... The above first approach still works with the original speed, only makes the result a little bit nicer.
Mar 25
parent Salih Dincer <salihdb hotmail.com> writes:
On Monday, 25 March 2024 at 14:02:08 UTC, rkompass wrote:
 
 Of course you may also combine the up(+) and down(-) step to 
 one:

 1/i - 1/(i+2) = 2/(i*(i+2))

 ```d
 double leibniz(int iter) {
   double n = 0.0;
   for (int i = 1; i < iter; i+=4)
     n += 2.0 / (i * (i+2.0));
   return n * 4.0;
 }
 ```
 or even combine both approaches. But of, course mathematically 
 much more is possible. This was not about approximating pi as 
 fast as possible...

 The above first approach still works with the original speed, 
 only makes the result a little bit nicer.
It's obvious that you are a good mathematician. You used sequence A005563. First of all, I must apologize to the questioner for digressing from the topic. But I saw that there is a calculation difference between real and double. My goal was to see if there would be a change in speed. For example, with 250 million cycles (iter/4) I got the following result:
 3.14159265158976691 (250 5million (with real)
 3.14159264457621568 (250 million with double)
 3.14159265358979324 (std.math.constants.PI)
First of all, my question is: Why do we see this calculation error with double? Could the changes I made to the algorithm have caused this? Here's an executable code snippet: ```d enum step = 4; enum loop = 250_000_000; auto leibniz(T)(int iter) { T n = 2/3.0; for(int i = 5; i < iter; i += step) { T a = (2.0 + i) * i; // https://oeis.org/A005563 n += 2/a; } return n * step; } import std.stdio : writefln; void main() {  enum iter = loop * step-10;   65358979323.writefln!"Compare.%s";  iter.leibniz!double.writefln!"%.17f (double)";  iter.leibniz!real.writefln!"%.17f (real)";  imported!"std.math".PI.writefln!"%.17f (enum)"; } /* Prints: Compare.65358979323 3.14159264457621568 (double) 3.14159265158976689 (real) 3.14159265358979324 (enum) */ ``` In fact, there are algorithms that calculate accurately up to 12 decimal places with fewer cycles. (e.g. 9999) SDB 79
Mar 26
prev sibling parent Salih Dincer <salihdb hotmail.com> writes:
On Sunday, 24 March 2024 at 22:16:06 UTC, Kdevel wrote:
 The term containing the `pow` invocation computes the 
 alternating sequence -1, 1, -1, ..., which can be replaced by 
 e.g.

 ```d
    immutable int [2] sign = [-1, 1];
    n += sign [i & 1] / (i * 2.0 - 1.0);
 ```

 This saves the expensive call to the pow function.
I also used this code: ```d import std.stdio : writefln; import std.datetime.stopwatch; enum ITERATIONS = 1_000_000; enum BENCHMARKS = 20; auto leibniz(bool speed = true)(int iter) { double n = 1.0; static if(speed) const sign = [-1, 1]; for(int i = 2; i < iter; i++) { static if(speed) { const m = i << 1; n += sign [i & 1] / (m - 1.0); } else { n += pow(-1, i - 1) / (i * 2.0 - 1.0); } } return n * 4.0; } auto pow(F, G)(F x, G n) nogc trusted pure nothrow { import std.traits : Unsigned, Unqual; real p = 1.0, v = void; Unsigned!(Unqual!G) m = n; if(n < 0) { if(n == -1) return 1 / x; m = cast(typeof(m))(0 - n); v = p / x; } else { switch(n) { case 0: return 1.0; case 1: return x; case 2: return x * x; default: } v = x; } while(true) { if(m & 1) p *= v; m >>= 1; if(!m) break; v *= v; } return p; } void main() { double result; long total_time = 0; for(int i = 0; i < BENCHMARKS; i++) { auto sw = StopWatch(AutoStart.no); sw.start(); result = ITERATIONS.leibniz;//!false; sw.stop(); total_time += sw.peek.total!"nsecs"; } result.writefln!"%0.21f"; writefln("Avg execution time: %f\n", total_time / BENCHMARKS / 1e9); } ``` and results:
 dmd -run "leibnizTest.d"
 3.141594653593692054727
 Avg execution time: 0.002005
If I compile with leibniz!false(ITERATIONS) the average execution time increases slightly:
 Avg execution time: 0.044435
However, if you pay attention, it is not connected to an external library and a power function that works with integers is used. Normally the following function of the library should be called:
 Unqual!(Largest!(F, G)) pow(F, G)(F x, G y)  nogc  trusted pure 
 nothrow
 if (isFloatingPoint!(F) && isFloatingPoint!(G))
 ...
Now, the person asking the question will ask why it is slow even though we use exactly the same codes in C; rightly. You may think that the more watermelon you carry in your arms, the slower you naturally become. I think the important thing is not to drop the watermelons :) SDB 79
Mar 24
prev sibling parent Csaba <feketecsaba gmail.com> writes:
On Sunday, 24 March 2024 at 21:21:13 UTC, kdevel wrote:
 Usually you do not translate mathematical expressions directly 
 into code:

 ```
    n += pow(-1.0, i - 1.0) / (i * 2.0 - 1.0);
 ```

 The term containing the `pow` invocation computes the 
 alternating sequence -1, 1, -1, ..., which can be replaced by 
 e.g.

 ```
    immutable int [2] sign = [-1, 1];
    n += sign [i & 1] / (i * 2.0 - 1.0);
 ```

 This saves the expensive call to the pow function.
I know that the code can be simplified/optimized, I just wanted to compare the same expression in C and D.
Mar 26
prev sibling parent reply Lance Bachmeier <no spam.net> writes:
On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:
 I know that benchmarks are always controversial and depend on a 
 lot of factors. So far, I read that D performs very well in 
 benchmarks, as well, if not better, as C.

 I wrote a little program that approximates PI using the Leibniz 
 formula. I implemented the same thing in C, D and Python, all 
 of them execute 1,000,000 iterations 20 times and display the 
 average time elapsed.

 Here are the results:

 C: 0.04s
 Python: 0.33s
 D: 0.73s

 What the hell? D slower than Python? This cannot be real. I am 
 sure I am making a mistake here. I'm sharing all 3 programs 
 here:

 C: https://pastebin.com/s7e2HFyL
 D: https://pastebin.com/fuURdupc
 Python: https://pastebin.com/zcXAkSEf

 As you can see the function that does the job is exactly the 
 same in C and D.

 Here are the compile/run commands used:

 C: `gcc leibniz.c -lm -oleibc`
 D: `gdc leibniz.d -frelease -oleibd`
 Python: `python3 leibniz.py`

 PS. my CPU is AMD A8-5500B and my OS is Ubuntu Linux, if that 
 matters.
As others suggested, pow is the problem. I noticed that the C versions are often much faster than their D counterparts. (And I don't view that as a problem, since both are built into the language - my only thought is that the D version should call the C version). Changing ``` import std.math:pow; ``` to ``` import core.stdc.math: pow; ``` and leaving everything unchanged, I get C: Avg execution time: 0.007918 D (original): Avg execution time: 0.102612 D (using core.stdc.math): Avg execution time: 0.008134 So more or less the exact same numbers if you use core.stdc.math.
Mar 26
parent reply Lance Bachmeier <no spam.net> writes:
On Tuesday, 26 March 2024 at 14:25:53 UTC, Lance Bachmeier wrote:
 On Sunday, 24 March 2024 at 19:31:19 UTC, Csaba wrote:
 I know that benchmarks are always controversial and depend on 
 a lot of factors. So far, I read that D performs very well in 
 benchmarks, as well, if not better, as C.

 I wrote a little program that approximates PI using the 
 Leibniz formula. I implemented the same thing in C, D and 
 Python, all of them execute 1,000,000 iterations 20 times and 
 display the average time elapsed.

 Here are the results:

 C: 0.04s
 Python: 0.33s
 D: 0.73s

 What the hell? D slower than Python? This cannot be real. I am 
 sure I am making a mistake here. I'm sharing all 3 programs 
 here:

 C: https://pastebin.com/s7e2HFyL
 D: https://pastebin.com/fuURdupc
 Python: https://pastebin.com/zcXAkSEf

 As you can see the function that does the job is exactly the 
 same in C and D.

 Here are the compile/run commands used:

 C: `gcc leibniz.c -lm -oleibc`
 D: `gdc leibniz.d -frelease -oleibd`
 Python: `python3 leibniz.py`

 PS. my CPU is AMD A8-5500B and my OS is Ubuntu Linux, if that 
 matters.
As others suggested, pow is the problem. I noticed that the C versions are often much faster than their D counterparts. (And I don't view that as a problem, since both are built into the language - my only thought is that the D version should call the C version). Changing ``` import std.math:pow; ``` to ``` import core.stdc.math: pow; ``` and leaving everything unchanged, I get C: Avg execution time: 0.007918 D (original): Avg execution time: 0.102612 D (using core.stdc.math): Avg execution time: 0.008134 So more or less the exact same numbers if you use core.stdc.math.
And then the other thing is changing ``` const int BENCHMARKS = 20; ``` to ``` enum BENCHMARKS = 20; ``` which should allow substitution of the constant directly into the rest of the program, which gives ``` Avg execution time: 0.007564 ``` On my Ubuntu 22.04 machine, therefore, the LDC binary with no flags is slightly faster than the C code compiled with your flags.
Mar 26
parent reply rkompass <rkompass gmx.de> writes:
I apologize for digressing a little bit further - just to share 
insights to other learners.

I had the question, why my binary was so big (> 4M), discovered 
the
`gdc -Wall -O2 -frelease -shared-libphobos` options (now >200K).
Then I tried to avoid GC, just learnt about this: The GC in the 
Leibnitz code is there only for the writeln. With a change to 
(again standard C) printf the
` nogc` modifier can be applied, the binary then gets down to 
~17K, a comparable size of the C counterpart.

Another observation regarding precision:
The iteration proceeds in the wrong order. Adding small 
contributions first and bigger last leads to less loss when 
summing up the small parts below the final real/double LSB limit.

So I'm now at this code (abolishing the avarage of 20 interations 
as unnesseary)

```d
// import std.stdio;  // writeln will lead to the garbage 
collector to be included
import core.stdc.stdio: printf;
import std.datetime.stopwatch;

const int ITERATIONS = 1_000_000_000;

 nogc pure double leibniz(int it) {  // sum up the small values 
first
   double n = 0.5*((it%2) ? -1.0 : 1.0) / (it * 2.0 + 1.0);
   for (int i = it-1; i >= 0; i--)
     n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0);
   return n * 4.0;
}

 nogc void main() {
     double result;
     double total_time = 0;
     auto sw = StopWatch(AutoStart.yes);
     result = leibniz(ITERATIONS);
     sw.stop();
     total_time = sw.peek.total!"nsecs";
     printf("%.16f\n", result);
     printf("Execution time: %f\n", total_time / 1e9);
}
```
result:
```
3.1415926535897931
Execution time: 1.068111
```
Mar 27
parent reply Salih Dincer <salihdb hotmail.com> writes:
On Wednesday, 27 March 2024 at 08:22:42 UTC, rkompass wrote:
 I apologize for digressing a little bit further - just to share 
 insights to other learners.
Good thing you're digressing; I am 45 years old and I still cannot say that I am finished as a student! For me this is version 4 and it looks like we don't need a 3rd variable other than the function parameter and return value: ```d auto leibniz_v4(int i) nogc pure { double n = 0.5*((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); while(--i >= 0) n += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); return n * 4.0; } /* 3.1415926535892931 3.141592653589 793238462643383279502884197169399375105 3.141593653590774200000 (v1) Avg execution time: 0.000033 */ ``` SDB 79
Mar 27
parent reply rkompass <rkompass gmx.de> writes:
On Thursday, 28 March 2024 at 01:09:34 UTC, Salih Dincer wrote:
 Good thing you're digressing; I am 45 years old and I still 
 cannot say that I am finished as a student! For me this is 
 version 4 and it looks like we don't need a 3rd variable other 
 than the function parameter and return value:
So we go with another digression. I discovered parallel, also avoided the extra variable, as suggested by Salih: ```d import std.range; import std.parallelism; import core.stdc.stdio: printf; import std.datetime.stopwatch; enum ITERS = 1_000_000_000; enum STEPS = 31; // 5 is fine, even numbers (e.g. 10) may give bad precision (for math reason ???) pure double leibniz(int i) { // sum up the small values first double r = (i == ITERS) ? 0.5 * ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0) : 0.0; for (--i; i >= 0; i-= STEPS) r += ((i%2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); return r * 4.0; } void main() { auto start = iota(ITERS, ITERS-STEPS, -1).array; auto sw = StopWatch(AutoStart.yes); double result = 0.0; foreach(s; start.parallel) result += leibniz(s); double total_time = sw.peek.total!"nsecs"; printf("%.16f\n", result); printf("Execution time: %f\n", total_time / 1e9); } ``` gives: ``` 3.1415926535897931 Execution time: 0.211667 ``` My laptop has 6 cores and obviously 5 are used in parallel by this. The original question related to a comparison between C, D and Python. Turning back to this: Are there similarly simple libraries for C, that allow for parallel computation?
Mar 28
parent reply Salih Dincer <salihdb hotmail.com> writes:
On Thursday, 28 March 2024 at 11:50:38 UTC, rkompass wrote:
 
 Turning back to this: Are there similarly simple libraries for 
 C, that allow for
 parallel computation?
You can achieve parallelism in C using libraries such as OpenMP, which provides a set of compiler directives and runtime library routines for parallel programming. Here’s an example of how you might modify the code to use OpenMP for parallel processing: ```c #include <stdio.h> #include <time.h> #include <omp.h> #define ITERS 1000000000 #define STEPS 31 double leibniz(int i) { double r = (i == ITERS) ? 0.5 * ((i % 2) ? -1.0 : 1.0) / (i * 2.0 + 1.0) : 0.0; for (--i; i >= 0; i -= STEPS) r += ((i % 2) ? -1.0 : 1.0) / (i * 2.0 + 1.0); return r * 4.0; } int main() { double start_time = omp_get_wtime(); double result = 0.0; #pragma omp parallel for reduction(+:result) for (int s = ITERS; s >= 0; s -= STEPS) { result += leibniz(s); } // Calculate the time taken double time_taken = omp_get_wtime() - start_time; printf("%.16f\n", result); printf("%f (seconds)\n", time_taken); return 0; } ``` To compile this code with OpenMP support, you would use a command like gcc -fopenmp your_program.c. This tells the GCC compiler to enable OpenMP directives. The #pragma omp parallel for directive tells the compiler to parallelize the loop, and the reduction clause is used to safely accumulate the result variable across multiple threads. SDB 79
Mar 28
parent reply rkompass <rkompass gmx.de> writes:
On Thursday, 28 March 2024 at 14:07:43 UTC, Salih Dincer wrote:
 On Thursday, 28 March 2024 at 11:50:38 UTC, rkompass wrote:
 
 Turning back to this: Are there similarly simple libraries for 
 C, that allow for
 parallel computation?
You can achieve parallelism in C using libraries such as OpenMP, which provides a set of compiler directives and runtime library routines for parallel programming. Here’s an example of how you might modify the code to use OpenMP for parallel processing: ```c . . . #pragma omp parallel for reduction(+:result) for (int s = ITERS; s >= 0; s -= STEPS) { result += leibniz(s); } . . . ``` To compile this code with OpenMP support, you would use a command like gcc -fopenmp your_program.c. This tells the GCC compiler to enable OpenMP directives. The #pragma omp parallel for directive tells the compiler to parallelize the loop, and the reduction clause is used to safely accumulate the result variable across multiple threads. SDB 79
Nice, thank you. It worked endlessly until I saw I had to correct the `for` to `for (int s = ITERS; s > ITERS-STEPS; s--)` Now the result is: ``` 3.1415926535897936 Execution time: 0.212483 (seconds). ``` This result is sooo similar! I didn't know that OpenMP programming could be that easy. Binary size is 16K, same order of magnitude, although somewhat less. D advantage is gone here, I would say.
Mar 28
next sibling parent Sergey <kornburn yandex.ru> writes:
On Thursday, 28 March 2024 at 20:18:10 UTC, rkompass wrote:
 D advantage is gone here, I would say.
It's hard to compare actually. Std.parallelism has a bit different mechanics, and I think easier to use. The syntax is nicer. OpenMP is an well-known and highly adopted tool, which is also quite flexible, but usually used with initially sequential code. And the syntax is not very intuitive. Interesting point from Dr Russel here: https://forum.dlang.org/thread/qvksmhwkaxbrnggsvtxe forum.dlang.org However since 2012 OpenMP also got some development and improvement and HPC world is pretty conservative. So it is one of the most popular tool in the area: https://www.openmp.org/wp-content/uploads/sc23-openmp-popularity-mattson.pdf With MPI.. But probably with AI and GPU revolution the balance will shift a bit to CUDA-like technologies.
Mar 28
prev sibling parent reply Salih Dincer <salihdb hotmail.com> writes:
On Thursday, 28 March 2024 at 20:18:10 UTC, rkompass wrote:
 I didn't know that OpenMP programming could be that easy.
 Binary size is 16K, same order of magnitude, although somewhat 
 less.
 D advantage is gone here, I would say.
There is no such thing as parallel programming in D anyway. At least it has modules, but I didn't see it being works. Whenever I use toys built in foreach() it always ends in disappointment :) SDB 79
Mar 28
parent reply Serg Gini <kornburn yandex.ru> writes:
On Thursday, 28 March 2024 at 23:15:26 UTC, Salih Dincer wrote:
 There is no such thing as parallel programming in D anyway. At 
 least it has modules, but I didn't see it being works. Whenever 
 I use toys built in foreach() it always ends in disappointment
I think it just works :) Which issues did you have with it?
Mar 28
parent Salih Dincer <salihdb hotmail.com> writes:
On Friday, 29 March 2024 at 00:04:14 UTC, Serg Gini wrote:
 On Thursday, 28 March 2024 at 23:15:26 UTC, Salih Dincer wrote:
 There is no such thing as parallel programming in D anyway. At 
 least it has modules, but I didn't see it being works. 
 Whenever I use toys built in foreach() it always ends in 
 disappointment
I think it just works :) Which issues did you have with it?
A year has passed and I have tried almost everything! Either it went into an infinite loop or nothing changed at the speed. At least things are not as simple as openMP on the D side! First I tried this code snippet: futile attempt! ```d struct RowlandSequence { import std.numeric : gcd; import std.format : format; import std.conv : text; long b, r, a = 3; enum empty = false; string[] front() { string result = format("%s, %s", b, r); return [text(a), result]; } void popFront() { long result = 1; while(result == 1) { result = gcd(r++, b); b += result; } a = result; } } enum BP { f = 1, b = 7, r = 2, a = 1, /* f = 109, b = 186837516, r = 62279173, //*/ s = 5 } void main() { RowlandSequence rs; long start, skip; with(BP) { rs = RowlandSequence(b, r); start = f; skip = s; } rs.popFront(); import std.stdio, std.parallelism; import std.range : take; auto rsFirst128 = rs.take(128); foreach(r; rsFirst128.parallel) { if(r[0].length > skip) { start.writeln(": ", r); } start++; } } ``` SDB 79
Mar 28