www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Sanely optimized fizzbuzz

reply monkyyy <crazymonkyyy gmail.com> writes:
https://codegolf.stackexchange.com/questions/215216/high-throughput-fizz-buzz/236630#236630

So this was bought up on the discord and it seems interesting, so 
how about a little contest?

Speed up fizzbuzz but let's do it a bit sanely:

Some rough rules:

* Less then 100 lines of code
* Let's not define your own asm marco language like the winning 
result
* Use d
Oct 29 2021
next sibling parent reply Brian Callahan <bcallah openbsd.org> writes:
On Saturday, 30 October 2021 at 05:31:48 UTC, monkyyy wrote:
 https://codegolf.stackexchange.com/questions/215216/high-throughput-fizz-buzz/236630#236630

 So this was bought up on the discord and it seems interesting, 
 so how about a little contest?

 Speed up fizzbuzz but let's do it a bit sanely:

 Some rough rules:

 * Less then 100 lines of code
 * Let's not define your own asm marco language like the winning 
 result
 * Use d
On my very slow machine, the naive solution in C that you linked to runs at around 64.5 MB/s. An equally naive version in D runs on the same machine at around 920 MB/s: ```d import std.stdio; import std.conv; string fizzbuzz() { string str; for (int i = 1; i <= 15000; i++) { if (i % 15 == 0) str ~= "FizzBuzz\n"; else if (i % 3 == 0) str ~= "Fizz\n"; else if (i % 5 == 0) str ~= "Buzz\n"; else str ~= to!string(i) ~ "\n"; } return str; } void main() { string fb = fizzbuzz(); for (int i = 0; i < 1000000; i++) write(fb); } ``` You could do CTFE by turning `string fb = fizzbuzz();` into `static string fb = fizzbuzz();` but I found that reduces runtime performance to about 901 MB/s on my machine.
Oct 30 2021
next sibling parent reply Elronnd <elronnd elronnd.net> writes:
On Saturday, 30 October 2021 at 07:30:26 UTC, Brian Callahan 
wrote:
 On my very slow machine, the naive solution in C that you 
 linked to runs at around 64.5 MB/s.

 An equally naive version in D runs on the same machine at 
 around 920 MB/s:
 [...]
That's doing something completely different. C is generating _all_ fizzbuzz values up to (some very large number), and printing them out once. Your code is generating all the fizzbuzz values up to (some rather small number), and then printing them out over and over again; effectively a benchmark of i/o.
Oct 30 2021
parent reply Brian Callahan <bcallah openbsd.org> writes:
On Saturday, 30 October 2021 at 09:51:43 UTC, Elronnd wrote:
 On Saturday, 30 October 2021 at 07:30:26 UTC, Brian Callahan 
 wrote:
 On my very slow machine, the naive solution in C that you 
 linked to runs at around 64.5 MB/s.

 An equally naive version in D runs on the same machine at 
 around 920 MB/s:
 [...]
That's doing something completely different. C is generating _all_ fizzbuzz values up to (some very large number), and printing them out once. Your code is generating all the fizzbuzz values up to (some rather small number), and then printing them out over and over again; effectively a benchmark of i/o.
My machine doesn't have enough memory to do CTFE for values larger than 15000. If it did, then I would have done that. The rules do not state that you cannot pre-calculate. On my machine, the pre-calculating approach gives a significant throughput improvement.
Oct 30 2021
parent Elronnd <elronnd elronnd.net> writes:
On Saturday, 30 October 2021 at 11:35:10 UTC, Brian Callahan 
wrote:
 My machine doesn't have enough memory to do CTFE for values 
 larger than 15000. If it did, then I would have done that. The 
 rules do not state that you cannot pre-calculate. On my 
 machine, the pre-calculating approach gives a significant 
 throughput improvement.
1. That doesn't change the fact that the code you posted is meaningless as a benchmark. 2. If such an approach is valid, then one may use 'cat' as a fast fizzbuzz implementation. Clearly that is not very interesting or useful. The question is about _generating_ fizzbuzz, not reproducing arbitrary text. 'The rules do not state that you cannot pre-calculate' is equivocation; such an approach is clearly outside of its spirit. 3. In the limit, the pregeneration approach will be bottlenecked by disc, which is slow. 4. Pregeneration cannot reach the limit anyway, because disc space is limited (streams are not).
Oct 30 2021
prev sibling parent monkyyy <crazymonkyyy gmail.com> writes:
     string fb = fizzbuzz();
`
 You could do CTFE by turning `string fb = fizzbuzz();` into 
 `static string fb = fizzbuzz();
There is a fair chance its ctfe without it
Oct 30 2021
prev sibling next sibling parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Saturday, 30 October 2021 at 05:31:48 UTC, monkyyy wrote:
 https://codegolf.stackexchange.com/questions/215216/high-throughput-fizz-buzz/236630#236630

 So this was bought up on the discord and it seems interesting, 
 so how about a little contest?

 Speed up fizzbuzz but let's do it a bit sanely:

 Some rough rules:

 * Less then 100 lines of code
 * Let's not define your own asm marco language like the winning 
 result
 * Use d
And maybe keep the metric as MiB/s for example
Oct 30 2021
prev sibling parent monkyyy <crazymonkyyy gmail.com> writes:
my take:
https://pastebin.com/EQWj7sTk

Decompiling it and messing with flags I couldn't convince writeln 
to be inlined and its still seems slow so it should be replaced 
with something

each print was 2 asm instructions and writeln so its the majority 
of the time is with that
Oct 30 2021