digitalmars.D.learn - Virtual value types during compile-time for static type safety, static
- Tamas (53/53) Jul 17 2015 I got inspired by Andrei's "Generic Programming Must Go" talk:
- ZombineDev (5/8) Jul 17 2015 Did you try looking at assembly generated by GDC or LDC with full
- ZombineDev (3/11) Jul 17 2015 Also, see the table at the bottom of this comment:
- Tamas (9/21) Jul 17 2015 Thanks for the pointers! My first priority a performant library,
- Adam D. Ruppe (2/3) Jul 17 2015 what compile flags?
- Tamas (198/198) Jul 18 2015 I made a thorough comparison using multiple compilers and a
- Tamas (9/9) Jul 18 2015 Sorry, the main function of positive0.d correctly looks like this:
- Adam D. Ruppe (4/7) Jul 18 2015 Try adding the automatic optimize flags in all your cases. For
- Tamas (8/15) Jul 18 2015 Thanks, indeed, after -O -inline the bodies of the two abs
- Baz (92/99) Jul 17 2015 Hi, I don't see the runtime penalities you talk about.
I got inspired by Andrei's "Generic Programming Must Go" talk: https://www.youtube.com/watch?v=mCrVYYlFTrA I.e. writing functions with static if-s, based on what we know about the input. So the question is how to annotate the input variables? Adam showed an excellent solution for this, by wrapping the value into a struct, and using alias: http://stackoverflow.com/questions/31442059/virtual-static-value-types Unfortunately, it's not without runtime penalty. I have checked this idea with ints, against the generated asm.dlang.org code. Unfortunately the struct wrapper results in longer asm code. In theory such run-time penalty is not necessary, as everything is available compile-time. To understand my use-case, here is an example "math library" that skips certain validations or computations altogether, whether some properties of the input can be inferred from how the value was produced: import std.math : sqrt; import std.conv; struct Positive { int num; alias num this; } Positive abs(T)(T n) { static if (is(T == Positive)) { return n; } if (n >= 0) return Positive(n); return Positive(-n); } Positive isqrt(T)(T x) { static if (is(T == Positive)) { return Positive(sqrt(x.to!float).to!int); } if (x<0) { throw new Exception("no root for negative numbers"); } return Positive(sqrt(x.to!float).to!int); } unittest { assert(abs(-4) == 4); // full abs runs assert(isqrt(4).abs() == 2); // full isqrt runs, abs returns n immediately. assert(abs(-4).isqrt() == 2); // full abs runs, isqrt return immediately, skipping validation; } Although this code is fully operational, presents nice api and compile-time optimizations, the extra Struct wrapper is not without runtime penalty. Is there a solution that results the same static optimizations, but has no runtime penalty, i.e. the functions just operates with ints? (At least when compiled) Thanks, Tamas
Jul 17 2015
On Friday, 17 July 2015 at 21:20:41 UTC, Tamas wrote:Is there a solution that results the same static optimizations, but has no runtime penalty, i.e. the functions just operates with ints? (At least when compiled)Did you try looking at assembly generated by GDC or LDC with full optimizations? For example GDC does quite better than DMD for the proposed SafeInt type: https://github.com/D-Programming-Language/phobos/pull/3389#issuecomment-119005595
Jul 17 2015
On Friday, 17 July 2015 at 23:15:31 UTC, ZombineDev wrote:On Friday, 17 July 2015 at 21:20:41 UTC, Tamas wrote:Also, see the table at the bottom of this comment: https://github.com/D-Programming-Language/phobos/pull/3389#issuecomment-117450524Is there a solution that results the same static optimizations, but has no runtime penalty, i.e. the functions just operates with ints? (At least when compiled)Did you try looking at assembly generated by GDC or LDC with full optimizations? For example GDC does quite better than DMD for the proposed SafeInt type: https://github.com/D-Programming-Language/phobos/pull/3389#issuecomment-119005595
Jul 17 2015
On Friday, 17 July 2015 at 23:16:51 UTC, ZombineDev wrote:On Friday, 17 July 2015 at 23:15:31 UTC, ZombineDev wrote:Thanks for the pointers! My first priority a performant library, secondary is a nice api, 3rd is a nice implementation. So that kind of rules put Amy degradation compared to ints. I used DMD, BTW. I see no reason for the necessity of performance degradation. Essentially I just want to assign a similar qualifyer like const or immutable. They are checked and used at compile time, but erased for runtime. Same here.On Friday, 17 July 2015 at 21:20:41 UTC, Tamas wrote:Also, see the table at the bottom of this comment: https://github.com/D-Programming-Language/phobos/pull/3389#issuecomment-117450524Is there a solution that results the same static optimizations, but has no runtime penalty, i.e. the functions just operates with ints? (At least when compiled)Did you try looking at assembly generated by GDC or LDC with full optimizations? For example GDC does quite better than DMD for the proposed SafeInt type: https://github.com/D-Programming-Language/phobos/pull/3389#issuecomment-119005595
Jul 17 2015
On Friday, 17 July 2015 at 23:44:25 UTC, Tamas wrote:I used DMD, BTW.what compile flags?
Jul 17 2015
I made a thorough comparison using multiple compilers and a summary of the findings. In short, there is a runtime overhead. I reduced the code to cut out the imports and made two versions with equivalent semantic content. positive0.d contains the hand written specializations of the abs function. positive.d contains the solution with function templates / static type analysis. /////// /* positive0.d: Compile & execute: $ dmd positive0.d; ./positive0; echo $? $ ldc2 positive0.d; ./positive0; echo $? generate ASM source: $ dmd positive0.d; gobjdump -d positive0.o > positive0.dmd.s $ ldc2 positive0.d -output-s */ int absPositive(int n) { return n; } int abs(int n) { return (n>=0) ? n : -n; } int square(int x) { return x * x; } int main() { return !((abs(-16) == 16) && (abs(3) == 3) && (square(5).abs == 25) && (square(-4).abs == 16)); } /////// /* positive.d: Compile & execute: $ dmd positive.d; ./positive; echo $? $ ldc2 positive.d; ./positive; echo $? generate ASM source: $ dmd positive.d; gobjdump -d positive.o > positive.dmd.s $ ldc2 positive.d -output-s */ struct Positive { int num; alias num this; } Positive abs(T)(T n) { static if (is(T == Positive)) { return n; } else { return Positive((n >= 0) ? n : -n); } } Positive square(int x) { return Positive(x * x); } int main() { return !((abs(-16) == 16) && (abs(3) == 3) && (square(5).abs == 25) && (square(-4).abs == 16)); } /////// I compared the generated asms. The asm code was substantially longer in case of non-hand written specializations of the abs function. The 'optimized' versions of the abs function were equivalent, but the 'non-optimzed' versions shows the runtime overhead for dmd and ldc2 as well, a double 'mov' commands instead of a single ones; The compiled hand written code was roughly half the size for both compilers: File sizes: ldc: 2678 positive0.s 4313 positive.s dmd: 3442 positive0.dmd.s 8701 positive.dmd.s You can see the abs functions below, and you can spot the double 'mov' operations: positive.dmd.s: 0000000000000230 <_D8positive10__T3absTiZ3absFNaNbNiNfiZS8positive8Positive>: 230: 55 push %rbp 231: 48 8b ec mov %rsp,%rbp 234: 48 83 ec 10 sub $0x10,%rsp 238: 85 ff test %edi,%edi 23a: 78 02 js 23e <_D8positive10__T3absTiZ3absFNaNbNiNfiZS8positive8Positive+0xe> 23c: eb 02 jmp 240 <_D8positive10__T3absTiZ3absFNaNbNiNfiZS8positive8Positive+0x10> 23e: f7 df neg %edi 240: 89 7d f0 mov %edi,-0x10(%rbp) 243: 48 89 f8 mov %rdi,%rax 246: c9 leaveq 247: c3 retq 0000000000000248 <_D8positive28__T3absTS8positive8PositiveZ3absFNaNbNiNfS8positive8PositiveZS8positive8Positive>: 248: 55 push %rbp 249: 48 8b ec mov %rsp,%rbp 24c: 48 83 ec 10 sub $0x10,%rsp 250: 48 89 f8 mov %rdi,%rax 253: c9 leaveq 254: c3 retq 255: 0f 1f 00 nopl (%rax) positive0.dmd.s: 00000000000000a0 <_D9positive011absPositiveFiZi>: a0: 55 push %rbp a1: 48 8b ec mov %rsp,%rbp a4: 48 83 ec 10 sub $0x10,%rsp a8: 48 89 f8 mov %rdi,%rax ab: c9 leaveq ac: c3 retq ad: 0f 1f 00 nopl (%rax) 00000000000000b0 <_D9positive03absFiZi>: b0: 55 push %rbp b1: 48 8b ec mov %rsp,%rbp b4: 48 83 ec 10 sub $0x10,%rsp b8: 85 ff test %edi,%edi ba: 78 05 js c1 <_D9positive03absFiZi+0x11> bc: 48 89 f8 mov %rdi,%rax bf: eb 05 jmp c6 <_D9positive03absFiZi+0x16> c1: 48 89 f8 mov %rdi,%rax c4: f7 d8 neg %eax c6: c9 leaveq c7: c3 retq ldc2: positive.s: __D8positive10__T3absTiZ3absFNaNbNiNfiZS8positive8Positive: .cfi_startproc movl %edi, -4(%rsp) cmpl $0, -4(%rsp) jl LBB2_2 leaq -4(%rsp), %rax movq %rax, -16(%rsp) jmp LBB2_3 LBB2_2: leaq -20(%rsp), %rax xorl %ecx, %ecx subl -4(%rsp), %ecx movl %ecx, -20(%rsp) movq %rax, -16(%rsp) LBB2_3: movq -16(%rsp), %rax movl (%rax), %ecx movl %ecx, -8(%rsp) movl %ecx, %eax retq .cfi_endproc .globl __D8positive28__T3absTS8positive8PositiveZ3absFNaNbNiNfS8positive8PositiveZS8positive8Positive .weak_definition __D8positive28__T3absTS8positive8PositiveZ3absFNaNbNiNfS8positive8PositiveZS8positive8Positive .align 4, 0x90 __D8positive28__T3absTS8positive8PositiveZ3absFNaNbNiNfS8positive8PositiveZS8positive8Positive: .cfi_startproc movl %edi, -8(%rsp) movl %edi, %eax retq .cfi_endproc .section __TEXT,__text,regular,pure_instructions .align 4, 0x90 positive0.s: __D9positive011absPositiveFiZi: .cfi_startproc movl %edi, -4(%rsp) movl -4(%rsp), %eax retq .cfi_endproc .globl __D9positive03absFiZi .align 4, 0x90 __D9positive03absFiZi: .cfi_startproc movl %edi, -4(%rsp) cmpl $0, -4(%rsp) jl LBB1_2 leaq -4(%rsp), %rax movq %rax, -16(%rsp) jmp LBB1_3 LBB1_2: leaq -20(%rsp), %rax xorl %ecx, %ecx subl -4(%rsp), %ecx movl %ecx, -20(%rsp) movq %rax, -16(%rsp) LBB1_3: movq -16(%rsp), %rax movl (%rax), %eax retq .cfi_endproc .globl __D9positive06squareFiZi .align 4, 0x90 my compilers: $ ldc2 -version LDC - the LLVM D compiler (6d3923): based on DMD v2.066.1 and LLVM 3.6.1 Default target: x86_64-apple-darwin14.4.0 Host CPU: core-avx2 $ dmd --version DMD64 D Compiler v2.067
Jul 18 2015
Sorry, the main function of positive0.d correctly looks like this: int main() { return !((abs(-16) == 16) && (abs(3) == 3) && (square(5).absPositive == 25) && (square(-4).absPositive == 16)); } But this does not affect the results, the asm file sizs or the asm abs function bodies.
Jul 18 2015
On Saturday, 18 July 2015 at 10:06:07 UTC, Tamas wrote:Compile & execute: $ dmd positive0.d; ./positive0; echo $? $ ldc2 positive0.d; ./positive0; echo $?Try adding the automatic optimize flags in all your cases. For dmd, `-O -inline`. Not sure about ldc but I think it is `-O` as well.
Jul 18 2015
On Saturday, 18 July 2015 at 13:16:26 UTC, Adam D. Ruppe wrote:On Saturday, 18 July 2015 at 10:06:07 UTC, Tamas wrote:Thanks, indeed, after -O -inline the bodies of the two abs functions are the same! :) The asm code of the templated version is still longer overall, but I think it's only some garbage that is not really executed. (e.g some with assert and unittest in the name, although I have none such) Soo thank you, it's really awesome! :)Compile & execute: $ dmd positive0.d; ./positive0; echo $? $ ldc2 positive0.d; ./positive0; echo $?Try adding the automatic optimize flags in all your cases. For dmd, `-O -inline`. Not sure about ldc but I think it is `-O` as well.
Jul 18 2015
On Friday, 17 July 2015 at 21:20:41 UTC, Tamas wrote:Although this code is fully operational, presents nice api and compile-time optimizations, the extra Struct wrapper is not without runtime penalty. Is there a solution that results the same static optimizations, but has no runtime penalty, i.e. the functions just operates with ints? (At least when compiled) Thanks, TamasHi, I don't see the runtime penalities you talk about. I've defined these alias: --- alias abs1 = abs!int; alias abs2 = abs!Positive; alias isqrt1 = isqrt!int; alias isqrt2 = isqrt!Positive; --- and the asm produced for the `Positive` type is clearly faster: ;------- abs1 ------- 00402074h enter 000Ch, 00h 00402078h test eax, eax 0040207Ah js 00402081h 0040207Ch mov dword ptr [ebp-0Ch], eax 0040207Fh leave 00402080h ret 00402081h neg eax 00402083h mov dword ptr [ebp-08h], eax 00402086h leave 00402087h ret ;---------------------------- ;------- abs2 (Positive)------- 00402088h enter 0004h, 00h 0040208Ch leave 0040208Dh ret ;---------------------------- ;------- isqrt1 ------- 00402090h enter 0008h, 00h 00402094h test eax, eax 00402096h jns 004020CFh 00402098h mov ecx, 00454D60h 0040209Dh push ecx 0040209Eh call 00403410h 004020A3h add esp, 04h 004020A6h push dword ptr [004540A4h] 004020ACh push dword ptr [004540A0h] 004020B2h push dword ptr [004540E4h] 004020B8h push dword ptr [004540E0h] 004020BEh push 0000001Bh 004020C0h push 00000000h 004020C2h call 004035C0h 004020C7h push eax 004020C8h call 00403400h 004020CDh jmp 004020E6h 004020CFh call 0040247Ch 004020D4h fsqrt 004020D6h sub esp, 04h 004020D9h fstp dword ptr [esp] 004020DCh call 00402494h 004020E1h mov dword ptr [ebp-08h], eax 004020E4h leave 004020E5h ret 004020E6h leave 004020E7h ret ;---------------------------- ;------- isqrt2 (Positive)------- 004020E8h enter 0008h, 00h 004020ECh call 00402560h 004020F1h fsqrt 004020F3h sub esp, 04h 004020F6h fstp dword ptr [esp] 004020F9h call 00402494h 004020FEh mov dword ptr [ebp-08h], eax 00402101h leave 00402102h ret ;---------------------------- (dmd win32 bit, -w -wi). Also note that i've tweaked the code to get rid of a few warnings: --- Positive abs(T)(T n) { static if (is(T == Positive)) { return n; } else { if (n >= 0) return Positive(n); else return Positive(-n); } } Positive isqrt(T)(T x) { static if (is(T == Positive)) { return Positive(sqrt(x.to!float).to!int); } else { if (x<0) throw new Exception("no root for negative numbers"); else return Positive(sqrt(x.to!float).to!int); } } ---
Jul 17 2015