www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Why does scope(success) have to use exceptions?

reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Sample code:

void callScope(ref int x)
{
    x = 1;
    scope(success) { x = 2; }
    x = 3;
    scope(success) { x = 4; }
}

void callFunc(ref int x)
{
    x = 1;
    x = 3;
    x = 4;
    x = 2;
}

void main()
{
    int x;
    callScope(x);
    assert(x == 2);

    callFunc(x);
    assert(x == 2);
}

I was expecting callScope to be lowered down to the handwritten code
in callFunc in assembly, but instead it uses exceptions. Here's some
simple ASM for callFunc compiled with -c -release -O (no inline):

_D4test8callFuncFKiZv:; Function begin, communal
        mov     dword [eax], 1                          ; 0000 _ C7.
00, 00000001
        mov     ecx, eax                                ; 0006 _ 89. C1
        mov     dword [ecx], 3                          ; 0008 _ C7.
01, 00000003
        mov     dword [ecx], 4                          ; 000E _ C7.
01, 00000004
        mov     dword [ecx], 2                          ; 0014 _ C7.
01, 00000002
        ret                                             ; 001A _ C3
; _D4test8callFuncFKiZv End of function

And this monster code for callScope:

_D4test9callScopeFKiZv:; Function begin, communal
        push    ebp                                     ; 0000 _ 55
        mov     ebp, esp                                ; 0001 _ 8B. EC
        mov     edx, dword [fs:__except_list]           ; 0003 _ 64:
8B. 15, 00000000(segrel)
        push    -1                                      ; 000A _ 6A, FF
        mov     ecx, 1                                  ; 000C _ B9, 00000001
        push    _D4test9callScopeFKiZv+0A6H             ; 0011 _ 68,
000000A6(segrel)
        push    edx                                     ; 0016 _ 52
        mov     dword [fs:__except_list], esp           ; 0017 _ 64:
89. 25, 00000000(segrel)
        sub     esp, 24                                 ; 001E _ 83. EC, 18
        push    ebx                                     ; 0021 _ 53
        push    esi                                     ; 0022 _ 56
        push    edi                                     ; 0023 _ 57
        mov     dword [ebp-18H], eax                    ; 0024 _ 89. 45, E8
        mov     dword [eax], ecx                        ; 0027 _ 89. 08
        mov     byte [ebp-1CH], 0                       ; 0029 _ C6. 45, E4, 00
        mov     dword [ebp-4H], 0                       ; 002D _ C7.
45, FC, 00000000
        mov     dword [ebp-4H], ecx                     ; 0034 _ 89. 4D, FC
        mov     dword [eax], 3                          ; 0037 _ C7.
00, 00000003
        xor     ecx, ecx                                ; 003D _ 31. C9
        mov     dword [eax], 4                          ; 003F _ C7.
00, 00000004
        mov     dword [ebp-4H], ecx                     ; 0045 _ 89. 4D, FC
        jmp     ?_002                                   ; 0048 _ EB, 0C

; Note: Inaccessible code
        mov     byte [ebp-1CH], 1                       ; 004A _ C6. 45, E4, 01
        push    dword [ebp-20H]                         ; 004E _ FF. 75, E0
        call    __d_throwc                              ; 0051 _ E8,
00000000(rel)
?_002:  mov     dword [ebp-4H], -1                      ; 0056 _ C7.
45, FC, FFFFFFFF
; Note: Displacement could be made smaller by sign extension
        lea     ecx, [ebp-0CH]                          ; 005D _ 8D.
8D, FFFFFFF4
        push    -1                                      ; 0063 _ 6A, FF
        push    ecx                                     ; 0065 _ 51
        push    FLAT:?_001                              ; 0066 _ 68,
00000000(segrel)
        call    __d_local_unwind2                       ; 006B _ E8,
00000000(rel)
        add     esp, 12                                 ; 0070 _ 83. C4, 0C
        call    ?_003                                   ; 0073 _ E8, 00000002
        jmp     ?_005                                   ; 0078 _ EB, 18

?_003:  mov     dword [ebp-4H], -1                      ; 007A _ C7.
45, FC, FFFFFFFF
        mov     al, byte [ebp-1CH]                      ; 0081 _ 8A. 45, E4
        xor     al, 01H                                 ; 0084 _ 34, 01
        jz      ?_004                                   ; 0086 _ 74, 09
        mov     edx, dword [ebp-18H]                    ; 0088 _ 8B. 55, E8
        mov     dword [edx], 2                          ; 008B _ C7.
02, 00000002
?_004:  ret                                             ; 0091 _ C3

?_005:
; Note: Displacement could be made smaller by sign extension
        mov     ecx, dword [ebp-0CH]                    ; 0092 _ 8B.
8D, FFFFFFF4
        mov     dword [fs:__except_list], ecx           ; 0098 _ 64:
89. 0D, 00000000(segrel)
        pop     edi                                     ; 009F _ 5F
        pop     esi                                     ; 00A0 _ 5E
        pop     ebx                                     ; 00A1 _ 5B
        mov     esp, ebp                                ; 00A2 _ 8B. E5
        pop     ebp                                     ; 00A4 _ 5D
        ret                                             ; 00A5 _ C3
; _D4test9callScopeFKiZv End of function

I'm trying to understand why. If an exception is thrown between one of
those assignment statements the stack will unwind and any following
assignment statements will not be called, so there shouldn't be a need
to check if an exception is thrown. Why doesn't the compiler simply
rewrite callScope to look like callFunc?
Jan 16 2013
next sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 16 January 2013 at 23:19:20 UTC, Andrej Mitrovic 
wrote:
 Sample code:
 I was expecting callScope to be lowered down to the handwritten 
 code in callFunc in assembly, but instead it uses exceptions. 
 Here's some simple ASM for callFunc compiled with -c -release 
 -O (no inline):

<snip>
 I'm trying to understand why. If an exception is thrown between 
 one of those assignment statements the stack will unwind and 
 any following assignment statements will not be called, so 
 there shouldn't be a need to check if an exception is thrown. 
 Why doesn't the compiler simply rewrite callScope to look like 
 callFunc?

Obviously your code won't throw, however that's now how scope works. TDPL. pg 84-88 explains the lowering of scope is equal to hand written try/catch/finally versions; But you don't have to worry about making a mistake in the writing and adding more scopes is easier. Handling SQL and several tasks that all need to succeed or certain steps to be cleaned up afterwards makes perfect sense for scope. [quote] Consider a block containing a scope(exit) statement: { <statements1> scope(exit) statement<2> <statements3> } Let's pick the first scope in the block, so we can assume that <statements1> itself does not contain scope (but <statement2> and <statments3> might). Lowering transforms the code into this: { <statements1> try { <statements3> } finally { <statement2> } } Following the transform, <statements3> and <statement2> are further lowered because they may contain additional scope statements. (The lowering always ends because the fragments are always strictly smaller than the initial sequence.) this means that code containing multiple scope(exit) statements is well defined, even in weird cases like scope(exit) scope(exit) scope(exit) writeln ("?"). In particular let's see what happens in the interesting case of two scope(exit) statements in the same block. } <statements1> scope(exit) <statement2> <statements3> scope(exit) <statement4> <statements5> } Let's assume that all statements do not containing additional scope(exit) statements. After lowering we obtain { <statements1> try { <statements3> try { <statements5> } finally { <statement4> } } finally { <statement2> } } The purpose of showing this unwieldly code is to figure out the order of execution of multiple scope(exit) statements in the same block. Following the flow shows that <statement4> gets executed before <statment2>. In general, scope(exit) statements execute in a stack, LIFO manner, the reverse of their order in the execution flow. [/quote] Only asserts or exceptions would/can manage to decide if the block was successful or not; And the scope(s) can then manage cleanup (or final code) regardless where it was put without you having to do it yourself. If there's never any chance of it failing in the function then scope may not be what you want. However if compiler knows and can verify the code is unable to fail (thus exceptions are not needed) perhaps an enhancement request that could remove the unneeded try/catches...
Jan 16 2013
next sibling parent Brad Roberts <braddr slice-2.puremagic.com> writes:
On Thu, 17 Jan 2013, Era Scarecrow wrote:

 On Wednesday, 16 January 2013 at 23:19:20 UTC, Andrej Mitrovic wrote:
 Sample code:
 I was expecting callScope to be lowered down to the handwritten code in
 callFunc in assembly, but instead it uses exceptions. Here's some simple ASM
 for callFunc compiled with -c -release -O (no inline):
 

Obviously your code won't throw, however that's now how scope works. <snip> However if compiler knows and can verify the code is unable to fail (thus exceptions are not needed) perhaps an enhancement request that could remove the unneeded try/catches...

It's a QOI issue. I know that parts of DMD are nothrow aware and will collapse away unnecessary layers, but obviously not all are. The missing piece here is likely examining assignments to validate that they can't throw. Probably not hard to make a meaningful improvement here. Later, Brad
Jan 16 2013
prev sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 17/01/2013 00:06, Era Scarecrow wrote:
<snip>
   Consider a block containing a scope(exit) statement:

   However if compiler knows and can verify the code is unable to fail
 (thus exceptions are not needed) perhaps an enhancement request that
 could remove the unneeded try/catches...

The OP was talking about scope(success), not scope(exit). scope(success), by definition, won't be executed if the code fails. If it does it by changing it to a try/finally, it has to generate extra code to test whether the code succeeded or not. It's far simpler to just write out the scope(success) code at the end of the scope in question. As such, the compiler is _pessimising_ the code. OK, so it's a bit more complicated if the scope can be exited via a return, break, continue or (dare I mention?) goto statement, but even then it ought to be quite easy to make the compiler generate more efficient code. This should be fixed, as it is discouraging use of scope(success). Stewart.
Jan 16 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 1/17/13, Era Scarecrow <rtcvb32 yahoo.com> wrote:
 Obviously your code won't throw, however that's now how scope
 works. Only asserts or exceptions would/can manage to decide if the
 block was successful or not.

That's true for scope exit and scope failure, which need a try/catch/finally. But if an exception is thrown the stack will unwind, therefore the next statements won't be run, which is interesting for scope(success). Let's look at it this way: void foo() { int x; scope(success) { x = 1; } // < code which might throw > x = 2; } try/catch version: void foo() { int x; try { x = 2; // < code which might throw > x = 1; } catch (Throwable e) { throw e; } } But there's no need for a try/catch, you can rewrite this to: void foo() { int x; x = 2; // < code which might throw > x = 1; // if there's no stack unwind, this gets executed, hence scope(success) }
Jan 16 2013
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 17 January 2013 at 00:41:17 UTC, Stewart Gordon 
wrote:
 The OP was talking about scope(success), not scope(exit). 
 scope(success), by definition, won't be executed if the code 
 fails.  If it does it by changing it to a try/finally, it has 
 to generate extra code to test whether the code succeeded or 
 not.  It's far simpler to just write out the scope(success) 
 code at the end of the scope in question.  As such, the 
 compiler is _pessimising_ the code.

I am aware of that. And although the examples were scope(exit) doesn't change how it's rewritten, only if finally is used. Depending on how much unrolling there is it might not be worth trying to reduce the try/catches, 3 scope levels as it is can be very hard to write correctly.
 OK, so it's a bit more complicated if the scope can be exited 
 via a return, break, continue or (dare I mention?) goto 
 statement, but even then it ought to be quite easy to make the 
 compiler generate more efficient code.

Not everything that seems obvious to us is easy to write for a computer to recognize; Who knows perhaps the compiler having one 'success' code may end up more efficiently managed by making it a delegate/lambda instead and calling it at each point rather than inlining the code to handle all code paths.
 This should be fixed, as it is discouraging use of 
 scope(success).

I'll agree, if something can't throw in a given scope then it should be able to remove the try/catches (as much as possible), and if scope(failure) can never be called it should be removed (although a warning might be nice as notification, a hint of a bug perhaps?)
Jan 16 2013
prev sibling next sibling parent "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Wednesday, 16 January 2013 at 23:19:20 UTC, Andrej Mitrovic 
wrote:
 Sample code:

 void callScope(ref int x)
 {
     x = 1;
     scope(success) { x = 2; }
     x = 3;
     scope(success) { x = 4; }
 }

 void callFunc(ref int x)
 {
     x = 1;
     x = 3;
     x = 4;
     x = 2;
 }

 void main()
 {
     int x;
     callScope(x);
     assert(x == 2);

     callFunc(x);
     assert(x == 2);
 }

 I was expecting callScope to be lowered down to the handwritten 
 code
 in callFunc in assembly, but instead it uses exceptions. Here's 
 some
 simple ASM for callFunc compiled with -c -release -O (no 
 inline):

 I'm trying to understand why. If an exception is thrown between 
 one of
 those assignment statements the stack will unwind and any 
 following
 assignment statements will not be called, so there shouldn't be 
 a need
 to check if an exception is thrown. Why doesn't the compiler 
 simply
 rewrite callScope to look like callFunc?

On linux, segfaults can be translated into exceptions using this module (https://github.com/D-Programming-Language/druntime/blob/master/src/etc/ inux/memoryerror.d) however I do not know how to use it - I get linker errors. On windows null pointer errors are translated into Object.Error (Access Violation) - I do not remember exactly. In any case, your void callScope(ref int x) can be blown up by: int* ptr; callScope(*ptr); so, exceptions may come when they are not expected.
Jan 17 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 17 January 2013 at 12:15:07 UTC, Maxim Fomin wrote:
 On linux, segfaults can be translated into exceptions using 
 this module 
 (https://github.com/D-Programming-Language/druntime/blob/master/src/etc/
inux/memoryerror.d) 
 however I do not know how to use it - I get linker errors.

Can you provide a sample code and the error message ?
Jan 17 2013
prev sibling next sibling parent "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Thursday, 17 January 2013 at 13:05:00 UTC, deadalnix wrote:
 On Thursday, 17 January 2013 at 12:15:07 UTC, Maxim Fomin wrote:
 On linux, segfaults can be translated into exceptions using 
 this module 
 (https://github.com/D-Programming-Language/druntime/blob/master/src/etc/
inux/memoryerror.d) 
 however I do not know how to use it - I get linker errors.

Can you provide a sample code and the error message ?

import etc.linux.memoryerror; import core.stdc.stdio : printf; class A { int x; } void main() { A a; try { a.x = 0; } catch(NullPointerError er) { printf("catched\n"); } } main.o:(.data+0xd0): undefined reference to `_D3etc5linux11memoryerror16NullPointerError7__ClassZ' collect2: error: ld returned 1 exit status --- errorlevel 1 InvalidPointerError does not work either. Perhaps this is a wrong usage, but I do not see any spec at dlang.org or inside source file.
Jan 17 2013
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 17 Jan 2013 14:45:25 +0100
schrieb "Maxim Fomin" <maxim maxim-fomin.ru>:
 main.o:(.data+0xd0): undefined reference to 
 `_D3etc5linux11memoryerror16NullPointerError7__ClassZ'
 collect2: error: ld returned 1 exit status
 --- errorlevel 1
 
 InvalidPointerError does not work either.
 
 Perhaps this is a wrong usage, but I do not see any spec at 
 dlang.org or inside source file.

Looks like etc.linux.memoryerror.d was not compiled into libdruntime/libphobos.a.
Jan 17 2013
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 17 Jan 2013 13:15:06 +0100
schrieb "Maxim Fomin" <maxim maxim-fomin.ru>:

 On linux, segfaults can be translated into exceptions using this 
 module 
 (https://github.com/D-Programming-Language/druntime/blob/master/src/etc/
inux/memoryerror.d) 
 however I do not know how to use it - I get linker errors.
 
 On windows null pointer errors are translated into Object.Error 
 (Access Violation) - I do not remember exactly.
 
 In any case, your void callScope(ref int x) can be blown up by:
 int* ptr; callScope(*ptr); so, exceptions may come when they are 
 not expected.

Uh, does that meany that a D compiler can never optimize by removing exception handling, even if it can prove that all involved methods are nothrow and don't throw errors as well?
Jan 17 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 1/17/13, Maxim Fomin <maxim maxim-fomin.ru> wrote:
 In any case, your void callScope(ref int x) can be blown up by:
 int* ptr; callScope(*ptr); so, exceptions may come when they are
 not expected.

That was the point of the sample code, exceptions can be thrown at any point and as a result the stack will unwind, hence the compiler can rewrite callScope to look like callFunc. This isn't about the *lack* of exceptions being thrown.
Jan 17 2013
prev sibling next sibling parent "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Thursday, 17 January 2013 at 15:35:26 UTC, Andrej Mitrovic 
wrote:
 On 1/17/13, Maxim Fomin <maxim maxim-fomin.ru> wrote:
 In any case, your void callScope(ref int x) can be blown up by:
 int* ptr; callScope(*ptr); so, exceptions may come when they 
 are
 not expected.

That was the point of the sample code, exceptions can be thrown at any point and as a result the stack will unwind, hence the compiler can rewrite callScope to look like callFunc. This isn't about the *lack* of exceptions being thrown.

Well, with success scope statement it seems you are right - compiler can optimize it in cases like this because if exception is thrown when assigning 1 to x, no further statements should be performed within callScope.
Jan 17 2013
prev sibling next sibling parent "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Thursday, 17 January 2013 at 15:16:07 UTC, Johannes Pfau wrote:
 Am Thu, 17 Jan 2013 13:15:06 +0100
 schrieb "Maxim Fomin" <maxim maxim-fomin.ru>:

 On linux, segfaults can be translated into exceptions using 
 this module 
 (https://github.com/D-Programming-Language/druntime/blob/master/src/etc/
inux/memoryerror.d) 
 however I do not know how to use it - I get linker errors.
 
 On windows null pointer errors are translated into 
 Object.Error (Access Violation) - I do not remember exactly.
 
 In any case, your void callScope(ref int x) can be blown up by:
 int* ptr; callScope(*ptr); so, exceptions may come when they 
 are not expected.

Uh, does that meany that a D compiler can never optimize by removing exception handling, even if it can prove that all involved methods are nothrow and don't throw errors as well?

DMD should optimize, but I was arguing that in this situation it should not do this.
Jan 17 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 17 January 2013 at 16:05:05 UTC, Maxim Fomin wrote:
 DMD should optimize, but I was arguing that in this situation 
 it should not do this.

I *love* D's "scope". It re-organizes some of the ugliest code I have ever seen into really beautiful code. However, having used it inside code that needs to run as fast as possible, I can tell you it is scary slow. As a matter of fact it was so scary slow one might even call it ridiculous. It *needs* to be improved if it wants to be taken seriously. For non-inner loops, or non performance oriented operations, such as closing file hands, or for SQL transactions, I think it is fine. However, don't use it in a loop. In particular, don't use it while decoding UTF!
Jan 17 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 17, 2013 at 06:20:56PM +0100, monarch_dodra wrote:
 On Thursday, 17 January 2013 at 16:05:05 UTC, Maxim Fomin wrote:
DMD should optimize, but I was arguing that in this situation it
should not do this.

I *love* D's "scope". It re-organizes some of the ugliest code I have ever seen into really beautiful code.

Yeah, not only so, but it also makes what would have been tricky exception-safe code in C++ really easy to write, and easy to do so *correctly*.
 However, having used it inside code that needs to run as fast as
 possible, I can tell you it is scary slow. As a matter of fact it
 was so scary slow one might even call it ridiculous. It *needs* to
 be improved if it wants to be taken seriously.

Well, there are a lot of areas in D that can use some implementation improvements. :) The language construct itself is awesome; what we need now is to make the implementation just as awesome.
 For non-inner loops, or non performance oriented operations, such as
 closing file hands, or for SQL transactions, I think it is fine.
 
 However, don't use it in a loop. In particular, don't use it while
 decoding UTF!

For UTF, I've always believed in state-machine based directly manipulation of UTF-8 (resp. -16, -32), ideally using a lexer generator or regex engine or some such mechanism. Decoding into a dchar range introduces painful amounts of overhead. OTOH, if scope introduces extraneous overhead in loops, that's definitely something to be looked at. How does the performance of scope compare between DMD, GDC, and LDC? Does GDC's superior optimizer reduce the overhead sufficiently? Or does the front-end need improving? T -- Век живи - век учись. А дураком помрёшь.
Jan 17 2013