D.gnu - Object file questions

Timo Sintonen (12/12) Aug 14 2014 I have been looking at object files to see if I can reduce the

Johannes Pfau (7/23) Aug 14 2014 Strange, could you post a testcase?

Timo Sintonen (8/34) Aug 14 2014 It seems this comes from libdruntime and it exists in object.o

Artur Skawina via D.gnu (15/27) Aug 14 2014 diff --git a/libphobos/libdruntime/gcc/atomics.d b/libphobos/libdruntime...
Johannes Pfau (22/64) Aug 14 2014 If you're referring to this:

Timo Sintonen (11/80) Aug 16 2014 Looks good. Template code is gone and init blocks have moved to

ketmar via D.gnu (8/9) Aug 16 2014 maybe this will work:
Johannes Pfau (11/102) Aug 16 2014 Iain recently pushed a commit to put zero initializers into bss, so

Timo Sintonen (35/53) Aug 16 2014 It is true that bss does not take place in the executable. But in

Johannes Pfau (4/29) Aug 16 2014 I just had a look at this and ClassInfo has a mutable 'monitor' field,

Mike (7/10) Aug 16 2014 This was discussed at DConf 2014.

Johannes Pfau (9/24) Aug 17 2014 Great! But I think this pull request addresses a different monitor

Mike (5/17) Aug 17 2014 I looked through the source code, and couldn't find any such

Johannes Pfau (14/35) Aug 17 2014 In gcc/d/d-objfile.cc: Search for

Artur Skawina via D.gnu (29/32) Aug 16 2014 [Only noticed this accidentally; using a mailing list

Mike (10/11) Aug 16 2014 It may be allowed, but it probably shouldn't be. Always-inlining

Artur Skawina via D.gnu (13/20) Aug 16 2014 Address-of should work -- disallowing it wouldn't help much, but would

Johannes Pfau (5/34) Aug 17 2014 We can make this explicit. I don't care enough to argue about that.

Artur Skawina via D.gnu (4/12) Aug 17 2014 *I* haven't encountered any problems and have been using functions+

Johannes Pfau (5/19) Aug 17 2014 Then I don't understand your statement at all. You said 'instead of

Artur Skawina via D.gnu (17/38) Aug 17 2014 I don't know - it wasn't me who proposed:

Mike (4/16) Aug 17 2014 Do you mean the problems with --gc-sections breaking code?

Timo Sintonen (7/43) Aug 16 2014 This seems to work.

Artur Skawina via D.gnu (41/45) Aug 16 2014 version (GNU) {
Artur Skawina via D.gnu (50/53) Aug 16 2014 I did not like that required dereference in the previous version,

Timo Sintonen (18/73) Aug 17 2014 This seems to work. With inlining the code is quite compact.

Johannes Pfau (12/100) Aug 17 2014 You mean __builtin_volatile_load/store? I'm not sure if compiler
Timo Sintonen (26/27) Aug 17 2014 This does not work with member functions

Artur Skawina via D.gnu (47/83) Aug 17 2014 It works for me:

Timo Sintonen (11/62) Aug 17 2014 I am compiling for arm and I am sorry I misinterpreted the

Artur Skawina via D.gnu (6/8) Aug 17 2014 Does declaring it as:

Timo Sintonen (11/26) Aug 17 2014 Yes, now it works.

Johannes Pfau (5/38) Aug 17 2014 r3 is an argument/scratch register, the callee can't rely on its

Johannes Pfau (3/4) Aug 17 2014 caller of course ;-)
Timo Sintonen (2/42) Aug 17 2014 So is this a bug or just undefined behavior?
Timo Sintonen (28/45) Aug 18 2014 I have had some weird bugs lately and then I looked my other

Artur Skawina via D.gnu (11/17) Aug 17 2014 Ensuring ordering, w/o it the compiler could reorder operations

Johannes Pfau (8/44) Aug 17 2014 That's a good start. Can you also get unary operators working?

Artur Skawina via D.gnu (93/99) Aug 17 2014 Unary ops are easy. If you mean post-inc and post-dec -- that's a

Johannes Pfau (10/12) Aug 17 2014 It's perfect for structs, but when simply declaring a Volatile!uint the

Artur Skawina via D.gnu (29/46) Aug 17 2014 Another D-problem - the language doesn't have /real/ refs. But...

"Timo Sintonen" <t.sintonen luukku.com> writes:

I have been looking at object files to see if I can reduce the 
memory usage for minimum systems. There are two things I have 
noticed:

1. In the data segment there is some source code as ascii text 
from a template in gcc/atomics.d . This is in the actual data 
segment and not in debug info segments and goes into the data 
segment of the executable. I do not see any code using this data. 
Why is this in the executable and is it possible to remove it?

2. In the data segment there is also __init for all types. I 
assume that they contain the initial values that are copied when 
a new object of this type is created. Is this data mutable and 
should it really be in data segment and not in rodata?

Aug 14 2014

Johannes Pfau <nospam example.com> writes:

Am Thu, 14 Aug 2014 10:07:04 +0000
schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 I have been looking at object files to see if I can reduce the 
 memory usage for minimum systems. There are two things I have 
 noticed:
 
 1. In the data segment there is some source code as ascii text 
 from a template in gcc/atomics.d . This is in the actual data 
 segment and not in debug info segments and goes into the data 
 segment of the executable. I do not see any code using this data. 
 Why is this in the executable and is it possible to remove it?
 

Strange, could you post a testcase?

 2. In the data segment there is also __init for all types. I 
 assume that they contain the initial values that are copied when 
 a new object of this type is created.

Correct, it's for '.init' (there's especially __..._TypeInfo_init which
is the initializer for typeinfo. I've implemented -fno-rtti in a private
git branch to get rid of typeinfo)

 Is this data mutable and 
 should it really be in data segment and not in rodata?
 

I think it should be in rodata.

Aug 14 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:
 Am Thu, 14 Aug 2014 10:07:04 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 I have been looking at object files to see if I can reduce the 
 memory usage for minimum systems. There are two things I have 
 noticed:
 
 1. In the data segment there is some source code as ascii text 
 from a template in gcc/atomics.d . This is in the actual data 
 segment and not in debug info segments and goes into the data 
 segment of the executable. I do not see any code using this 
 data. Why is this in the executable and is it possible to 
 remove it?
 

 Strange, could you post a testcase?

It seems this comes from libdruntime and it exists in object.o 
and core/atomic.o, Testcase is to compile minlibd library as it 
is currently in the repo using the makefile as such.
But I think it will be in any object file that imports 
gcc.atomics and uses the template in there.

 2. In the data segment there is also __init for all types. I 
 assume that they contain the initial values that are copied 
 when a new object of this type is created.

 Correct, it's for '.init' (there's especially 
 __..._TypeInfo_init which
 is the initializer for typeinfo. I've implemented -fno-rtti in 
 a private
 git branch to get rid of typeinfo)

 Is this data mutable and should it really be in data segment 
 and not in rodata?
 

 I think it should be in rodata.

So it is not a bug and not a feature. It is just because it does 
not matter? Maybe a feature request?

Aug 14 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/14/14 19:53, Timo Sintonen via D.gnu wrote:
 On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:
 Am Thu, 14 Aug 2014 10:07:04 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 I have been looking at object files to see if I can reduce the memory usage
for minimum systems. There are two things I have noticed:

 1. In the data segment there is some source code as ascii text from a template
in gcc/atomics.d . This is in the actual data segment and not in debug info
segments and goes into the data segment of the executable. I do not see any
code using this data. Why is this in the executable and is it possible to
remove it?

 Strange, could you post a testcase?

 It seems this comes from libdruntime and it exists in object.o and
core/atomic.o, Testcase is to compile minlibd library as it is currently in the
repo using the makefile as such.
 But I think it will be in any object file that imports gcc.atomics and uses
the template in there.

diff --git a/libphobos/libdruntime/gcc/atomics.d
b/libphobos/libdruntime/gcc/atomics.d
index 78e644191e8f..ee1a146b680e 100644
--- a/libphobos/libdruntime/gcc/atomics.d
+++ b/libphobos/libdruntime/gcc/atomics.d
   -28,7 +28,7    import gcc.builtins;
  */
 private template __sync_op_and(string op1, string op2)
 {
-    const __sync_op_and = `
+    enum __sync_op_and = `
 T __sync_` ~ op1 ~ `_and_` ~ op2 ~ `(T)(const ref shared T ptr, T value)
 {
     static if (T.sizeof == byte.sizeof)

artur

Aug 14 2014

Johannes Pfau <nospam example.com> writes:

Am Thu, 14 Aug 2014 17:53:32 +0000
schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau wrote:
 Am Thu, 14 Aug 2014 10:07:04 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 I have been looking at object files to see if I can reduce the 
 memory usage for minimum systems. There are two things I have 
 noticed:
 
 1. In the data segment there is some source code as ascii text 
 from a template in gcc/atomics.d . This is in the actual data 
 segment and not in debug info segments and goes into the data 
 segment of the executable. I do not see any code using this 
 data. Why is this in the executable and is it possible to 
 remove it?
 

 Strange, could you post a testcase?

 It seems this comes from libdruntime and it exists in object.o 
 and core/atomic.o, Testcase is to compile minlibd library as it 
 is currently in the repo using the makefile as such.
 But I think it will be in any object file that imports 
 gcc.atomics and uses the template in there.
 

If you're referring to this:
http://dpaste.dzfl.pl/fe75e8c7dfca

This seems to be the const variable in __sync_op_and. Try to change the
code to "immutable __sync_op_and = " or "enum __sync_op_and = " and
file a bug report.

 2. In the data segment there is also __init for all types. I 
 assume that they contain the initial values that are copied 
 when a new object of this type is created.

 Correct, it's for '.init' (there's especially 
 __..._TypeInfo_init which
 is the initializer for typeinfo. I've implemented -fno-rtti in 
 a private
 git branch to get rid of typeinfo)

 Is this data mutable and should it really be in data segment 
 and not in rodata?
 

 I think it should be in rodata.

 
 So it is not a bug and not a feature. It is just because it does 
 not matter? Maybe a feature request?

Seems to happen only for the TypeInfo init symbols. I can't run the
testsuite right now, but try this:

diff --git a/gcc/d/d-decls.cc b/gcc/d/d-decls.cc
index bd6f5f9..45d433a 100644
--- a/gcc/d/d-decls.cc
+++ b/gcc/d/d-decls.cc
   -274,6 +274,8    TypeInfoDeclaration::toSymbol (void)
       // given TypeInfo.  It is the actual data, not a reference
       gcc_assert (TREE_CODE (TREE_TYPE (csym->Stree)) ==
REFERENCE_TYPE); TREE_TYPE (csym->Stree) = TREE_TYPE (TREE_TYPE
(csym->Stree));
+      TREE_CONSTANT (csym->Stree) = true;
+      TREE_READONLY (csym->Stree) = true;
       relayout_decl (csym->Stree);
       TREE_USED (csym->Stree) = 1;

Aug 14 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Thursday, 14 August 2014 at 19:05:46 UTC, Johannes Pfau wrote:
 Am Thu, 14 Aug 2014 17:53:32 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau 
 wrote:
 Am Thu, 14 Aug 2014 10:07:04 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 I have been looking at object files to see if I can reduce 
 the memory usage for minimum systems. There are two things 
 I have noticed:
 
 1. In the data segment there is some source code as ascii 
 text from a template in gcc/atomics.d . This is in the 
 actual data segment and not in debug info segments and goes 
 into the data segment of the executable. I do not see any 
 code using this data. Why is this in the executable and is 
 it possible to remove it?
 

 Strange, could you post a testcase?

 It seems this comes from libdruntime and it exists in object.o 
 and core/atomic.o, Testcase is to compile minlibd library as 
 it is currently in the repo using the makefile as such.
 But I think it will be in any object file that imports 
 gcc.atomics and uses the template in there.
 

 If you're referring to this:
 http://dpaste.dzfl.pl/fe75e8c7dfca

 This seems to be the const variable in __sync_op_and. Try to 
 change the
 code to "immutable __sync_op_and = " or "enum __sync_op_and = " 
 and
 file a bug report.

 2. In the data segment there is also __init for all types. 
 I assume that they contain the initial values that are 
 copied when a new object of this type is created.

 Correct, it's for '.init' (there's especially 
 __..._TypeInfo_init which
 is the initializer for typeinfo. I've implemented -fno-rtti 
 in a private
 git branch to get rid of typeinfo)

 Is this data mutable and should it really be in data 
 segment and not in rodata?
 

 I think it should be in rodata.

 
 So it is not a bug and not a feature. It is just because it 
 does not matter? Maybe a feature request?

 Seems to happen only for the TypeInfo init symbols. I can't run 
 the
 testsuite right now, but try this:

 diff --git a/gcc/d/d-decls.cc b/gcc/d/d-decls.cc
 index bd6f5f9..45d433a 100644
 --- a/gcc/d/d-decls.cc
 +++ b/gcc/d/d-decls.cc
    -274,6 +274,8    TypeInfoDeclaration::toSymbol (void)
        // given TypeInfo.  It is the actual data, not a 
 reference
        gcc_assert (TREE_CODE (TREE_TYPE (csym->Stree)) ==
 REFERENCE_TYPE); TREE_TYPE (csym->Stree) = TREE_TYPE (TREE_TYPE
 (csym->Stree));
 +      TREE_CONSTANT (csym->Stree) = true;
 +      TREE_READONLY (csym->Stree) = true;
        relayout_decl (csym->Stree);
        TREE_USED (csym->Stree) = 1;

Looks good. Template code is gone and init blocks have moved to 
rodata. My simple test program compiles and runs.

There is still some __Class in data segment and init values for 
structs and arrays in bss segment. Is it possible to move these 
to rodata too?


In my application there will be several large structs. I never 
create anything of these types. Instead I use them to point to 
hardware registers and maybe on top of existing byte arrays like 
message buffers. There will still be initial values for these 
structs wasting memory. Is there any way to omit them?

Aug 16 2014

"ketmar via D.gnu" <d.gnu puremagic.com> writes:

On Sat, 16 Aug 2014 07:06:34 +0000
"Timo Sintonen via D.gnu" <d.gnu puremagic.com> wrote:

 structs wasting memory. Is there any way to omit them?

maybe this will work:

struct A {
  int n =3D void;
  uint[2] a =3D void;
  ...and so on for all fields
}

Aug 16 2014

Johannes Pfau <nospam example.com> writes:

Am Sat, 16 Aug 2014 07:06:34 +0000
schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 On Thursday, 14 August 2014 at 19:05:46 UTC, Johannes Pfau wrote:
 Am Thu, 14 Aug 2014 17:53:32 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 On Thursday, 14 August 2014 at 17:13:23 UTC, Johannes Pfau 
 wrote:
 Am Thu, 14 Aug 2014 10:07:04 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 I have been looking at object files to see if I can reduce 
 the memory usage for minimum systems. There are two things 
 I have noticed:
 
 1. In the data segment there is some source code as ascii 
 text from a template in gcc/atomics.d . This is in the 
 actual data segment and not in debug info segments and goes 
 into the data segment of the executable. I do not see any 
 code using this data. Why is this in the executable and is 
 it possible to remove it?
 

 Strange, could you post a testcase?

 It seems this comes from libdruntime and it exists in object.o 
 and core/atomic.o, Testcase is to compile minlibd library as 
 it is currently in the repo using the makefile as such.
 But I think it will be in any object file that imports 
 gcc.atomics and uses the template in there.
 

 If you're referring to this:
 http://dpaste.dzfl.pl/fe75e8c7dfca

 This seems to be the const variable in __sync_op_and. Try to 
 change the
 code to "immutable __sync_op_and = " or "enum __sync_op_and = " 
 and
 file a bug report.

 2. In the data segment there is also __init for all types. 
 I assume that they contain the initial values that are 
 copied when a new object of this type is created.

 Correct, it's for '.init' (there's especially 
 __..._TypeInfo_init which
 is the initializer for typeinfo. I've implemented -fno-rtti 
 in a private
 git branch to get rid of typeinfo)

 Is this data mutable and should it really be in data 
 segment and not in rodata?
 

 I think it should be in rodata.

 
 So it is not a bug and not a feature. It is just because it 
 does not matter? Maybe a feature request?

 Seems to happen only for the TypeInfo init symbols. I can't run 
 the
 testsuite right now, but try this:

 diff --git a/gcc/d/d-decls.cc b/gcc/d/d-decls.cc
 index bd6f5f9..45d433a 100644
 --- a/gcc/d/d-decls.cc
 +++ b/gcc/d/d-decls.cc
    -274,6 +274,8    TypeInfoDeclaration::toSymbol (void)
        // given TypeInfo.  It is the actual data, not a 
 reference
        gcc_assert (TREE_CODE (TREE_TYPE (csym->Stree)) ==
 REFERENCE_TYPE); TREE_TYPE (csym->Stree) = TREE_TYPE (TREE_TYPE
 (csym->Stree));
 +      TREE_CONSTANT (csym->Stree) = true;
 +      TREE_READONLY (csym->Stree) = true;
        relayout_decl (csym->Stree);
        TREE_USED (csym->Stree) = 1;

 
 Looks good. Template code is gone and init blocks have moved to 
 rodata. My simple test program compiles and runs.
 
 There is still some __Class in data segment and init values for 
 structs and arrays in bss segment. Is it possible to move these 
 to rodata too?
 

Iain recently pushed a commit to put zero initializers into bss, so
that's intentional:
http://bugzilla.gdcproject.org/show_bug.cgi?id=139
But I understand your point that it should be in rodata instead, you'll
have to discuss this with Iain.

Regarding __Class: Can you post a short example?

 
 In my application there will be several large structs. I never 
 create anything of these types. Instead I use them to point to 
 hardware registers and maybe on top of existing byte arrays like 
 message buffers. There will still be initial values for these 
 structs wasting memory. Is there any way to omit them?
 

See 
https://github.com/D-Programming-GDC/GDC/pull/82
 attribute("noinit")

Aug 16 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Saturday, 16 August 2014 at 07:36:07 UTC, Johannes Pfau wrote:

 Iain recently pushed a commit to put zero initializers into 
 bss, so
 that's intentional:
 http://bugzilla.gdcproject.org/show_bug.cgi?id=139
 But I understand your point that it should be in rodata 
 instead, you'll
 have to discuss this with Iain.

It is true that bss does not take place in the executable. But in 
small processors, even there is nowadays plenty of rom there is 
not enough ram. It is also a question of safety: in the long run, 
data area may be corrupted by buggy program or electrical distort 
while rodata in rom cannot be changed.
At least in my setup, gold maps bss to executable anyway while ld 
does not.

I noticed your comment in the bug report. I was just thinking the 
same: one big block of zeros that is used by all.
Another that I was thinking is that memset might be used for 
these types. Then there would be no block of zeros at all. But 
that would require an extra flag in typeinfo to separate these 
types from others...

 Regarding __Class: Can you post a short example?

Some lines from mapfile. Seems to be one for every type in the 
program:

  .data          0x0000000020001074      0x720 
minlibd/libdruntime/libdruntime.a(object_.o)
                 0x0000000020001074                
_D9Exception7__ClassZ
                 0x00000000200010c0                
_D8TypeInfo7__ClassZ
                 0x000000002000110c                
_D17TypeInfo_Function7__ClassZ
                 0x0000000020001158                
_D17TypeInfo_Delegate7__ClassZ
                 0x00000000200011a4                
_D14TypeInfo_Class7__ClassZ
                 0x00000000200011f0                
_D18TypeInfo_Interface7__ClassZ
                 0x000000002000123c                
_D15TypeInfo_Struct7__ClassZ
                 0x0000000020001288                
_D16TypeInfo_Typedef7__ClassZ

 
 In my application there will be several large structs. I never 
 create anything of these types. Instead I use them to point to 
 hardware registers and maybe on top of existing byte arrays 
 like message buffers. There will still be initial values for 
 these structs wasting memory. Is there any way to omit them?
 

 See
 https://github.com/D-Programming-GDC/GDC/pull/82
  attribute("noinit")

Yes this will solve the problem.

Aug 16 2014

Johannes Pfau <nospam example.com> writes:

Am Sat, 16 Aug 2014 08:39:04 +0000
schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 
 Regarding __Class: Can you post a short example?

 
 Some lines from mapfile. Seems to be one for every type in the 
 program:
 
   .data          0x0000000020001074      0x720 
 minlibd/libdruntime/libdruntime.a(object_.o)
                  0x0000000020001074                
 _D9Exception7__ClassZ
                  0x00000000200010c0                
 _D8TypeInfo7__ClassZ
                  0x000000002000110c                
 _D17TypeInfo_Function7__ClassZ
                  0x0000000020001158                
 _D17TypeInfo_Delegate7__ClassZ
                  0x00000000200011a4                
 _D14TypeInfo_Class7__ClassZ
                  0x00000000200011f0                
 _D18TypeInfo_Interface7__ClassZ
                  0x000000002000123c                
 _D15TypeInfo_Struct7__ClassZ
                  0x0000000020001288                
 _D16TypeInfo_Typedef7__ClassZ
 

I just had a look at this and ClassInfo has a mutable 'monitor' field,
so it can't be placed into read-only data.

Aug 16 2014

"Mike" <none none.com> writes:

On Saturday, 16 August 2014 at 09:29:14 UTC, Johannes Pfau wrote:

 I just had a look at this and ClassInfo has a mutable 'monitor' 
 field,
 so it can't be placed into read-only data.

This was discussed at DConf 2014.  
https://www.youtube.com/watch?v=TNvUIWFy02I#t=1008


There is currently a pull request to remove the monitor from 
object field from object and therefore all classes: 
https://github.com/D-Programming-Language/druntime/pull/789.

Mike

Aug 16 2014

Johannes Pfau <nospam example.com> writes:

Am Sat, 16 Aug 2014 10:36:19 +0000
schrieb "Mike" <none none.com>:

 On Saturday, 16 August 2014 at 09:29:14 UTC, Johannes Pfau wrote:
 
 I just had a look at this and ClassInfo has a mutable 'monitor' 
 field,
 so it can't be placed into read-only data.

 
 This was discussed at DConf 2014.  
 https://www.youtube.com/watch?v=TNvUIWFy02I#t=1008
 
 
 There is currently a pull request to remove the monitor from 
 object field from object and therefore all classes: 
 https://github.com/D-Programming-Language/druntime/pull/789.
 
 Mike

Great! But I think this pull request addresses a different monitor
problem: There's an implicit __monitor field in every class right now,
which makes every class _instance_ bigger.

But the monitor in TypeInfo/ClassInfo is different: ClassInfo exists
only once per class, it doesn't matter how many class instances you've
got. AFAIR this monitor is to support synchronize(ClassType) which
synchronizes on the class type, not on an instance.

Aug 17 2014

"Mike" <none none.com> writes:

On Sunday, 17 August 2014 at 08:26:40 UTC, Johannes Pfau wrote:
 Great! But I think this pull request addresses a different 
 monitor
 problem: There's an implicit __monitor field in every class 
 right now,
 which makes every class _instance_ bigger.

 But the monitor in TypeInfo/ClassInfo is different: ClassInfo 
 exists
 only once per class, it doesn't matter how many class instances 
 you've
 got. AFAIR this monitor is to support synchronize(ClassType) 
 which
 synchronizes on the class type, not on an instance.

I looked through the source code, and couldn't find any such
monitor.  Can you please point it out for me?

Thanks,
Mike

Aug 17 2014

Johannes Pfau <nospam example.com> writes:

Am Sun, 17 Aug 2014 10:44:34 +0000
schrieb "Mike" <none none.com>:

 On Sunday, 17 August 2014 at 08:26:40 UTC, Johannes Pfau wrote:
 Great! But I think this pull request addresses a different 
 monitor
 problem: There's an implicit __monitor field in every class 
 right now,
 which makes every class _instance_ bigger.

 But the monitor in TypeInfo/ClassInfo is different: ClassInfo 
 exists
 only once per class, it doesn't matter how many class instances 
 you've
 got. AFAIR this monitor is to support synchronize(ClassType) 
 which
 synchronizes on the class type, not on an instance.

 
 I looked through the source code, and couldn't find any such
 monitor.  Can you please point it out for me?
 
 Thanks,
 Mike

In gcc/d/d-objfile.cc: Search for

  /* Put out the ClassInfo.
   * The layout is:
   *  void **vptr;
   *  monitor_t monitor;
   *  byte[] initializer;         // static initialisation data

Actually I just realized that this is also true for all TypeInfo, so
I'll have to revert the commit which placed TypeInfo into .rodata

(Thinking more about it, it's more or less the same monitor as the
one referred in the pull request: TypeInfo are classes and for every
type there is one instance, then these instances have __monitor fields.
But the implementation in the compiler is slightly different)

Aug 17 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/16/14 09:33, Johannes Pfau via D.gnu wrote:
 https://github.com/D-Programming-GDC/GDC/pull/82

[Only noticed this accidentally; using a mailing list
instead of some web forum would increase visibility...]

  enum var = Volatile!(T,addr)(): doesn't allow |= on enum literals, even if
the type implements opAssign as there's no this pointer

   T volatile_load(T)(ref T v) nothrow {
      asm { "" : "+m" v; }
      T res = v;
      asm { "" : "+g" res; }
      return res;
   }

   void volatile_store(T)(ref T v, const T a) nothrow {
      asm { "" : : "m" v; }
      v = a;
      asm { "" : "+m" v; }
   }
      
   struct Volatile(T, alias /* T* */ A) {
       void opOpAssign(string OP)(const T rhs) nothrow {
           auto v = volatile_load(*A);
           mixin("v " ~ OP ~ "= rhs;");
           volatile_store(*A, v);
       }
   }

   enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)();

   void main() {
      TimerB |= 0b1;
      TimerB += 1;
   }

 not emitting force-inlined functions is a logical optimization for forceinline
(if a function is always inlined, there's no way to call it, so there's no need
to output it).

Taking the address of an always_inline function is allowed.

artur

Aug 16 2014

"Mike" <none none.com> writes:

On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via 
D.gnu wrote:
 Taking the address of an always_inline function is allowed.

It may be allowed, but it probably shouldn't be.  Always-inlining 
a function and taking the address of that function is 
contradictory.

But this situation demonstrates why having an intelligent linker 
is a better solution than decorating with attributes.  The linker 
should know if you took an address of an always-inlined function 
or not and decide whether or not to remove it from the binary.

Mike

Aug 16 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/16/14 12:41, Mike via D.gnu wrote:
 On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via D.gnu wrote:
 Taking the address of an always_inline function is allowed.

 
 It may be allowed, but it probably shouldn't be.  Always-inlining a function
and taking the address of that function is contradictory.

Address-of should work -- disallowing it wouldn't help much, but would
create problems for code that needs to call the function both directly
and indirectly. This is actually a larger problem for D than for C (where
it's allowed) because of generic code, templates and delegates. The
alternative would be requiring trivial not- inline wrappers and compile
failures if one is accidentally forgotten.

A ` nocode` attribute would be a good idea, yes, but there's no need
to make it implicit for ` inline`.

 But this situation demonstrates why having an intelligent linker is a better
solution than decorating with attributes.  The linker should know if you took
an address of an always-inlined function or not and decide whether or not to
remove it from the binary.

It already does. Apparently there are some kind of problems with
certain setups, but, instead of addressing those problems, more and
more /language/ hacks are proposed...

artur

Aug 16 2014

Johannes Pfau <nospam example.com> writes:

Am Sat, 16 Aug 2014 13:15:57 +0200
schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:

 On 08/16/14 12:41, Mike via D.gnu wrote:
 On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via
 D.gnu wrote:
 Taking the address of an always_inline function is allowed.

 
 It may be allowed, but it probably shouldn't be.  Always-inlining a
 function and taking the address of that function is contradictory.

 
 Address-of should work -- disallowing it wouldn't help much, but would
 create problems for code that needs to call the function both directly
 and indirectly. This is actually a larger problem for D than for C
 (where it's allowed) because of generic code, templates and
 delegates. The alternative would be requiring trivial not- inline
 wrappers and compile failures if one is accidentally forgotten.
 
 A ` nocode` attribute would be a good idea, yes, but there's no need
 to make it implicit for ` inline`.

We can make this explicit. I don't care enough to argue about that.

 But this situation demonstrates why having an intelligent linker is
 a better solution than decorating with attributes.  The linker
 should know if you took an address of an always-inlined function or
 not and decide whether or not to remove it from the binary.

 
 It already does. Apparently there are some kind of problems with
 certain setups, but, instead of addressing those problems, more and
 more /language/ hacks are proposed...
 
 artur

So as you know all these problems and you know exactly how to fix them,
where's your contribution?

Aug 17 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/17/14 10:31, Johannes Pfau via D.gnu wrote:
 Am Sat, 16 Aug 2014 13:15:57 +0200
 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:
 
 It already does. Apparently there are some kind of problems with
 certain setups, but, instead of addressing those problems, more and
 more /language/ hacks are proposed...


 So as you know all these problems and you know exactly how to fix them,
 where's your contribution?

*I* haven't encountered any problems and have been using functions+
data+gc-sections for years...

artur

Aug 17 2014

Johannes Pfau <nospam example.com> writes:

Am Sun, 17 Aug 2014 13:38:36 +0200
schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:

 On 08/17/14 10:31, Johannes Pfau via D.gnu wrote:
 Am Sat, 16 Aug 2014 13:15:57 +0200
 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:
 
 It already does. Apparently there are some kind of problems with
 certain setups, but, instead of addressing those problems, more and
 more /language/ hacks are proposed...


 
 So as you know all these problems and you know exactly how to fix
 them, where's your contribution?

 
 *I* haven't encountered any problems and have been using functions+
 data+gc-sections for years...
 

Then I don't understand your statement at all. You said 'instead of
addressing those problems' but there are no problems? Also what exactly
are 'more /language/ hacks'?

Aug 17 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/17/14 13:57, Johannes Pfau via D.gnu wrote:
 Am Sun, 17 Aug 2014 13:38:36 +0200
 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:
 
 On 08/17/14 10:31, Johannes Pfau via D.gnu wrote:
 Am Sat, 16 Aug 2014 13:15:57 +0200
 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:

 It already does. Apparently there are some kind of problems with
 certain setups, but, instead of addressing those problems, more and
 more /language/ hacks are proposed...


 So as you know all these problems and you know exactly how to fix
 them, where's your contribution?

 *I* haven't encountered any problems and have been using functions+
 data+gc-sections for years...

 
 Then I don't understand your statement at all. You said 'instead of
 addressing those problems' but there are no problems?

I don't know - it wasn't me who proposed:

- attribute("noinit")
- attribute("notypeinfo")
- attribute("nocode")
- pragma(GNU_nomoduleinfo)

etc

 Also what exactly are 'more /language/ hacks'?

The above, volatile attribute etc. Note that I agree (some of) those
are necessary -- it's just that they are all useful for certain very
specific cases -- they are not a general solution to the codegen
bloat problem. A situation where practically every declaration and
almost every scope in a D program needs to be annotated with compiler-
-specific non-portable annotations is not a good one. And not even
a practical one -- it not reasonable to expect everyone to modify
the source of every used library (!) to match the requirements of
every project (some may need RTTI, other may not want it at all, etc).

artur

Aug 17 2014

"Mike" <none none.com> writes:

On Saturday, 16 August 2014 at 11:16:09 UTC, Artur Skawina via
D.gnu wrote:

 A ` nocode` attribute would be a good idea, yes, but there's no 
 need
 to make it implicit for ` inline`.

 But this situation demonstrates why having an intelligent 
 linker is a better solution than decorating with attributes.  
 The linker should know if you took an address of an 
 always-inlined function or not and decide whether or not to 
 remove it from the binary.

 It already does. Apparently there are some kind of problems with
 certain setups, but, instead of addressing those problems, more 
 and
 more /language/ hacks are proposed...

Do you mean the problems with --gc-sections breaking code?

Mike

Aug 17 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Saturday, 16 August 2014 at 09:59:03 UTC, Artur Skawina via 
D.gnu wrote:
 On 08/16/14 09:33, Johannes Pfau via D.gnu wrote:
 https://github.com/D-Programming-GDC/GDC/pull/82

 [Only noticed this accidentally; using a mailing list
 instead of some web forum would increase visibility...]

  enum var = Volatile!(T,addr)(): doesn't allow |= on enum 
 literals, even if the type implements opAssign as there's no 
 this pointer

    T volatile_load(T)(ref T v) nothrow {
       asm { "" : "+m" v; }
       T res = v;
       asm { "" : "+g" res; }
       return res;
    }

    void volatile_store(T)(ref T v, const T a) nothrow {
       asm { "" : : "m" v; }
       v = a;
       asm { "" : "+m" v; }
    }
 
    struct Volatile(T, alias /* T* */ A) {
        void opOpAssign(string OP)(const T rhs) nothrow {
            auto v = volatile_load(*A);
            mixin("v " ~ OP ~ "= rhs;");
            volatile_store(*A, v);
        }
    }

    enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)();

    void main() {
       TimerB |= 0b1;
       TimerB += 1;
    }

 not emitting force-inlined functions is a logical optimization 
 for forceinline (if a function is always inlined, there's no 
 way to call it, so there's no need to output it).

 Taking the address of an always_inline function is allowed.

 artur

This seems to work.

I am not so familiar with these opAssign things, so how I can do 
basic assignment: TimerB = 0x1234  ?

How can I use this with struct members ?

Is it possible to inline volatile_load and volatile_store ?

Aug 16 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/16/14 18:46, Timo Sintonen via D.gnu wrote:
 
 I am not so familiar with these opAssign things, so how I can do basic
assignment: TimerB = 0x1234  ?

 Is it possible to inline volatile_load and volatile_store ?

   version (GNU) {
   static import gcc.attribute;
   enum inline = gcc.attribute.attribute("forceinline");
   }
   
   extern int volatile_dummy;
   
    inline T volatile_load(T)(ref T v) nothrow {
      asm { "" : "+m" v, "+m" volatile_dummy; }
      T res = v;
      asm { "" : "+g" res, "+m" volatile_dummy; }
      return res;
   }

    inline void volatile_store(T, A)(ref T v, A a) nothrow {
      asm { "" : "+m" volatile_dummy : "m" v; }
      v = a;
      asm { "" : "+m" v, "+m" volatile_dummy; }
   }
   
   static struct Volatile(T, alias PTR) {
      static: nothrow:  inline:
      void opOpAssign(string OP)(const T rhs) {
           auto v = volatile_load(*PTR);
           mixin("v " ~ OP ~ "= rhs;");
           volatile_store(*PTR, v);
      }
      void opAssign()(const T rhs) { volatile_store(*PTR, rhs); }
      T opUnary(string OP:"*")() { return volatile_load(*PTR); }
   }

   enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)();

   int main() {
      TimerB |= 0b1;
      TimerB += 1;
      TimerB = 42;
      return *TimerB;
   }

 How can I use this with struct members ?

One possibility would be to declare all members as `Volatile!...`, or
even create such a struct at CT. Another solution would be something
like http://forum.dlang.org/post/mailman.4237.1405540813.2907.digital
ars-d puremagic.com .

artur

Aug 16 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/16/14 20:40, Artur Skawina wrote:
 How can I use this with struct members ?

 
 One possibility would be to declare all members as `Volatile!...`, or

I did not like that required dereference in the previous version,
and tried a different approach:

   struct Timer
   {
       Volatile!uint control;
       Volatile!uint data;
   }

   enum timerA = cast(Timer*)0xDEADBEAF;

   int main() {
      timerA.control |= 0b1;
      timerA.control += 1;
      timerA.control = 42;
      int a = timerA.data - timerA.data;
      int b = timerA.control;
      return timerA.control;
   }

   version (GNU) {
   static import gcc.attribute;
   enum inline = gcc.attribute.attribute("forceinline");
   }
   
   extern int volatile_dummy;
   
    inline T volatile_load(T)(ref T v) nothrow {
      asm { "" : "+m" v, "+m" volatile_dummy; }
      T res = v;
      asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; }
      return res;
   }

    inline void volatile_store(T, A)(ref T v, A a) nothrow {
      asm { "" : "+m" volatile_dummy : "m" v; }
      v = a;
      asm { "" : "+m" v, "+m" volatile_dummy; }
   }
   
   struct Volatile(T) {
      T raw;
      nothrow:  inline:
       disable this(this);
      void opAssign(A)(A a) { volatile_store(raw, a); }
      T load()  property { return volatile_load(raw); }
      alias load this;
      void opOpAssign(string OP)(const T rhs) {
           auto v = volatile_load(raw);
           mixin("v " ~ OP ~ "= rhs;");
           volatile_store(raw, v);
      }
   }


artur

Aug 16 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Saturday, 16 August 2014 at 20:01:06 UTC, Artur Skawina via
D.gnu wrote:
 On 08/16/14 20:40, Artur Skawina wrote:
 How can I use this with struct members ?

 
 One possibility would be to declare all members as 
 `Volatile!...`, or

 I did not like that required dereference in the previous 
 version,
 and tried a different approach:

    struct Timer
    {
        Volatile!uint control;
        Volatile!uint data;
    }

    enum timerA = cast(Timer*)0xDEADBEAF;

    int main() {
       timerA.control |= 0b1;
       timerA.control += 1;
       timerA.control = 42;
       int a = timerA.data - timerA.data;
       int b = timerA.control;
       return timerA.control;
    }

    version (GNU) {
    static import gcc.attribute;
    enum inline = gcc.attribute.attribute("forceinline");
    }
 
    extern int volatile_dummy;
 
     inline T volatile_load(T)(ref T v) nothrow {
       asm { "" : "+m" v, "+m" volatile_dummy; }
       T res = v;
       asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; }
       return res;
    }

     inline void volatile_store(T, A)(ref T v, A a) nothrow {
       asm { "" : "+m" volatile_dummy : "m" v; }
       v = a;
       asm { "" : "+m" v, "+m" volatile_dummy; }
    }
 
    struct Volatile(T) {
       T raw;
       nothrow:  inline:
        disable this(this);
       void opAssign(A)(A a) { volatile_store(raw, a); }
       T load()  property { return volatile_load(raw); }
       alias load this;
       void opOpAssign(string OP)(const T rhs) {
            auto v = volatile_load(raw);
            mixin("v " ~ OP ~ "= rhs;");
            volatile_store(raw, v);
       }
    }


 artur

This seems to work. With inlining the code is quite compact.

Not tested yet but the code for these constructs looks correct:
for (f=0;f<50;f++) { regs.txreg = śomebuf[f] }
while (regs.status == 0) {}

What is the purpose of volatile_dummy? Even if it is not used,
the address for it is calculated in several places.

The struct members are defined saparately. This means the address
of every member is stored and fetched separately. The compiler
seems to remove some of these and use the pointer, but I am not
sure what happens when the structs are bigger.

It seems all loads and stores access the real memory, like
volatile should do. It is hard to follow the optimized code so I
am not yet sure that they have not been reordered in any way.

Anyway, this seems acceptable solution to me.

Johannes, is this good starting point to you or is your work with
compiler builtins giving us some more?

Aug 17 2014

Johannes Pfau <nospam example.com> writes:

Am Sun, 17 Aug 2014 07:57:15 +0000
schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 On Saturday, 16 August 2014 at 20:01:06 UTC, Artur Skawina via
 D.gnu wrote:
 On 08/16/14 20:40, Artur Skawina wrote:
 How can I use this with struct members ?

=20
 One possibility would be to declare all members as=20
 `Volatile!...`, or

 I did not like that required dereference in the previous=20
 version,
 and tried a different approach:

    struct Timer
    {
        Volatile!uint control;
        Volatile!uint data;
    }

    enum timerA =3D cast(Timer*)0xDEADBEAF;

    int main() {
       timerA.control |=3D 0b1;
       timerA.control +=3D 1;
       timerA.control =3D 42;
       int a =3D timerA.data - timerA.data;
       int b =3D timerA.control;
       return timerA.control;
    }

    version (GNU) {
    static import gcc.attribute;
    enum inline =3D gcc.attribute.attribute("forceinline");
    }
=20
    extern int volatile_dummy;
=20
     inline T volatile_load(T)(ref T v) nothrow {
       asm { "" : "+m" v, "+m" volatile_dummy; }
       T res =3D v;
       asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; }
       return res;
    }

     inline void volatile_store(T, A)(ref T v, A a) nothrow {
       asm { "" : "+m" volatile_dummy : "m" v; }
       v =3D a;
       asm { "" : "+m" v, "+m" volatile_dummy; }
    }
=20
    struct Volatile(T) {
       T raw;
       nothrow:  inline:
        disable this(this);
       void opAssign(A)(A a) { volatile_store(raw, a); }
       T load()  property { return volatile_load(raw); }
       alias load this;
       void opOpAssign(string OP)(const T rhs) {
            auto v =3D volatile_load(raw);
            mixin("v " ~ OP ~ "=3D rhs;");
            volatile_store(raw, v);
       }
    }


 artur

=20
 This seems to work. With inlining the code is quite compact.
=20
 Not tested yet but the code for these constructs looks correct:
 for (f=3D0;f<50;f++) { regs.txreg =3D =C5=9Bomebuf[f] }
 while (regs.status =3D=3D 0) {}
=20
 What is the purpose of volatile_dummy? Even if it is not used,
 the address for it is calculated in several places.
=20
 The struct members are defined saparately. This means the address
 of every member is stored and fetched separately. The compiler
 seems to remove some of these and use the pointer, but I am not
 sure what happens when the structs are bigger.
=20
 It seems all loads and stores access the real memory, like
 volatile should do. It is hard to follow the optimized code so I
 am not yet sure that they have not been reordered in any way.
=20
 Anyway, this seems acceptable solution to me.
=20
 Johannes, is this good starting point to you or is your work with
 compiler builtins giving us some more?

You mean __builtin_volatile_load/store? I'm not sure if compiler
barriers and these builtins are 100% equal, I think I managed to
produce example code where the barriers didn't work 100% as expected.
But these builtins will need to be introduced anyway as
core.bitop.volatileLoad or whatever final name the DMD devs decide on.

Regarding nocode/typeinfo/noinit/GNU_nomoduleinfo: I think these are
useful anyway. The linker can strip these out, but I don't want to rely
on the linker and on the user to know all special linker flags only to
avoid some binary bloat which can be avoided in the compiler.

But overall this approach looks fine.

Aug 17 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Sunday, 17 August 2014 at 07:57:15 UTC, Timo Sintonen wrote:
 This seems to work.

This does not work with member functions

struct uartreg {
     Volatile!int sr;
     Volatile!int dr;
     Volatile!int brr;
     Volatile!int cr1;
     Volatile!int cr2;
     Volatile!int cr3;
     Volatile!int gtpr;

     // send a byte to the uart
     void send(int t) {
       while ((sr&0x80)==0)
       {  }
       dr=t;
     }

}

In this function the fetch of sr is omitted but compare is still
made against an invalid register value. Then address of dr is
omitted and store is made from wrong register to invalid address.
So the generated code is totally invalid.

If I move this function out of the struct then it is ok.
I use -O2, not tested what it woud do without optimization.

Also if I have:
cr1=cr2=0;
I get: expression this.cr2.opAssign(0) is void and has no value

Aug 17 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/17/14 11:24, Timo Sintonen via D.gnu wrote:
 On Sunday, 17 August 2014 at 07:57:15 UTC, Timo Sintonen wrote:
 This seems to work.

 
 This does not work with member functions
 
 struct uartreg {
     Volatile!int sr;
     Volatile!int dr;
     Volatile!int brr;
     Volatile!int cr1;
     Volatile!int cr2;
     Volatile!int cr3;
     Volatile!int gtpr;
 
     // send a byte to the uart
     void send(int t) {
       while ((sr&0x80)==0)
       {  }
       dr=t;
     }
 
 }
 
 In this function the fetch of sr is omitted but compare is still
 made against an invalid register value. Then address of dr is
 omitted and store is made from wrong register to invalid address.
 So the generated code is totally invalid.
 
 If I move this function out of the struct then it is ok.
 I use -O2, not tested what it woud do without optimization.

It works for me:

   import volat; // module w/ the last Volatile(T) implementation.

   struct uartreg {
       Volatile!int sr;
       Volatile!int dr;
       Volatile!int brr;
       Volatile!int cr1;
       Volatile!int cr2;
       Volatile!int cr3;
       Volatile!int gtpr;

       // send a byte to the uart
       void send(int t) {
         while ((sr&0x80)==0)
         {  }
         dr=t;
       }
   }

   enum uart = cast(uartreg*)0xDEADBEAF;

   void main() {
      uart.send(42);
   }

=>

0000000000403620 <_Dmain>:
  403620:       b8 af be ad de          mov    $0xdeadbeaf,%eax
  403625:       0f 1f 00                nopl   (%rax)
  403628:       b9 af be ad de          mov    $0xdeadbeaf,%ecx
  40362d:       8b 11                   mov    (%rcx),%edx
  40362f:       81 e2 80 00 00 00       and    $0x80,%edx
  403635:       74 f1                   je     403628 <_Dmain+0x8>
  403637:       bf b3 be ad de          mov    $0xdeadbeb3,%edi
  40363c:       31 c0                   xor    %eax,%eax
  40363e:       c7 07 2a 00 00 00       movl   $0x2a,(%rdi)
  403644:       c3                      retq   

Except for some obviously missed optimizations (dead eax load,
unnecessary ecx reload), the code seems fine. What platform
are you using and what does the emitted code look like?

 Also if I have:
 cr1=cr2=0;
 I get: expression this.cr2.opAssign(0) is void and has no value

That's because the opAssign returns void, which prevents this
kind of chaining. This was a deliberate choice, as I /wanted/ to
disallow that; it's already a bad idea for normal assignments;
for volatile ones, which can require a specific order, it's an
even worse one.
But it's trivial to "fix", just change

   void opAssign(A)(A a) { volatile_store(raw, a); }

to

   T opAssign(A)(A a) { volatile_store(raw, a); return a; }

artur

Aug 17 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Sunday, 17 August 2014 at 11:35:33 UTC, Artur Skawina via 
D.gnu wrote:

 It works for me:

    import volat; // module w/ the last Volatile(T) 
 implementation.

    struct uartreg {
        Volatile!int sr;
        Volatile!int dr;
        Volatile!int brr;
        Volatile!int cr1;
        Volatile!int cr2;
        Volatile!int cr3;
        Volatile!int gtpr;

        // send a byte to the uart
        void send(int t) {
          while ((sr&0x80)==0)
          {  }
          dr=t;
        }
    }

    enum uart = cast(uartreg*)0xDEADBEAF;

    void main() {
       uart.send(42);
    }

 =>

 0000000000403620 <_Dmain>:
   403620:       b8 af be ad de          mov    $0xdeadbeaf,%eax
   403625:       0f 1f 00                nopl   (%rax)
   403628:       b9 af be ad de          mov    $0xdeadbeaf,%ecx
   40362d:       8b 11                   mov    (%rcx),%edx
   40362f:       81 e2 80 00 00 00       and    $0x80,%edx
   403635:       74 f1                   je     403628 
 <_Dmain+0x8>
   403637:       bf b3 be ad de          mov    $0xdeadbeb3,%edi
   40363c:       31 c0                   xor    %eax,%eax
   40363e:       c7 07 2a 00 00 00       movl   $0x2a,(%rdi)
   403644:       c3                      retq

 Except for some obviously missed optimizations (dead eax load,
 unnecessary ecx reload), the code seems fine. What platform
 are you using and what does the emitted code look like?

 Also if I have:
 cr1=cr2=0;
 I get: expression this.cr2.opAssign(0) is void and has no value

 That's because the opAssign returns void, which prevents this
 kind of chaining. This was a deliberate choice, as I /wanted/ to
 disallow that; it's already a bad idea for normal assignments;
 for volatile ones, which can require a specific order, it's an
 even worse one.
 But it's trivial to "fix", just change

    void opAssign(A)(A a) { volatile_store(raw, a); }

 to

    T opAssign(A)(A a) { volatile_store(raw, a); return a; }

 artur

I am compiling for arm and I am sorry I misinterpreted the 
optimized code. Actually the code is correct but it still does 
not work.
The problem is that the call to get the tls pointer for 
volatile_dummy seems to corrupt the register (r3) where the this 
pointer is. The call is inside the while loop.  After removing 
tha call by hand in the assembly everything works. R3 is usually 
pushed into stack when it is used in a function. I have to check 
what is wrong in this case.

Aug 17 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:

 I am compiling for arm and I am sorry I misinterpreted the optimized code.
Actually the code is correct but it still does not work.
 The problem is that the call to get the tls pointer for volatile_dummy seems
to corrupt the register (r3) where the this pointer is. The call is inside the
while loop.  After removing tha call by hand in the assembly everything works.
R3 is usually pushed into stack when it is used in a function. I have to check
what is wrong in this case.

Does declaring it as:

   extern __gshared int volatile_dummy;
 
help?

artur

Aug 17 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Sunday, 17 August 2014 at 13:59:03 UTC, Artur Skawina via 
D.gnu wrote:
 On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:

 I am compiling for arm and I am sorry I misinterpreted the 
 optimized code. Actually the code is correct but it still does 
 not work.
 The problem is that the call to get the tls pointer for 
 volatile_dummy seems to corrupt the register (r3) where the 
 this pointer is. The call is inside the while loop.  After 
 removing tha call by hand in the assembly everything works. R3 
 is usually pushed into stack when it is used in a function. I 
 have to check what is wrong in this case.

 Does declaring it as:

    extern __gshared int volatile_dummy;
 
 help?

 artur

Yes, now it works.

But the register corruption is still an issue. My tls function 
clearly uses r3 and does not save it.

Johannes, do you know the arm calling system? Is it caller or 
callee that should save r3?
In this case it is my function that has one function inlined that 
has another function inlined that contains a compiler generated 
function call. Could this be a bug in the compiler that it does 
not recognize the innermost call and does not save registers?

Aug 17 2014

Johannes Pfau <nospam example.com> writes:

Am Sun, 17 Aug 2014 14:36:53 +0000
schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 On Sunday, 17 August 2014 at 13:59:03 UTC, Artur Skawina via 
 D.gnu wrote:
 On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:

 I am compiling for arm and I am sorry I misinterpreted the 
 optimized code. Actually the code is correct but it still does 
 not work.
 The problem is that the call to get the tls pointer for 
 volatile_dummy seems to corrupt the register (r3) where the 
 this pointer is. The call is inside the while loop.  After 
 removing tha call by hand in the assembly everything works. R3 
 is usually pushed into stack when it is used in a function. I 
 have to check what is wrong in this case.

 Does declaring it as:

    extern __gshared int volatile_dummy;
 
 help?

 artur

 
 Yes, now it works.
 
 But the register corruption is still an issue. My tls function 
 clearly uses r3 and does not save it.
 
 Johannes, do you know the arm calling system? Is it caller or 
 callee that should save r3?
 In this case it is my function that has one function inlined that 
 has another function inlined that contains a compiler generated 
 function call. Could this be a bug in the compiler that it does 
 not recognize the innermost call and does not save registers?

r3 is an argument/scratch register, the callee can't rely on its
contents after a function call. This could also be caused by the inline
ASM.

Aug 17 2014

Johannes Pfau <nospam example.com> writes:

Am Sun, 17 Aug 2014 16:45:15 +0200
schrieb Johannes Pfau <nospam example.com>:

 the callee can't rely on its

caller of course ;-)

Aug 17 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Sunday, 17 August 2014 at 14:47:57 UTC, Johannes Pfau wrote:
 Am Sun, 17 Aug 2014 14:36:53 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:

 On Sunday, 17 August 2014 at 13:59:03 UTC, Artur Skawina via 
 D.gnu wrote:
 On 08/17/14 15:44, Timo Sintonen via D.gnu wrote:

 I am compiling for arm and I am sorry I misinterpreted the 
 optimized code. Actually the code is correct but it still 
 does not work.
 The problem is that the call to get the tls pointer for 
 volatile_dummy seems to corrupt the register (r3) where the 
 this pointer is. The call is inside the while loop.  After 
 removing tha call by hand in the assembly everything works. 
 R3 is usually pushed into stack when it is used in a 
 function. I have to check what is wrong in this case.

 Does declaring it as:

    extern __gshared int volatile_dummy;
 
 help?

 artur

 
 Yes, now it works.
 
 But the register corruption is still an issue. My tls function 
 clearly uses r3 and does not save it.
 
 Johannes, do you know the arm calling system? Is it caller or 
 callee that should save r3?
 In this case it is my function that has one function inlined 
 that has another function inlined that contains a compiler 
 generated function call. Could this be a bug in the compiler 
 that it does not recognize the innermost call and does not 
 save registers?

 r3 is an argument/scratch register, the callee can't rely on its
 contents after a function call. This could also be caused by 
 the inline
 ASM.

So is this a bug or just undefined behavior?

Aug 17 2014

"Timo Sintonen" <t.sintonen luukku.com> writes:

On Sunday, 17 August 2014 at 14:47:57 UTC, Johannes Pfau wrote:
 Am Sun, 17 Aug 2014 14:36:53 +0000
 schrieb "Timo Sintonen" <t.sintonen luukku.com>:
 
 But the register corruption is still an issue. My tls function 
 clearly uses r3 and does not save it.
 
 Johannes, do you know the arm calling system? Is it caller or 
 callee that should save r3?
 In this case it is my function that has one function inlined 
 that has another function inlined that contains a compiler 
 generated function call. Could this be a bug in the compiler 
 that it does not recognize the innermost call and does not 
 save registers?

 r3 is an argument/scratch register, the caller can't rely on its
 contents after a function call. This could also be caused by 
 the inline
 ASM.

I have had some weird bugs lately and then I looked my other 
object files.
I think there is a bug because I found more like this:

This is a class function (actually a constructor) that writes 
constant values into two variables, one is a static class 
variable in tls an  the other is an instance variable

   27 0000 10B5     		push	{r4, lr}
   28 0002 0346     		mov	r3, r0
   29 0004 FFF7FEFF 		bl	__aeabi_read_tp	  load_tp_soft
   30 0008 034A     		ldr	r2, .L3


   33 0010 1150     		str	r1, [r2, r0]

   35 0014 1846     		mov	r0, r3
   36 0016 10BD     		pop	{r4, pc}
   37              	.L4:
   38              		.align	2
   39              	.L3:
   40 0018 00000000 		.word	.LANCHOR0(tpoff)
   41

In line 28 the this pointer is saved to r3, then the call in line 
29 returns the tls start address in r0. __aeabi_read_tp uses r3 
to fetch the address so r3 is corrupted  R3 is used in 34 to 
store to address this+8 and then r3 is moved back to r0 returning 
incorrect value for this.

Is this a gdc or gcc bug?

Aug 18 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/17/14 09:57, Timo Sintonen via D.gnu wrote:

 What is the purpose of volatile_dummy? Even if it is not used,

Ensuring ordering, w/o it the compiler could reorder operations
on different volatile objects. (Which isn't necessarily a bad thing,
but people expect certain semantics of 'volatile', so it would be
a bad and dangerous default)

 the address for it is calculated in several places.

It's completely optimized away for me (I'm testing on x86). Can you
show the emitted code?

 The struct members are defined saparately. This means the address
 of every member is stored and fetched separately. The compiler
 seems to remove some of these and use the pointer, but I am not
 sure what happens when the structs are bigger.

Yes, the compiler does not to generate optimal code, but so far I've
only seen dead immediate-constant->register loads; so it's not a
huge problem.

artur

Aug 17 2014

Johannes Pfau <nospam example.com> writes:

Am Sat, 16 Aug 2014 11:58:49 +0200
schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:

 On 08/16/14 09:33, Johannes Pfau via D.gnu wrote:
 https://github.com/D-Programming-GDC/GDC/pull/82

 
 [Only noticed this accidentally; using a mailing list
 instead of some web forum would increase visibility...]
 
  enum var = Volatile!(T,addr)(): doesn't allow |= on enum literals,
 even if the type implements opAssign as there's no this pointer

 
    T volatile_load(T)(ref T v) nothrow {
       asm { "" : "+m" v; }
       T res = v;
       asm { "" : "+g" res; }
       return res;
    }
 
    void volatile_store(T)(ref T v, const T a) nothrow {
       asm { "" : : "m" v; }
       v = a;
       asm { "" : "+m" v; }
    }
       
    struct Volatile(T, alias /* T* */ A) {
        void opOpAssign(string OP)(const T rhs) nothrow {
            auto v = volatile_load(*A);
            mixin("v " ~ OP ~ "= rhs;");
            volatile_store(*A, v);
        }
    }
 
    enum TimerB = Volatile!(uint, cast(uint*)0xDEADBEEF)();
 
    void main() {
       TimerB |= 0b1;
       TimerB += 1;
    }

That's a good start. Can you also get unary operators working?
e.g
TimerB++;

Do you think it's possible to combine this with the other solution you
posted for struct fields? Or do we need separate Volatile!T and
VolatileField!T types?

Aug 17 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/17/14 10:49, Johannes Pfau via D.gnu wrote:

 That's a good start. Can you also get unary operators working?
 e.g
 TimerB++;

Unary ops are easy. If you mean post-inc and post-dec -- that's a 
language problem. At least for volatile, they will cause a compile
error; for atomic ops the naive `post-op->tmp-load+op+tmp` rewrite
can introduce bugs... D would need to make the post-ops overloadable
to get rid of these issues.

 Do you think it's possible to combine this with the other solution you
 posted for struct fields? Or do we need separate Volatile!T and
 VolatileField!T types?

Right now, I'd prefer this approach:

--------------------------------------------------------------
   module volat;
   
   version (GNU) {
   static import gcc.attribute;
   enum inline = gcc.attribute.attribute("forceinline");
   }
   
   extern int volatile_dummy;
   
    inline T volatile_load(T)(ref T v) nothrow {
      asm { "" : "+m" v, "+m" volatile_dummy; }
      T res = v;
      asm { "" : "+g" res, "+m" v, "+m" volatile_dummy; }
      return res;
   }

    inline void volatile_store(T, A)(ref T v, A a) nothrow {
      asm { "" : "+m" volatile_dummy : "m" v; }
      v = a;
      asm { "" : "+m" v, "+m" volatile_dummy; }
   }
   
    inline void volatile_barrier(T)(ref T v) nothrow {
      asm { "" : "+m" v, "+m" volatile_dummy; }
   }
   
   struct Volatile(T) {
      T raw;
      nothrow:  inline:
       disable this(this);
      void opAssign(A)(A a) { volatile_store(raw, a); }
      T load()  property { return volatile_load(raw); }
      alias load this;
      void opOpAssign(string OP)(const T b) {
           volatile_barrier(raw);
           mixin("raw " ~ OP ~ "= b;");
           volatile_barrier(raw);
      }
      T opUnary(string OP)() {
           volatile_barrier(raw);
           auto result = mixin(OP ~ "raw");
           volatile_barrier(raw);
           return result;
      }
   }
--------------------------------------------------------------
   import volat;

   struct Timer
   {
       Volatile!uint control;
       Volatile!uint data;
   }

   enum timerA = cast(Timer*)0xDEADBEAF;

   int main() {
      timerA.control |= 0b1;
      timerA.control += 1;
      timerA.control = 42;
      int a = timerA.data - timerA.data;
      int b = ++timerA.control;
      --timerA.data;
      timerA.control /= 2;
      return b;
   }
--------------------------------------------------------------

compiles to:

--------------------------------------------------------------
0000000000403620 <_Dmain>:
  403620:       ba af be ad de          mov    $0xdeadbeaf,%edx
  403625:       b9 b3 be ad de          mov    $0xdeadbeb3,%ecx
  40362a:       83 0a 01                orl    $0x1,(%rdx)
  40362d:       83 02 01                addl   $0x1,(%rdx)
  403630:       c7 02 2a 00 00 00       movl   $0x2a,(%rdx)
  403636:       8b 42 04                mov    0x4(%rdx),%eax
  403639:       8b 72 04                mov    0x4(%rdx),%esi
  40363c:       8b 02                   mov    (%rdx),%eax
  40363e:       83 c0 01                add    $0x1,%eax
  403641:       89 02                   mov    %eax,(%rdx)
  403643:       83 6a 04 01             subl   $0x1,0x4(%rdx)
  403647:       d1 2a                   shrl   (%rdx)
  403649:       c3                      retq   
--------------------------------------------------------------

Do you see any problems with it? (Other than gcc not removing
that dead constant load)

[The struct-with-volatile-fields can be built from a "normal"
 struct at CT. But that's just syntax sugar.]

artur

Aug 17 2014

Johannes Pfau <nospam example.com> writes:

Am Sun, 17 Aug 2014 15:15:12 +0200
schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:

 Do you see any problems with it? (Other than gcc not removing
 that dead constant load)

It's perfect for structs, but when simply declaring a Volatile!uint the
pointer dereference must be done manually, right?

----
enum TimerB = cast(Volatile!(uint)*)0xDEADBEEF;

*TimerB |= 0b1;
----

I don't think that a huge problem though, just a little bit
inconvenient.

Aug 17 2014

"Artur Skawina via D.gnu" <d.gnu puremagic.com> writes:

On 08/17/14 16:16, Johannes Pfau via D.gnu wrote:
 Am Sun, 17 Aug 2014 15:15:12 +0200
 schrieb "Artur Skawina via D.gnu" <d.gnu puremagic.com>:
 
 Do you see any problems with it? (Other than gcc not removing
 that dead constant load)

 
 It's perfect for structs, but when simply declaring a Volatile!uint the
 pointer dereference must be done manually, right?
 
 ----
 enum TimerB = cast(Volatile!(uint)*)0xDEADBEEF;
 
 *TimerB |= 0b1;
 ----
 
 I don't think that a huge problem though, just a little bit
 inconvenient.

Another D-problem - the language doesn't have /real/ refs. But...

   import volat;

    inline ref  property timerA() { return *cast(Volatile!uint*)0xDEADBEAF; }

   int main() {
      timerA |= 0b1;
      timerA += 1;
      timerA = 42;
      int a = timerA - timerA;
      int b = ++timerA;
      --timerA;
      timerA /= 2;
      return b;
   }

=>

0000000000403620 <_Dmain>:
  403620:       ba af be ad de          mov    $0xdeadbeaf,%edx
  403625:       83 0a 01                orl    $0x1,(%rdx)
  403628:       83 02 01                addl   $0x1,(%rdx)
  40362b:       c7 02 2a 00 00 00       movl   $0x2a,(%rdx)
  403631:       8b 02                   mov    (%rdx),%eax
  403633:       8b 0a                   mov    (%rdx),%ecx
  403635:       8b 02                   mov    (%rdx),%eax
  403637:       83 c0 01                add    $0x1,%eax
  40363a:       89 02                   mov    %eax,(%rdx)
  40363c:       83 2a 01                subl   $0x1,(%rdx)
  40363f:       d1 2a                   shrl   (%rdx)
  403641:       c3                      retq   

artur

Aug 17 2014

D Programming

C/C++ Programming

Other

D.gnu - Object file questions