www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - Compile time macro trouble.

reply Taylor Hillegeist <taylorh140 gmail.com> writes:
So here: https://dlang.org/pretod.html
there is a lovely section about how doing normal functions is 
basically a processor macro.

The C Preprocessor Way:
#define X(i)	((i) = (i) / 3)

The D Way:
int X(ref int i) { return i = i / 3; }

However I am running into troubles with the truthfulness of this 
on LDC for thumb cortex-m0plus.

MKL25Z.d
line 5243:
uint32_t SIM_SOPT2_TPMSRC(uint32_t x)  {return ((x << 
SIM_SOPT2_TPMSRC_SHIFT) & SIM_SOPT2_TPMSRC_MASK);}

start.d
line 28:     int value = SIM_SOPT2_TPMSRC(1);
0x0000046E 2001      MOVS     r0,#0x01
0x00000470 9004      STR      r0,[sp,#0x10]
0x00000472 F000F8ED  BL.W     MKL25Z4.SIM_SOPT2_TPMSRC 
(0x00000650) //branch to crap
0x00000476 9005      STR      r0,[sp,#0x14]
...
0x00000650 B082      DCW      0xB082 < the crap
0x00000652 4601      DCW      0x4601
0x00000654 9001      DCW      0x9001
0x00000656 0600      DCW      0x0600
0x00000658 2203      DCW      0x2203

Is this an error on ldc's part or do i need to do something more 
to get the correct result?

I compiled with:
ldc2.exe -output-o -march=thumb -mcpu=cortex-m0plus 
-mtriple=arm-linux -g -c main.d start.d MKL25Z4.d
and linked with:
arm-unknown-linux-gnueabi-ld.exe -T mkl25z4.ld --gc-sections  
start.o MKL25Z4.o main.o  -o ldoutput.elf


And yes there is a little bit of information missing i've 
uploaded some files here.

https://github.com/taylorh140/DLANG_FRDM_KL25Z

I'm just trying to get more comfortable with working on 
microcontrollers with d. The good news is that I got a simple 
example running. An led blinking three colors!  I would like 
advice on the best way to handle direct interaction with 
registers, but maybe that's best for another post.
Mar 12 2016
next sibling parent reply Lass Safin <lasssafin gmail.com> writes:
On Sunday, 13 March 2016 at 06:15:34 UTC, Taylor Hillegeist wrote:
 So here: https://dlang.org/pretod.html
 there is a lovely section about how doing normal functions is 
 basically a processor macro.

 The C Preprocessor Way:
 #define X(i)	((i) = (i) / 3)

 The D Way:
 int X(ref int i) { return i = i / 3; }

 However I am running into troubles with the truthfulness of 
 this on LDC for thumb cortex-m0plus.

 MKL25Z.d
 line 5243:
 uint32_t SIM_SOPT2_TPMSRC(uint32_t x)  {return ((x << 
 SIM_SOPT2_TPMSRC_SHIFT) & SIM_SOPT2_TPMSRC_MASK);}
 [...]
 Is this an error on ldc's part or do i need to do something 
 more to get the correct result?
Try using DMD, it may just be a bug in LDC. In DMD you can also force inline with the attribute pragma(inline, true) (or force it not to with pragma(inline, false). Another thing: I wouldn't use ref here. Do it like this instead: int X(int i) { return i = i / 3; } int i = 491; i = X(i); If you look at the assembly which LDC generates you see this: mangled_name: movl (%rdi), %eax shll $2, %eax retq As you can see, it dereferences a pointer inside %rdi, which means when you call the function, a pointer gets send to it. This means a the caller has to push the value onto the stack (or heap). This costs precious cycles. The callee also has to load the value from memory. This may also be why it isn't inlined. Perhaps LLVM detects that inlining it won't improve performance, so it refrains from increasing the code size.
Mar 13 2016
next sibling parent Taylor Hillegeist <taylorh140 gmail.com> writes:
On Sunday, 13 March 2016 at 12:52:01 UTC, Lass Safin wrote:
 On Sunday, 13 March 2016 at 06:15:34 UTC, Taylor Hillegeist 
 wrote:
 [...]
Try using DMD, it may just be a bug in LDC. In DMD you can also force inline with the attribute pragma(inline, true) (or force it not to with pragma(inline, false). Another thing: I wouldn't use ref here. Do it like this instead: int X(int i) { return i = i / 3; } int i = 491; i = X(i); If you look at the assembly which LDC generates you see this: mangled_name: movl (%rdi), %eax shll $2, %eax retq As you can see, it dereferences a pointer inside %rdi, which means when you call the function, a pointer gets send to it. This means a the caller has to push the value onto the stack (or heap). This costs precious cycles. The callee also has to load the value from memory. This may also be why it isn't inlined. Perhaps LLVM detects that inlining it won't improve performance, so it refrains from increasing the code size.
Well as nice as that would be to use dmd, I don't think it has a arm backend. Also I think GDC does the same thing, and I wonder if it is a configuration issue on my end. It seems like I may need some optimization flags, or something.
Mar 13 2016
prev sibling parent Kai Nacke <kai redstar.de> writes:
On Sunday, 13 March 2016 at 12:52:01 UTC, Lass Safin wrote:
 On Sunday, 13 March 2016 at 06:15:34 UTC, Taylor Hillegeist 
 wrote:
 So here: https://dlang.org/pretod.html
 there is a lovely section about how doing normal functions is 
 basically a processor macro.

 The C Preprocessor Way:
 #define X(i)	((i) = (i) / 3)

 The D Way:
 int X(ref int i) { return i = i / 3; }

 However I am running into troubles with the truthfulness of 
 this on LDC for thumb cortex-m0plus.

 MKL25Z.d
 line 5243:
 uint32_t SIM_SOPT2_TPMSRC(uint32_t x)  {return ((x << 
 SIM_SOPT2_TPMSRC_SHIFT) & SIM_SOPT2_TPMSRC_MASK);}
 [...]
 Is this an error on ldc's part or do i need to do something 
 more to get the correct result?
Try using DMD, it may just be a bug in LDC.
Didn't know that DMD has a thumb cortex-m0plus backend ;-)
Mar 13 2016
prev sibling next sibling parent reply Dan Olson <gorox comcast.net> writes:
Taylor Hillegeist <taylorh140 gmail.com> writes:

 So here: https://dlang.org/pretod.html
 there is a lovely section about how doing normal functions is
 basically a processor macro.

 The C Preprocessor Way:
 #define X(i)	((i) = (i) / 3)

 The D Way:
 int X(ref int i) { return i = i / 3; }

 However I am running into troubles with the truthfulness of this on
 LDC for thumb cortex-m0plus.

 MKL25Z.d
 line 5243:
 uint32_t SIM_SOPT2_TPMSRC(uint32_t x)  {return ((x <<
 SIM_SOPT2_TPMSRC_SHIFT) & SIM_SOPT2_TPMSRC_MASK);}

 start.d
 line 28:     int value = SIM_SOPT2_TPMSRC(1);
 0x0000046E 2001      MOVS     r0,#0x01
 0x00000470 9004      STR      r0,[sp,#0x10]
 0x00000472 F000F8ED  BL.W     MKL25Z4.SIM_SOPT2_TPMSRC (0x00000650)
 //branch to crap
 0x00000476 9005      STR      r0,[sp,#0x14]
 ...
 0x00000650 B082      DCW      0xB082 < the crap
 0x00000652 4601      DCW      0x4601
 0x00000654 9001      DCW      0x9001
 0x00000656 0600      DCW      0x0600
 0x00000658 2203      DCW      0x2203

 Is this an error on ldc's part or do i need to do something more to
 get the correct result?

 I compiled with:
 ldc2.exe -output-o -march=thumb -mcpu=cortex-m0plus -mtriple=arm-linux
 -g -c main.d start.d MKL25Z4.d
 and linked with:
 arm-unknown-linux-gnueabi-ld.exe -T mkl25z4.ld --gc-sections  start.o
 MKL25Z4.o main.o  -o ldoutput.elf


 And yes there is a little bit of information missing i've uploaded
 some files here.

 https://github.com/taylorh140/DLANG_FRDM_KL25Z

 I'm just trying to get more comfortable with working on
 microcontrollers with d. The good news is that I got a simple example
 running. An led blinking three colors!  I would like advice on the
 best way to handle direct interaction with registers, but maybe that's
 best for another post.
Taylor, this should help: Change your function to a template, and then it will get inlined (add the extra, empty parens). uint32_t SIM_SOPT2_TPMSRC()(uint32_t x) { return ((x << SIM_SOPT2_TPMSRC_SHIFT) & SIM_SOPT2_TPMSRC_MASK); } I think LDC only inlines regular functions when they are in the same module. But templates get inlined across modules. Make sure you are using LDC ltsmaster branch since it has some needed improvements for ARM, although thumb is not well tested except on iOS (but that is a different fork). -- Dan
Mar 13 2016
parent Dan Olson <gorox comcast.net> writes:
Dan Olson <gorox comcast.net> writes:

 Taylor Hillegeist <taylorh140 gmail.com> writes:

 I'm just trying to get more comfortable with working on
 microcontrollers with d. The good news is that I got a simple example
 running. An led blinking three colors!
That is cool!
 Taylor, this should help:

 Change your function to a template, and then it will get inlined (add the
 extra, empty parens).

 uint32_t SIM_SOPT2_TPMSRC()(uint32_t x)  {
   return ((x << SIM_SOPT2_TPMSRC_SHIFT) & SIM_SOPT2_TPMSRC_MASK);
 }

 I think LDC only inlines regular functions when they are in the same
 module.  But templates get inlined across modules.

 Make sure you are using LDC ltsmaster branch since it has some needed
 improvements for ARM, although thumb is not well tested except on iOS
 (but that is a different fork).
Oh, and compile with optimizer -O to get template inlining.
Mar 13 2016
prev sibling parent reply Kai Nacke <kai redstar.de> writes:
On Sunday, 13 March 2016 at 06:15:34 UTC, Taylor Hillegeist wrote:
 So here: https://dlang.org/pretod.html
 there is a lovely section about how doing normal functions is 
 basically a processor macro.

 The C Preprocessor Way:
 #define X(i)	((i) = (i) / 3)

 The D Way:
 int X(ref int i) { return i = i / 3; }

 However I am running into troubles with the truthfulness of 
 this on LDC for thumb cortex-m0plus.

 MKL25Z.d
 line 5243:
 uint32_t SIM_SOPT2_TPMSRC(uint32_t x)  {return ((x << 
 SIM_SOPT2_TPMSRC_SHIFT) & SIM_SOPT2_TPMSRC_MASK);}

 start.d
 line 28:     int value = SIM_SOPT2_TPMSRC(1);
 0x0000046E 2001      MOVS     r0,#0x01
 0x00000470 9004      STR      r0,[sp,#0x10]
 0x00000472 F000F8ED  BL.W     MKL25Z4.SIM_SOPT2_TPMSRC 
 (0x00000650) //branch to crap
 0x00000476 9005      STR      r0,[sp,#0x14]
 ...
 0x00000650 B082      DCW      0xB082 < the crap
 0x00000652 4601      DCW      0x4601
 0x00000654 9001      DCW      0x9001
 0x00000656 0600      DCW      0x0600
 0x00000658 2203      DCW      0x2203

 Is this an error on ldc's part or do i need to do something 
 more to get the correct result?

 I compiled with:
 ldc2.exe -output-o -march=thumb -mcpu=cortex-m0plus 
 -mtriple=arm-linux -g -c main.d start.d MKL25Z4.d
 and linked with:
 arm-unknown-linux-gnueabi-ld.exe -T mkl25z4.ld --gc-sections  
 start.o MKL25Z4.o main.o  -o ldoutput.elf


 And yes there is a little bit of information missing i've 
 uploaded some files here.

 https://github.com/taylorh140/DLANG_FRDM_KL25Z

 I'm just trying to get more comfortable with working on 
 microcontrollers with d. The good news is that I got a simple 
 example running. An led blinking three colors!  I would like 
 advice on the best way to handle direct interaction with 
 registers, but maybe that's best for another post.
Hi Taylor, I compiled your code to assembly. I get in start.s: .loc 1 28 5 movs r0, #1 str r0, [sp, #16] bl _D7MKL25Z416SIM_SOPT2_TPMSRCFkZk and in MKL25Z4.s: _D7MKL25Z416SIM_SOPT2_TPMSRCFkZk: .Lfunc_begin197: .loc 1 5245 0 .fnstart .cfi_startproc .loc 1 5245 10 prologue_end sub sp, #8 .Ltmp966: .cfi_def_cfa_offset 8 Looks good so far (except for the missing inlining). I did not try to link. BTW: Which version do you use? I used current vesion from ltsmaster: LDC - the LLVM D compiler (c936f8): based on DMD v2.068.2 and LLVM 3.7.1 Default target: armv7a-hardfloat-linux-gnueabi Host CPU: krait http://dlang.org - http://wiki.dlang.org/LDC Regards, Kai
Mar 13 2016
parent reply Taylor Hillegeist <taylorh140 gmail.com> writes:
On Sunday, 13 March 2016 at 14:43:27 UTC, Kai Nacke wrote:
 On Sunday, 13 March 2016 at 06:15:34 UTC, Taylor Hillegeist 
 BTW: Which version do you use? I used current vesion from 
 ltsmaster:
I used the latest windows binary. LDC - the LLVM D compiler (1.0.0-alpha1): based on DMD v2.069.2 and LLVM 3.7.1 Default target: x86_64-pc-windows-msvc Host CPU: ivybridge http://dlang.org - http://wiki.dlang.org/LDC
Mar 13 2016
parent reply Taylor Hillegeist <taylorh140 gmail.com> writes:
On Sunday, 13 March 2016 at 21:54:16 UTC, Taylor Hillegeist wrote:
 On Sunday, 13 March 2016 at 14:43:27 UTC, Kai Nacke wrote:
 On Sunday, 13 March 2016 at 06:15:34 UTC, Taylor Hillegeist 
 BTW: Which version do you use? I used current vesion from 
 ltsmaster:
I used the latest windows binary. LDC - the LLVM D compiler (1.0.0-alpha1): based on DMD v2.069.2 and LLVM 3.7.1 Default target: x86_64-pc-windows-msvc Host CPU: ivybridge http://dlang.org - http://wiki.dlang.org/LDC
Could this be an issue of using --gc-sections for linking? I have quite a few symbols that need to be mapped with this not present.
Mar 13 2016
parent reply Kagamin <spam here.lot> writes:
On Monday, 14 March 2016 at 04:08:14 UTC, Taylor Hillegeist wrote:
 Could this be an issue of using --gc-sections for linking? I 
 have quite a few symbols that need to be mapped with this not 
 present.
For small projects I compile entire codebase into bitcode and then compile it with llc or llvm-lto. This way llvm handles inlining and llvm-lto does even more :) I would recommend it.
Mar 14 2016
parent reply David Nadlinger via digitalmars-d-ldc <digitalmars-d-ldc puremagic.com> writes:
On 14 Mar 2016, at 13:14, Kagamin via digitalmars-d-ldc wrote:
 For small projects I compile entire codebase into bitcode and then 
 compile it with llc or llvm-lto. This way llvm handles inlining and 
 llvm-lto does even more :) I would recommend it.
More or less equivalently, you can also use -singleobj (which is the default when using ldmd2). The only difference might be in the tuning parameters of various optimization passes. — David
Mar 14 2016
parent Taylor Hillegeist <taylorh140 gmail.com> writes:
On Monday, 14 March 2016 at 13:39:41 UTC, David Nadlinger wrote:
 On 14 Mar 2016, at 13:14, Kagamin via digitalmars-d-ldc wrote:
 For small projects I compile entire codebase into bitcode and 
 then compile it with llc or llvm-lto. This way llvm handles 
 inlining and llvm-lto does even more :) I would recommend it.
More or less equivalently, you can also use -singleobj (which is the default when using ldmd2). The only difference might be in the tuning parameters of various optimization passes. — David
So i think the problem might be something a bit bigger. Not quite sure yet. I know just enough about the cortex-m0+ to be slightly familiar. so what I am seeing is disagreement between dis-assembly of the cortex-m0+ by gcc objdump and the uvision disassembler. objdump: 00000552 <_D7MKL25Z410SIM_MemMap7SOTP2_t6TPMSRCMFNdhZv>: 552: 22c0 movs r2, #192 ; 0xc0 554: 43d2 mvns r2, r2 556: 6803 ldr r3, [r0, #0] 558: 4013 ands r3, r2 55a: 0189 lsls r1, r1, #6 55c: b2c9 uxtb r1, r1 55e: 4319 orrs r1, r3 560: 6001 str r1, [r0, #0] 562: 4770 bx lr uvision: SOTP2_t::MKL25Z4.SIM_MemMap.SOTP2_t.TPMSRC: 0x00000552 22C0 DCW 0x22C0 0x00000554 43D2 DCW 0x43D2 0x00000556 6803 DCW 0x6803 0x00000558 4013 DCW 0x4013 0x0000055A 0189 DCW 0x0189 0x0000055C B2C9 DCW 0xB2C9 0x0000055E 4319 DCW 0x4319 0x00000560 6001 DCW 0x6001 0x00000562 4770 DCW 0x4770
Mar 14 2016