www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - newCTFE Status July 2017

reply Stefan Koch <uplink.coder googlemail.com> writes:
Hi Guys,

It has been suggested that I open a new thread every month.
I am happy to do that as it will make documenting the progress in 
newCTFE easier.

Let me start with a funny error I just fixed :)
The following code :

static assert(() { return ulong(ushort.max | ulong(ushort.max) << 
32); }() == 281470681808895LU);

resulted in the following error:

static assert  (*function () => 281470681808895LU)() == 
281470681808895LU is false

Because the value was : 65535UL.
Which is the value when truncated to 32bit.
The reason for this was that the interpreter was initially built 
for 32bit.
On the switch-over to 64bit support I added the ability to work 
with 64bit values.
However since I did not want to bloat the instructions beyond 
64bit;
An 64bit Immediate Set (which is equivalent to a movq) is encoded 
in two instructions:

Set(lhs, imm32(imm64Value & uint.max));
SetHigh(lhs, imm32(imm64Value >> 32));

However I did forget to issue the SetHigh.
Which resulted in the observed behavior.


Funnily enough I found this issue because I suspected an ABI in 
how ulong and long struct members are handeld.
And sure enough on top of the above described bug there was an 
abi bug as well.
which is now fixed as well.

While theses bugs are lurking the development of the 
ctfe-debugger is impeded.
Jul 13
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, I've just added stack-map support to my debug output Although this is purely internal, I am proud enough of the improvement to post it. The following is Bytecode part of a foreach-loop. Printout without stackMap 290: Line #21 292: Set SP[208], #0 294: Set SP[212], SP[20] 296: Lt SP[208, SP[212] 298: Flg SP[216] 300: JmpZ SP[216, &358 302: Set SP[220, SP[208] 304: Line #23 Prinout with stackMap: 290: Line #21 292: Set __key64, #0 294: Set __limit65, resultLength 296: Lt __key64, __limit65 298: Flg SP[216] 300: JmpZ SP[216], &358 302: Set i, __key64 304: Line #23 I really wish I had added the stackMap 6 months ago :) I really don't miss the heaps of paper handwritten address to name translation tables. Cheers, Stefan
Jul 14
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, fresh ABI bugs! Even with my improved debugging facilities they are still a pain in the neck. see for yourself : https://www.youtube.com/watch?v=eX93aWDmiqE
Jul 14
prev sibling next sibling parent reply Dmitry <dmitry indiedev.ru> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 ...
Hi. Have you any public roadmap (or somethilng like this) of newCTFE? Will be useful to see what planned, what finished, etc.
Jul 15
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 15 July 2017 at 07:50:28 UTC, Dmitry wrote:
 On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 ...
Hi. Have you any public roadmap (or somethilng like this) of newCTFE? Will be useful to see what planned, what finished, etc.
All I have are the CTFE status threads, where one can see what I consider a working feature set. What is planed is simple to state: "Re-implement the full functionality of the current ctfe-interpreter in the new IR-based system" It is hard for me to say in advance which features will be working, or indeed regress next. Since the interactions between features form a complex system which by definition is very hard to predict. My current goal is to make complex structs work. Meaning structs with other structs inside it. For example struct SimpleStruct { float c; uint a; ulong b; } struct complexStruct { SimpleStruct[2] a; complexStruct* next; } The complex struct will currently produce a vaild IR type. But will generate invalid code, since the ABI is currently in the process of changing. And different parts of the code will interpret the data differently. On top of that I have to rebuild a few of the expression handling routines to switch from the old reference based ABI to the new hybrid ABI.
Jul 15
parent reply Tourist <gravatar gravatar.com> writes:
On Saturday, 15 July 2017 at 09:02:02 UTC, Stefan Koch wrote:
 On Saturday, 15 July 2017 at 07:50:28 UTC, Dmitry wrote:
 ...
All I have are the CTFE status threads, where one can see what I consider a working feature set. What is planed is simple to state: "Re-implement the full functionality of the current ctfe-interpreter in the new IR-based system" It is hard for me to say in advance which features will be working, or indeed regress next. Since the interactions between features form a complex system which by definition is very hard to predict. ...
It would indeed be nice to have a GitHub issue (or similar) with progress checkboxes of what works, what's in progress, and what is yet to be done.
Jul 15
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 15 July 2017 at 11:31:30 UTC, Tourist wrote:
 On Saturday, 15 July 2017 at 09:02:02 UTC, Stefan Koch wrote:
 On Saturday, 15 July 2017 at 07:50:28 UTC, Dmitry wrote:
 ...
It would indeed be nice to have a GitHub issue (or similar) with progress checkboxes of what works, what's in progress, and what is yet to be done.
here is the current newCTFE test. /UplinkCoder/dmd/blob/newCTFE_on_master/test/compilable/ctfeTest.d Eveything in there is supported by newCTFE . Modulo regressions.
Jul 15
parent reply Tourist <gravatar gravatar.com> writes:
On Saturday, 15 July 2017 at 17:43:04 UTC, Stefan Koch wrote:
 On Saturday, 15 July 2017 at 11:31:30 UTC, Tourist wrote:
 On Saturday, 15 July 2017 at 09:02:02 UTC, Stefan Koch wrote:
 On Saturday, 15 July 2017 at 07:50:28 UTC, Dmitry wrote:
 ...
It would indeed be nice to have a GitHub issue (or similar) with progress checkboxes of what works, what's in progress, and what is yet to be done.
here is the current newCTFE test. /UplinkCoder/dmd/blob/newCTFE_on_master/test/compilable/ctfeTest.d Eveything in there is supported by newCTFE . Modulo regressions.
Nice, but it's not as clear, and doesn't specify what's left to be done.
Jul 15
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 15 July 2017 at 21:35:30 UTC, Tourist wrote:
 Nice, but it's not as clear, and doesn't specify what's left to 
 be done.
features left to be done are: - && and || - multi-dimensional slices/arrays - associative arrays - complex structs - classes - unions - unicode support.
Jul 16
parent jmh530 <john.michael.hall gmail.com> writes:
On Sunday, 16 July 2017 at 10:41:56 UTC, Stefan Koch wrote:
 On Saturday, 15 July 2017 at 21:35:30 UTC, Tourist wrote:
 Nice, but it's not as clear, and doesn't specify what's left 
 to be done.
features left to be done are: - && and || - multi-dimensional slices/arrays - associative arrays - complex structs - classes - unions - unicode support.
You could start next month's thread with features implemented and then this list of features remaining.
Jul 17
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
I fixed my struct ABI issues for now. The problem was the lowering of the IR types. Since before everything was expressible as pointer, it was fine to convert everything to integers and treat them as pointer. Since structs treated are value-types now we can no longer just point to them. However due to the premature conversion we would stick random intgeres with the pointer values inside the struct. This is what should happen: *(cast (struct_t*)targetMemory) = *(cast(struct_t*) sourceMemory); And this is what actually did happen: *(cast(size_t*) targetMemory) = cast(size_t) sourceMemory; Due to bizarre coincidences this wrong behavior would actually pass a few tests which were supposed to catch this mistake :) long story short. This bug is now fixed. I am assuming that there are still a few issues with arrayLiterals. But I will fix those another day. I am spent. Have a nice day, Stefan.
Jul 15
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
I just figured out the bug in test/runnable/template8.d What happens somewhere inside those templates is that the following expression is executed: "some string literal" ~ null; when then happens we executed the equivalent of the following in bytecode { size_t lhsorrhs = lhs | rhs; if (!lhsorrhs) return null; // needed to handle the special case null ~ null == null; immutable elemSize = elementSize(lhs); // can be assumed to be the same as rhs // sema would have complained otherwise int newSize = 0; int lhsSize = lhs ? getLength(lhs) * elemSize : 0; int rhsSize = rhs ? getLength(rhs) * elemSize : 0; newSize += lhsSize; newSize += (getLength(rhs) * elemSize); void* newString = allocateHeap(newSize + SliceDescriptor.sizeof); auto sliceDesc = cast(SliceDescriptor*) newString; sliceDesc.base = newString + SliceDescriptor.sizeof; sliceDesc.length = newSize / elemSize; newString += SliceDescriptor.sizeof; memcpy(newString, lhs, lhsSize); memcpy(newString + lhsSize, rhs, rhsSize); } now what happens if either lhs OR rhs are null but not both ? right a null pointer dereference. and this is what happend here. Why did it take so long to find ? Well please scan the test https://github.com/dlang/dmd/blob/master/test/runnable/template8.d yourself and tell me where you see "something" ~ null :)
Jul 17
prev sibling next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, I have good news and bad news. The good news: newCTFE just compiled my own version of std.bitmanip.bitfields dubbed fastFields. which can be seen here: https://gist.github.com/UplinkCoder/b3501425a4fb4992c6cf1c77d6c3638a The bad news: It miscompiles it. The generated code is bogus :) Fixing this could take a while. Because even with my improved debugging tools it's still over 3k of instructions to look through. It does not help that over 300 temporaries are allocated. -- Stefan
Jul 18
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Jul 18, 2017 at 07:08:49PM +0000, Stefan Koch via Digitalmars-d wrote:
[...]
 The bad news:
 It miscompiles it.
 
 The generated code is bogus :)
 
 Fixing this could take a while.
 Because even with my improved debugging tools it's still over 3k of
 instructions to look through.
 
 It does not help that over 300 temporaries are allocated.
[...] Shouldn't there be a way to reduce the test case so that you don't have to look through 300 temporaries? T -- Valentine's Day: an occasion for florists to reach into the wallets of nominal lovers in dire need of being reminded to profess their hypothetical love for their long-forgotten.
Jul 18
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 18 July 2017 at 19:11:37 UTC, H. S. Teoh wrote:
 Shouldn't there be a way to reduce the test case so that you 
 don't have to look through 300 temporaries?
Yes. However, there is not automated way to reduce it. So to find the source-code which actually leads to the mis-compiled part. I still have to consider an awful lot of code :)
Jul 18
parent Seb <seb wilzba.ch> writes:
On Tuesday, 18 July 2017 at 19:23:56 UTC, Stefan Koch wrote:
 On Tuesday, 18 July 2017 at 19:11:37 UTC, H. S. Teoh wrote:
 Shouldn't there be a way to reduce the test case so that you 
 don't have to look through 300 temporaries?
Yes. However, there is not automated way to reduce it. So to find the source-code which actually leads to the mis-compiled part. I still have to consider an awful lot of code :)
Why can't you use DustMite? It's an amazing tool! https://github.com/CyberShadow/DustMite/wiki
Jul 18
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, The following code compiles now and runs in very reasonable time even for unreasonable repeat-counts. string repeatString(string s, uint repeatCount) { char[] result; uint sLength = cast(uint) s.length; result.length = sLength * repeatCount; uint p1 = 0; uint p2 = sLength; foreach(rc;0 .. repeatCount) { result[p1 .. p2] = s[0 .. sLength]; p1 += sLength; p2 += sLength; } return cast(string) result; } Note: a version that uses `result ~= s` runs much much slower and will soon run out of the 16M heapMemory. static assert( "o_o ".repeatString(24) == `o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o o_o ` );
Jul 20
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, Another work-filled two day went by. And here is the fruit of the labor: int[2][3] split(int[6] a) { int[2][3] result; foreach (i; 0 .. typeof(result[0]).length) { foreach (j; 0 .. result.length) { auto idx = i*result.length + j; result[j][i] = a[idx]; } } return result; } static assert(split([1,2,3,4,5,6]) == [[1, 4], [2, 5], [3, 6]]); The above code does now work at ctfe. Meaning we are well on our way of supporting multidimensional arrays. Multidimensional Slices however are a little tricker.
Jul 22
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, Due to improved ABI handling a subset of slices of complex structs work now :) The following code will correctly compile with newCTFE. struct NA { string name; uint age; } NA[] make5(string name) { NA[] result; foreach(i; 1 .. 6) { string nameN = name ~ [cast(immutable char)('0' + (i % 10))]; result ~= [NA(nameN , i-1)]; } return result; } static assert (make5("Tony") == [NA("Tony1", 0u), NA("Tony2", 1u), NA("Tony3", 2u), NA("Tony4", 3u), NA("Tony5", 4u)]); As soon as the foreach-loop is iterated more then 1000 times you will see a 7-10x speed improvement :)
Jul 22
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys I am currently fixing multi-dimensional arrays as outer parameters. So the following does not work. uint sumXd(uint[2][2]) { ... bla bla ... } pragma(msg, sumXd([[2,4],[4,7]])); This pretty tricky since we have the constraint of reprenting slices and arrays with the same ABI. So far I have worked 15 hours on this issue and it looks like it's going to be alot more :(
Jul 23
prev sibling next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, I managed to hack around the issue of multi-dimensional static array parameters. making the following code work in newCTFE: int[] fold (int[4][3] a) { int[] result; result.length = 4 * 3; int pos; foreach (i; 0 .. 3) { foreach (j; 0 .. 4) { result[pos++] = a[i][j]; } } return result; } static assert (fold ([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]) == [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]); Enjoy!
Jul 24
parent Olivier FAURE <olivier.faure epitech.eu> writes:
On Monday, 24 July 2017 at 11:17:23 UTC, Stefan Koch wrote:
 static assert (fold ([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 
 12]]) ==
               [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]);

 Enjoy!
I barely have any idea what any of this means, but it looks really cool. Keep up the good work!
Jul 24
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys I just fixed another bug that had me puzzled for a while. The following function ulong swap(ulong val) { ulong result; result |= (val & 0xFF) << 56; result |= (val & 0xFF00) << 40; result |= (val & 0xFF00_00) << 24; result |= (val & 0xFF00_0000) << 8; result |= (val & 0xFF00_0000_00) >> 8; result |= (val & 0xFF00_0000_0000) >> 24; result |= (val & 0xFF00_0000_0000_00) >> 40; result |= (val & 0xFF00_0000_0000_0000) >> 56; return result; } would return strange values. pragma(msg, swap(0xABCD_EF01_2345_6789)); // returns 9900958322423496704 // expected 9900958322455989675 On further inspection the actual return value has only zeros in the lower 32 bits. The generated IR however looks completey correct. Initialize(); Line(1); auto val_1 = genParameter(BCType(BCTypeEnum.i64));//SP[4] beginFunction(0);//swap Line(2); Line(3); auto result_1 = genLocal(BCType(BCTypeEnum.i64), "result");//SP[8] Set(result_1, BCValue(Imm64(0))); Line(5); auto tmp1 = genTemporary(BCType(BCTypeEnum.i64));//SP[16] auto tmp2 = genTemporary(BCType(BCTypeEnum.i64));//SP[24] And3(tmp2, val_1, BCValue(Imm64(255))); auto tmp3 = genTemporary(BCType(BCTypeEnum.i32));//SP[32] Le3(tmp3, BCValue(Imm32(56)), BCValue(Imm32(63))); auto tmp4 = genTemporary(BCType(BCTypeEnum.i32));//SP[36] Set(tmp4, BCValue(Imm32(56))); auto tmp5 = genTemporary(BCType(BCTypeEnum.i32));//SP[40] Set(tmp5, BCValue(Imm32(63))); Assert(tmp3, Imm32(1) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(56)), BCValue(Imm32(63))*/;); Lsh3(tmp1, tmp2, BCValue(Imm32(56))); Or3(result_1, result_1, tmp1); Line(6); auto tmp6 = genTemporary(BCType(BCTypeEnum.i64));//SP[44] auto tmp7 = genTemporary(BCType(BCTypeEnum.i64));//SP[52] And3(tmp7, val_1, BCValue(Imm64(65280))); auto tmp8 = genTemporary(BCType(BCTypeEnum.i32));//SP[60] Le3(tmp8, BCValue(Imm32(40)), BCValue(Imm32(63))); auto tmp9 = genTemporary(BCType(BCTypeEnum.i32));//SP[64] Set(tmp9, BCValue(Imm32(40))); auto tmp10 = genTemporary(BCType(BCTypeEnum.i32));//SP[68] Set(tmp10, BCValue(Imm32(63))); Assert(tmp8, Imm32(2) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(40)), BCValue(Imm32(63))*/;); Lsh3(tmp6, tmp7, BCValue(Imm32(40))); Or3(result_1, result_1, tmp6); Line(7); auto tmp11 = genTemporary(BCType(BCTypeEnum.i64));//SP[72] auto tmp12 = genTemporary(BCType(BCTypeEnum.i64));//SP[80] And3(tmp12, val_1, BCValue(Imm64(16711680))); auto tmp13 = genTemporary(BCType(BCTypeEnum.i32));//SP[88] Le3(tmp13, BCValue(Imm32(24)), BCValue(Imm32(63))); auto tmp14 = genTemporary(BCType(BCTypeEnum.i32));//SP[92] Set(tmp14, BCValue(Imm32(24))); auto tmp15 = genTemporary(BCType(BCTypeEnum.i32));//SP[96] Set(tmp15, BCValue(Imm32(63))); Assert(tmp13, Imm32(3) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(24)), BCValue(Imm32(63))*/;); Lsh3(tmp11, tmp12, BCValue(Imm32(24))); Or3(result_1, result_1, tmp11); Line(8); auto tmp16 = genTemporary(BCType(BCTypeEnum.i64));//SP[100] auto tmp17 = genTemporary(BCType(BCTypeEnum.i64));//SP[108] And3(tmp17, val_1, BCValue(Imm64(4278190080))); auto tmp18 = genTemporary(BCType(BCTypeEnum.i32));//SP[116] Le3(tmp18, BCValue(Imm32(8)), BCValue(Imm32(63))); auto tmp19 = genTemporary(BCType(BCTypeEnum.i32));//SP[120] Set(tmp19, BCValue(Imm32(8))); auto tmp20 = genTemporary(BCType(BCTypeEnum.i32));//SP[124] Set(tmp20, BCValue(Imm32(63))); Assert(tmp18, Imm32(4) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(8)), BCValue(Imm32(63))*/;); Lsh3(tmp16, tmp17, BCValue(Imm32(8))); Or3(result_1, result_1, tmp16); Line(10); auto tmp21 = genTemporary(BCType(BCTypeEnum.i64));//SP[128] auto tmp22 = genTemporary(BCType(BCTypeEnum.i64));//SP[136] And3(tmp22, val_1, BCValue(Imm64(1095216660480))); auto tmp23 = genTemporary(BCType(BCTypeEnum.i32));//SP[144] Le3(tmp23, BCValue(Imm32(8)), BCValue(Imm32(63))); auto tmp24 = genTemporary(BCType(BCTypeEnum.i32));//SP[148] Set(tmp24, BCValue(Imm32(8))); auto tmp25 = genTemporary(BCType(BCTypeEnum.i32));//SP[152] Set(tmp25, BCValue(Imm32(63))); Assert(tmp23, Imm32(5) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(8)), BCValue(Imm32(63))*/;); Rsh3(tmp21, tmp22, BCValue(Imm32(8))); Or3(result_1, result_1, tmp21); Line(11); auto tmp26 = genTemporary(BCType(BCTypeEnum.i64));//SP[156] auto tmp27 = genTemporary(BCType(BCTypeEnum.i64));//SP[164] And3(tmp27, val_1, BCValue(Imm64(280375465082880))); auto tmp28 = genTemporary(BCType(BCTypeEnum.i32));//SP[172] Le3(tmp28, BCValue(Imm32(24)), BCValue(Imm32(63))); auto tmp29 = genTemporary(BCType(BCTypeEnum.i32));//SP[176] Set(tmp29, BCValue(Imm32(24))); auto tmp30 = genTemporary(BCType(BCTypeEnum.i32));//SP[180] Set(tmp30, BCValue(Imm32(63))); Assert(tmp28, Imm32(6) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(24)), BCValue(Imm32(63))*/;); Rsh3(tmp26, tmp27, BCValue(Imm32(24))); Or3(result_1, result_1, tmp26); Line(12); auto tmp31 = genTemporary(BCType(BCTypeEnum.i64));//SP[184] auto tmp32 = genTemporary(BCType(BCTypeEnum.i64));//SP[192] And3(tmp32, val_1, BCValue(Imm64(71776119061217280))); auto tmp33 = genTemporary(BCType(BCTypeEnum.i32));//SP[200] Le3(tmp33, BCValue(Imm32(40)), BCValue(Imm32(63))); auto tmp34 = genTemporary(BCType(BCTypeEnum.i32));//SP[204] Set(tmp34, BCValue(Imm32(40))); auto tmp35 = genTemporary(BCType(BCTypeEnum.i32));//SP[208] Set(tmp35, BCValue(Imm32(63))); Assert(tmp33, Imm32(7) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(40)), BCValue(Imm32(63))*/;); Rsh3(tmp31, tmp32, BCValue(Imm32(40))); Or3(result_1, result_1, tmp31); Line(13); auto tmp36 = genTemporary(BCType(BCTypeEnum.i64));//SP[212] auto tmp37 = genTemporary(BCType(BCTypeEnum.i64));//SP[220] And3(tmp37, val_1, BCValue(Imm64(18374686479671623680))); auto tmp38 = genTemporary(BCType(BCTypeEnum.i32));//SP[228] Le3(tmp38, BCValue(Imm32(56)), BCValue(Imm32(63))); auto tmp39 = genTemporary(BCType(BCTypeEnum.i32));//SP[232] Set(tmp39, BCValue(Imm32(56))); auto tmp40 = genTemporary(BCType(BCTypeEnum.i32));//SP[236] Set(tmp40, BCValue(Imm32(63))); Assert(tmp38, Imm32(8) /*"shift by %d is outside the range 0..%d", BCValue(Imm32(56)), BCValue(Imm32(63))*/;); Rsh3(tmp36, tmp37, BCValue(Imm32(56))); Or3(result_1, result_1, tmp36); Line(26); Ret(result_1); Line(27); endFunction(); Line(28); Finalize(); So what is going wrong here ? The 4 lines responsible for the lower 32bit of the result have immediate and-instructions with constants larger then 32 bits. like : And3(tmp37, val_1, BCValue(Imm64(18374686479671623680))); which will cause the bytecode-interpreter to the immediate-bytecode instruction And tmp37, (immvalue & 0xFFFF_FFFF) Which in this case will become And tmp37, #0 (since the lower bits of 18374686479671623680 are zero) This is now fixed by pushing 64bit value into a register and issuing a reg-reg instruction. So, you see, there is never a boring day in newCTFE-land :) Cheers, Stefan
Jul 28
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hey Guys! working on the ctfe brainfuck compiler I finally figured out what was going wrong. consider this: uint fn () { uint[1] pointlessArray = [0]; foreach(0 .. 42) pointlessArray[0]++; return pointlessArray[0]; } static assert(fn() == 42); until a few minutes ago this would have failed and the output would have been 0; can you guess why ? Well, while the ++array[0] would lower to BinAssignExp (array[0] += 1) which does correctly deal with references ++ is actually not lowerd but it's own special expression. which is handled with the following code: auto expr = genExpr(e.e1); // in x++ expr is the x if (!canWorkWithType(expr.type) || !canWorkWithType(retval.type)) { bailout("++ only i32 is supported not " ~ to!string(expr.type.type) ~ " -- " ~ e.toString); return; } assert(expr.vType != BCValueType.Immediate, "++ does not make sense as on an Immediate Value"); Set(retval, expr); // return a copy of the old value // the following code adds one the the original value if (expr.type.type == BCTypeEnum.f23) { Add3(expr, expr, BCValue(Imm23f(1.0f))); } else if (expr.type.type == BCTypeEnum.f52) { Add3(expr, expr, BCValue(Imm52f(1.0))); } else { Add3(expr, expr, imm32(1)); } of course arr[x]++ will load the value into a temporary and add one to that temporary never modifying the array. Luckily I introduced a a way for rmw (read-modify-write) operations to be done on structs on arrays a while back. if the expr is not normal local i.e. a stack-variable it will have heapRef which tells you to which pointer you have to write to modify the actual value rather then just modifying the temporary in which it was loaded. so this was fixed by adding the following 3 lines. if (expr.heapRef) { StoreToHeapRef(expr); } Which will work for array's slices and structs alike :)
Jul 30
prev sibling next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, After getting the brainfuck to D transcompiler to work. I now made it's output compatible with newCTFE. See it here: https://gist.github.com/UplinkCoder/002b31572073798897552af4e8de2024 Unfortunately the above code does seem to get mis-compiled, As it does not output Hello World, but rather:  
Jul 30
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 30 Jul 2017 14:44:07 +0000
schrieb Stefan Koch <uplink.coder googlemail.com>:

 On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ] =20
=20 Hi Guys, =20 After getting the brainfuck to D transcompiler to work. I now made it's output compatible with newCTFE. =20 See it here: https://gist.github.com/UplinkCoder/002b31572073798897552af4e8de2024 =20 Unfortunately the above code does seem to get mis-compiled, As it does not output Hello World, but rather: =02=01 =02=11
Funny, it is working and mis-compiling at the same time. I figure with such complex code, it is working if it ends up *printing anything* at all and not segfaulting. :) --=20 Marco
Jul 31
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 31 July 2017 at 17:58:56 UTC, Marco Leise wrote:
 Am Sun, 30 Jul 2017 14:44:07 +0000
 schrieb Stefan Koch <uplink.coder googlemail.com>:

 On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hi Guys, After getting the brainfuck to D transcompiler to work. I now made it's output compatible with newCTFE. See it here: https://gist.github.com/UplinkCoder/002b31572073798897552af4e8de2024 Unfortunately the above code does seem to get mis-compiled, As it does not output Hello World, but rather:  
Funny, it is working and mis-compiling at the same time. I figure with such complex code, it is working if it ends up *printing anything* at all and not segfaulting. :)
I fixed the bug which cause this to miscompile it works now at ctfe. This code is not really that complex, it only looks confusing. complex code uses slices of struct-types and pointer-slicing and that stuff.
Jul 31
prev sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [ ... ]
Hello Guys, The bug preventing newCTFE from executing bf_ctfe[1] correctly (a peculiarity in which for for and if statement-conditions other then 32bit integers where ignored) is now fixed. newCTFE is about 5.7 times faster compiling bf_ctfe. (compiling bf_ctfe includes a test where a brainfuck interpreter written in brainfuck executes the brainfuck hello-world programm) Here are the numbers: uplink uplink-desktop:~/d/dmd$ time src/dmd ../bf-ctfe/source/*.d -c -o- > x Hello World! Hello World! real 0m0.113s user 0m0.104s sys 0m0.008s uplink uplink-desktop:~/d/dmd$ time dmd ../bf-ctfe/source/*.d -c
 x
Hello World! Hello World! real 0m0.633s user 0m0.600s sys 0m0.028s [1] https://github.com/UplinkCoder/bf_ctfe
Jul 30
parent reply Temtaime <temtaime gmail.com> writes:
On Sunday, 30 July 2017 at 20:40:24 UTC, Stefan Koch wrote:
 On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [...]
Hello Guys, The bug preventing newCTFE from executing bf_ctfe[1] correctly (a peculiarity in which for for and if statement-conditions other then 32bit integers where ignored) is now fixed. [...]
Aren't you disabling codegen by passing a -o- to your engine, so it starts to compile faster?
Jul 31
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 31 July 2017 at 23:03:21 UTC, Temtaime wrote:
 On Sunday, 30 July 2017 at 20:40:24 UTC, Stefan Koch wrote:
 On Thursday, 13 July 2017 at 12:45:19 UTC, Stefan Koch wrote:
 [...]
Hello Guys, The bug preventing newCTFE from executing bf_ctfe[1] correctly (a peculiarity in which for for and if statement-conditions other then 32bit integers where ignored) is now fixed. [...]
Aren't you disabling codegen by passing a -o- to your engine, so it starts to compile faster?
Ah yes. An oversight while posting the results. it does not affect the meassurements in any real way though the difference it 3-5 milliseconds. In the testcase the ctfe workload is totally dominant.
Jul 31