www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - A debug class has started

reply forkit <forkit gmail.com> writes:
ok. one line in this code is causing a problem (as noted in 
comments)

an explanation as to the cause, is welcome ;-)


// ------------------------
module test;

import std : writeln, writefln;
import std.conv : to;

void main()
{
     string str = "abc;def;ab";
     char* w = cast(char*)str;
     writeln(replaceChar(w, str.length, ';', 'X'));
}

immutable(char)[] replaceChar(char* str, ulong len, char ch1, 
char ch2)
{
     for (ulong i = 0; i < len; i++)
     {
         if (str[i] == ch1)
         {
             writefln("Found %c at str[%d]", ch1, i); // fine

             // problem is with this next line:
             // the line works fine using DMD on windows - but 
crashes if using LDC on windows
             // On Linux, both DMD and LDC -> sigsegv
             //str[i] = ch2;
             // seems simple enough
             // - just replace the char currently in element 
str[i]) with 'X'
         }
     }
     return to!(immutable(char)[])(str);
}
// ------------------------
Dec 13 2021
next sibling parent forkit <forkit gmail.com> writes:
On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:
 
 
oh... Windows - dmd version is 2.098.0-dirty - ldc2 version is 1.28 (based on dmd v2.098.0) Linux - dmd version is 2.098 - ldc2 version is 1.20.1 (based on dmd v2.090.1)
Dec 13 2021
prev sibling parent reply forkit <forkit gmail.com> writes:
On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:
 ....
char* w = cast(char*)str.toStringz; // this seems to be the solution class has ended ;-)
Dec 13 2021
parent reply drug <drug2004 bk.ru> writes:
On 13.12.2021 13:49, forkit wrote:
 On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:
 ....
char* w = cast(char*)str.toStringz; // this seems to be the solution class has ended ;-)
That's because `str` is initialized by a literal and you can not change it by definition. When you call `toStringz` it duplicates that literal (adding terminating zero at the end) and the duplicate is mutable. I would recommend do not use `toStringz` and just make duplicate of the literal - https://run.dlang.io/is/vaosW0
Dec 13 2021
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 13.12.21 12:09, drug wrote:
 That's because `str` is initialized by a literal and you can not change 
 it by definition. When you call `toStringz` it duplicates that literal 
 (adding terminating zero at the end) and the duplicate is mutable. I 
 would recommend do not use `toStringz` and just make duplicate of the 
 literal - https://run.dlang.io/is/vaosW0
From the link:
 string str = "abc;def;ab".dup; // allocates the string in the heap
 char* w = cast(char*)str;
 writeln(replaceChar(w, str.length, ';', 'X'));
That still has undefined behavior. You cannot mutate the characters in a `string`. It doesn't matter if it came from a literal or `.dup`. Use `char[]` instead of `string`.
Dec 13 2021
parent drug <drug2004 bk.ru> writes:
On 13.12.2021 14:26, ag0aep6g wrote:
 On 13.12.21 12:09, drug wrote:
 That's because `str` is initialized by a literal and you can not 
 change it by definition. When you call `toStringz` it duplicates that 
 literal (adding terminating zero at the end) and the duplicate is 
 mutable. I would recommend do not use `toStringz` and just make 
 duplicate of the literal - https://run.dlang.io/is/vaosW0
From the link:
 string str = "abc;def;ab".dup; // allocates the string in the heap
 char* w = cast(char*)str;
 writeln(replaceChar(w, str.length, ';', 'X'));
That still has undefined behavior. You cannot mutate the characters in a `string`. It doesn't matter if it came from a literal or `.dup`. Use `char[]` instead of `string`.
You're right. I forget to change `string str` to `auto str` or `char[] str`.
Dec 13 2021
prev sibling parent reply WebFreak001 <d.forum webfreak.org> writes:
On Monday, 13 December 2021 at 11:09:18 UTC, drug wrote:
 On 13.12.2021 13:49, forkit wrote:
 On Monday, 13 December 2021 at 09:49:05 UTC, forkit wrote:
 ....
char* w = cast(char*)str.toStringz; // this seems to be the solution class has ended ;-)
That's because `str` is initialized by a literal and you can not change it by definition. When you call `toStringz` it duplicates that literal (adding terminating zero at the end) and the duplicate is mutable. I would recommend do not use `toStringz` and just make duplicate of the literal - https://run.dlang.io/is/vaosW0
important: toStringz _may_ do a copy with the current implementation but nothing in the docs states it actually does so. In fact there is commented out code where it [in the past](https://github.com/dlang/phobos/commit/bc412e7c3fa3f124d7f2785223318b45edd4b3e6#diff-b94766ba288f9b4b05ef1a4874e26724750e614afdcddaf4c 071d0f19d91595L217) just dereferenced the memory after the string and checked if it was 0. You should really use `.dup` if you want to mutate your string. (You would need to duplicate anyway if you don't want an unsafe cast) pro-tip for bugs like this: just slap ` safe:` at the start of every file, the compiler will tell you everything that is risky and the bug will 9/10 times just sort itself out by fixing what the compiler complains about. (just recently helped someone with this again, was a 3 minute fix for a big code-base where manually searching the issue would have taken much longer)
Dec 13 2021
parent reply forkit <forkit gmail.com> writes:
On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:
 You should really use `.dup` if you want to mutate your string. 
 (You would need to duplicate anyway if you don't want an unsafe 
 cast)
(this produces an unpredictable result??) char* w = cast(char*)str.dup; (but this seems to work - as expected) char* w = strdup(cast(char*)str); // import core.stdc.string : strdup;
 pro-tip for bugs like this: just slap ` safe:` at the start of 
 every file, the compiler will tell you everything that is risky
I like this idea. Thanks ;-)
Dec 13 2021
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via Digitalmars-d-learn wrote:
 On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:
 
 You should really use `.dup` if you want to mutate your string. (You
 would need to duplicate anyway if you don't want an unsafe cast)
(this produces an unpredictable result??) char* w = cast(char*)str.dup;
Shouldn't you be using: char* w = str.dup.ptr; instead?? T -- It is impossible to make anything foolproof because fools are so ingenious. -- Sammy
Dec 13 2021
next sibling parent reply forkit <forkit gmail.com> writes:
On Monday, 13 December 2021 at 20:28:26 UTC, H. S. Teoh wrote:
 On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via 
 Digitalmars-d-learn wrote:
 On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:
 
 You should really use `.dup` if you want to mutate your 
 string. (You would need to duplicate anyway if you don't 
 want an unsafe cast)
(this produces an unpredictable result??) char* w = cast(char*)str.dup;
Shouldn't you be using: char* w = str.dup.ptr; instead?? T
that also produces the same unpredictable result. i.e (an extra character from 'somewhere' appears in the output from line below) writeln(replaceChar(w, str.length, ';', 'X'));
Dec 13 2021
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 13, 2021 at 08:47:12PM +0000, forkit via Digitalmars-d-learn wrote:
 On Monday, 13 December 2021 at 20:28:26 UTC, H. S. Teoh wrote:
 On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via Digitalmars-d-learn
 wrote:
[...]
 (this produces an unpredictable result??)
 char* w = cast(char*)str.dup;
Shouldn't you be using: char* w = str.dup.ptr; instead??
[...]
 that also produces the same unpredictable result.
 
 i.e (an extra character from 'somewhere' appears in the output from
 line below)
 writeln(replaceChar(w, str.length, ';', 'X'));
[...] That's weird. Can you post the minimal code that exhibits this problem? T -- Prosperity breeds contempt, and poverty breeds consent. -- Suck.com
Dec 13 2021
prev sibling parent reply forkit <forkit gmail.com> writes:
On Monday, 13 December 2021 at 20:28:26 UTC, H. S. Teoh wrote:
 On Mon, Dec 13, 2021 at 08:04:24PM +0000, forkit via 
 Digitalmars-d-learn wrote:
 On Monday, 13 December 2021 at 12:06:53 UTC, WebFreak001 wrote:
 
 You should really use `.dup` if you want to mutate your 
 string. (You would need to duplicate anyway if you don't 
 want an unsafe cast)
(this produces an unpredictable result??) char* w = cast(char*)str.dup;
Shouldn't you be using: char* w = str.dup.ptr; instead?? T
so here are all the possible options I've tried. only 2 of these actually produce the expected result. // ------ module test; import std : writeln, writefln; import std.conv : to; import core.stdc.string : strdup; import std.string : toStringz; void main() { string str = "abc;def;ab"; //char* w = cast(char*)str; // NOPE! A string constant is immutable, so expect undefined behaviour. char* w = strdup(cast(char*)str); // ok //char* w = cast(char*)str.toStringz; // also ok // all these below result in an extra character from 'somewhere' appearing in the writeln output //char* w = cast(char*)str.dup; // nope //char* w = str.dup.ptr; // nope //char* w = &dup(cast(const(char)[])str)[0]; // nope writeln(replaceChar(w, str.length, ';', 'X')); } immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } return to!(immutable(char)[])(str); } // ----------------
Dec 13 2021
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 13, 2021 at 08:58:42PM +0000, forkit via Digitalmars-d-learn wrote:
[...]
 immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2)
 {
     for (ulong i = 0; i < len; i++)
     {
         if (str[i] == ch1)
         {
             writefln("Found %c at str[%d]", ch1, i); // fine
             str[i] = ch2;
         }
     }
     return to!(immutable(char)[])(str);
This line is your problem: you have a raw pointer `str` and force-casting it to an array without specifying its length. Do not expect anything good to come of this. (In fact I'm surprised it .to even accepts such a thing!) What you should be doing is: return to!string(str[0 .. len]); Or just: return str[0 .. len].idup; T -- Being forced to write comments actually improves code, because it is easier to fix a crock than to explain it. -- G. Steele
Dec 13 2021
parent reply forkit <forkit gmail.com> writes:
On Monday, 13 December 2021 at 21:13:25 UTC, H. S. Teoh wrote:
 What you should be doing is:

 	return to!string(str[0 .. len]);

 Or just:

 	return str[0 .. len].idup;


 T
oh.. so many different ways...(to both produce the same bug, and also to produce the correct output). ... it's a little mind boggling ;-) // ---------- module test; import std : writeln, writefln, assumeUnique; import std.conv : to; import core.stdc.string : strdup; import std.string : toStringz; void main() { string str = "abc;def;ab"; //char* w = cast(char*)str; // nope. a pointer to a string constant is // (supposed to be) immutable, so expect undefined behaviour. char* w = strdup(cast(char*)str); // ok //char* w = cast(char*)str.toStringz; // also ok //char* w = cast(char*)str.dup; // also ok //char* w = str.dup.ptr; // also ok writeln(replaceChar(w, str.length, ';', 'X')); } immutable(char)[] replaceChar(char* str, ulong len, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } //return to!(immutable(char)[])(str); // nope .. issue with null terminator perhaps ?? return str[0 .. len].idup; // ok //return str[0 .. len].dup; // also ok //return to!string(str[0 .. len]); // also ok //return assumeUnique(str[0..len]); // also ok } // ---------------------
Dec 13 2021
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 13, 2021 at 10:43:14PM +0000, forkit via Digitalmars-d-learn wrote:
 On Monday, 13 December 2021 at 21:13:25 UTC, H. S. Teoh wrote:
 
 What you should be doing is:
 
 	return to!string(str[0 .. len]);
 
 Or just:
 
 	return str[0 .. len].idup;
[...]
 oh.. so many different ways...(to both produce the same bug, and also
 to produce the correct output).
 
 ... it's a little mind boggling ;-)
[...] It's very simple. In C, an array decays to a pure pointer. In D, an array is a pointer + length pair. Given a D array, if you want a pointer to its first element, you use .ptr. Given a D pointer, if you want an array, you slice it with [0 .. length]. That's all there is to it. T -- Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
Dec 13 2021
prev sibling parent reply WebFreak001 <d.forum webfreak.org> writes:
On Monday, 13 December 2021 at 22:43:14 UTC, forkit wrote:
 [...]

     //char* w = cast(char*)str; // nope. a pointer to a string 
 constant is
                                 // (supposed to be) immutable, 
 so expect undefined behaviour.
note:
     //char* w = cast(char*)str.toStringz; // also ok
this is also undefined behavior (toStringz returns an immutable(char)* which you cast away)
     char* w = strdup(cast(char*)str); // ok
this is a C library function - this is risky if your string is not a string literal (may copy too much or segfault) - I would recommend not using this. This will only work properly when you have string literals (strings that are created using `""` in code, no other strings like those that are for example read from user input, from files or dynamically created)
     //char* w = cast(char*)str.dup; // also ok
     //char* w = str.dup.ptr; // also ok

 [...]
the last two here are equivalent, I personally prefer the last one. I think these are the idiomatic way how to duplicate a string into writable memory and get the pointer to it. The best way would be not doing this at all - when you manipulate strings/arrays in D you can do so by just assigning the elements like this: ```d immutable(char)[] replaceChar(char[] str, char ch1, char ch2) { for (ulong i = 0; i < len; i++) { if (str[i] == ch1) { writefln("Found %c at str[%d]", ch1, i); // fine str[i] = ch2; } } return str.idup; } ``` then when you call it: ```d replaceChar(str.dup, ';', 'X'); ``` or the function more idiomatically: ```d string replaceChar(scope char[] str, char ch1, char ch2) { // ref makes the `c` variable an l-value / assignable and modifies the character when assigned foreach (i, ref c; str) { if (c == ch1) { writefln("Found %s at str[%s]", c, i); c = ch2; } } return str.idup; // you could also not .idup and return char[] and let the caller .idup it when needed } ``` You only really need to work with pointers when you interface with a C library that needs them.
Dec 14 2021
parent forkit <forkit gmail.com> writes:
On Tuesday, 14 December 2021 at 08:07:43 UTC, WebFreak001 wrote:
 The best way would be not doing this at all - when you 
 manipulate strings/arrays in D you can do so by just assigning 
 the elements like this:

 ```d
 immutable(char)[] replaceChar(char[] str, char ch1, char ch2)
 {
     for (ulong i = 0; i < len; i++)
     {
         if (str[i] == ch1)
         {
             writefln("Found %c at str[%d]", ch1, i); // fine
             str[i] = ch2;
         }
     }

     return str.idup;
 }
 ```

 then when you call it:
 ```d
 replaceChar(str.dup, ';', 'X');
 ```

 or the function more idiomatically:
 ```d
 string replaceChar(scope char[] str, char ch1, char ch2)
 {
     // ref makes the `c` variable an l-value / assignable and 
 modifies the character when assigned
     foreach (i, ref c; str)
     {
         if (c == ch1)
         {
             writefln("Found %s at str[%s]", c, i);
             c = ch2;
         }
     }

     return str.idup; // you could also not .idup and return 
 char[] and let the caller .idup it when needed
 }
 ```

 You only really need to work with pointers when you interface 
 with a C library that needs them.
This was of course just me 'playing around with pointer casting in D', and not code that I would have deployed. Debugging that code used up an hour of my life .. that I cannot get back I might try out safe instead ;-)
Dec 14 2021
prev sibling parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Monday, 13 December 2021 at 20:58:42 UTC, forkit wrote:

 immutable(char)[] replaceChar(char* str, ulong len, char ch1, 
 char ch2)
 //snip
     return to!(immutable(char)[])(str);
 }
You're calling a `to` on a char pointer, which, ostensibly, would look for null terminator. Which there may not be any if you do a .dup.
Dec 13 2021