www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to test if a string is pointing into read-only memory?

reply jfondren <julian.fondren gmail.com> writes:
std.string.toStringz always allocates a new string, but it has 
this note:

```d
/+ Unfortunately, this isn't reliable.
  We could make this work if string literals are put
  in read-only memory and we test if s[] is pointing into
  that.

  /* Peek past end of s[], if it's 0, no conversion necessary.
  * Note that the compiler will put a 0 past the end of static
  * strings, and the storage allocator will put a 0 past the end
  * of newly allocated char[]'s.
  */
  char* p = &s[0] + s.length;
  if (*p == 0)
  return s;
  +/
```

and string literals weren't reliably in read-only memory as 
recently as early 2017: 
https://github.com/dlang/dmd/pull/6546#issuecomment-280612721

What's a reliable test that could be used in a toStringz that 
skips allocation when given a string in read-only memory?

As for whether it's a necessarily a good idea to patch toStringz, 
I'd worry that

1. someone will slice a string literal and pass the test while 
not having NUL where it's expected

2. people are probably relying by now on toStringz always 
allocating, to e.g. safely cast immutable off the result.
Oct 12 2021
next sibling parent reply Elronnd <elronnd elronnd.net> writes:
On Tuesday, 12 October 2021 at 08:19:01 UTC, jfondren wrote:
 What's a reliable test that could be used in a toStringz that 
 skips allocation when given a string in read-only memory?
There is no good way. - You could peek in /proc, but that's not portable - You could poke the data and catch the resulting fault; but that's: 1) horrible, 2) slow, 3) problematic wrt threading, 4) sensitive to user code mapping its own memory and then remapping as rw (or unmapping) - You could make a global hash table into which are registered the addresses of all rodata; but that is difficult to get right across translation units, especially in the face of dynamic linking. This is probably the most feasible, but is really not worth the hassle.
Oct 12 2021
next sibling parent Elronnd <elronnd elronnd.net> writes:
On Tuesday, 12 October 2021 at 09:20:42 UTC, Elronnd wrote:
 problematic wrt threading
Not to mention signals. Reentrancy's a bitch.
Oct 12 2021
prev sibling parent reply IGotD- <nise nise.com> writes:
On Tuesday, 12 October 2021 at 09:20:42 UTC, Elronnd wrote:
 There is no good way.
Can't it be done using function overloading?
Oct 12 2021
parent Paul Backus <snarwin gmail.com> writes:
On Tuesday, 12 October 2021 at 21:42:45 UTC, IGotD- wrote:
 On Tuesday, 12 October 2021 at 09:20:42 UTC, Elronnd wrote:
 There is no good way.
Can't it be done using function overloading?
Function overloading lets you distinguish between arguments with different types, but strings in read-only memory and strings in read-write memory both have the same type: string.
Oct 12 2021
prev sibling next sibling parent ag0aep6g <anonymous example.com> writes:
On 12.10.21 10:19, jfondren wrote:
 ```d
 /+ Unfortunately, this isn't reliable.
   We could make this work if string literals are put
   in read-only memory and we test if s[] is pointing into
   that.
 
   /* Peek past end of s[], if it's 0, no conversion necessary.
   * Note that the compiler will put a 0 past the end of static
   * strings, and the storage allocator will put a 0 past the end
   * of newly allocated char[]'s.
   */
   char* p = &s[0] + s.length;
   if (*p == 0)
   return s;
   +/
 ```
[...]
 As for whether it's a necessarily a good idea to patch toStringz, I'd 
 worry that
 
 1. someone will slice a string literal and pass the test while not 
 having NUL where it's expected
The (commented-out) code checks if the NUL is there. Just make sure that it's also read-only.
 2. people are probably relying by now on toStringz always allocating, to 
 e.g. safely cast immutable off the result.
It doesn't matter if the result is freshly allocated. Casting away immutable is only allowed as long as you don't use it to actually change the data (i.e. it remains de-facto immutable).
Oct 12 2021
prev sibling parent Kagamin <spam here.lot> writes:
On Tuesday, 12 October 2021 at 08:19:01 UTC, jfondren wrote:
 and string literals weren't reliably in read-only memory as 
 recently as early 2017: 
 https://github.com/dlang/dmd/pull/6546#issuecomment-280612721
Sometimes sections have defined symbols for start and end, you can check if the string is in rdata section. On windows you can test it generically with IsBadWritePtr function.
Oct 12 2021