www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Kernel buffer overflow exposes iPhone 11 Pro to radio based attacks

reply Paulo Pinto <pjmlp progtools.org> writes:
Yet another proof why system languages like D are required to 
take over OS development.

https://googleprojectzero.blogspot.com/2020/12/an-ios-zero-click-radio-proximity.html
Dec 02 2020
next sibling parent reply M.M. <matus email.cz> writes:
On Wednesday, 2 December 2020 at 11:07:54 UTC, Paulo Pinto wrote:
 Yet another proof why system languages like D are required to 
 take over OS development.

 https://googleprojectzero.blogspot.com/2020/12/an-ios-zero-click-radio-proximity.html
Why do you think that D is better than C++ in that respect?
Dec 02 2020
next sibling parent reply IGotD- <nise nise.com> writes:
On Wednesday, 2 December 2020 at 11:19:08 UTC, M.M. wrote:
 Why do you think that D is better than C++ in that respect?
C++ does not have array bounds checking, at least in its original form. C++ added std::array which is kind of wrapper around its static arrays. However, since it is in the STL library you need to compile it and it doesn't play that well bare metal programming. Also when you include STL binary size shoots up the roof. Therefore C++ + STL is often not suitable for kernels and this problem applies for D as well. BetterC is more suitable for kernels.
Dec 02 2020
parent reply M.M. <matus email.cz> writes:
On Wednesday, 2 December 2020 at 12:06:57 UTC, IGotD- wrote:
 On Wednesday, 2 December 2020 at 11:19:08 UTC, M.M. wrote:
 Why do you think that D is better than C++ in that respect?
C++ does not have array bounds checking, at least in its original form. C++ added std::array which is kind of wrapper around its static arrays. However, since it is in the STL library you need to compile it and it doesn't play that well bare metal programming. Also when you include STL binary size shoots up the roof. Therefore C++ + STL is often not suitable for kernels and this problem applies for D as well. BetterC is more suitable for kernels.
Oh, OK, I see. I was thinking that modern C++ would be equally suitable as D or Rust or any other "modern" language of that sort. But what you explain give me a different view on that.
Dec 02 2020
next sibling parent reply IGotD- <nise nise.com> writes:
On Wednesday, 2 December 2020 at 12:32:02 UTC, M.M. wrote:
 Oh, OK, I see. I was thinking that modern C++ would be equally 
 suitable as D or Rust or any other "modern" language of that 
 sort. But what you explain give me a different view on that.
I also forgot to mention that when there is an out of bound access, both C++ and D throw an exception. Exceptions are usually a big no no in kernels because the potential memory allocation. In kernels you want to reduce and have absolute control over memory allocations. Only the new proposal "lean exceptions" in C++ might be interesting for kernels.
Dec 02 2020
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 2 December 2020 at 15:04:29 UTC, IGotD- wrote:
 I also forgot to mention that when there is an out of bound 
 access, both C++ and D throw an exception. Exceptions are 
 usually a big no no in kernels because the potential memory 
 allocation.
D range error actually doesn't allocate memory, it throws a statically allocated object, or with appropriate compile switches, calls an abort function directly.
Dec 02 2020
parent reply IGotD- <nise nise.com> writes:
On Wednesday, 2 December 2020 at 15:13:35 UTC, Adam D. Ruppe 
wrote:
 D range error actually doesn't allocate memory, it throws a 
 statically allocated object, or with appropriate compile 
 switches, calls an abort function directly.
That's interesting and of course people don't this because it is not documented and you of have study the source code in detail to figure this out. Do we have a list what libraries use statically allocated exceptions? Then comes the question, how is this statically allocated exception placed, on stack or TLS. Is it shared with all libraries that throws RangeError or is it per library?
Dec 02 2020
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 2 December 2020 at 15:19:37 UTC, IGotD- wrote:
 That's interesting and of course people don't this because it 
 is not documented
It is documented that catching an Error will not necessarily work. This is meant to give significant freedom to the implementation. Note that Error and Exception are two different things. Both are thrown and sometimes caught, but they are two different branches of classes and the language definition treats them differently. Exceptions are meant to be caught and unwind the stack as they work their way up it. Errors are meant to kill the program and thus may or may not unwind the stack and be allowed to be caught. (the `-checkaction` switch to dmd can change throw Error to simply abort the program by a couple means)
 Do we have a list what libraries use statically allocated 
 exceptions?
Library exceptions are free to do this too, but they usually don't since user code is permitted to catch and even retain references to normal exception objects. So it makes it difficult to use tricks like this without breaking user code. But if a specific library were to do it, it would surely be documented and specially called out for them. No list as far as I know though.
 Then comes the question, how is this statically allocated 
 exception placed, on stack or TLS.
It is implementation defined, so a kernel implementation would surely do it differently if necessary, but in the default impl it goes in TLS. There's one block it reuses for any Error that occurs. Most common examples of Errors are RangeError for out-of-bounds array access and AssertError for failed assertions.
 Is it shared with all libraries that throws RangeError or is it 
 per library?
This is part of druntime, so unless your libraries have multiple copies of the runtime (which is possible with Windows dlls but not really anywhere else, the system linker will merge the definitions) there's just the one. If some library decides to `throw new RangeError` of course that would be an ordinary GC'd object. The language doesn't prohibit this but it also doesn't allow it per se, it is still subject to the implementation's whim.
Dec 02 2020
prev sibling parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 2 December 2020 at 15:19:37 UTC, IGotD- wrote:
 On Wednesday, 2 December 2020 at 15:13:35 UTC, Adam D. Ruppe 
 wrote:
 D range error actually doesn't allocate memory, it throws a 
 statically allocated object, or with appropriate compile 
 switches, calls an abort function directly.
That's interesting and of course people don't this because it is not documented and you of have study the source code in detail to figure this out. Do we have a list what libraries use statically allocated exceptions?
I haven't looked to find whether it's properly documented, but you can find the announcement about the feature that allows choosing the behavior here: https://dlang.org/changelog/2.084.0.html#checkaction As for how built-in error handling (I mean out-of-bounds array checks, out-of-memory, etc.) is implemented, it depends on whether betterC [1] is enabled and whether exception support is enabled [2]. [1]: https://github.com/dlang/druntime/blob/v2.094.2/src/core/exception.d#L22 [2]: https://github.com/dlang/druntime/blob/v2.094.2/src/core/exception.d#L16 If exception support is enabled, we do e.g.: throw staticError!RangeError(file, line, null); (The last parameter is used for a chained exceptions, which here is not the case). The implementation for staticError [3] uses a global TLS variable as storage, though I really don't know why it's implement more along the lines of this: https://run.dlang.io/gist/PetarKirov/f7b6c3b7e4f8fe10c0a2dbf0d3f44471 [3]: https://github.com/dlang/druntime/blob/v2.094.2/src/core/exception.d#L634 Though I really think it should be implemented like this:
Dec 02 2020
prev sibling next sibling parent Paulo Pinto <pjmlp progtools.org> writes:
On Wednesday, 2 December 2020 at 15:04:29 UTC, IGotD- wrote:
 On Wednesday, 2 December 2020 at 12:32:02 UTC, M.M. wrote:
 Oh, OK, I see. I was thinking that modern C++ would be equally 
 suitable as D or Rust or any other "modern" language of that 
 sort. But what you explain give me a different view on that.
I also forgot to mention that when there is an out of bound access, both C++ and D throw an exception. Exceptions are usually a big no no in kernels because the potential memory allocation. In kernels you want to reduce and have absolute control over memory allocations. Only the new proposal "lean exceptions" in C++ might be interesting for kernels.
C++ can throw an exception IIF either at() was used, or the STL was compiled with exceptions enabled for operator[]() and the developer aren't using C style arrays in any place of the code. Otherwise it will corrupt memory just like C, and no exception will be thrown no matter what.
Dec 02 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/2/2020 7:04 AM, IGotD- wrote:
 I also forgot to mention that when there is an out of bound access, both C++
and 
 D throw an exception. Exceptions are usually a big no no in kernels because
the 
 potential memory allocation. In kernels you want to reduce and have absolute 
 control over memory allocations. Only the new proposal "lean exceptions" in
C++ 
 might be interesting for kernels.
D has configurable options on what to do on a buffer overflow. One of them, the simplest, is just execute a halt instruction.
Dec 03 2020
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/3/2020 8:08 PM, Walter Bright wrote:
 On 12/2/2020 7:04 AM, IGotD- wrote:
 I also forgot to mention that when there is an out of bound access, both C++ 
 and D throw an exception. Exceptions are usually a big no no in kernels 
 because the potential memory allocation. In kernels you want to reduce and 
 have absolute control over memory allocations. Only the new proposal "lean 
 exceptions" in C++ might be interesting for kernels.
D has configurable options on what to do on a buffer overflow. One of them, the simplest, is just execute a halt instruction.
https://dlang.org/dmd-windows.html#switch-checkaction
Dec 03 2020
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 4 December 2020 at 04:08:31 UTC, Walter Bright wrote:
 One of them, the simplest, is just execute a halt instruction.
which wouldn't help kernel code at all fyi
Dec 03 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/3/2020 8:13 PM, Adam D. Ruppe wrote:
 On Friday, 4 December 2020 at 04:08:31 UTC, Walter Bright wrote:
 One of them, the simplest, is just execute a halt instruction.
which wouldn't help kernel code at all fyi
Infinitely better than a buffer overflow.
Dec 03 2020
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 04.12.20 08:03, Walter Bright wrote:
 On 12/3/2020 8:13 PM, Adam D. Ruppe wrote:
 On Friday, 4 December 2020 at 04:08:31 UTC, Walter Bright wrote:
 One of them, the simplest, is just execute a halt instruction.
which wouldn't help kernel code at all fyi
Infinitely better than a buffer overflow.
In ring 0 where the kernel runs, `HLT` does not prevent the buffer overflow, it's just delayed until the next external interrupt. Essentially, it would behave in a way similar to this: if(i > a.length){ Thread.sleep(); } a.ptr[i]=x; The only reason why `HLT` terminates execution of userspace code is that such code does not have sufficient permissions to execute the instruction; in the kernel, it would not do much.
Dec 04 2020
next sibling parent reply IGotD- <nise nise.com> writes:
On Friday, 4 December 2020 at 09:24:43 UTC, Timon Gehr wrote:
 In ring 0 where the kernel runs, `HLT` does not prevent the 
 buffer overflow, it's just delayed until the next external 
 interrupt.

 Essentially, it would behave in a way similar to this:

 if(i > a.length){
     Thread.sleep();
 }
 a.ptr[i]=x;

 The only reason why `HLT` terminates execution of userspace 
 code is that such code does not have sufficient permissions to 
 execute the instruction; in the kernel, it would not do much.
Correct so if this was a kernel, then if you get a interrupt like pressing a key or if there are any pending interrupts, the HLT instruction would just continue. For kernels, the best choice would be some kind of function that is supposed to be called or a panic function that the programmer can fill in. This would be the most versatile option for those scenarios.
Dec 04 2020
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/4/2020 2:12 AM, IGotD- wrote:
 Correct so if this was a kernel, then if you get a interrupt like pressing a
key 
 or if there are any pending interrupts, the HLT instruction would just
continue. 
 For kernels, the best choice would be some kind of function that is supposed
to 
 be called or a panic function that the programmer can fill in. This would be
the 
 most versatile option for those scenarios.
And DMD has that option as well: -checkaction=C which calls the C Standard library assert fail function. Note that this function doesn't actually have to be in the C Standard library, it just has to have the same name and arguments.
Dec 04 2020
prev sibling next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 4 December 2020 at 09:24:43 UTC, Timon Gehr wrote:
 The only reason why `HLT` terminates execution of userspace 
 code is that such code does not have sufficient permissions to 
 execute the instruction; in the kernel, it would not do much.
I think every time dmd uses hlt it would be better off with int 3; the debug trap instruction. It is also one byte - 0xcc - and is actually defined to do something more appropriate. Or maybe not cuz of side effects... idk really, just the misuse of hlt has always bugged me.
Dec 04 2020
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/4/20 4:24 AM, Timon Gehr wrote:
 On 04.12.20 08:03, Walter Bright wrote:
 On 12/3/2020 8:13 PM, Adam D. Ruppe wrote:
 On Friday, 4 December 2020 at 04:08:31 UTC, Walter Bright wrote:
 One of them, the simplest, is just execute a halt instruction.
which wouldn't help kernel code at all fyi
Infinitely better than a buffer overflow.
In ring 0 where the kernel runs, `HLT` does not prevent the buffer overflow, it's just delayed until the next external interrupt. Essentially, it would behave in a way similar to this: if(i > a.length){     Thread.sleep(); } a.ptr[i]=x; The only reason why `HLT` terminates execution of userspace code is that such code does not have sufficient permissions to execute the instruction; in the kernel, it would not do much.
Had no idea. Thanks!
Dec 04 2020
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2020-12-04 10:24, Timon Gehr wrote:

 In ring 0 where the kernel runs, `HLT` does not prevent the buffer 
 overflow, it's just delayed until the next external interrupt.
 
 Essentially, it would behave in a way similar to this:
 
 if(i > a.length){
      Thread.sleep();
 }
 a.ptr[i]=x;
 
 The only reason why `HLT` terminates execution of userspace code is that 
 such code does not have sufficient permissions to execute the 
 instruction; in the kernel, it would not do much.
The just use another instructions that the kernel doesn't have access to. There's always a more privileged mode. -- /Jacob Carlborg
Dec 04 2020
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/4/2020 1:24 AM, Timon Gehr wrote:
 The only reason why `HLT` terminates execution of userspace code is that such 
 code does not have sufficient permissions to execute the instruction; in the 
 kernel, it would not do much.
The compiler actually was changed to generate a UD2 instruction (0x0F0B) at the suggestion of (if I remember correctly) Iain. https://www.felixcloutier.com/x86/ud which raises the invalid opcode exception. https://github.com/dlang/dmd/blob/master/src/dmd/backend/cod2.d#L5723
Dec 04 2020
prev sibling parent Kagamin <spam here.lot> writes:
On Wednesday, 2 December 2020 at 12:32:02 UTC, M.M. wrote:
 Oh, OK, I see. I was thinking that modern C++ would be equally 
 suitable as D or Rust or any other "modern" language of that 
 sort. But what you explain give me a different view on that.
Putty code shows that traditions are fixable, the problem is to find people who will do it, you might as well look for D or Rust programmers instead.
Dec 08 2020
prev sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Wednesday, 2 December 2020 at 11:19:08 UTC, M.M. wrote:
 On Wednesday, 2 December 2020 at 11:07:54 UTC, Paulo Pinto 
 wrote:
 Yet another proof why system languages like D are required to 
 take over OS development.

 https://googleprojectzero.blogspot.com/2020/12/an-ios-zero-click-radio-proximity.html
Why do you think that D is better than C++ in that respect?
Bounds checking enabled by default. You need system code for such kind of tricks and it is relatively easy to track down where system, or trusted, code blocks are used for.
Dec 02 2020
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Dec 02, 2020 at 12:31:28PM +0000, Paulo Pinto via Digitalmars-d wrote:
 On Wednesday, 2 December 2020 at 11:19:08 UTC, M.M. wrote:
 On Wednesday, 2 December 2020 at 11:07:54 UTC, Paulo Pinto wrote:
 Yet another proof why system languages like D are required to take
 over OS development.
 
 https://googleprojectzero.blogspot.com/2020/12/an-ios-zero-click-radio-proximity.html
Why do you think that D is better than C++ in that respect?
Bounds checking enabled by default.
And also arrays are fat pointers. It seems like a minor detail, but it makes a huge difference when the length of the array is always kept together with the pointer to the array contents, and is supported by the language. I work with C code daily, and I cannot tell you how many times I've seen absolutely terrifying code that simply passes a bare pointer around willy-nilly, making implicit assumptions about array size that, almost inevitably, some user code somewhere violates. Or the number of times I've fixed bugs involving checking the wrong size against the wrong pointer, because when you have to manually pass them around, it's easy to make mistakes. The worst is C strings. The number of bugs I've caught involving (potentially) unterminated strings is absolutely scary. The worst ones come from unsanitized input, where the code simply *assumes* that some random data read from a file is null-terminated. Just as bad is code that looks like it was written in the 80's, freely calling strcpy on char* arguments passed in from who knows where, without a care in the world. And yes, such code is still around. And no, it was not written in the 80's, people still write that garbage *today*, believe it or not. *This* is why such things must be enforced *in the language*, because when people are free to do whatever they want, they inevitably gravitate to the pessimal option. An equally bad thing about C strings is that utterly evil function known as strncpy. Why is it evil? Because it comes with the warning that the result may not be terminated if the target buffer is not large enough to contain the entire string. And guess how many people gloss over or simply forget that detail? Yep, I've fixed a whole bunch of bugs caused by that. And don't get me started with casting void* to all sorts of things willy-nilly. (And in non-trivial code you cannot avoid this, because that's the only way you can write polymorphic code in C.) C is just a landmine of memory corruption bugs waiting to happen. The incentives are just all wrong. You have to work hard to make your program memory-safe, and you don't have to lift a finger to incur all sorts of nasty memory bugs. All it takes is *one* slip-up somewhere in some obscure corner in the code, and you have a memory corruption waiting to happen. D made a bunch of seemingly-minor, but actually game-changing decisions that eliminate 95% of the above-mentioned problems. The single biggest one is probably the D array aka fat pointer, as far as memory bugs are concerned. There are a bunch of others, which others have mentioned. The general design in D is to make the simplest, most nave code memory-safe, and you have to work at it if you want to bypass that safety net for systems programming reasons. Which means you'll be thinking harder about your code, and hopefully more aware of potential issues and catch yourself before making slip-ups. That's the way the incentives should be, not the other way round as it is in C. T -- Give a man a fish, and he eats once. Teach a man to fish, and he will sit forever.
Dec 02 2020
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/2/2020 9:52 AM, H. S. Teoh wrote:
 It seems like a minor detail, but it makes a huge difference when the
 length of the array is always kept together with the pointer to the
 array contents, and is supported by the language.  I work with C code
 daily, and I cannot tell you how many times I've seen absolutely
 terrifying code that simply passes a bare pointer around willy-nilly,
 making implicit assumptions about array size that, almost inevitably,
 some user code somewhere violates.  Or the number of times I've fixed
 bugs involving checking the wrong size against the wrong pointer,
 because when you have to manually pass them around, it's easy to make
 mistakes.
I wrote C every day for 15 years before I was able to reliably write complex code that didn't have buffer overflows and other pointer bugs. The conversion of DMD from C to D did not uncover a single pointer bug, which I'm rather proud of. But with D, there's no longer a need to train 15 years to write reliable code.
 The worst is C strings.  The number of bugs I've caught involving
 (potentially) unterminated strings is absolutely scary.
I've talked many times about whenever I review C code, I'll look at the use of string functions first and will nearly always find a bug.
 D made a bunch of seemingly-minor, but actually game-changing decisions
 that eliminate 95% of the above-mentioned problems.  The single biggest
 one is probably the D array aka fat pointer, as far as memory bugs are
 concerned.  There are a bunch of others, which others have mentioned.
 The general design in D is to make the simplest, most naïve code
 memory-safe, and you have to work at it if you want to bypass that
 safety net for systems programming reasons.  Which means you'll be
 thinking harder about your code, and hopefully more aware of potential
 issues and catch yourself before making slip-ups.  That's the way the
 incentives should be, not the other way round as it is in C.
I couldn't have said it better!
Dec 03 2020
parent reply user1234 <user1234 12.de> writes:
On Friday, 4 December 2020 at 07:10:51 UTC, Walter Bright wrote:
 On 12/2/2020 9:52 AM, H. S. Teoh wrote:
 [...]
I wrote C every day for 15 years before I was able to reliably write complex code that didn't have buffer overflows and other pointer bugs. The conversion of DMD from C to D did not uncover a single pointer bug,
Yes *but*. There's still strange code in DMD. For example in dmd.root.aav or dmd.doc there strange lazy initializations with double pointers ( ** ;) )to structures.
Dec 04 2020
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/4/2020 12:41 AM, user1234 wrote:
 Yes *but*. There's still strange code in DMD. For example in dmd.root.aav or 
 dmd.doc there strange lazy initializations with double pointers ( ** ;) )to 
 structures.
Oh, I'm not going to argue that the code is a paragon of virtue. Just that it didn't have pointer bugs in it. (aav.d was optimized for speed, not clarity.)
Dec 04 2020
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2020-12-02 18:52, H. S. Teoh wrote:

 D made a bunch of seemingly-minor, but actually game-changing decisions
 that eliminate 95% of the above-mentioned problems.  The single biggest
 one is probably the D array aka fat pointer, as far as memory bugs are
 concerned.  There are a bunch of others, which others have mentioned.
 The general design in D is to make the simplest, most naïve code
 memory-safe, and you have to work at it if you want to bypass that
 safety net for systems programming reasons.  Which means you'll be
 thinking harder about your code, and hopefully more aware of potential
 issues and catch yourself before making slip-ups.  That's the way the
 incentives should be, not the other way round as it is in C.
Unfortunately it's still very easy to bypass most safety features in D. Especially since everything is system by default. All the features of C are still available, one have to pick the D specific features to be safe. I've seen many many times on the forums that people are asking questions with examples containing C style code with raw pointers and calling functions in the C standard library instead of using the D equivalent. -- /Jacob Carlborg
Dec 04 2020
parent reply Bruce Carneal <bcarneal gmail.com> writes:
On Friday, 4 December 2020 at 15:33:40 UTC, Jacob Carlborg wrote:
 On 2020-12-02 18:52, H. S. Teoh wrote:

 [...]
Unfortunately it's still very easy to bypass most safety features in D. Especially since everything is system by default. All the features of C are still available, one have to pick the D specific features to be safe. I've seen many many times on the forums that people are asking questions with examples containing C style code with raw pointers and calling functions in the C standard library instead of using the D equivalent.
Yes. I could support a DIP making safe the default if it did not claim C is safe.
Dec 04 2020
next sibling parent 12345swordy <alexanderheistermann gmail.com> writes:
On Friday, 4 December 2020 at 17:52:28 UTC, Bruce Carneal wrote:
 On Friday, 4 December 2020 at 15:33:40 UTC, Jacob Carlborg 
 wrote:
 On 2020-12-02 18:52, H. S. Teoh wrote:

 [...]
Unfortunately it's still very easy to bypass most safety features in D. Especially since everything is system by default. All the features of C are still available, one have to pick the D specific features to be safe. I've seen many many times on the forums that people are asking questions with examples containing C style code with raw pointers and calling functions in the C standard library instead of using the D equivalent.
Yes. I could support a DIP making safe the default if it did not claim C is safe.
That bodyless c function marked as safe is the biggest hole in that dip that it took a backslash to withdraw the dip.
Dec 04 2020
prev sibling parent Paulo Pinto <pjmlp progtools.org> writes:
On Friday, 4 December 2020 at 17:52:28 UTC, Bruce Carneal wrote:
 On Friday, 4 December 2020 at 15:33:40 UTC, Jacob Carlborg 
 wrote:
 On 2020-12-02 18:52, H. S. Teoh wrote:

 [...]
Unfortunately it's still very easy to bypass most safety features in D. Especially since everything is system by default. All the features of C are still available, one have to pick the D specific features to be safe. I've seen many many times on the forums that people are asking questions with examples containing C style code with raw pointers and calling functions in the C standard library instead of using the D equivalent.
Yes. I could support a DIP making safe the default if it did not claim C is safe.
That was my problem with the DIP as well.
Dec 04 2020
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 2 December 2020 at 17:52:29 UTC, H. S. Teoh wrote:
 An equally bad thing about C strings is that utterly evil 
 function known as strncpy.  Why is it evil?  Because it comes 
 with the warning that the result may not be terminated if the 
 target buffer is not large enough to contain the entire string.
  And guess how many people gloss over or simply forget that 
 detail?  Yep, I've fixed a whole bunch of bugs caused by that.
The only sin of strncpy() is its name. The problem is that people think it is a string function (even you fell for it), but it never was a string function, it is a buffer function and a mem*/buf* prefix would have gone a long way to avoid its misuse as a string function. Beyond its truncation feature, it has a second functionality that most people do not know and that make it definitely different from the string function, it overwrites the whole buffer with 0 to the end of it, making it often a performance hog: char buffer[32000]; strncpy(buffer, "a", sizeof buffer); will write 32000 characters. Historically it was invented for early Unix, to write the filename in the directory entry, which was size 14 at that time. strncpy(direntry, filename, 14); strncpy() has its uses, but it is important to know, that it is NOT a string function. The new warning in gcc since version 9 is annoying and has to be shut up in some cases (with pragmas) as there are legitimate uses of strncpy (unlike gets(), which is always wrong) Except for that, I completely agree with the rest of your rant.
Dec 09 2020
parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 9 December 2020 at 08:26:35 UTC, Patrick Schluter 
wrote:
 On Wednesday, 2 December 2020 at 17:52:29 UTC, H. S. Teoh wrote:
[...]
The only sin of strncpy() is its name. The problem is that people think it is a string function (even you fell for it), but it never was a string function, it is a buffer function and a mem*/buf* prefix would have gone a long way to avoid its misuse as a string function. Beyond its truncation feature, it has a second functionality that most people do not know and that make it definitely different from the string function, it overwrites the whole buffer with 0 to the end of it, making it often a performance hog: [...]
Simplest implementation of strncpy char *strncpy(char *dest, const char *src, size_t n) { memset(dest, 0, n); memcpy(dest, src, min(strlen(src),n)); } Checking the man on Linux does perpetuate the error. strncpy() is joined with strcpy(), which is wrong imo. As my implementation above shows, strncpy() is semantically closer to memcpy() than to strcpy().
Dec 09 2020
parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 9 December 2020 at 08:52:10 UTC, Patrick Schluter 
wrote:
 On Wednesday, 9 December 2020 at 08:26:35 UTC, Patrick Schluter 
 wrote:
 On Wednesday, 2 December 2020 at 17:52:29 UTC, H. S. Teoh 
 wrote:
[...]
The only sin of strncpy() is its name. The problem is that people think it is a string function (even you fell for it), but it never was a string function, it is a buffer function and a mem*/buf* prefix would have gone a long way to avoid its misuse as a string function. Beyond its truncation feature, it has a second functionality that most people do not know and that make it definitely different from the string function, it overwrites the whole buffer with 0 to the end of it, making it often a performance hog: [...]
Simplest implementation of strncpy char *strncpy(char *dest, const char *src, size_t n) { memset(dest, 0, n); memcpy(dest, src, min(strlen(src),n));
return memcpy(dest, src, min(strlen(src), n)); obviously
     }

 Checking the man on Linux does perpetuate the error. strncpy() 
 is joined with strcpy(), which is wrong imo. As my 
 implementation above shows, strncpy() is semantically closer to 
 memcpy() than to strcpy().
Dec 09 2020
prev sibling next sibling parent reply TheGag96 <thegag96 gmail.com> writes:
On Wednesday, 2 December 2020 at 11:07:54 UTC, Paulo Pinto wrote:
 Yet another proof why system languages like D are required to 
 take over OS development.

 https://googleprojectzero.blogspot.com/2020/12/an-ios-zero-click-radio-proximity.html
How many times is this going to happen before people realize it's almost immoral to even continue writing things in C++? I saw a tweet a few days ago where someone pointed out rightly that C++ is not the future because of its memory unsafety, and there were some people trying to tell him he was wrong or being too harsh or that it's just cool to come up with reasons to hate C++ these days. Who would have thought he could be vindicated mere days later.
Dec 02 2020
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Dec 03, 2020 at 01:36:15AM +0000, TheGag96 via Digitalmars-d wrote:
 On Wednesday, 2 December 2020 at 11:07:54 UTC, Paulo Pinto wrote:
 Yet another proof why system languages like D are required to take
 over OS development.
 
 https://googleprojectzero.blogspot.com/2020/12/an-ios-zero-click-radio-proximity.html
How many times is this going to happen before people realize it's almost immoral to even continue writing things in C++? I saw a tweet a few days ago where someone pointed out rightly that C++ is not the future because of its memory unsafety, and there were some people trying to tell him he was wrong or being too harsh or that it's just cool to come up with reasons to hate C++ these days. Who would have thought he could be vindicated mere days later.
Time will eventually prove who's right. It may not come as fast as we'd hope, but eventually, the weight of evidence against memory-unsafe languages will become so overwhelming that it would speak for itself, we wouldn't need to argue over it anymore. T -- The best way to destroy a cause is to defend it poorly.
Dec 02 2020
prev sibling parent reply IGotD- <nise nise.com> writes:
On Thursday, 3 December 2020 at 01:36:15 UTC, TheGag96 wrote:
 How many times is this going to happen before people realize 
 it's almost immoral to even continue writing things in C++? I 
 saw a tweet a few days ago where someone pointed out rightly 
 that C++ is not the future because of its memory unsafety, and 
 there were some people trying to tell him he was wrong or being 
 too harsh or that it's just cool to come up with reasons to 
 hate C++ these days. Who would have thought he could be 
 vindicated mere days later.
Kernel programming is not the same as application programming. In kernel programming you do so many tricks and quirks that you must operate outside what is considered safe by language designers. Sure bounds checking helps as well as getting rid of those stupid zero terminated strings but a kernel written in a 100% safe language is just fantasy. Take a look at the Linux page structure (struct page). https://elixir.bootlin.com/linux/latest/source/include/linux/mm_types.h It's a structure full of unions, then also we have pointers to other structures some forming a linked list. There is no way you can program this with safe code and the structure requires this because size of this structure is of high importance. Safe kernel programming just forget it. It's just that C has a bit too few features that can help reducing bugs.
Dec 02 2020
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Thursday, 3 December 2020 at 07:28:09 UTC, IGotD- wrote:
 On Thursday, 3 December 2020 at 01:36:15 UTC, TheGag96 wrote:
 [...]
Kernel programming is not the same as application programming. In kernel programming you do so many tricks and quirks that you must operate outside what is considered safe by language designers. Sure bounds checking helps as well as getting rid of those stupid zero terminated strings but a kernel written in a 100% safe language is just fantasy. [...]
F-Secure apparently is living in a phantasy world, https://www.f-secure.com/en/consulting/foundry/usb-armory
Dec 03 2020
parent reply IGotD- <nise nise.com> writes:
On Thursday, 3 December 2020 at 09:04:56 UTC, Paulo Pinto wrote:
 F-Secure apparently is living in a phantasy world, 
 https://www.f-secure.com/en/consulting/foundry/usb-armory
How does this product has to do with safe and unsafe language designs? On Thursday, 3 December 2020 at 09:16:06 UTC, Paulo Pinto wrote:
 Ah, and NVidia is going phantasy land as well, 
 https://blogs.nvidia.com/blog/2019/02/05/adacore-secure-autonomous-driving/
The answer is still no, you cannot write a kernel in a safe language without breaking outside the safety features. You can write a kernel in Rust but many portions must be in unsafe mode, for example unions are not safe in Rust. Ada and SPARK are likely to be good languages if you want to create stable SW however like I wrote before in order to write a kernel you're likely to step outside the safety harness in these languages as well. Kernel SW is inherently unsafe.
Dec 03 2020
parent reply Max Haughton <maxhaton gmail.com> writes:
On Thursday, 3 December 2020 at 10:07:19 UTC, IGotD- wrote:
 On Thursday, 3 December 2020 at 09:04:56 UTC, Paulo Pinto wrote:
 F-Secure apparently is living in a phantasy world, 
 https://www.f-secure.com/en/consulting/foundry/usb-armory
How does this product has to do with safe and unsafe language designs? On Thursday, 3 December 2020 at 09:16:06 UTC, Paulo Pinto wrote:
 Ah, and NVidia is going phantasy land as well, 
 https://blogs.nvidia.com/blog/2019/02/05/adacore-secure-autonomous-driving/
The answer is still no, you cannot write a kernel in a safe language without breaking outside the safety features. You can write a kernel in Rust but many portions must be in unsafe mode, for example unions are not safe in Rust. Ada and SPARK are likely to be good languages if you want to create stable SW however like I wrote before in order to write a kernel you're likely to step outside the safety harness in these languages as well. Kernel SW is inherently unsafe.
Just because you have to take some liberties with the safety features doesn't mean you have to avoid them entirely. If you look at the Linux kernel, a huge amount of the code could nearly be in userspace when you take into account how it actually works. For example, implementing a high level system call like perf_event_open is practically doable as a library with the right msr set. With a safe language you can carefully contain and test unsafe constructs, with C you don't even have the option of doing that.
Dec 03 2020
parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Thursday, 3 December 2020 at 10:12:35 UTC, Max Haughton wrote:
 On Thursday, 3 December 2020 at 10:07:19 UTC, IGotD- wrote:
 On Thursday, 3 December 2020 at 09:04:56 UTC, Paulo Pinto 
 wrote:
 F-Secure apparently is living in a phantasy world, 
 https://www.f-secure.com/en/consulting/foundry/usb-armory
How does this product has to do with safe and unsafe language designs?
The whole software stack is 100% written in TamaGo (bare metal Go for ARM SoCs), which obviously you didn't bother to read about on the product page.
 On Thursday, 3 December 2020 at 09:16:06 UTC, Paulo Pinto 
 wrote:
 Ah, and NVidia is going phantasy land as well, 
 https://blogs.nvidia.com/blog/2019/02/05/adacore-secure-autonomous-driving/
The answer is still no, you cannot write a kernel in a safe language without breaking outside the safety features. You can write a kernel in Rust but many portions must be in unsafe mode, for example unions are not safe in Rust. Ada and SPARK are likely to be good languages if you want to create stable SW however like I wrote before in order to write a kernel you're likely to step outside the safety harness in these languages as well. Kernel SW is inherently unsafe.
Just because you have to take some liberties with the safety features doesn't mean you have to avoid them entirely. If you look at the Linux kernel, a huge amount of the code could nearly be in userspace when you take into account how it actually works. For example, implementing a high level system call like perf_event_open is practically doable as a library with the right msr set. With a safe language you can carefully contain and test unsafe constructs, with C you don't even have the option of doing that.
Hence why Go has unsafe package for breaking outside the safety features. Ironically, what I get from these kind of replies is that companies like F-Secure or PTC/Aicas/Astrobe (*bare metal* Java/Oberon) deliver, whereas here one keeps arguing how bad GC is and it is time to reboot the whole D language for an imaginary crowd to finally adopt it. In the end, maybe those languages are the ones being adopted. https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2020_06_SSTIC/SSTIC2020%20-%20Pursuing%20Durably%20Safe%20Systems%20Software.pdf
Dec 03 2020
next sibling parent reply IGotD- <nise nise.com> writes:
On Thursday, 3 December 2020 at 10:36:45 UTC, Paulo Pinto wrote:
 The whole software stack is 100% written in TamaGo (bare metal 
 Go for ARM SoCs), which obviously you didn't bother to read 
 about on the product page.
The product page doesn't say anything about that, just a link to the TamaGo github page in "related resources". However, it does say that it supports running Linux and its connectivity options suggests that this is the intended use for this product.
Dec 03 2020
parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Thursday, 3 December 2020 at 10:51:09 UTC, IGotD- wrote:
 On Thursday, 3 December 2020 at 10:36:45 UTC, Paulo Pinto wrote:
 The whole software stack is 100% written in TamaGo (bare metal 
 Go for ARM SoCs), which obviously you didn't bother to read 
 about on the product page.
The product page doesn't say anything about that, just a link to the TamaGo github page in "related resources". However, it does say that it supports running Linux and its connectivity options suggests that this is the intended use for this product.
A link that you obviously didn't follow. No it doesn't run Linux at all. Had you followed the link, https://github.com/f-secure-foundry/tamago
 The projects spawns from the desire of reducing the attack 
 surface of embedded systems firmware by removing any runtime 
 dependency on C code and Operating Systems.
And better yet, were you actually curious to learn about TamaGo, you would eventually land on this CCC talk. "TamaGo - bare metal Go framework for ARM SoCs. Reducing the attack surface with pure embedded Go. https://media.ccc.de/v/36c3-10597-tamago_-_bare_metal_go_framework_for_arm_socs However I see I keep wasting my time on D forums.
Dec 03 2020
parent reply IGotD- <nise nise.com> writes:
On Thursday, 3 December 2020 at 14:08:45 UTC, Paulo Pinto wrote:
 "TamaGo - bare metal Go framework for ARM SoCs.
 Reducing the attack surface with pure embedded Go.

 https://media.ccc.de/v/36c3-10597-tamago_-_bare_metal_go_framework_for_arm_socs

 However I see I keep wasting my time on D forums.
You can go and do programming in TamaGo anytime than waste you time here. Go is similar to D that it requires a runtime in order to have most useful features. It would be interesting to know how big this runtime is. This is also not really a kernel but baremetal framework + app support, which can have interesting applications. What I have seen is that D + runtime is too big for any kernel right now.
Dec 03 2020
parent reply Abdulhaq <alynch4047 gmail.com> writes:
On Thursday, 3 December 2020 at 14:53:10 UTC, IGotD- wrote:
 On Thursday, 3 December 2020 at 14:08:45 UTC, Paulo Pinto wrote:
 "TamaGo - bare metal Go framework for ARM SoCs.
 Reducing the attack surface with pure embedded Go.

 https://media.ccc.de/v/36c3-10597-tamago_-_bare_metal_go_framework_for_arm_socs

 However I see I keep wasting my time on D forums.
You can go and do programming in TamaGo anytime than waste you time here. Go is similar to D that it requires a runtime in order to have most useful features. It would be interesting to know how big this runtime is. This is also not really a kernel but baremetal framework + app support, which can have interesting applications. What I have seen is that D + runtime is too big for any kernel right now.
Mmmm, human lives are more and more becoming dependent on the safe and reliable operation of hardware and software. "Small" bugs can cause a cascade of downstream hazards to thousands of people. "My programming language must have unions" and "I really really want the kernel to be small" will eventually not be acceptable excuses for non-maximally-safe software, whether it's a kernel or not.
Dec 03 2020
parent David Gileadi <gileadisNOSPM gmail.com> writes:
On 12/3/20 9:52 AM, Abdulhaq wrote:
 On Thursday, 3 December 2020 at 14:53:10 UTC, IGotD- wrote:
 What I have seen is that D + runtime is too big for any kernel right now.
Mmmm, human lives are more and more becoming dependent on the safe and reliable operation of hardware and software. "Small" bugs can cause a cascade of downstream hazards to thousands of people. "My programming language must have unions" and "I really really want the kernel to be small" will eventually not be acceptable excuses for non-maximally-safe software, whether it's a kernel or not.
If enough programmers band together and demand change, perhaps unions will end up killing unions ;)
Dec 03 2020
prev sibling next sibling parent aberba <karabutaworld gmail.com> writes:
On Thursday, 3 December 2020 at 10:36:45 UTC, Paulo Pinto wrote:
 On Thursday, 3 December 2020 at 10:12:35 UTC, Max Haughton 
 wrote:
 [...]
The whole software stack is 100% written in TamaGo (bare metal Go for ARM SoCs), which obviously you didn't bother to read about on the product page.
 [...]
Hence why Go has unsafe package for breaking outside the safety features. Ironically, what I get from these kind of replies is that companies like F-Secure or PTC/Aicas/Astrobe (*bare metal* Java/Oberon) deliver, whereas here one keeps arguing how bad GC is and it is time to reboot the whole D language for an imaginary crowd to finally adopt it. In the end, maybe those languages are the ones being adopted.
"Don't put the Geeks in charge of management" 😁 BTW, that's pretty deep. There are so many unexplored territories in D but we sometimes keep ourselves hooked to the imaginary ones.
 https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2020_06_SSTIC/SSTIC2020%20-%20Pursuing%20Durably%20Safe%20Systems%20Software.pdf
Dec 05 2020
prev sibling parent reply Testle <turtle.testle.junior gmail.com> writes:
On Thursday, 3 December 2020 at 10:36:45 UTC, Paulo Pinto wrote:
 On Thursday, 3 December 2020 at 10:12:35 UTC, Max Haughton 
 wrote:
 On Thursday, 3 December 2020 at 10:07:19 UTC, IGotD- wrote:
 On Thursday, 3 December 2020 at 09:04:56 UTC, Paulo Pinto 
 wrote:
 F-Secure apparently is living in a phantasy world, 
 https://www.f-secure.com/en/consulting/foundry/usb-armory
How does this product has to do with safe and unsafe language designs?
The whole software stack is 100% written in TamaGo (bare metal Go for ARM SoCs), which obviously you didn't bother to read about on the product page.
It is also severally limited. It looks like it is designed to be used for purpose built tasks, not as a general use operating system. It works because all the complexity of an OS like Linux isn't implemented at all. This is done intentionally. You are comparing apples to oranges and then saying you dont understand why the people talking about apples don't like your oranges. I don't think any amount of reboot of D will make it more popular with people that are flocking to Rust. D proposed a system that didn't get traction, another language came along proposed something different and did get traction. There are many problems with D as it stands, I use it for smaller hobby projects but it would be a nightmare to use in a larger project that needs to be maintained. Part of that problem is with how the project is managed. Sure everyone has their own opinions of this, and that's one of the reasons that gets brought up about problem. Problems users face aren't taken seriously, it seems management is more concerned with follow trends like implementing borrow ownership than addressing actual problems have brought up. The safe by default is a good example of that. Rather than listening to the outcry from the community to change one thing, the entire thing was just shut down.
Dec 05 2020
parent reply Bruce Carneal <bcarneal gmail.com> writes:
On Saturday, 5 December 2020 at 19:32:53 UTC, Testle wrote:
 [discussion including criticism of how "management" dropped DIP 
 1028 completely rather than amending it to remove the wildly 
 unpopular "C is  safe" verbiage]
Even though I love working on projects that should operate near HW limits, I would like to see defaults favor newcomers and prototyping ( safe, throw, gc). 1028 without the "C is safe" falsehood could work.
Dec 05 2020
parent reply Dukc <ajieskola gmail.com> writes:
On Sunday, 6 December 2020 at 04:52:19 UTC, Bruce Carneal wrote:
 Even though I love working on projects that should operate near 
 HW limits, I would like to see defaults favor newcomers and 
 prototyping ( safe, throw, gc).

 1028 without the "C is  safe" falsehood could work.
Consider a module with ` safe:` or ` trusted:` at top. The problem with the rule "external C functions can't be safe" or "can't be safe by default" is that you do not know why the annotation is at the top of module. It could be because it has been reviewed, or it could be just to get functions calling it to quickly work. Theoretically, if C code is considered ` safe` by default, you can tell them from each other. No attributes: ` system`, but considered ` safe` for now, meaning you want to review the module in the near future. ` safe` or ` trusted`: someone has checked the module, nothing to worry about. I don't think this is a good argument, because a comment about whether a review is done is so easy and recommended to add in any case. But then again, it should not make much difference. It is still better to greenwash the C headers as safe (with an appropriate comment about the hack) and use them from ` safe` code, than to mark all your code ` system`. At least you're checking the D code. So I think that the issue has been exaggregated. Either way, it'd be progress from the current language.
Dec 08 2020
parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 8 December 2020 at 22:52:29 UTC, Dukc wrote:
 On Sunday, 6 December 2020 at 04:52:19 UTC, Bruce Carneal wrote:
 Even though I love working on projects that should operate 
 near HW limits, I would like to see defaults favor newcomers 
 and prototyping ( safe, throw, gc).

 1028 without the "C is  safe" falsehood could work.
Consider a module with ` safe:` or ` trusted:` at top. The problem with the rule "external C functions can't be safe" or "can't be safe by default" is that you do not know why the annotation is at the top of module. It could be because it has been reviewed, or it could be just to get functions calling it to quickly work. Theoretically, if C code is considered ` safe` by default, you can tell them from each other. No attributes: ` system`, but considered ` safe` for now, meaning you want to review the module in the near future. ` safe` or ` trusted`: someone has checked the module, nothing to worry about.
The problem with this is that there is existing *correct* D code that relies "no attributes" meaning system, and which would silently become incorrectly-annotated if the default were changed. For example, there are many external system functions in the D runtime that do not have an explicit system annotation. You flip the switch, your tests pass, and then months or years later, you discover that a memory-corruption bug has snuck its way into your safe code, because no one ever got around to putting an explicit system annotation on some external function deep in one of your dependencies. How would you react? Personally, I'd jump ship to Rust and never look back. Of course, you can try to argue that it's the fault of the library maintainer for not realizing that they need to re-review all of their external function declarations--but why should they have to, when the compiler can just as easily flag those functions automatically? Isn't the whole reason we have automatic memory-safety checks in the first place to *avoid* relying on programmer discipline for this kind of thing?
Dec 08 2020
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Dec 09, 2020 at 01:23:37AM +0000, Paul Backus via Digitalmars-d wrote:
 On Tuesday, 8 December 2020 at 22:52:29 UTC, Dukc wrote:
[...]
 Consider a module with ` safe:` or ` trusted:` at top. The problem
 with the rule "external C functions can't be  safe" or "can't be
  safe by default" is that you do not know why the annotation is at
 the top of module. It could be because it has been reviewed, or it
 could be just to get functions calling it to quickly work.
IMO, if the module was reviewed, individual functions would be annotated. I would not trust a "review" that simply slaps a blanket trusted on the top of the file. Individually-annotated functions increase my confidence that somebody has at least put in the effort to look over each function.
 Theoretically, if C code is considered ` safe` by default, you can
 tell them from each other. No attributes: ` system`, but considered
 ` safe` for now, meaning you want to review the module in the near
 future.  ` safe` or ` trusted`: someone has checked the module,
 nothing to worry about.
The problem with this is that there is existing *correct* D code that relies "no attributes" meaning system, and which would silently become incorrectly-annotated if the default were changed. For example, there are many external system functions in the D runtime that do not have an explicit system annotation.
Yeah, this is one of the main reasons there was a big backlash against that DIP. [...]
 Of course, you can try to argue that it's the fault of the library
 maintainer for not realizing that they need to re-review all of their
 external function declarations--but why should they have to, when the
 compiler can just as easily flag those functions automatically? Isn't
 the whole reason we have automatic memory-safety checks in the first
 place to *avoid* relying on programmer discipline for this kind of
 thing?
Yeah, D's mantra all these years has always been, automatic verification rather than programming by convention. This DIP undermines that principle. T -- Guns don't kill people. Bullets do.
Dec 08 2020
prev sibling next sibling parent reply Bruce Carneal <bcarneal gmail.com> writes:
On Wednesday, 9 December 2020 at 01:23:37 UTC, Paul Backus wrote:
 The problem with this is that there is existing *correct* D 
 code that relies "no attributes" meaning  system, and which 
 would silently become incorrectly-annotated if the default were 
 changed. For example, there are many external  system functions 
 in the D runtime that do not have an explicit  system 
 annotation.
IIUC, such functions in existing .o files and libs would not be designated (mangled) safe so I'd expect linker errors, not silence. New compilations will have the source body and will, of course, reject non safe code so, again, not silent. What have I misunderstood? What is the "silent" problem? Is there some transitive issue? Note: safe designation should be part of the external mangle of any future defaulted-and-verified- safe function. I don't see how it works otherwise.
 You flip the switch, your tests pass, and then months or years 
 later, you discover that a memory-corruption bug has snuck its 
 way into your  safe code, because no one ever got around to 
 putting an explicit  system annotation on some external 
 function deep in one of your dependencies. How would you react? 
 Personally, I'd jump ship to Rust and never look back.
How do your tests pass? How does the code even compile? If the default moves from lax ( system) to strict ( safe) I see how a lot of code that formerly compiled would stop compiling/linking, an ongoing concern were the DIP edited and re-introduced, but I don't see how you get bugs "sneaking" in or lying dormant. Absent explicit greenwashing by the programmer, how do the bugs sneak in?
 Of course, you can try to argue that it's the fault of the 
 library maintainer for not realizing that they need to 
 re-review all of their external function declarations--but why 
 should they have to, when the compiler can just as easily flag 
 those functions automatically? Isn't the whole reason we have 
 automatic memory-safety checks in the first place to *avoid* 
 relying on programmer discipline for this kind of thing?
Well, safe by default is about as automatic/not-relying-on-discipline as it gets. Unless annotated otherwise all functions with source are flagged at compile time if not verified safe. Extern declarations against old object files and libs should flag errors at link time. Again I feel that I must be missing something. What "programmer discipline" are you referring to?
Dec 09 2020
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 9 December 2020 at 08:29:58 UTC, Bruce Carneal 
wrote:
 IIUC, such functions in existing .o files and libs would not be 
 designated (mangled)  safe so I'd expect linker errors, not 
 silence.  New compilations will have the source body and will, 
 of course, reject non  safe code so, again, not silent. What 
 have I misunderstood?  What is the "silent" problem?  Is there 
 some transitive issue?

 Note:  safe designation should be part of the external mangle 
 of any future defaulted-and-verified- safe function.  I don't 
 see how it works otherwise.
This does not work for extern(C) functions because their names are not mangled.
Dec 09 2020
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09.12.20 14:06, Paul Backus wrote:
 On Wednesday, 9 December 2020 at 08:29:58 UTC, Bruce Carneal wrote:
 IIUC, such functions in existing .o files and libs would not be 
 designated (mangled)  safe so I'd expect linker errors, not silence.  
 New compilations will have the source body and will, of course, reject 
 non  safe code so, again, not silent. What have I misunderstood?  What 
 is the "silent" problem?  Is there some transitive issue?

 Note:  safe designation should be part of the external mangle of any 
 future defaulted-and-verified- safe function.  I don't see how it 
 works otherwise.
This does not work for extern(C) functions because their names are not mangled.
It does not even work for extern(D) functions because their return types are not mangled.
Dec 09 2020
parent reply Bruce Carneal <bcarneal gmail.com> writes:
On Wednesday, 9 December 2020 at 13:28:14 UTC, Timon Gehr wrote:
 On 09.12.20 14:06, Paul Backus wrote:
 On Wednesday, 9 December 2020 at 08:29:58 UTC, Bruce Carneal 
 wrote:
 [...]
This does not work for extern(C) functions because their names are not mangled.
It does not even work for extern(D) functions because their return types are not mangled.
I did not know this. If we lose information when we compile separately it's over. If you can't close the proof you've got no proof.
Dec 09 2020
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Dec 09, 2020 at 04:02:46PM +0000, Bruce Carneal via Digitalmars-d wrote:
 On Wednesday, 9 December 2020 at 13:28:14 UTC, Timon Gehr wrote:
 On 09.12.20 14:06, Paul Backus wrote:
 On Wednesday, 9 December 2020 at 08:29:58 UTC, Bruce Carneal wrote:
 [...]
This does not work for extern(C) functions because their names are not mangled.
It does not even work for extern(D) functions because their return types are not mangled.
I did not know this. If we lose information when we compile separately it's over. If you can't close the proof you've got no proof.
This has nothing to do with separate compilation. The C ABI simply does not encode parameter or return types, needless to say function attributes. The D ABI does not encode return type information, because doing so would imply overloading by return type, which is not possible in D. You cannot "prove" anything in the way of proving it down to the raw machine level. At some point, you have to depend on the consistency of the layer below you. Just because you mangle return types, does not prevent lower-level breakage of your proof. For example, you can declare a safe function at the D language level, but link it to a C function that just happens to have the same name as the mangled D name, and you're free to make this C function do whatever you want, and there's nothing the compiler can do to enforce anything. Just because the mangled name says it's safe, does not guarantee it's safe. Even if you somehow require the D compiler to take over the entire process down to the executable, bypassing the system linker, that still guarantees nothing. I can use a hex editor to modify the executable after-the-fact to break the safe-ty proof, and there's nothing the compiler or the language can do about it. Even if you somehow "protect" your executable, I can run it on a modified OS that edits the program in-memory while it's executing, and there's nothing the language can do about it either. Basically, the automatic verification of safe, etc., relies on the consistency of the lower-level layers. It does not "prove" anything in the absolute sense. You might be able to do that if you invent your own hardware from ground up, starting from the transistor level. But you can't do that with a programming language that's intended to run on a large variety of preexisting systems. Just like in mathematical proofs, you can only prove down to the axioms, you cannot prove the axioms themselves. If the axioms are violated, your proof collapses. The guarantees of safe only hold if certain assumptions about the lower-level layers hold; however, there is nothing you can do to guarantee that. You cannot prove your axioms, you can only assume them. T -- I see that you JS got Bach.
Dec 09 2020
parent Bruce Carneal <bcarneal gmail.com> writes:
On Wednesday, 9 December 2020 at 16:34:28 UTC, H. S. Teoh wrote:
 On Wed, Dec 09, 2020 at 04:02:46PM +0000, Bruce Carneal via 
 Digitalmars-d wrote:
 On Wednesday, 9 December 2020 at 13:28:14 UTC, Timon Gehr 
 wrote:
 On 09.12.20 14:06, Paul Backus wrote:
 On Wednesday, 9 December 2020 at 08:29:58 UTC, Bruce 
 Carneal wrote:
 [...]
This does not work for extern(C) functions because their names are not mangled.
It does not even work for extern(D) functions because their return types are not mangled.
I did not know this. If we lose information when we compile separately it's over. If you can't close the proof you've got no proof.
This has nothing to do with separate compilation. The C ABI simply does not encode parameter or return types, needless to say function attributes. The D ABI does not encode return type information, because doing so would imply overloading by return type, which is not possible in D.
If you compile from source safe can be enforced because the compiler has all the information it needs. However things are represented in .o files and libs, if that representation does not give you the necessary information then you can't enforce safe mechanically.
 You cannot "prove" anything in the way of proving it down to 
 the raw machine level.  At some point, you have to depend on 
 the consistency of the layer below you.  Just because you 
 mangle return types, does not prevent lower-level breakage of 
 your proof.  For example, you can declare a  safe function at 
 the D language level, but link it to a C function that just 
 happens to have the same name as the mangled D name, and you're 
 free to make this C function do whatever you want, and there's 
 nothing the compiler can do to enforce anything.  Just because 
 the mangled name says it's safe, does not guarantee it's safe.
I agree. If your information is probabilistic then you do not have a hard proof.
 Even if you somehow require the D compiler to take over the 
 entire process down to the executable, bypassing the system 
 linker, that still guarantees nothing.  I can use a hex editor 
 to modify the executable after-the-fact to break the  safe-ty 
 proof, and there's nothing the compiler or the language can do 
 about it.  Even if you somehow "protect" your executable, I can 
 run it on a modified OS that edits the program in-memory while 
 it's executing, and there's nothing the language can do about 
 it either.
No one can prove anything about a system that is modified after it leaves the proof domain. Does anyone believe otherwise?
 Basically, the automatic verification of  safe, etc., relies on 
 the consistency of the lower-level layers. It does not "prove" 
 anything in the absolute sense.  You might be able to do that 
 if you invent your own hardware from ground up, starting from 
 the transistor level.  But you can't do that with a programming 
 language that's intended to run on a large variety of 
 preexisting systems.  Just like in mathematical proofs, you can 
 only prove down to the axioms, you cannot prove the axioms 
 themselves.  If the axioms are violated, your proof collapses.
Does anyone think otherwise?
 The guarantees of  safe only hold if certain assumptions about 
 the lower-level layers hold; however, there is nothing you can 
 do to guarantee that.  You cannot prove your axioms, you can 
 only assume them.
Again, I don't see this as an issue. As you note, there is nothing any proof system can do when the "axioms" are violated. We're not trying to prove that all HW is correctly implemented, for example, or that it can withstand cosmic rays, or a power spike or, for that matter, that an arbitrarily patched binary has any properties at all. I agree that, even within the compiled language domain, "proof" is probably better thought of as aspirational rather than literal. There may be errors in the "proof": compiler implementation errors, language definition ambiguities/errors that were "correctly" implemented, and probably other modes of failure that I've not thought of. However, I do not agree that because we can not prove everything we should stop trying to prove anything. Finally, thanks for bringing up some of the hazards of using the word "proof" wrt computer systems.
 T
Dec 09 2020
prev sibling parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 9 December 2020 at 01:23:37 UTC, Paul Backus wrote:
 The problem with this is that there is existing *correct* D 
 code that relies "no attributes" meaning  system, and which 
 would silently become incorrectly-annotated if the default were 
 changed. For example, there are many external  system functions 
 in the D runtime that do not have an explicit  system 
 annotation.

 You flip the switch, your tests pass, and then months or years 
 later, you discover that a memory-corruption bug has snuck its 
 way into your  safe code, because no one ever got around to 
 putting an explicit  system annotation on some external 
 function deep in one of your dependencies. How would you react? 
 Personally, I'd jump ship to Rust and never look back.
Yes, this would be a problem, but I believe it'd be less of a problem than you think. If you use some third-party library, you need to think twice how much you trust it's ` safe`ty in any case. At least if you're that strict about it. No annotations anywhere is about the easiest thing to spot when considering that.
 Of course, you can try to argue that it's the fault of the 
 library maintainer for not realizing that they need to 
 re-review all of their external function declarations--but why 
 should they have to, when the compiler can just as easily flag 
 those functions automatically?
It might have some benefit: If non-annotated C libraries are considered ` safe`, it'll mean that not-so-quality code is using compromised ` safe`. Bad. But if they are considered ` system`, not-so-quality code will not be using ` safe` AT ALL. Even worse. Now I understand there is a drawback for higher-quality code. You have to either copy-paste the C library header and add ` system:` to top of it, or make a module that automatically wraps the header as ` system`. That's more work than just importing the C header, and thus will result in more greenwashed headers than C headers being ` system` by default. Also it sure sucks that the compiler would do the wrong thing by default, but would the pragmatic downsides be even worse for the Common Sense option? I don't know, But I'm saying that 1: it's a judgement call, not anything absolute, 2: either way one could live with, and would be far from making ` safe` meaningless.
Dec 09 2020
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09.12.20 12:46, Dukc wrote:
 
 It might have some benefit: If non-annotated C libraries are considered 
 ` safe`, it'll mean that not-so-quality code is using compromised 
 ` safe`. Bad. But if they are considered ` system`, not-so-quality code 
 will not be using ` safe` AT ALL. Even worse.
That's a bit like saying it's bad if products produced using slave labour don't get a fair trade label. Anyway, extern(C) code that may corrupt memory is not even necessarily buggy, some C functions just have an unsafe interface. If you want the safe checks make small trusted wrappers around those functions so that the interface becomes safe and explicitly annotate those extern(C) functions whose interface you think is safe with trusted. And if you don't care about safe, that's fine too. If someone wants to use your library from safe code they can add the required annotations themselves and send you a pull request.
 
 
 Also it sure sucks that the compiler would do the wrong thing by default, but
would the pragmatic downsides be even worse for the Common Sense option? I
don't know, But I'm saying that 1: it's a judgement call, not anything
absolute, 2: either way one could live with, and would be far from making
` safe` meaningless. 
I'm not really willing to debate the pragmatic upsides of encouraging dishonesty in a modular verification context. There are none.
Dec 09 2020
parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 9 December 2020 at 13:25:49 UTC, Timon Gehr wrote:
 On 09.12.20 12:46, Dukc wrote:
 
 It might have some benefit: If non-annotated C libraries are 
 considered ` safe`, it'll mean that not-so-quality code is 
 using compromised ` safe`. Bad. But if they are considered 
 ` system`, not-so-quality code will not be using ` safe` AT 
 ALL. Even worse.
That's a bit like saying it's bad if products produced using slave labour don't get a fair trade label.
You're thinking ` safe` as a certificate. It can definitely help in doing certifying reviews, but it's also supposed to be a tool to catch mistakes - for all code, not just for code that wants to certify. That it won't catch mistakes from using the C code does not prevent it from catching other unrelated mistakes. That's still better than nothing if we don't pretend that the C headers are certified. One can still add a comment to describe why the code is annotated ` safe` or ` trusted`.
Dec 09 2020
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09.12.20 16:42, Dukc wrote:
 On Wednesday, 9 December 2020 at 13:25:49 UTC, Timon Gehr wrote:
 On 09.12.20 12:46, Dukc wrote:
 It might have some benefit: If non-annotated C libraries are 
 considered ` safe`, it'll mean that not-so-quality code is using 
 compromised ` safe`. Bad. But if they are considered ` system`, 
 not-so-quality code will not be using ` safe` AT ALL. Even worse.
That's a bit like saying it's bad if products produced using slave labour don't get a fair trade label.
You're thinking ` safe` as a certificate.
It is, that's why it's on the function signature and causes function type incompatibility. ` safe` has an actual modular meaning that is communicated to the caller; it's not supposed to be just a collection of lint heuristics. It's supposed to contain all memory-safety errors within ` trusted` functions.
 It can definitely help in 
 doing certifying reviews, but it's also supposed to be a tool to catch 
 mistakes - for all code,
No, this feature does not exist currently, it is not the way safe has been advertised, and separating code into system/ trusted/ safe would make no sense if it was. You'd just want different strictness levels with no modular guarantees at all for most of them. If that's useful to you, feel free to advocate for it, but this is not what ` safe` is.
 not just for code that wants to certify. That 
 it won't catch mistakes from using the C code does not prevent it from 
 catching other unrelated mistakes. That's still better than nothing if 
 we don't pretend that the C headers are certified.
 
 One can still add a comment to describe why the code is annotated 
 ` safe` or ` trusted`.
 
You can't have the documentation state one thing and silently start practicing the opposite. It's a plain, corrupt lie. Why is this not obvious? You'll be right back in fairy-tale wonderland where good programmers don't make mistakes and everyone else does not matter.
Dec 09 2020
next sibling parent IGotD- <nise nise.com> writes:
On Wednesday, 9 December 2020 at 18:20:48 UTC, Timon Gehr wrote:
[...]

Not to reply your post in particular but a general reply. I 
couldn't care less about this  safe lobbyism, it really doesn't 
do anything for me.  safe is just a limitation of the features in 
D so it isn't really safe or maybe it will be somewhere in the 
distant future. Rust started this safe nonsense but in reality 
your program is as safe you make it, it's just a word that is 

use that kind of marketing because programmers understood what 
they are about anyway.

Programming languages today seem to be victims of "Objects in 
mirror are closer than they appear" jargon. In reality you will 
not be safer because of that stupid sentence, which the rest of 
world don't seem to need.

I don't really care what happens to the  safe DIP as long as I 
have an easy escape from it. If you want to be safe, don't do 
what I do like changing the stack pointer in the middle of the 
execution. Which would be perfectly ok in  safe code with DIP 
1028 as changing the stack pointer is done via a assembler 
function. I don't really mind because if I mess up, it's my fault.
Dec 09 2020
prev sibling parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 9 December 2020 at 18:20:48 UTC, Timon Gehr wrote:
 On 09.12.20 16:42, Dukc wrote:
 You're thinking ` safe` as a certificate.
It is, that's why it's on the function signature and causes function type incompatibility. ` safe` has an actual modular meaning that is communicated to the caller; it's not supposed to be just a collection of lint heuristics. It's supposed to contain all memory-safety errors within ` trusted` functions.
I quess the stance on this is where the fundamental disagreement about C code default ` safe`ty lies. If one considers, even in internal code, that ` safe` and ` trusted` are always declarations that calling it can never corrupt memory unless there's a honest mistake in the implementation, your stance on this is perfectly sound. But as it can also work as a limited down-and-dirty bug-finder, I really think we should consider it as a valid use case. While continuing to consider use as a certifying aid valid alike.
Dec 09 2020
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 9 December 2020 at 20:30:57 UTC, Dukc wrote:
 I quess the stance on this is where the fundamental 
 disagreement about C code default ` safe`ty lies. If one 
 considers, even in internal code, that ` safe` and ` trusted` 
 are always declarations that calling it can never corrupt 
 memory unless there's a honest mistake in the implementation, 
 your stance on this is perfectly sound.
The meanings of safe and trusted are spelled out in the language spec. [1] They are not a matter of opinion. [1] https://dlang.org/spec/function.html#function-safety
 But as it can also work as a limited down-and-dirty bug-finder,
safe does not find bugs; it prevents them.
Dec 09 2020
parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 9 December 2020 at 21:21:52 UTC, Paul Backus wrote:
 The meanings of  safe and  trusted are spelled out in the 
 language spec. [1] They are not a matter of opinion.

 [1] https://dlang.org/spec/function.html#function-safety
It seems so. So the greenwashing approach for partial ` safe`ty is not currently officially supported. I'll have to remember that. It can still be debated whether it should be, through.
  safe does not find bugs; it prevents them.
Ah, that's better wording. It isn't really a bug if it does not pass compilation.
Dec 09 2020
parent Paul Backus <snarwin gmail.com> writes:
On Wednesday, 9 December 2020 at 21:49:54 UTC, Dukc wrote:
 On Wednesday, 9 December 2020 at 21:21:52 UTC, Paul Backus 
 wrote:
  safe does not find bugs; it prevents them.
Ah, that's better wording. It isn't really a bug if it does not pass compilation.
Well, also, the things that safe flags are not actually bugs, but rather things that could *potentially* lead to bugs somewhere down the line. It is entirely possible to write code that has no memory-safety bugs, but is also not safe.
Dec 09 2020
prev sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 9 December 2020 at 11:46:41 UTC, Dukc wrote:
 It might have some benefit: If non-annotated C libraries are 
 considered ` safe`, it'll mean that not-so-quality code is 
 using compromised ` safe`. Bad. But if they are considered 
 ` system`, not-so-quality code will not be using ` safe` AT 
 ALL. Even worse.
Using compromised safe is much, much worse than not using safe at all. Not using safe at all means you still have the option of migrating to safe in the future. If you're using compromised safe, you have no migration path.
Dec 09 2020
parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 9 December 2020 at 14:05:44 UTC, Paul Backus wrote:
 Using compromised  safe is much, much worse than not using 
  safe at all. Not using  safe at all means you still have the 
 option of migrating to  safe in the future. If you're using 
 compromised  safe, you have no migration path.
Yes you have, assuming you have documented appropriately. Look for comments saying "greenwashed as trusted" or whatever, and recheck the annotations of those places. Note that I meant internal usage. With library APIs you're right. Yeah, I quess that's another problem with Walter's DIP. but personally I think that if the APIs would have been annotated within 6-12 months of the DIP acceptance, the blame of the breakage would be on the library user. And if it took longer, the library was probably abandoned anyway.
Dec 09 2020
parent Paul Backus <snarwin gmail.com> writes:
On Wednesday, 9 December 2020 at 15:56:18 UTC, Dukc wrote:
 Note that I meant internal usage. With library APIs you're 
 right. Yeah, I quess that's another problem with Walter's DIP. 
 but personally I think that if the APIs would have been 
 annotated within 6-12 months of the DIP acceptance, the blame 
 of the breakage would be on the library user. And if it took 
 longer, the library was probably abandoned anyway.
Ultimately, breakage is breakage. It doesn't really matter who the blame falls on; it's still a problem.
Dec 09 2020
prev sibling parent Paulo Pinto <pjmlp progtools.org> writes:
On Thursday, 3 December 2020 at 07:28:09 UTC, IGotD- wrote:
 On Thursday, 3 December 2020 at 01:36:15 UTC, TheGag96 wrote:
 [...]
Kernel programming is not the same as application programming. In kernel programming you do so many tricks and quirks that you must operate outside what is considered safe by language designers. Sure bounds checking helps as well as getting rid of those stupid zero terminated strings but a kernel written in a 100% safe language is just fantasy. [...]
Ah, and NVidia is going phantasy land as well, https://blogs.nvidia.com/blog/2019/02/05/adacore-secure-autonomous-driving/
Dec 03 2020
prev sibling parent Elronnd <elronnd elronnd.net> writes:
On Wednesday, 2 December 2020 at 11:07:54 UTC, Paulo Pinto wrote:
 system languages like D are required to take over OS 
 development.
Note that d isn't actually safe for kernel code, at least not with the current implementation of safe (and it would require a fairly large overhaul to fix). https://forum.dlang.org/thread/qxxdmkhxydcpmdyhrpxd forum.dlang.org
Dec 02 2020