www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Fantastic exchange from DConf

reply Joakim <dlang joakim.fea.st> writes:
Walter Bright: I firmly believe that memory safety is gonna be an 
absolute requirement moving forward, very soon, for programming 
language selection.

Scott Meyers: For, for what kinds of applications?

Walter: Anything that goes on the internet.

Scott: Uh, let me just, sort of as background, given the 
remaining popularity of C, unbelievable popularity of C, which is 
far from a memory-safe language, do you think that that... I'm 
having trouble reconciling the ongoing popularity of C with the 
claim that you're making that this is going to be an absolute 
requirement for programming languages going forward.

Walter: I believe memory safety will kill C.

Scott: ... Wow.
https://www.youtube.com/watch?v=_gfwk-zRwmk#t=8h35m18s

The whole exchange starts with a question at the 8h:33m mark and 
goes on for about 13 mins, worth listening to.

I agree with Walter that safety will be big going forward, should 
have been big already.
May 05
next sibling parent reply qznc <qznc web.de> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter Bright: I firmly believe that memory safety is gonna be 
 an absolute requirement moving forward, very soon, for 
 programming language selection.

 Scott Meyers: For, for what kinds of applications?

 Walter: Anything that goes on the internet.

 Scott: Uh, let me just, sort of as background, given the 
 remaining popularity of C, unbelievable popularity of C, which 
 is far from a memory-safe language, do you think that that... 
 I'm having trouble reconciling the ongoing popularity of C with 
 the claim that you're making that this is going to be an 
 absolute requirement for programming languages going forward.

 Walter: I believe memory safety will kill C.

 Scott: ... Wow.
 https://www.youtube.com/watch?v=_gfwk-zRwmk#t=8h35m18s

 The whole exchange starts with a question at the 8h:33m mark 
 and goes on for about 13 mins, worth listening to.

 I agree with Walter that safety will be big going forward, 
 should have been big already.
Hm, Sociomantic removes the live captures the next day? One request: Chop the panel discussion into one clip per question/topic, please. Alternatively, provide some means to easily jump to the start of each question.
May 06
parent reply Joakim <dlang joakim.fea.st> writes:
On Saturday, 6 May 2017 at 09:53:52 UTC, qznc wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 [...]
Hm, Sociomantic removes the live captures the next day? One request: Chop the panel discussion into one clip per question/topic, please. Alternatively, provide some means to easily jump to the start of each question.
Video of the exchange is now back up: https://www.youtube.com/watch?v=Lo6Q2vB9AAg#t=24m37s Question now starts at 22m:19s mark.
May 12
next sibling parent Nemanja Boric <4burgos gmail.com> writes:
On Friday, 12 May 2017 at 18:52:43 UTC, Joakim wrote:
 On Saturday, 6 May 2017 at 09:53:52 UTC, qznc wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 [...]
Hm, Sociomantic removes the live captures the next day? One request: Chop the panel discussion into one clip per question/topic, please. Alternatively, provide some means to easily jump to the start of each question.
Video of the exchange is now back up: https://www.youtube.com/watch?v=Lo6Q2vB9AAg#t=24m37s Question now starts at 22m:19s mark.
Oh no, my accent is terrible! Time to stand in front of a mirror and rehersal :-). When I said "outside community pressure", I meant "trends", but didn't make it clear then :(.
May 12
prev sibling parent reply Joakim <dlang joakim.fea.st> writes:
On Friday, 12 May 2017 at 18:52:43 UTC, Joakim wrote:
 On Saturday, 6 May 2017 at 09:53:52 UTC, qznc wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 [...]
Hm, Sociomantic removes the live captures the next day? One request: Chop the panel discussion into one clip per question/topic, please. Alternatively, provide some means to easily jump to the start of each question.
Video of the exchange is now back up: https://www.youtube.com/watch?v=Lo6Q2vB9AAg#t=24m37s Question now starts at 22m:19s mark.
Hmm, this talk has become the most-viewed from this DConf, by far beating Scott's keynote. Wonder how, as this seems to be the only link to it, hasn't been posted on reddit/HN. I guess people like panels, the process panel last year is one of the most viewed videos also.
May 17
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/17/2017 3:21 AM, Joakim wrote:
 Hmm, this talk has become the most-viewed from this DConf, by far beating
 Scott's keynote.  Wonder how, as this seems to be the only link to it, hasn't
 been posted on reddit/HN.  I guess people like panels, the process panel last
 year is one of the most viewed videos also.
I received -2 net votes on Hackernews for suggesting that the takeaway from the WannaCry fiasco for developers should be to use memory safe languages. Maybe the larger community isn't punished enough yet.
May 17
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 17, 2017 at 01:41:43PM -0700, Walter Bright via Digitalmars-d wrote:
 On 5/17/2017 3:21 AM, Joakim wrote:
 Hmm, this talk has become the most-viewed from this DConf, by far
 beating Scott's keynote.  Wonder how, as this seems to be the only
 link to it, hasn't been posted on reddit/HN.  I guess people like
 panels, the process panel last year is one of the most viewed videos
 also.
I received -2 net votes on Hackernews for suggesting that the takeaway from the WannaCry fiasco for developers should be to use memory safe languages. Maybe the larger community isn't punished enough yet.
People aren't willing to accept that their cherished choice of language may have been the wrong one, especially if they have invested much of their lives in mastering said language. Though from what I can tell, the WannaCry fiasco is more than merely a matter of memory safety; it also involves backdoors. Backdoors are always considered "safe" by the people who implement them, but unfortunately, time and time again history has proven that backdoors always get found by the wrong people, usually with disastrous consequences. Security by obscurity does not work, yet people continue to believe it does. T -- Long, long ago, the ancient Chinese invented a device that lets them see through walls. It was called the "window".
May 17
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/17/2017 1:46 PM, H. S. Teoh via Digitalmars-d wrote:
 People aren't willing to accept that their cherished choice of language
 may have been the wrong one, especially if they have invested much of
 their lives in mastering said language.
It may not be the developers that initiate this change. It'll be the managers and the customers who force the issue - as those are the people who'll pay the bill for the problems.
 Though from what I can tell, the WannaCry fiasco is more than merely a
 matter of memory safety;
It may very well be. But if memory safety is part of the problem, then it is part of the solution.
May 17
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 17, 2017 at 04:16:59PM -0700, Walter Bright via Digitalmars-d wrote:
 On 5/17/2017 1:46 PM, H. S. Teoh via Digitalmars-d wrote:
 People aren't willing to accept that their cherished choice of
 language may have been the wrong one, especially if they have
 invested much of their lives in mastering said language.
It may not be the developers that initiate this change. It'll be the managers and the customers who force the issue - as those are the people who'll pay the bill for the problems.
That may or may not force a shift to a different language. In fact, the odds are heavily stacked against a language change. Most management are concerned (and in many cases, rightly so) about the cost of rewriting decades-old "proven" software as opposed to merely plugging the holes in the existing software. As long as they have enough coders plugging away at the bugs, they're likely to be inclined to say "good enough". The only way this will change is if a nasty enough exploit causes a large enough incident, and if it's something that shakes the very foundations of practically all code in that language -- say a fundamental flaw in the C standard library or something of that scale that has no easy fix except rewriting major chunks of all code that uses the C library. Either that, or if a continuous stream of high-visibility exploits occur over an extended period of time, all related to flaws in C or the C standard library, such that people eventually grow disgusted enough at yet another buffer overflow or yet another stack overflow, etc., that they seriously start considering alternatives. Barring these extreme scenarios, I see the more likely outcome is an increasing adoption of safe coding conventions that may help in the short term, but ultimately unable to address fundamental language design issues. Perhaps what will eventually cause a change is education: if the next generation of programmers are educated to be aware of security issues and language design issues, they may be more inclined to choose a memory-safe language when they are given a choice. Eventually, if they become the decision-makers, that is when the shift will happen.
 Though from what I can tell, the WannaCry fiasco is more than merely
 a matter of memory safety;
It may very well be. But if memory safety is part of the problem, then it is part of the solution.
Memory safety is only part of the story. I'm tempted to quote you saying that it's only plugging one hole in a cheesegrater, but in this case it's a pretty darn big hole. :-P But there are other issues that memory safety doesn't even begin to address, like race conditions, resource leakage (leading to DoS attacks), improper use (or lack of use) of cryptographically-secure primitives, access control, data sanitization, leakage of sensitive data, inherently insecure designs (e.g., backdoors), etc.. After having been involved in a major code audit project at my work, I'm increasingly of the opinion that the vast majority of coders have no idea how to write secure code (or they are just too indifferent to bother), and that the vast majority of non-trivial codebases running our present-day systems right now are riddled chockful of security holes just waiting for someone to devise an exploit for. Only some of these flaws are related to memory safety. T -- Once bitten, twice cry...
May 17
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/17/17 8:27 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 17, 2017 at 04:16:59PM -0700, Walter Bright via Digitalmars-d
wrote:
 On 5/17/2017 1:46 PM, H. S. Teoh via Digitalmars-d wrote:
 People aren't willing to accept that their cherished choice of
 language may have been the wrong one, especially if they have
 invested much of their lives in mastering said language.
It may not be the developers that initiate this change. It'll be the managers and the customers who force the issue - as those are the people who'll pay the bill for the problems.
That may or may not force a shift to a different language. In fact, the odds are heavily stacked against a language change. Most management are concerned (and in many cases, rightly so) about the cost of rewriting decades-old "proven" software as opposed to merely plugging the holes in the existing software. As long as they have enough coders plugging away at the bugs, they're likely to be inclined to say "good enough".
What will cause a shift is a continuous business loss. If business A and B are competing in the same space, and business A has a larger market share, but experiences a customer data breach. Business B consumes many of A's customers, takes over the market, and it turns out that the reason B wasn't affected was that they used a memory-safe language. The business cases like this will continue to pile up until it will be considered ignorant to use a non-memory safe language. It will be even more obvious when companies like B are much smaller and less funded than companies like A, but can still overtake them because of the advantage. At least, this is the only way I can see C ever "dying". And of course by dying, I mean that it just won't be selected for large startup projects. It will always live on in low level libraries, and large existing projects (e.g. Linux). I wonder how much something like D in betterC mode can take over some of these tasks? -Steve
May 17
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 17, 2017 at 08:58:31PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 5/17/17 8:27 PM, H. S. Teoh via Digitalmars-d wrote:
[...]
 What will cause a shift is a continuous business loss.
 
 If business A and B are competing in the same space, and business A
 has a larger market share, but experiences a customer data breach.
 Business B consumes many of A's customers, takes over the market, and
 it turns out that the reason B wasn't affected was that they used a
 memory-safe language.
 
 The business cases like this will continue to pile up until it will be
 considered ignorant to use a non-memory safe language. It will be even
 more obvious when companies like B are much smaller and less funded
 than companies like A, but can still overtake them because of the
 advantage.
This is a possible scenario, but I don't see it being particularly likely, because in terms of data breaches, memory safety is only part of the equation. Other factors will also come into play in determining the overall susceptibility of a system. Things like secure coding practices, and by that I include more than just memory safety, such as resource management, proper use of cryptographic technology, privilege separation, access control, data sanitation, etc.. In spite of C's flaws, it *is* still possible to create a relatively secure system in C. It's harder, no argument about that, but possible. It depends on how the company implements secure coding practices (or not). In a memory safe language you can still make blunders that allow breaches like SQL injection in spite of memory safety.
 At least, this is the only way I can see C ever "dying". And of course
 by dying, I mean that it just won't be selected for large startup
 projects. It will always live on in low level libraries, and large
 existing projects (e.g. Linux).
[...] If that's your definition of "dying", then C has been steadily dying over the past decade or two already. :-) Since the advent of Java, C#, and the rest of that ilk, large droves of programmers have been leaving C and adopting these other languages instead. I don't have concrete data to back this up, but my suspicion is that the majority of new projects started today are not in C, but in Java, C#, Javascript, and similar languages, and a smaller percentage in C++, depending on the type of project. Perhaps in gaming circles C++ might still be dominant, but in the business applications world and in the web apps world the trend is definitely on Java, C#, et al. C's role has pretty much been shrinking to embedded software and low-level stuff like OSes (mainly Posix) and low-level network code. (Unfortunately it still accounts for a significant number of low-level network code, especially those running on network hardware like routers and firewalls, which is why security issues in C code are still a concern today.) Nevertheless, there is still an ongoing stream of exploits and security incidents in the web programming world largely driven by supposedly memory-safe languages like Java or Javascript. (Well, there is that disaster called PHP that's still prevalent on the web, maybe that accounts for some percentage of these exploits. But that's mostly in the implementation of PHP rather than the language itself, since AFAIK it doesn't let you manipulate memory directly in an unsafe way like C does. But it does let you do a lot of other stupid things, security-wise, that will still pose problems even though it's technically memory-safe. That's why I said, memory safety only goes so far -- you need a lot more than that to stand in the face of today's security threats.) T -- People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth
May 17
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/18/17 12:40 AM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 17, 2017 at 08:58:31PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 5/17/17 8:27 PM, H. S. Teoh via Digitalmars-d wrote:
[...]
 What will cause a shift is a continuous business loss.

 If business A and B are competing in the same space, and business A
 has a larger market share, but experiences a customer data breach.
 Business B consumes many of A's customers, takes over the market, and
 it turns out that the reason B wasn't affected was that they used a
 memory-safe language.

 The business cases like this will continue to pile up until it will be
 considered ignorant to use a non-memory safe language. It will be even
 more obvious when companies like B are much smaller and less funded
 than companies like A, but can still overtake them because of the
 advantage.
This is a possible scenario, but I don't see it being particularly likely, because in terms of data breaches, memory safety is only part of the equation. Other factors will also come into play in determining the overall susceptibility of a system. Things like secure coding practices, and by that I include more than just memory safety, such as resource management, proper use of cryptographic technology, privilege separation, access control, data sanitation, etc.. In spite of C's flaws, it *is* still possible to create a relatively secure system in C. It's harder, no argument about that, but possible. It depends on how the company implements secure coding practices (or not). In a memory safe language you can still make blunders that allow breaches like SQL injection in spite of memory safety.
Of course. But what business people would see is a huge company like facebook being marginalized by a small startup, and having the analysts say "well, it's mostly because they used Rust/D". The game would be over at that point, regardless of the technical details of the "true" root cause. Note: I just use facebook as an example of a company that is so large and pervasive that everyone thinks they are unkillable, I don't really think the strawman scenario above is likely. Remember the old saying, "Nobody ever got fired for picking IBM"? How relevant is that today?
 Nevertheless, there is still an ongoing stream of exploits and security
 incidents in the web programming world largely driven by supposedly
 memory-safe languages like Java or Javascript. (Well, there is that
 disaster called PHP that's still prevalent on the web, maybe that
 accounts for some percentage of these exploits. But that's mostly in the
 implementation of PHP rather than the language itself, since AFAIK it
 doesn't let you manipulate memory directly in an unsafe way like C does.
 But it does let you do a lot of other stupid things, security-wise, that
 will still pose problems even though it's technically memory-safe.
 That's why I said, memory safety only goes so far -- you need a lot more
 than that to stand in the face of today's security threats.)
Speaking of "memory safe" languages like PHP whose implementation is not necessarily memory safe, there is a danger here also in how D is moving towards memory safety. We still allow unsafe operations inside safe code, using trusted. This is a necessary evil, but it's so very important that the base libraries (druntime and phobos) keep this to a minimum, and that we review those trusted blocks to death. -Steve
May 18
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, May 18, 2017 at 08:12:18AM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 Of course. But what business people would see is a huge company like
 facebook being marginalized by a small startup, and having the
 analysts say "well, it's mostly because they used Rust/D". The game
 would be over at that point, regardless of the technical details of
 the "true" root cause.
But how likely is it for the analysts to say "it's because they used Rust/D instead of C"?
 Note: I just use facebook as an example of a company that is so large
 and pervasive that everyone thinks they are unkillable, I don't really
 think the strawman scenario above is likely. Remember the old saying,
 "Nobody ever got fired for picking IBM"? How relevant is that today?
Yeah, probably the shift away from C will be gradual, rather than overnight. [...]
 Speaking of "memory safe" languages like PHP whose implementation is
 not necessarily memory safe, there is a danger here also in how D is
 moving towards memory safety. We still allow unsafe operations inside
  safe code, using  trusted. This is a necessary evil, but it's so very
 important that the base libraries (druntime and phobos) keep this to a
 minimum, and that we review those  trusted blocks to death.
[...] Yes, and that is why it's a grave concern that Phobos has (or used to have) giant blocks of code under the heading ` trusted:`. Even entire functions marked trusted are a concern, to me, if the function is more than 5-10 lines long. In the long run, I fear that if there are too many trusted blocks in a given codebase (not necessarily Phobos), it will become too onerous to review, and could lead to hidden exploits that are overlooked by reviewers. I don't know how to solve this conundrum. T -- "Hi." "'Lo."
May 18
next sibling parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Thursday, 18 May 2017 at 17:53:52 UTC, H. S. Teoh wrote:

 In the long run, I fear that if there are too many  trusted 
 blocks in a given codebase (not necessarily Phobos), it will 
 become too onerous to review, and could lead to hidden exploits 
 that are overlooked by reviewers.  I don't know how to solve 
 this conundrum.
Simple. You reject such codebase from the get-go ;)
May 18
parent Moritz Maxeiner <moritz ucworks.org> writes:
On Thursday, 18 May 2017 at 18:15:28 UTC, Stanislav Blinov wrote:
 On Thursday, 18 May 2017 at 17:53:52 UTC, H. S. Teoh wrote:

 In the long run, I fear that if there are too many  trusted 
 blocks in a given codebase (not necessarily Phobos), it will 
 become too onerous to review, and could lead to hidden 
 exploits that are overlooked by reviewers.  I don't know how 
 to solve this conundrum.
Simple. You reject such codebase from the get-go ;)
To be honest, I don't think you *can* solve this problem (rejecting such a codebase is a workaround that may or may not work, depending on the use case and what the codebase as to do; there are valid reasons for why the majority of a codebase may need to be trusted, such as OS abstractions). As long as we build software on top of operating systems with APIs that may or may not be unsafe we *need* such an unsafe layer and any codebase that heavily interacts with the OS will be littered with trusted. All you can do is educate people to spot when trusted is actually necessary and when something could genuinely be written safe without trusted and educate them to choose the latter when and if possible.
May 19
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/18/2017 10:53 AM, H. S. Teoh via Digitalmars-d wrote:
 Yes, and that is why it's a grave concern that Phobos has (or used to
 have) giant blocks of code under the heading ` trusted:`. Even entire
 functions marked  trusted are a concern, to me, if the function is more
 than 5-10 lines long.
Please pick one and submit a PR to fix it!
May 18
prev sibling parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Thursday, 18 May 2017 at 12:12:18 UTC, Steven Schveighoffer 
wrote:
 [...]

 We still allow unsafe operations inside  safe code, using 
  trusted. This is a necessary evil, but it's so very important 
 that the base libraries (druntime and phobos) keep this to a 
 minimum, and that we review those  trusted blocks to death.
That and we need to make sure it is understood by everyone using third party safe code that it is *not* a "I don't have to audit this code" free card. It merely reduced the amount of code you need to review to what is marked as trusted (with regards to memory safety); as long as you don't *know* whether some third party code is safe or trusted, you (as the programmer) have to assume it is trusted and that means you have to extend trust to the author and cannot assume any of the safe guarantees for that code.
May 19
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/19/17 5:12 AM, Moritz Maxeiner wrote:
 On Thursday, 18 May 2017 at 12:12:18 UTC, Steven Schveighoffer wrote:
 [...]

 We still allow unsafe operations inside  safe code, using  trusted.
 This is a necessary evil, but it's so very important that the base
 libraries (druntime and phobos) keep this to a minimum, and that we
 review those  trusted blocks to death.
That and we need to make sure it is understood by everyone using third party safe code that it is *not* a "I don't have to audit this code" free card. It merely reduced the amount of code you need to review to what is marked as trusted (with regards to memory safety); as long as you don't *know* whether some third party code is safe or trusted, you (as the programmer) have to assume it is trusted and that means you have to extend trust to the author and cannot assume any of the safe guarantees for that code.
What we need are 2 things: 1. trusted blocks need to be rock-solid in Phobos and Druntime. And as rare as possible. This provides a foundation to build completely safe libraries. It's like atomics -- they are hugely important and very easy to get wrong. Leave the actual implementation to the pros. We should be the pros on phobos/druntime safety. 2. trusted blocks in any project need to be considered red flags. You should not need to audit safe code. What you need to do is audit trusted code when it interacts with safe code. If you can prove that in *all cases* the safe code is still safe even with the included trusted blocks, then you don't have to audit safe code that calls that "tainted" function. If we get into " safe really means trusted" territory, we have lost. -Steve
May 19
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Friday, 19 May 2017 at 11:53:57 UTC, Steven Schveighoffer 
wrote:
 On 5/19/17 5:12 AM, Moritz Maxeiner wrote:
 On Thursday, 18 May 2017 at 12:12:18 UTC, Steven Schveighoffer 
 wrote:
 [...]

 We still allow unsafe operations inside  safe code, using 
  trusted.
 This is a necessary evil, but it's so very important that the 
 base
 libraries (druntime and phobos) keep this to a minimum, and 
 that we
 review those  trusted blocks to death.
That and we need to make sure it is understood by everyone using third party safe code that it is *not* a "I don't have to audit this code" free card. It merely reduced the amount of code you need to review to what is marked as trusted (with regards to memory safety); as long as you don't *know* whether some third party code is safe or trusted, you (as the programmer) have to assume it is trusted and that means you have to extend trust to the author and cannot assume any of the safe guarantees for that code.
What we need are 2 things: 1. trusted blocks need to be rock-solid in Phobos and Druntime. And as rare as possible.
Agreed 100%.
 This provides a foundation to build completely  safe libraries.
Agreed if you mean libraries being marked completely as safe (which I assume). Disagreed if you mean libraries that are proven to never corrupt memory (not possible with unsafe operating system).
 It's like atomics -- they are hugely important and very easy to 
 get wrong.
Sure.
 Leave the actual implementation to the pros.
If you mean the act of implementing: Yes, agreed. If you mean the entire mind space of the implementation, aka you (the programmer) receives a "get out of audit this" free card, because "professionals" wrote it: No. You are *always* responsible for verifying *all* third party trusted code *yourself*.
 We should be the pros on phobos/druntime safety.
Agreed.
 2.  trusted blocks in any project need to be considered red 
 flags. You should not need to audit  safe code.
Yes you do, because it can call into trusted like this: --- void foo(int[] bar) safe { () trusted { // Exploitable code here }(); } --- You *must* audit third party safe code for such hidden trusted code (e.g. grep recursively through such third party code for trusted and verify).
 What you need to do is audit  trusted code when it interacts 
 with  safe code.
You need to audit *all* trusted code, because you don't necessarily control who calls it.
 If you can prove that in *all cases* the  safe code is still 
  safe even with the included  trusted blocks, then you don't 
 have to audit  safe code that calls that "tainted" function.
s/prove/promise/ But yes, that is precisely what I wrote in the above with regards to the reduction of what you have to audit.
 If we get into " safe really means  trusted" territory, we have 
 lost.
For code that you write yourself, safe means safe, of course. For code other people write and you want to call, it being marked safe does really mean trusted as long as you yourself have not looked inside it and verified there either is no hidden trusted, or verified *yourself* that the hidden trusted is memory safe. I consider any other behaviour to be negligent to the degree of "you don't actually care about memory safety at all".
May 19
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/19/17 9:46 AM, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 11:53:57 UTC, Steven Schveighoffer wrote:
 This provides a foundation to build completely  safe libraries.
Agreed if you mean libraries being marked completely as safe (which I assume). Disagreed if you mean libraries that are proven to never corrupt memory (not possible with unsafe operating system).
I mean libraries which only contain safe and system calls. i.e.: $ grep -R ' trusted' libsafe | wc -l 0
 2.  trusted blocks in any project need to be considered red flags. You
 should not need to audit  safe code.
Yes you do, because it can call into trusted like this: --- void foo(int[] bar) safe { () trusted { // Exploitable code here }(); } --- You *must* audit third party safe code for such hidden trusted code (e.g. grep recursively through such third party code for trusted and verify).
This is what I mean by auditing trusted when it interacts with safe code. Using your example, if we confirm that no matter how you call foo, the trusted block cannot break memory safety, then foo becomes a verified safe function. Then any safe function that calls foo can be considered safe without auditing. This is actually a necessity, because templates can infer safety, so you may not even know the call needs to be audited. The most dangerous thing I think is to have trusted blocks which use templated types. A real example recently was a PR that added safe to a function, and made the following call: () trusted { return CreateDirectoryW(pathname.tempCStringW(), null); } Where pathname was a range. The trusted block is to allow the call to CreateDirectoryW, but inadvertently, you are trusting all the range functions inside pathname, whatever that type is! The correct solution is: auto cstr = pathname.tempCStringW(); () trusted { return CreateDirectoryW(cstr, null); } So yes, if the third party has trusted code, you need to audit it. But once you have audited the *block* of trusted code, and how it interacts within its function, you can consider the calling safe functions actually safe without auditing.
 If we get into " safe really means  trusted" territory, we have lost.
For code that you write yourself, safe means safe, of course. For code other people write and you want to call, it being marked safe does really mean trusted as long as you yourself have not looked inside it and verified there either is no hidden trusted, or verified *yourself* that the hidden trusted is memory safe. I consider any other behaviour to be negligent to the degree of "you don't actually care about memory safety at all".
I think there will be a good market for separating libraries between trusted-containing libraries, and only safe-containing libraries. This will make the auditing more focused, and more shareable. I don't expect people to use Phobos and audit all the trusted blocks personally. If "D is memory safe" means "D is memory safe ONLY if you verify all of the standard library personally", we still have lost. -Steve
May 19
next sibling parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Friday, 19 May 2017 at 15:12:20 UTC, Steven Schveighoffer 
wrote:
 On 5/19/17 9:46 AM, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 11:53:57 UTC, Steven Schveighoffer 
 wrote:
 This provides a foundation to build completely  safe 
 libraries.
Agreed if you mean libraries being marked completely as safe (which I assume). Disagreed if you mean libraries that are proven to never corrupt memory (not possible with unsafe operating system).
I mean libraries which only contain safe and system calls. i.e.: $ grep -R ' trusted' libsafe | wc -l 0
In that case, agreed. There will be no need (with regards to memory safety) to audit such libraries.
 2.  trusted blocks in any project need to be considered red 
 flags. You
 should not need to audit  safe code.
Yes you do, because it can call into trusted like this: --- void foo(int[] bar) safe { () trusted { // Exploitable code here }(); } --- You *must* audit third party safe code for such hidden trusted code (e.g. grep recursively through such third party code for trusted and verify).
This is what I mean by auditing trusted when it interacts with safe code.
Ok, I was not sure what exactly you meant (because interaction is a broad concept), so I went explicit as I did not want to assume.
 Using your example, if we confirm that no matter how you call 
 foo, the  trusted block cannot break memory safety, then foo 
 becomes a verified  safe function. Then any  safe function that 
 calls foo can be considered  safe without auditing.
Yes (assuming any such safe function does not call another trusted function that wasn't verified).
 This is actually a necessity, because templates can infer 
 safety, so you may not even know the call needs to be audited. 
 The most dangerous thing I think is to have  trusted blocks 
 which use templated types.

 A real example recently was a PR that added  safe to a 
 function, and made the following call:

 ()  trusted { return CreateDirectoryW(pathname.tempCStringW(), 
 null); }

 Where pathname was a range. The trusted block is to allow the 
 call to CreateDirectoryW, but inadvertently, you are trusting 
 all the range functions inside pathname, whatever that type is!

 The correct solution is:

 auto cstr = pathname.tempCStringW();
 ()  trusted { return CreateDirectoryW(cstr, null); }
Interesting case. I have not encountered something like it before, since all code that I mark trusted uses OS provided structs, primitive types, or arrays (nothing fancy like ranges).
 So yes, if the third party has  trusted code, you need to audit 
 it. But once you have audited the *block* of trusted code, and 
 how it interacts within its function, you can consider the 
 calling  safe functions actually safe without auditing.
Indeed (again, though, only if that calling safe function doesn't call some other trusted you forgot to audit).
 If we get into " safe really means  trusted" territory, we 
 have lost.
For code that you write yourself, safe means safe, of course. For code other people write and you want to call, it being marked safe does really mean trusted as long as you yourself have not looked inside it and verified there either is no hidden trusted, or verified *yourself* that the hidden trusted is memory safe. I consider any other behaviour to be negligent to the degree of "you don't actually care about memory safety at all".
I think there will be a good market for separating libraries between trusted-containing libraries, and only safe-containing libraries.
I totally agree. Unfortunately, most of what I use D for is replacing C in parts where I have to directly interface with C, specifically glibc wrapped syscalls. I considered calling the syscalls directly and completly bypass C at all, but since syscalls in Linux are often in reality available in userspace without context switch (vdso), the code is not trivial and I decided to just use interface with the C wrappers.
 This will make the auditing more focused, and more shareable.
Sure.
 I don't expect people to use Phobos and audit all the  trusted 
 blocks personally.
As long as they don't actually call them, that's reasonable. But if your application ends up calling trusted code and you did not audit that trusted yourself, you have violated the trusted requirement: You cannot promise to the compiler that the code is memory safe since you have no knowledge of what it actually does.
 If "D is  memory safe" means "D is memory safe ONLY if you 
 verify all of the standard library personally", we still have 
 lost.
It is more like "D is memory safe" meaning "D is memory safe ONLY if you verify all of the trusted code your application end up compiling in / linking against". There is no way around that I can see without getting rid of trusted, which is impossible for a systems PL.
May 19
parent reply Dominikus Dittes Scherkl <Dominikus.Scherkl continental-corporation.com> writes:
On Friday, 19 May 2017 at 15:52:52 UTC, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 15:12:20 UTC, Steven Schveighoffer
 I don't expect people to use Phobos and audit all the  trusted 
 blocks personally.
As long as they don't actually call them, that's reasonable. But if your application ends up calling trusted code and you did not audit that trusted yourself, you have violated the trusted requirement: You cannot promise to the compiler that the code is memory safe since you have no knowledge of what it actually does.
No. trusted is about trust: you cannot rely on the compiler to verify it, but the code is reviewed by humans. So there is a list of reviewers and if this list contains some names you happen to trust (sic!) you don't have to audit the code yourself. Especially basic libraries will over time become tested and audited by very many people or even organizations. So after some time they really can be trusted.
 If "D is  memory safe" means "D is memory safe ONLY if you 
 verify all of the standard library personally", we still have 
 lost.
It is more like "D is memory safe" meaning "D is memory safe ONLY if you verify all of the trusted code your application end up compiling in / linking against". There is no way around that I can see without getting rid of trusted, which is impossible for a systems PL.
For bigger projects you always need to trust in some previous work. But having the trusted and save mechanism makes the resulting code a whole lot more trustworthy than any C library can ever be - just by reducing the number of lines of code that really need be audited. I personally would not going bejond probing some few functions within a library which I think are more complicated and fragile, and if I find them ok, my trust in what else the authors have marked trusted increases likewise.
May 19
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Friday, 19 May 2017 at 17:21:23 UTC, Dominikus Dittes Scherkl 
wrote:
 On Friday, 19 May 2017 at 15:52:52 UTC, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 15:12:20 UTC, Steven Schveighoffer
 I don't expect people to use Phobos and audit all the 
  trusted blocks personally.
As long as they don't actually call them, that's reasonable. But if your application ends up calling trusted code and you did not audit that trusted yourself, you have violated the trusted requirement: You cannot promise to the compiler that the code is memory safe since you have no knowledge of what it actually does.
No. trusted is about trust: you cannot rely on the compiler to verify it, but the code is reviewed by humans.
Precisely. It is about trust the compiler extends to you, the programmer, instead of a mechanical proof ( safe): "Trusted functions are guaranteed by the programmer to not exhibit any undefined behavior if called by a safe function. Generally, trusted functions should be kept small so that they are easier to manually verify." [1] If you write an application that uses trusted code - even from a third party library - *you* are the programmer that the compiler extends the trust to.
 So there is a  list of reviewers and if this list contains some 
 names you happen to trust (sic!) you don't have to audit the 
 code yourself.
Trust, but verify: Considering the damages already caused via memory corruption, I would argue that even if you have a list of people you trust to both write trusted and review trusted code (both of which is fine imho), reviewing them yourself (when writing an application) is the prudent (and sane) course of action.
 Especially basic libraries will over time become tested and 
 audited by very many people or even organizations. So after 
 some time they really can be trusted.
Absolutely not. This kind of mentality is what allowed bugs like heartbleed to rot for years[2], or even decades[3]. Unsafe code can never be *inherently* trusted.
 If "D is  memory safe" means "D is memory safe ONLY if you 
 verify all of the standard library personally", we still have 
 lost.
It is more like "D is memory safe" meaning "D is memory safe ONLY if you verify all of the trusted code your application end up compiling in / linking against". There is no way around that I can see without getting rid of trusted, which is impossible for a systems PL.
For bigger projects you always need to trust in some previous work.
Not really. You can always verify any trusted code (and if the amount of trusted code you have to verify is large, then I argue that you are using the wrong previous work with regards to memory safety).
 But having the  trusted and  save mechanism makes the resulting 
 code a whole lot more trustworthy than any C library can ever 
 be - just by reducing the number of lines of code that really 
 need be audited.
I agree with that viewpoint (and wrote about the reduced auditing work previously in this conversation), but the quote you responded to here was about using D in general being memory safe (which is binary "yes/no"), not any particular library's degree of trustworthyness with regards to memory safety (which is a continuous scale).
 I personally would not going bejond probing some few functions 
 within a library which I think are more complicated and 
 fragile, and if I find them ok, my trust in what else the 
 authors have marked  trusted increases likewise.
That is your choice, but the general track record of trusting others to get it right without verifying it yourself remains atrocious and I would still consider you negligent for doing so, because while in C one has had little other choice historically - since without a safe concept the amount of code one would have to verify reaches gargantuan size - in D we can (and should imho) only have small amounts of trusted code. [1] https://dlang.org/spec/function.html#trusted-functions [2] https://www.theguardian.com/technology/2014/apr/11/heartbleed-developer-error-regrets-oversight [3] https://www.theguardian.com/technology/2014/jun/06/heartbleed-openssl-bug-security-vulnerabilities
May 19
parent reply Dominikus Dittes Scherkl <dominikus scherkl.de> writes:
On Friday, 19 May 2017 at 20:19:46 UTC, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 17:21:23 UTC, Dominikus Dittes 
 Scherkl wrote:
 You cannot promise to the compiler that the code is memory 
 safe since you have no knowledge of what it actually does.
No. trusted is about trust: you cannot rely on the compiler to verify it, but the code is reviewed by humans.
Precisely. It is about trust the compiler extends to you, the programmer, instead of a mechanical proof ( safe): "Trusted functions are guaranteed by the programmer to not exhibit any undefined behavior if called by a safe function. Generally, trusted functions should be kept small so that they are easier to manually verify." [1]
I take this to mean the programmer who wrote the library, not every user of the library. Ok, it's better the more people checked it but it need not be always me. Hm - we should have some mechanism to add to some list of people who already trust the code because they checked it.
 If you write an application that uses  trusted code - even from 
 a third party library - *you* are the programmer that the 
 compiler extends the trust to.
This is not my point of view. Especially if I had payed for some library, even legally it's not my fault if it fails. For public domain ok, the end user is theoretically responsible for everything that goes wrong but even there nobody can check everything or even a relevant portion of it.
 Trust, but verify: Considering the damages already caused via 
 memory corruption, I would argue that even if you have a list 
 of people you trust to both write  trusted and review  trusted 
 code (both of which is fine imho), reviewing them yourself 
 (when writing an application) is the prudent (and sane) course 
 of action.
This is infeasable even if safe and trusted reduce the Herculic task by large.
 Especially basic libraries will over time become tested and 
 audited by very many people or even organizations. So after 
 some time they really can be trusted.
Absolutely not. This kind of mentality is what allowed bugs like heartbleed to rot for years[2], or even decades[3]. Unsafe code can never be *inherently* trusted.
In addition to trusted, D has unittests that - in harsh contrast to C - are run by most users. And especially trusted functions have extensive tests - even more so if they ever showed some untrustworthy behaviour. This increasing unittest blocks make older and more used libraries indeed more reliable, even if a function is changed (also in contrast to C where a changed function start again at zero trust while a D function has to pass all the old unittests and therefore start with high trust level)
 For bigger projects you always need to trust in some previous 
 work.
Not really. You can always verify any trusted code (and if the amount of trusted code you have to verify is large, then I argue that you are using the wrong previous work with regards to memory safety).
Sorry. Reviewing everything you use is impossible. I just can't believe you if you claim to do so.
 But having the  trusted and  save mechanism makes the 
 resulting code a whole lot more trustworthy than any C library 
 can ever be - just by reducing the number of lines of code 
 that really need be audited.
I agree with that viewpoint (and wrote about the reduced auditing work previously in this conversation), but the quote you responded to here was about using D in general being memory safe (which is binary "yes/no"), not any particular library's degree of trustworthyness with regards to memory safety (which is a continuous scale).
No. Declaring a function safe is still no binary "yes". I don't believe in such absolute values. Clearly the likelyhood of memory corruption will be orders of magnitude lower, but never zero. The compiler may have bugs, the system a SW is running on will have bugs, even hardware failures are possible. Everything is about trust.
 I personally would not going bejond probing some few functions 
 within a library which I think are more complicated and 
 fragile, and if I find them ok, my trust in what else the 
 authors have marked  trusted increases likewise.
That is your choice, but the general track record of trusting others to get it right without verifying it yourself remains atrocious and I would still consider you negligent for doing so, because while in C one has had little other choice historically - since without a safe concept the amount of code one would have to verify reaches gargantuan size - in D we can (and should imho) only have small amounts of trusted code.
Of course. And an decreasing amount. But what we have is already a huge step in the right direction. We should live in the reality. Everybodies time is spare. So you can always spent your time for checking code only for the parts which are most important for you and which you suspect the most. Claiming otherwise is - believe it or not - making you less trustworthy to me.
May 19
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Friday, 19 May 2017 at 20:54:40 UTC, Dominikus Dittes Scherkl 
wrote:
 On Friday, 19 May 2017 at 20:19:46 UTC, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 17:21:23 UTC, Dominikus Dittes 
 Scherkl wrote:
 You cannot promise to the compiler that the code is memory 
 safe since you have no knowledge of what it actually does.
No. trusted is about trust: you cannot rely on the compiler to verify it, but the code is reviewed by humans.
Precisely. It is about trust the compiler extends to you, the programmer, instead of a mechanical proof ( safe): "Trusted functions are guaranteed by the programmer to not exhibit any undefined behavior if called by a safe function. Generally, trusted functions should be kept small so that they are easier to manually verify." [1]
I take this to mean the programmer who wrote the library, not every user of the library.
I take this to mean any programmer that ends up compiling it (if you use a precompiled version that you only link against that would be different).
 Hm - we should have some mechanism to add to some list of 
 people who already trust the code because they checked it.
Are you talking about inside the code itself? If so, I imagine digitally signing the functions source code (ignoring whitespace) and adding the signature to the docstring would do it.
 If you write an application that uses  trusted code - even 
 from a third party library - *you* are the programmer that the 
 compiler extends the trust to.
This is not my point of view. Especially if I had payed for some library, even legally it's not my fault if it fails.
It is mine, because even if you payed for the library, when you compile it, the compiler cannot know where the library came from. It only knows you (the programmer who invoked it), as the one it extends trust to. I am specifically not talking about what is legally your fault or not, because I consider that an entirely different matter.
 For public domain ok, the end user is theoretically responsible 
 for everything that goes wrong but even there nobody can check 
 everything or even a relevant portion of it.
That entirely depends on how much trusted code you have.
 Trust, but verify: Considering the damages already caused via 
 memory corruption, I would argue that even if you have a list 
 of people you trust to both write  trusted and review  trusted 
 code (both of which is fine imho), reviewing them yourself 
 (when writing an application) is the prudent (and sane) course 
 of action.
This is infeasable even if safe and trusted reduce the Herculic task by large.
I disagree.
 Especially basic libraries will over time become tested and 
 audited by very many people or even organizations. So after 
 some time they really can be trusted.
Absolutely not. This kind of mentality is what allowed bugs like heartbleed to rot for years[2], or even decades[3]. Unsafe code can never be *inherently* trusted.
In addition to trusted, D has unittests that - in harsh contrast to C - are run by most users. And especially trusted functions have extensive tests - even more so if they ever showed some untrustworthy behaviour. This increasing unittest blocks make older and more used libraries indeed more reliable, even if a function is changed (also in contrast to C where a changed function start again at zero trust while a D function has to pass all the old unittests and therefore start with high trust level)
Those are valuable tools, but they do not make any piece of code *inherently* trustworthy.
 For bigger projects you always need to trust in some previous 
 work.
Not really. You can always verify any trusted code (and if the amount of trusted code you have to verify is large, then I argue that you are using the wrong previous work with regards to memory safety).
Sorry. Reviewing everything you use is impossible. I just can't believe you if you claim to do so.
I specifically stated reviewing any trusted code, not all code. And so far I have made no claim about not being negligent myself with regards to memory safety.
 But having the  trusted and  save mechanism makes the 
 resulting code a whole lot more trustworthy than any C 
 library can ever be - just by reducing the number of lines of 
 code that really need be audited.
I agree with that viewpoint (and wrote about the reduced auditing work previously in this conversation), but the quote you responded to here was about using D in general being memory safe (which is binary "yes/no"), not any particular library's degree of trustworthyness with regards to memory safety (which is a continuous scale).
No. Declaring a function safe is still no binary "yes".
Again, this was not about any particular function (or library), but about using D in general.
 I don't believe in such absolute values. Clearly the likelyhood 
 of memory corruption will be orders of magnitude lower, but 
 never zero. The compiler may have bugs, the system a SW is 
 running on will have bugs, even hardware failures are possible. 
 Everything is about trust.
I agree in principal, but the statement I responded to was "D is memory safe", which either does or does not hold. I also believe that considering the statement's truthfulness only makes sense under the assumption that nothing *below* it violates that, since the statement is about language theory.
 I personally would not going bejond probing some few 
 functions within a library which I think are more complicated 
 and fragile, and if I find them ok, my trust in what else the 
 authors have marked  trusted increases likewise.
That is your choice, but the general track record of trusting others to get it right without verifying it yourself remains atrocious and I would still consider you negligent for doing so, because while in C one has had little other choice historically - since without a safe concept the amount of code one would have to verify reaches gargantuan size - in D we can (and should imho) only have small amounts of trusted code.
Of course. And an decreasing amount. But what we have is already a huge step in the right direction.
Yes, it is.
 We should live in the reality. Everybodies time is spare. So 
 you can always spent your time for checking code only for the 
 parts which are most important for you and which you suspect 
 the most.
Of course anyone can choose to check whatever they wish. That does not change what *I* consider negligent.
 Claiming otherwise is - believe it or not - making you less 
 trustworthy to me.
Where did I claim otherwise? Errare humanum est, sed in errare perseverare diabolicum. In this context: It is one thing to be negligent (and I explicitly do not claim *not* to be negligent myself), but something completely different to pretend that being negligent is OK.
May 19
parent reply Dominikus Dittes Scherkl <dominikus scherkl.de> writes:
On Friday, 19 May 2017 at 22:06:59 UTC, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 20:54:40 UTC, Dominikus Dittes 
 Scherkl wrote:
 I take this to mean the programmer who wrote the library, not 
 every user of the library.
I take this to mean any programmer that ends up compiling it (if you use a precompiled version that you only link against that would be different).
Why? Because you don't have the source? Go, get the source - at least for open source projects this should be possible. I can's see the difference.
 Hm - we should have some mechanism to add to some list of 
 people who already trust the code because they checked it.
Are you talking about inside the code itself? If so, I imagine digitally signing the functions source code (ignoring whitespace) and adding the signature to the docstring would do it.
Yeah. But I mean: we need such a mechanism in the D review process. It would be nice to have something standardized so that if I checked something to be really trustworthy, I want to make it public, so that everybody can see who already checked the code - and maybe concentrate on reviewing something that was yet not reviewed by many people or not by anybody they trust most.
 This is not my point of view. Especially if I had payed for 
 some library, even legally it's not my fault if it fails.
It is mine, because even if you payed for the library, when you compile it, the compiler cannot know where the library came from.
Yeah, but you (should) do. For me it doesn't matter who actually compiled the code - if anything my trust would be higher if I compiled it myself, because I don't know what compiler or what settings has been used for a pre-compiled library.
[the compiler] only knows you (the programmer who invoked it), 
as the
 one it extends trust to.
The compiler "trusts" anybody using it. This is of no value. The important thing is who YOU trust. Or who you want the user of your program to trust. Oftentimes it may be more convincing to the user of your program if you want them to trust company X where you bought some library from than trusting in your own ability to prove the memory safety of the code build upon this - no matter if you compiled the library yourself or have it be done by company X.
 I am specifically not talking about what is legally your fault 
 or not, because I consider that an entirely different matter.
Different matter, but same chain of trust.
 nobody can check everything or even a relevant portion of it.
That entirely depends on how much trusted code you have.
Of course. But no matter how glad I would be to be able to check e.g. my operating system for memory safety, and even if it would be only 1% of its code that is merely trusted instead of safe, it would still be too much for me. This is only feasible if you shrink you view far enough. And reverse: the more code is safe the further I can expand my checking activities - but I still don't believe to ever being able to check everything.
 I specifically stated reviewing any  trusted code, not all code.
Yes. Still too much, I think.
 I agree in principal, but the statement I responded to was "D 
 is memory safe", which either does or does not hold.
And I say: No, D is not memory safe. In practice. Good, but no 100%.
 I also believe that considering the statement's truthfulness 
 only makes sense under the assumption that nothing *below* it 
 violates that, since the statement is about language theory.
Ok, this is what I mean by "shrinking your view until it's possible to check everything" - or being able to prove something in this case. but by doing so you also neglect things. Many things.
 Of course anyone can choose to check whatever they wish. That 
 does not change what *I* consider negligent.
But neglecting is a necessity. Your view is reduced to a D specification to make statements about it in language theory where you check everything - and decide thereby to neglect everything else, below that. Including the buggy implementation of that spec running on a buggy OS on buggy hardware.
 In this context: It is one thing to be negligent (and I 
 explicitly do not claim *not* to be negligent myself), but 
 something completely different to pretend that being negligent 
 is OK.
It's not only ok. It's a necessity. The necessity of a limited being in an infinite universe. We can only hope to not neglect the important things - and trust in others is one way to increase the number of things we have hope to be ok. Making things safe instead of only trusted is another way. Both increase the view we have largely. But trying to check everything yourself is still in vain.
May 19
parent Moritz Maxeiner <moritz ucworks.org> writes:
On Friday, 19 May 2017 at 23:56:55 UTC, Dominikus Dittes Scherkl 
wrote:
 On Friday, 19 May 2017 at 22:06:59 UTC, Moritz Maxeiner wrote:
 On Friday, 19 May 2017 at 20:54:40 UTC, Dominikus Dittes 
 Scherkl wrote:
 I take this to mean the programmer who wrote the library, not 
 every user of the library.
I take this to mean any programmer that ends up compiling it (if you use a precompiled version that you only link against that would be different).
Why? Because you don't have the source? Go, get the source - at least for open source projects this should be possible. I can's see the difference.
Because imo the specification statement covers compiling trusted code, not linking to already compiled trusted code; so I excluded it from my statement. Whether you can get the source is not pertinent to the statement.
 Hm - we should have some mechanism to add to some list of 
 people who already trust the code because they checked it.
Are you talking about inside the code itself? If so, I imagine digitally signing the functions source code (ignoring whitespace) and adding the signature to the docstring would do it.
Yeah. But I mean: we need such a mechanism in the D review process. It would be nice to have something standardized so that if I checked something to be really trustworthy, I want to make it public, so that everybody can see who already checked the code - and maybe concentrate on reviewing something that was yet not reviewed by many people or not by anybody they trust most.
Well, are you self-electing yourself to be the champion of this? Because I don't think it will happen without one.
[the compiler] only knows you (the programmer who invoked it), 
as the
 one it extends trust to.
The compiler "trusts" anybody using it. This is of no value.
The compiler extends trust to whoever invokes it, that is correct (and what I wrote). That person then either explicitly, or implicitly manages that trust further. You can, obviously, manage that trust however you see fit, but *I* will still consider it negligence if you - as the author of some application - have not verified all trusted code you use.
 The important thing is who YOU trust. Or who you want the user 
 of your program to trust.
 Oftentimes it may be more convincing to the user of your 
 program if you want them to trust company X where you bought 
 some library from than trusting in your own ability to prove 
 the memory safety of the code build upon this - no matter if 
 you compiled the library yourself or have it be done by company 
 X.
And MY trust is not transitive. If I trust person A, and A trusts person B, I would NOT implicitly trust person B. As such, if A wrote me a safe application that uses trusted code written by B and A would tell me that he/she/it had not verified B's code, I would consider A to be negligent.
 I am specifically not talking about what is legally your fault 
 or not, because I consider that an entirely different matter.
Different matter, but same chain of trust.
 nobody can check everything or even a relevant portion of it.
That entirely depends on how much trusted code you have.
Of course. But no matter how glad I would be to be able to check e.g. my operating system for memory safety, and even if it would be only 1% of its code that is merely trusted instead of safe, it would still be too much for me. This is only feasible if you shrink you view far enough.
What you call shrinking your view information sciences call using the appropriate abstraction for the problem domain: It makes absolutely no sense whatsoever to even talk about the memory safety of a programming language if the infrastructure below it is not excluded from the view: If you do not use the appropriate abstraction you can always say: High enough radiation can always just randomly flip bits in your computer and you're screwed, memory safe does not exist. That, while true in practice, is of no help when designing applications that will run on computers not exposed to excessive radiation so you exclude it.
 And reverse: the more code is  safe the further I can expand my 
 checking activities - but I still don't believe to ever being 
 able to check everything.
Again: I specifically wrote about trusted code, not all code.
 I specifically stated reviewing any  trusted code, not all 
 code.
Yes. Still too much, I think.
I do not.
 I agree in principal, but the statement I responded to was "D 
 is memory safe", which either does or does not hold.
And I say: No, D is not memory safe. In practice. Good, but no 100%.
 I also believe that considering the statement's truthfulness 
 only makes sense under the assumption that nothing *below* it 
 violates that, since the statement is about language theory.
Ok, this is what I mean by "shrinking your view until it's possible to check everything" - or being able to prove something in this case. but by doing so you also neglect things. Many things.
As stated above, this is choosing the appropriate abstraction for the problem domain. I know what can go wrong in the lower layers, but all of that is irrelevant to the problem domain of a programming language. It only becomes relevant again when you ask "Will my safe D program (which is memory safe, I verified all trusted myself!) be memory safe when running on this specific system?", to which the generic answer is "it depends".
 Of course anyone can choose to check whatever they wish. That 
 does not change what *I* consider negligent.
But neglecting is a necessity.
It may be a necessity for you (and I personally assume probably even most programmers), but that does not make it generally true.
 In this context: It is one thing to be negligent (and I 
 explicitly do not claim *not* to be negligent myself), but 
 something completely different to pretend that being negligent 
 is OK.
It's not only ok. It's a necessity.
First, something being a necessity does not make it OK. Second: Whether it is a necessity is once more use case dependent.
 The necessity of a limited being in an infinite universe.
The amount of trusted code is limited and thus the time needed to review it is as well.
May 19
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 19.05.2017 17:12, Steven Schveighoffer wrote:
 I mean libraries which only contain  safe and  system calls.

 i.e.:

 $ grep -R ' trusted' libsafe | wc -l
 0
mixin(" "~"trusted void nasty(){ corruptAllTheMemory(); }");
May 19
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 19 May 2017 at 16:29:59 UTC, Timon Gehr wrote:
 On 19.05.2017 17:12, Steven Schveighoffer wrote:
 I mean libraries which only contain  safe and  system calls.

 i.e.:

 $ grep -R ' trusted' libsafe | wc -l
 0
mixin(" "~"trusted void nasty(){ corruptAllTheMemory(); }");
dmd -vcg-ast *.d
May 19
prev sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/19/17 12:29 PM, Timon Gehr wrote:
 On 19.05.2017 17:12, Steven Schveighoffer wrote:
 I mean libraries which only contain  safe and  system calls.

 i.e.:

 $ grep -R ' trusted' libsafe | wc -l
 0
mixin(" "~"trusted void nasty(){ corruptAllTheMemory(); }");
Yeah. There's that. But I think we're OK even with that loophole :) -Steve
May 19
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Thursday, 18 May 2017 at 00:58:31 UTC, Steven Schveighoffer 
wrote:
 On 5/17/17 8:27 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 17, 2017 at 04:16:59PM -0700, Walter Bright via 
 Digitalmars-d wrote:
 On 5/17/2017 1:46 PM, H. S. Teoh via Digitalmars-d wrote:
 [...]
It may not be the developers that initiate this change. It'll be the managers and the customers who force the issue - as those are the people who'll pay the bill for the problems.
That may or may not force a shift to a different language. In fact, the odds are heavily stacked against a language change. Most management are concerned (and in many cases, rightly so) about the cost of rewriting decades-old "proven" software as opposed to merely plugging the holes in the existing software. As long as they have enough coders plugging away at the bugs, they're likely to be inclined to say "good enough".
What will cause a shift is a continuous business loss. If business A and B are competing in the same space, and business A has a larger market share, but experiences a customer data breach. Business B consumes many of A's customers, takes over the market, and it turns out that the reason B wasn't affected was that they used a memory-safe language. The business cases like this will continue to pile up until it will be considered ignorant to use a non-memory safe language. It will be even more obvious when companies like B are much smaller and less funded than companies like A, but can still overtake them because of the advantage. At least, this is the only way I can see C ever "dying". And of course by dying, I mean that it just won't be selected for large startup projects. It will always live on in low level libraries, and large existing projects (e.g. Linux). I wonder how much something like D in betterC mode can take over some of these tasks?
If you get it to compile for and run the code on an AVR, Cortex R0 or other 16 bit µC, then it would have a chance to replace C. As it stands, C is the only general "high-level" language that can be used for some classes of cpu's. D requires afaict at least a 32 bit system with virtual memory, which is already a steep requirement for embedded stuff. C will remain relevant in everything below that.
May 17
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Thursday, 18 May 2017 at 05:07:38 UTC, Patrick Schluter wrote:
 On Thursday, 18 May 2017 at 00:58:31 UTC, Steven Schveighoffer 
 wrote:
 On 5/17/17 8:27 PM, H. S. Teoh via Digitalmars-d wrote:
 [...]
What will cause a shift is a continuous business loss. If business A and B are competing in the same space, and business A has a larger market share, but experiences a customer data breach. Business B consumes many of A's customers, takes over the market, and it turns out that the reason B wasn't affected was that they used a memory-safe language. The business cases like this will continue to pile up until it will be considered ignorant to use a non-memory safe language. It will be even more obvious when companies like B are much smaller and less funded than companies like A, but can still overtake them because of the advantage. At least, this is the only way I can see C ever "dying". And of course by dying, I mean that it just won't be selected for large startup projects. It will always live on in low level libraries, and large existing projects (e.g. Linux). I wonder how much something like D in betterC mode can take over some of these tasks?
If you get it to compile for and run the code on an AVR, Cortex R0 or other 16 bit µC, then it would have a chance to replace C. As it stands, C is the only general "high-level" language that can be used for some classes of cpu's. D requires afaict at least a 32 bit system with virtual memory, which is already a steep requirement for embedded stuff. C will remain relevant in everything below that.
https://www.mikroe.com/products/#compilers-software One of the few companies that thinks there is more to AVR, Cortex R0 or other 16 bit µC than just C. On this specific case they also sell Basic and Pascal (TP compatible) compilers. There are other companies selling alternatives to C and still in business, one just has to look beyond FOSS.
May 17
parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Thursday, 18 May 2017 at 06:36:55 UTC, Paulo Pinto wrote:
 On Thursday, 18 May 2017 at 05:07:38 UTC, Patrick Schluter 
 wrote:
 On Thursday, 18 May 2017 at 00:58:31 UTC, Steven Schveighoffer 
 wrote:
 On 5/17/17 8:27 PM, H. S. Teoh via Digitalmars-d wrote:
 [...]
What will cause a shift is a continuous business loss. If business A and B are competing in the same space, and business A has a larger market share, but experiences a customer data breach. Business B consumes many of A's customers, takes over the market, and it turns out that the reason B wasn't affected was that they used a memory-safe language. The business cases like this will continue to pile up until it will be considered ignorant to use a non-memory safe language. It will be even more obvious when companies like B are much smaller and less funded than companies like A, but can still overtake them because of the advantage. At least, this is the only way I can see C ever "dying". And of course by dying, I mean that it just won't be selected for large startup projects. It will always live on in low level libraries, and large existing projects (e.g. Linux). I wonder how much something like D in betterC mode can take over some of these tasks?
If you get it to compile for and run the code on an AVR, Cortex R0 or other 16 bit µC, then it would have a chance to replace C. As it stands, C is the only general "high-level" language that can be used for some classes of cpu's. D requires afaict at least a 32 bit system with virtual memory, which is already a steep requirement for embedded stuff. C will remain relevant in everything below that.
https://www.mikroe.com/products/#compilers-software One of the few companies that thinks there is more to AVR, Cortex R0 or other 16 bit µC than just C. On this specific case they also sell Basic and Pascal (TP compatible) compilers. There are other companies selling alternatives to C and still in business, one just has to look beyond FOSS.
The thing with C is that it is available from the tiniest to the biggest. I remember in my former work place where the asset of the company were communication protocols (mainframe, telecom, lan, industrial). The same sources were used on Z80, x86 (from 80186 to Pentium), 68030, ARM, AVR and 8051 (granted the 2 last didn't use much of the C code). Except for C I'm not aware of any language capable of that spread. This doesn't mean that it won't change or that something similar wasn't possible with other languages. Pascal was a good contender in the 90s but paradoxically it is the success of turbo-pascal that killed it (i.e. there was no chance for the ISO standard to be appealing). As for Basic, the big issue with it is that it is not even portable within a platform or between versions. Don't get me wrong, the products you listed are nice and a good step in the right direction, but they are far from there. I love D but it is not being unfair to notice that it has a lack of platform diversity (I wanted to use D for years for our project at work but the lack of Solaris-SPARCv9 version was an insurmountable issue).
May 18
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/17/2017 10:07 PM, Patrick Schluter wrote:
 D requires afaict at least a 32 bit system
Yes.
 with virtual memory,
No.
May 18
next sibling parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Thursday, 18 May 2017 at 08:24:18 UTC, Walter Bright wrote:
 On 5/17/2017 10:07 PM, Patrick Schluter wrote:
 D requires afaict at least a 32 bit system
Yes.
What are the technical limitations of this? *LLVM has 16bit targets *Nobody would use druntime on 16bit anyway and would not generate module info either *User code is the users problem, that leaves the front end and ldc as potential sources of limitation.
May 18
prev sibling parent Johannes Pfau <nospam example.com> writes:
On Thursday, 18 May 2017 at 08:24:18 UTC, Walter Bright wrote:
 On 5/17/2017 10:07 PM, Patrick Schluter wrote:
 D requires afaict at least a 32 bit system
Yes.
You've said this some times before but never explained why there's such a limitation? I've actually used GDC to run code on 8bit AVR as well as 16bit MSP430 controllers. The only thing I can think of is 'far pointer' support, but the times have changed in this regard as well: TI implements 16bit or 20bit pointers for their 16 bit MSP architecture, but they never mix pointers: [1]
 The problem with a "medium" model, or any model where size_t 
 and sizeof(void *)
 are not the same, is that they technically violate the ISO C 
 standard. GCC has
 minimal support for such models, and having done some in the 
 past, I recommend against it.
AVR for a long time only allowed access to high memory using special functions, no compiler support [2]. Nowadays GCC supports named address spaces [3] but I think we could implement this completely in library code: Basically using a type wrapper template should work. The only difficulty is making it work with volatile_load and if we can't find a better solution we'll need a new intrinsic data_load!(Flags = volatile, addrSpace = addrspace(foo),...)(address). Then there's the small additional 'problem' that slices will be more expensive on these architectures: If you already need 2 registers to fit a pointer and 2 for size_t a slice will need 4 registers. So there may be some performance penalty but OTOH these RISC machines usually have more general purpose registers available than X86. [1] https://e2e.ti.com/support/development_tools/compiler/f/343/t/451127 [2] http://www.nongnu.org/avr-libc/user-manual/pgmspace.html [3] https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html -- Johannes
May 18
prev sibling parent Laeeth Isharc <laeeth nospamlaeeth.com> writes:
On Thursday, 18 May 2017 at 00:58:31 UTC, Steven Schveighoffer 
wrote:
 On 5/17/17 8:27 PM, H. S. Teoh via Digitalmars-d wrote:
 [...]
What will cause a shift is a continuous business loss. If business A and B are competing in the same space, and business A has a larger market share, but experiences a customer data breach. Business B consumes many of A's customers, takes over the market, and it turns out that the reason B wasn't affected was that they used a memory-safe language. The business cases like this will continue to pile up until it will be considered ignorant to use a non-memory safe language. It will be even more obvious when companies like B are much smaller and less funded than companies like A, but can still overtake them because of the advantage. At least, this is the only way I can see C ever "dying". And of course by dying, I mean that it just won't be selected for large startup projects. It will always live on in low level libraries, and large existing projects (e.g. Linux). I wonder how much something like D in betterC mode can take over some of these tasks? -Steve
Is there any other way other than to do good work that's recognised as such and happens to be written in D? And I guess open sourcing dmd back end will make it a more acceptable language for such in time.
May 17
prev sibling parent Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 2017-05-17 at 17:27 -0700, H. S. Teoh via Digitalmars-d wrote:
 [=E2=80=A6]
 odds are heavily stacked against a language change. Most management
 are
 concerned (and in many cases, rightly so) about the cost of rewriting
 decades-old "proven" software as opposed to merely plugging the holes
 in
 the existing software.=C2=A0=C2=A0As long as they have enough coders plug=
ging
 away
 at the bugs, they're likely to be inclined to say "good enough".
[=E2=80=A6] If a lump of software is allowed into the "it works, do no touch it" then that is the beginning of the end for that product and that company. The accountants probably haven't realised it at the time they make that decision, but they have just signed the death warrant on that part of their organisation. An organisation that keeps all of it's software in development at all times may appear to spend more on development, but they are keeping the organisation codebase in a fit state for evolution. As the market changes, the organisation can change without massive revolution. The difference here is between an organisation that treats software as a cost versus software as an asset. As long as you do not measure the asset by lines of code, obviously. The rather interesting anecdote of the moment is FORTRAN (and Fortran). Various code bases written in the 1960s must still be compilable by current Fortran compilers because no-one is allowed to alter the source code of the 1960s codes. This makes Fortran one of the weirdest languages, and their compiler writers some of the best. Note though that all the organisation who followed the "the source code is fine now" are having real troubles hiring FORTRAN and Fortran developers, c.f. UK government and NASA. I believe some organisations are having to hire at =C2=A32000 per day for these people. So for the accountants: you need to look further than the next three months when it comes to your assets and bottom line over the lifetime of the organisation. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
May 17
prev sibling parent Joakim <dlang joakim.fea.st> writes:
On Wednesday, 17 May 2017 at 20:41:43 UTC, Walter Bright wrote:
 On 5/17/2017 3:21 AM, Joakim wrote:
 Hmm, this talk has become the most-viewed from this DConf, by 
 far beating
 Scott's keynote.  Wonder how, as this seems to be the only 
 link to it, hasn't
 been posted on reddit/HN.  I guess people like panels, the 
 process panel last
 year is one of the most viewed videos also.
Heh, someone just posted it to HN: https://hn.algolia.com/?query=dconf2017%20walter%20safety
 I received -2 net votes on Hackernews for suggesting that the 
 takeaway from the WannaCry fiasco for developers should be to 
 use memory safe languages.

 Maybe the larger community isn't punished enough yet.
HN votes are an isolated case, but I'm not sure how much wider recognition there is that memory safety is a big part of the problem and that there exist viable languages that offer a way out, as Andrei said in the panel. There is a long way to go in publicizing these new languages that offer better solutions.
May 17
prev sibling next sibling parent reply thedeemon <dlang thedeemon.com> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
And then null safety will kill D. ;)
May 06
parent deadalnix <deadalnix gmail.com> writes:
On Saturday, 6 May 2017 at 17:59:38 UTC, thedeemon wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
And then null safety will kill D. ;)
I actually think this is more likely than memory safety killing C. Just because both are very important but D is just easier to kill than C for historical reasons.
May 09
prev sibling next sibling parent reply Jerry <hurricane hereiam.com> writes:
Anything that goes on the internet already has memory safety. The 
things that need it aren't written in C, there's a lot of 
programs out there that just don't require it. C won't be killed, 
there's too much already written in it. Sure there might be 
nothing new getting written in it but there will still be tons of 
software that needs to be maintained even if nothing new is being 
written in it. D also won't be that far behind it if that's truly 
the reason C gets 'killed'.

Anyways can't watch the discussion as it's private.
May 08
next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Monday, 8 May 2017 at 18:33:08 UTC, Jerry wrote:
 Anything that goes on the internet already has memory safety.
BS, a damn buffer overflow bug caused cloudflare to spew its memory all over the internet just a couple of months ago. Discussed here https://forum.dlang.org/post/bomiwvlcdhxfegvxxier forum.dlang.org These things still happen all the time. Especially when companies realize that transitioning from a Python/Ruby backend to a C++ one can save tens of thousands in server costs.
May 08
parent Jack Stouffer <jack jackstouffer.com> writes:
On Monday, 8 May 2017 at 19:37:05 UTC, Jack Stouffer wrote:
 ...
Wrong link https://forum.dlang.org/post/novsplitocprdvpookre forum.dlang.org
May 08
prev sibling next sibling parent Moritz Maxeiner <moritz ucworks.org> writes:
On Monday, 8 May 2017 at 18:33:08 UTC, Jerry wrote:
 Anything that goes on the internet already has memory safety.
Bait [1]?
 The things that need it aren't written in C
Except - of course - for virtually all of our entire digital infrastructure.
 there's a lot of programs out there that just don't require it.
Just not anything that may run on a system connected to the internet. [1] https://nvd.nist.gov/vuln/search/results?adv_search=false&form_type=basic&results_type=overview&search_type=all&query=remote+buffer+overflow
May 08
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via Digitalmars-d wrote:
 Anything that goes on the internet already has memory safety.
Is that a subtle joke, or are you being serious? A LOT of code out in the internet, both in infrastructure and as applications, run C code. And if you know the typical level of quality of a large C project written by 50-100 (or more) employees who have a rather high turnover, you should be peeing your pants right now. A frightening amount of C code both in infrastructure (by that I mean stuff like routers, switches, firewalls, core services like DNS, etc.) and in applications (application-level services like webservers, file servers, database servers, etc.) are literally riddled with buffer overflows, null pointer dereference bugs, off-by-1 string manipulations, and other such savorable things. Recently I've had the dubious privilege of being part of a department wide push on the part of my employer to audit our codebases (mostly C, with a smattering of C++ and other code, all dealing with various levels of network services and running on hardware expected to be "enterprise" quality and "secure") and fix security problems and other such bugs, with the help of some static analysis tools. I have to say that even given my general skepticism about the quality of so-called "enterprise" code, I was rather shaken not only to find lots of confirmation of my gut feeling that there are major issues in our codebase, but even more by just HOW MANY of them there are. An unsettlingly large percentage of bugs / problematic code is in the realm of not handling null pointers correctly. The simplest is checking for null correctly at the beginning of the function, but then proceeding to dereference the possibly-null pointer with wild abandon thereafter. This may seem like not such a big problem, until you realize that all it takes is for *one* of these literally *hundreds* of instances of wrong code to get exposed to a public interface, and you have a DDOS attack waiting for you in your glorious future. Another unsettlingly common problem is the off-by-1 error in string handling. Actually, the most unsettling thing in this area is the pervasiveness of strcpy() and strcat() -- even after decades of experience that these functions are inherently unsafe and should be avoided if at all possible. Yet they still appear with persistent frequency, introducing hidden vulnerabilities that people overlook because, oh well, we trust the guy who wrote it 'cos he's an expert C coder, so he must have already made sure it's actually OK. Unfortunately, upon closer inspection, there are actual bugs in a large percentage of such code. Next to this is strncpy(), the touted "safe" variant of strcpy / strcat, except that people keep writing this: strncpy(buf, src, sizeof(buf)); Quick, without looking: what's wrong with the above line of code? Not so obvious, huh? The problem is that strncpy is, in spite of being the "safe" version of strcpy, badly designed. It does not guarantee buf is null-terminated if src was too long to fit in buf! Next thing you know -- why, hello, unterminated string used to inject shellcode into your "secure" webserver! The "obvious" fix, of course, is to leave 1 byte for the \0 terminator: strncpy(buf, src, sizeof(buf)-1); Except that this is *still* wrong, because strncpy doesn't write a '\0' to the end. You have to manually put one there: strncpy(buf, src, sizeof(buf)-1); buf[sizeof(buf)-1] = '\0'; The second line there has a -1 that lazy/careless C coders often forget, so you end up *introducing* a buffer overrun in the name of fixing another. This single problem area (improper use of strncpy) accounts for a larger chunk of code I've audited than I dare to admit -- all just timebombs waiting for somebody to write an exploit for. Then there's the annoyingly common matter of checking for return codes. Walter has said this before, and he's spot on: 90% of C code out there ignore error codes where they shouldn't, so as soon as a normally-working syscall fails for whatever reason, the code cascades down a chain of unexpected control flow changes and ends in catastrophe. Or rather, in silent corruption of internal data because any signs that something has gone wrong was conveniently ignored by the caller, of course. And even when you *do* meticulously check for every single darn error code evah, it's so ridiculously easy to make a blunder: int my_func(mytype_t *input, outbuf_t *output_buf, char *buffer, int size) { /* Typical lazy way of null-checking (that will blow up * later) */ myhandle_t *h = input ? input->handle : 0; writer_t *w = output_buf ? output_buf->writer : 0; char *block = (char *)malloc(size); FILE *fp; int i; if (!buffer) return -1; /* typical useless error return code */ /* (also, memory leak) */ if (h->control_block) { /* oops, possible null deref */ fp = fopen("blah", "w"); if (!fp) return -1; /* oops, memory leak */ } if (w->buffered) { /* oops, possible null deref */ strncpy(buffer, input->data, size); /* oops, unterminated string */ if (w->write(buffer, size) != 0) /* hmm, is 0 the error status, or is it -1? */ /* also, what if w->write == null? */ { return -1; /* oops, memory leak AND file descriptor leak */ } } for (i = 0; i <= input->size; i++) { /* oops, off-by-1 error */ ... /* more nauseating nasty stuff here */ if (error) goto EXIT; ... /* ad nauseum */ } EXIT: if (fp) fclose(fp); /* oops, uninitialized ptr deref */ free(block); /* Typical lazy way of evading more tedious `if * (func(...) == error) goto EXIT;` style code, which * ends up being even more error-prone */ return error ? -1 : w->num_bytes_written(); /* oops, what if w or w->num_bytes_written is * null? */ } If you look hard enough, almost every line of C code has one potential problem or another. OK, I exaggerate, but in a large codebase written by 100 people, many of whom have since left the company for greener fields, code of this sort can be found everywhere. And nobody realizes just how bad it is, because everyone is too busy fixing pointer bugs in their own code to have time to read code written by somebody else, that doesn't directly concern them. Another big cause of bugs is C/C++'s lovely convention of not initializing local variables. Hello, random stack value that just so happens to be usually 0 when we test the code, but becomes something else when the customer runs it, and BOOM, the code tumbles onto an unexpected and disastrous chain of wrong steps. And you'd better be praying that this wasn't a pointer, 'cos you know what that means... if you're lucky, it's a random memory corruption that, for the most part, goes undetected until a customer with an obscure setup triggers a visible effect. Then you spend the next 6 months trying to trace the bug from the visible effect, which is long, long, past the actual cause. If you're not so lucky, though, this could result in leakage of unrelated memory locations (think Cloudbleed) or worse, arbitrary code execution. Hello, random remote hacker, welcome to DoD Nuclear Missile Control Panel. Who would you like to nuke today? C++ has a lovely footnote to add to this: class/struct members aren't initialized by default, so the ctor has to do it, *explicitly*. How many programmers aren't too lazy to just skip this part and trust that the methods will initialize whatever members need initializing? Keep in mind that a lot of C++ code out there has yet to catch up with the latest "correct" C++ coding style -- there's still way too much god-object classes with far too many fields than they should have, and of course, the guy who wrote the ctor was too lazy to initialize all of them explicitly. (You should be thankful already that he wasn't too lazy to skip writing the ctor altogether!) And inevitably, later on some method will read the uninitialized value and do something unexpected. This doesn't happen when you first write the class, of course. But 50 ex-employees later, 'tis a whole new landscape out there. These are some of the simpler common flaws I've come across. I've also seen several very serious bugs that could lead to actual remote exploits, if somebody tried hard enough to find a path to them from the outside. tl;dr: the C language simply isn't friendly towards memory-safe code. Most C coders (including myself, I'll admit, before this code audit) are unaware of just how bad it is, because over the years, we've accumulated a set of idioms of how to write safe code (and the scars to prove that, at least in *some* cases, they result in safer code). Too bad our accumulated wisdom isn't enough to prevent *all* of the blunders that we still regular commit, except now we're even less aware of them, because, after all, we are experienced C coders now, so surely we've outgrown such elementary mistakes! Due to past experience we've honed our eagle eyes to catch mistakes we've learned from... unfortunately, that also distracts us from *other* mistakes the language also happily lets slip by. It gives us a false sense of security that we could easily detect blunders just by eyeing the code carefully. I was under that false sense... until I participated in the audit, and discovered to my chagrin that there are a LOT of other bugs that I often fail to catch because I simply didn't have them in mind, or they were more subtle to detect, or just by pure habit I was looking in other directions and therefore missed an otherwise obvious problem spot. Walter is probably right that one of C's biggest blunders was to conflate arrays and pointers. I'd say 85-90% of the bugs I found were directly or indirectly caused by C arrays not carrying length information along with the reference to the data, of which the misuse of strncpy and off-by-1 errors in loops are prime examples. Another big part of C's unsafety is the legacy C library that contains far too many misdesigned safety-bombs like strcpy and strcat, that are there merely for legacy reasons but really ought to have been killed with fire 2 decades ago. You'd think people know better by now, but no, they STILL keep writing code that calls these badly designed functions... In this day and age of automated exploit-hunting bots, it's only a matter of time before somebody, or some*thing*, discovers that sending a certain kind of packet to a certain port on a certain firewall produces an unusual response... and pretty soon, somebody is worming his way into your supposedly secure local network and doing who knows what. And it's scary how much poorly-written C code is running on the targeted machine, and how much more poor C code is being written everyday, even today, for stuff that's going to be running on the backbone of the internet or on some wide-impact online applications (hello, Heartbleed!). Something's gonna give eventually. As I was participating in the code audit, I couldn't help thinking how many of D's features would have outright prevented a large percentage of the bugs I've found before the code found its way into production. 1) D arrays have length! Imagine that! This singlehandedly eliminates an entire class of bugs with one blow. No more strcpy/strncpy monstrosities. No more buffer overruns -- thanks to array bounds checks. Not to mention slices eliminating the need for the ubiquitous string copying in C/C++ that represents a not-often-thought-of background source of performance degradation. 2) D variables are initialized by default: this would have prevented all of the uninitialized value bugs I found. And where performance matters (hint: it usually doesn't matter where you think it does -- please fess up: which C coder here regularly uses a profiler? Probably only a minority.), you ask for =void explicitly. So when there's a bug involving uninitialized values later, the offending line is easily found. 3) Exceptions, love it or hate it, dispense with that horrible C idiom of repeatedly writing `if (func(...)!=OK) goto EXIT;` that's so horribly error-prone, and so non-DRY that the programmer is basically motivated to find an excuse NOT to write it that way (and thereby inevitably introduce a bug). Here I've to add that there's this perverse thought out there that C++ exceptions are "expensive", and so in the name of performance people would outlaw using try/catch blocks, using instead homegrown (and inevitably buggy -- and not necessarily less expensive) alternatives instead. All the while ignoring the fact that C/C++ arrays being what they are, too much array copying (esp. string copying) is happening where in D you'd just be taking a slice in O(1) time. 4) D switches would eliminate certain very nasty bugs that I've discovered involving some code assuming that a variable can only hold certain values, but it actually doesn't, in which case nasty problems happen. In D, a non-final switch requires a default case... seemingly onerous but prevents this class of bugs. Also, the deprecation of switch case fallthrough is a step in the right direction, along with goto switch for when you *want* fallthrough. Inadvertent fallthrough was one of the bugs I found that happens every so often -- but at the same time there were a number of false positives where fallthrough was intentional. D solves both problems by having the code explicitly document intent with goto case. 5) scope(exit) is also great for preventing resource leaks in a way that doesn't require "code at a distance" (very error prone). The one big class of issues that D doesn't solve is null pointer handling. Exceptions help to some extent by making it possible to do a check at the top and aborting immediately if something is null, but it's still possible to have some nasty null pointer bugs in D. Fortunately, D mostly dispenses with the need to directly manipulate pointers, so the surface area for bugs is somewhat smaller than in C (and older-style C++ code -- unfortunately still prevalent even today). A good part of the null pointer bugs I found were related to C's lack of closures -- D's delegates would obviate the need for direct pointer manipulation in this case, even if D still suffers from null handling issues. Anyway, the point of all this is that C/C++'s safety problems are very real, and C/C++ code is very widespread in online services and more C/C++ code is still being written every day for online services, and a lot of that code is still being affected by the lack of safety in C/C++. So safety problems in C/C++ are very relevant today, and will continue being a major concern in the near future. If we can complete the implementation of SafeD (and plug the existing holes in safe), it could have significant impact in this area. T -- Not all rumours are as misleading as this one.
May 08
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:


 	strncpy(buf, src, sizeof(buf));

 Quick, without looking: what's wrong with the above line of 
 code?

 Not so obvious, huh?  The problem is that strncpy is, in spite 
 of being the "safe" version of strcpy, badly designed. It does 
 not guarantee buf is null-terminated if src was too long to fit 
 in buf!  Next thing you know -- why, hello, unterminated string 
 used to inject shellcode into your "secure" webserver!

 The "obvious" fix, of course, is to leave 1 byte for the \0 
 terminator:

 	strncpy(buf, src, sizeof(buf)-1);

 Except that this is *still* wrong, because strncpy doesn't 
 write a '\0' to the end. You have to manually put one there:

 	strncpy(buf, src, sizeof(buf)-1);
 	buf[sizeof(buf)-1] = '\0';

 The second line there has a -1 that lazy/careless C coders 
 often forget, so you end up *introducing* a buffer overrun in 
 the name of fixing another.

 This single problem area (improper use of strncpy) accounts for 
 a larger chunk of code I've audited than I dare to admit -- all 
 just timebombs waiting for somebody to write an exploit for.
Adding to that, strncpy() is also a performance trap. strncpy will not stop when the input string is finished, it will fill the buffer up with 0. so char buff[4000]; strncpy(buff, "hello", sizeof buff); will write 4000 bytes on every call. The thing with strncpy() is that it's a badly named function. It is named as a string function but isn't a string function. Had it been named as memncpy() or something like that, it wouldn't confuse most C programmers. If I get my C lore right, the function was initially written for writing the file name in the Unix directory strncpy(dirent, filename, 14); or something like that.
May 09
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 	int my_func(mytype_t *input, outbuf_t *output_buf,
 	            char *buffer, int size)
 	{
 		/* Typical lazy way of null-checking (that will blow up
 		 * later) */
 		myhandle_t *h = input ? input->handle : 0;
 		writer_t *w = output_buf ? output_buf->writer : 0;
 		char *block = (char *)malloc(size);
Hey, you've been outed as a C++ programmer. A real C programmer never casts a void *. In that specific case, casting away the malloc() return can mask a nasty bug. If you have forgotten to include the header declaring the function, the compiler would assume an int returning function and the cast would suppress the righteous warning message of the compiler. On 64 bit machines the returned pointer would be truncated to the lower half. Unfortunately on Linux, as the heap starts in the lower 4 GiB of address space, the code would run for a long time before it crashed. On Solaris-SPARC it would crash directly as binaries are loaded address 0x1_0000_0000 of the address space.
 		FILE *fp;
 		int i;

 		if (!buffer)
 			return -1; /* typical useless error return code */
 				/* (also, memory leak) */

 		if (h->control_block) { /* oops, possible null deref */
 			fp = fopen("blah", "w");
 			if (!fp)
 				return -1; /* oops, memory leak */
 		}
 		if (w->buffered) { /* oops, possible null deref */
 			strncpy(buffer, input->data, size); /* oops, unterminated 
 string */
 			if (w->write(buffer, size) != 0)
 				/* hmm, is 0 the error status, or is it -1? */
 				/* also, what if w->write == null? */
Or is it inspired by fwrite, which returns the number of written records? In that case 0 return might be an error or not, depends on size.
 			{
 				return -1; /* oops, memory leak AND file
 						descriptor leak */
 			}
 		}
 		for (i = 0; i <= input->size; i++) {	/* oops, off-by-1 error 
 */
 			... /* more nauseating nasty stuff here */
 			if (error)
 				goto EXIT;
 			... /* ad nauseum */
 		}
 	EXIT:
 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
Worse, you didn't check the return of fclose() on writing FILE. fclose() can fail if the disk was full. As the FILE is buffered, the last fwrite might not have flushed it yet. So it is the fclose() that will try to write the last block and that can fail, but the app wouldn't be able to even report it.
 		free(block);

 		/* Typical lazy way of evading more tedious `if
 		 * (func(...) == error) goto EXIT;` style code, which
 		 * ends up being even more error-prone */
 		return error ? -1 : w->num_bytes_written();
 			/* oops, what if w or w->num_bytes_written is
 			 * null? */
 	}
May 09
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 08:18:09AM +0000, Patrick Schluter via Digitalmars-d
wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 
 	int my_func(mytype_t *input, outbuf_t *output_buf,
 	            char *buffer, int size)
 	{
 		/* Typical lazy way of null-checking (that will blow up
 		 * later) */
 		myhandle_t *h = input ? input->handle : 0;
 		writer_t *w = output_buf ? output_buf->writer : 0;
 		char *block = (char *)malloc(size);
Hey, you've been outed as a C++ programmer. A real C programmer never casts a void *. In that specific case, casting away the malloc() return can mask a nasty bug. If you have forgotten to include the header declaring the function, the compiler would assume an int returning function and the cast would suppress the righteous warning message of the compiler. On 64 bit machines the returned pointer would be truncated to the lower half. Unfortunately on Linux, as the heap starts in the lower 4 GiB of address space, the code would run for a long time before it crashed. On Solaris-SPARC it would crash directly as binaries are loaded address 0x1_0000_0000 of the address space.
Ouch. Haha, even I forgot about this particularly lovely aspect of C. Hooray, freely call functions without declaring them, and "obviously" they return int! Why not? There's an even more pernicious version of this, in that the compiler blindly believes whatever you declare a symbol to be, and the declaration doesn't even have to be in a .h file or anything even remotely related to the real definition. Here's a (greatly) reduced example (paraphrased from an actual bug I discovered): module.c: ------- int get_passwd(char *buf, int size); int func() { char passwd[100]; if (!get_passwd(buf, 100)) return -1; do_something(passwd); } passwd.c: --------- void get_passwd(struct user_db *db, struct login_record *rec) { ... // stuff } old_passwd.c: ------------- /* Please don't use this code anymore, it's deprecated. */ /* ^^^^ gratuitous useless comment */ int get_passwd(char *buf, int size) { ... /* old code */ } Originally, in the makefile, module.o is linked with libutil.so, which in turn is built from old_passwd.o and a bunch of other stuff. Later on, passwd.o was added to libotherutil.so, which was listed after libutil.so in the linker command, so the symbol conflict was masked because the linker found the libutil.so version of get_passwd first. Then one day, somebody changed the order of libraries in the makefile, and suddenly func() mysteriously starts malfunctioning because get_passwd now links to the wrong version of the function! Worse yet, the makefile was written to be "smart", as in, it uses wildcards to pick up .so files (y'know, us lazy programmers don't wanna have to manually type out the name of every library). So when somebody tried to fix this bug by removing old_passwd.o from libotherutil.so altogether, the bug was still happening in other developers' machines, because a stale copy of the old version of libotherutil.so was still left in their source tree, so when *they* built the executable, it contains the bug, but the bug vanishes when built from a fresh checkout. Who knows how many hours were wasted chasing after this heisenbug. [...]
 		if (w->buffered) { /* oops, possible null deref */
 			strncpy(buffer, input->data, size); /* oops, unterminated string */
 			if (w->write(buffer, size) != 0)
 				/* hmm, is 0 the error status, or is it -1? */
 				/* also, what if w->write == null? */
Or is it inspired by fwrite, which returns the number of written records? In that case 0 return might be an error or not, depends on size.
Yep, fwrite has an utterly lovely interface. The epitome of API design. :-D [...]
 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
Worse, you didn't check the return of fclose() on writing FILE. fclose() can fail if the disk was full. As the FILE is buffered, the last fwrite might not have flushed it yet. So it is the fclose() that will try to write the last block and that can fail, but the app wouldn't be able to even report it.
[...] Haha, you're right. NONE of the code I've ever dealt with even considers this case. None at all. In fact, I don't even remember the last time I've seen C code that bothers checking the return value of fclose(). Maybe I've written it *once* in my lifetime when I was young and nave, and actually bothered to notice the documentation that fclose() may sometimes fail. Even the static analysis tool we're using doesn't report it!! So again Walter was spot on: fill up the disk to 99% full, and 99% of C programs would start malfunctioning and showing all kinds of odd behaviours, because they never check the return code of printf, fprintf, or fclose, or any of a whole bunch of other syscalls that are regularly *assumed* to just work, when in reality they *can* fail. The worst part of all this is, this kind of C code is prevalent everywhere in C projects, including those intended for supposedly security-aware software. Basically, the language itself is just so unfriendly to safe coding practices that it's nigh impossible to write safe code in it. It's *theoretically* possible, certainly, but in practice nobody writes C code that way. It is a scary thought indeeed, how much of our current infrastructure relies on software running this kind of code. Something's gotta give, eventually. And it ain't gonna be pretty when it all starts crumbling down. T -- Caffeine underflow. Brain dumped.
May 09
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 08:18:09AM +0000, Patrick Schluter via
[...]
 Ouch.  Haha, even I forgot about this particularly lovely 
 aspect of C. Hooray, freely call functions without declaring 
 them, and "obviously" they return int! Why not?

 There's an even more pernicious version of this, in that the 
 compiler blindly believes whatever you declare a symbol to be, 
 and the declaration doesn't even have to be in a .h file or 
 anything even remotely related to the real definition. Here's a 
 (greatly) reduced example (paraphrased from an actual bug I 
 discovered):

 	module.c:
 	-------
 	int get_passwd(char *buf, int size);
yeah, this is a code smell. A not static declared function prototype in a C file. Raises the alarm bells automatically now. The same issue but much more frequent to observe, extern variable declaration in .c files. That one is really widespread and few see it as an anti-pattern. An extern global variable should always be put in the header file, never in the C file. Exactly for the same reason as your example with the wrong prototype below: non matching types the linker will join wrongly.
 	int func() {
 		char passwd[100];
 		if (!get_passwd(buf, 100)) return -1;
 		do_something(passwd);
 	}

 	passwd.c:
 	---------
 	void get_passwd(struct user_db *db, struct login_record *rec) {
 		... // stuff
 	}

 	old_passwd.c:
 	-------------
 	/* Please don't use this code anymore, it's deprecated. */
 	/* ^^^^ gratuitous useless comment */
 	int get_passwd(char *buf, int size) { ... /* old code */ }

 Originally, in the makefile, module.o is linked with 
 libutil.so, which in turn is built from old_passwd.o and a 
 bunch of other stuff. Later on, passwd.o was added to 
 libotherutil.so, which was listed after libutil.so in the 
 linker command, so the symbol conflict was masked because the 
 linker found the libutil.so version of get_passwd first.

 Then one day, somebody changed the order of libraries in the 
 makefile, and suddenly func() mysteriously starts 
 malfunctioning because get_passwd now links to the wrong 
 version of the function!

 Worse yet, the makefile was written to be "smart", as in, it 
 uses wildcards to pick up .so files (y'know, us lazy 
 programmers don't wanna have to manually type out the name of 
 every library).
Yeah, we also had makefiles using wildcards. It took a long time but I managed to get rid of them.
 So when somebody tried to fix this bug by removing old_passwd.o 
 from libotherutil.so altogether, the bug was still happening in 
 other developers' machines, because a stale copy of the old 
 version of libotherutil.so was still left in their source tree, 
 so when *they* built the executable, it contains the bug, but 
 the bug vanishes when built from a fresh checkout. Who knows 
 how many hours were wasted chasing after this heisenbug.


 [...]
 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
Worse, you didn't check the return of fclose() on writing FILE. fclose() can fail if the disk was full. As the FILE is buffered, the last fwrite might not have flushed it yet. So it is the fclose() that will try to write the last block and that can fail, but the app wouldn't be able to even report it.
[...] Haha, you're right. NONE of the code I've ever dealt with even considers this case. None at all. In fact, I don't even remember the last time I've seen C code that bothers checking the return value of fclose(). Maybe I've written it *once* in my lifetime when I was young and naïve, and actually bothered to notice the documentation that fclose() may sometimes fail. Even the static analysis tool we're using doesn't report it!!
I discovered that one only a few month ago. I have now around 30 places in our code base to fix. It's only important for writing FILE. Reading FILE can ignore the return values.
 So again Walter was spot on: fill up the disk to 99% full, and 
 99% of C programs would start malfunctioning and showing all 
 kinds of odd behaviours, because they never check the return 
 code of printf, fprintf, or fclose, or any of a whole bunch of 
 other syscalls that are regularly *assumed* to just work, when 
 in reality they *can* fail.

 The worst part of all this is, this kind of C code is prevalent 
 everywhere in C projects, including those intended for 
 supposedly security-aware software.  Basically, the language 
 itself is just so unfriendly to safe coding practices that it's 
 nigh impossible to write safe code in it.  It's *theoretically* 
 possible, certainly, but in practice nobody writes C code that 
 way.  It is a scary thought indeeed, how much of our current 
 infrastructure relies on software running this kind of code.  
 Something's gotta give, eventually. And it ain't gonna be 
 pretty when it all starts crumbling down.
Agreed. That's why I'm learning D now, it's probably the only language that will be able to replace progressively our C code base in a skunk work fashion. I wanted to do it already 5 or 6 years ago but couldn't as we were on Solaris/SPARC back then. Now that we have migrated to Linux-AMD64 there's not much holding us back. Oracle client is maybe still an issue though.
May 09
prev sibling parent reply Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
 Ouch.  Haha, even I forgot about this particularly lovely 
 aspect of C. Hooray, freely call functions without declaring 
 them, and "obviously" they return int! Why not?
To be fair, most of your complaints can be fixed by enabling compiler warnings and by avoiding the use of de-facto-deprecated functions (strnlen). The remaining problems theoretically shouldn't occur by disciplined use of commonly accepted C99 guidelines. But I agree that even then and with the use of sanitizers writing correct C code remains very hard.
May 09
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 11:09:27PM +0000, Guillaume Boucher via Digitalmars-d
wrote:
 On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
 Ouch.  Haha, even I forgot about this particularly lovely aspect of
 C.  Hooray, freely call functions without declaring them, and
 "obviously" they return int! Why not?
To be fair, most of your complaints can be fixed by enabling compiler warnings and by avoiding the use of de-facto-deprecated functions (strnlen).
The problem is that warnings don't work, because people ignore them. Everybody knows warnings shouldn't be ignored, but let's face it, when you make a 1-line code change and run make, and the output is 250 pages long (large project, y'know), any warnings that are buried somewhere in there won't even be noticed, much less acted on. In this sense I agree with Walter that warnings are basically useless, because they're not enforced. Either something is correct and compiles, or it should be an error that stops compilation. Anything else, and you start having people ignore warnings. Yes I know, there's gcc -Werror and the analogous flags for the other compilers, but in a sufficiently large project, -Werror is basically impractical because too much of legacy code will just refuse to compile, and it's not feasible to rewrite / waste time fixing it. As for avoiding de-facto-deprecated functions, I've already said it: *everybody* knows strcat is bad, and strcpy is bad, and so on and so forth. So how come I still see new C code being written almost every day that continues to use these functions? It's not that the coders refuse to cooperate... I've seen a lot of code in my project where people meticulously use strncpy instead of strcat / strcpy -- I presume out of the awareness that they are "bad". But when push comes to shove and there's a looming deadline, all scruples are thrown to the winds and people just take the path of least resistance. The mere fact that strcat and strcpy exist means that somebody, sometime, will use them, and usually to disastrous consequences. And *that's* the fundamental problem with C (and in the same principle, C++): the correct way to write code is also a very onerous, fragile, error-prone, and verbose way of writing code. The "obvious" and "easy" way to write C code is almost always the wrong way. The incentives are all wrong, and so there's a big temptation for people to cut corners and take the easy way out. It's much easier to write this: int myfunc(context_t *ctx) { data_desc_t *desc = ctx->data; FILE *fp = fopen(desc->filename, "w"); char *tmp = malloc(1000); strcpy(tmp, desc->data1); fwrite(tmp, strlen(tmp), 1, fp); strcpy(tmp, desc->data2); fwrite(tmp, strlen(tmp), 1, fp); strcpy(desc->cache, tmp); fclose(fp); free(tmp); return 0; } rather than this: int myfunc(context_t *ctx) { data_desc_t *desc; FILE *fp; char *tmp; size_t bufsz; if (!ctx) return INVALID_CONTEXT; desc = ctx->data; if (!desc->data1 || !desc->data2) return INVALID_ARGS; fp = fopen("blah", "w"); if (!fp) return CANT_OPEN_FILE; bufsz = desc->data1_len + desc->data2_len + 1; tmp = malloc(bufsz); if (!tmp) return OUT_OF_MEMORY; strncpy(tmp, desc->data1, bufsz); if (fwrite(tmp, strlen(tmp), 1, fp) != 1) { fclose(fp); unlink("blah"); return IO_ERROR; } strncpy(tmp, desc->data2, bufsz); if (fwrite(tmp, strlen(tmp), 1, fp) != 1) { fclose(fp); unlink("blah"); return IO_ERROR; } if (desc->cache) strncpy(desc->cache, tmp, sizeof(desc->cache)); if (fclose(fp) != 0) { WARN("I/O error"); free(tmp); return IO_ERROR; } free(tmp); return OK; } Most people would probably write something in between, which is neither completely wrong, nor completely right. But it works for 90% of the cases, and since it meets the deadline, it's "good enough". Notice how much longer and more onerous it is to write the "correct" version of the code than the easy way. A properly-designed language ought to reverse the incentives: the default, "easy" way to write code should be the "correct", safe, non-leaking way. Potentially unsafe, potentially resource-leaking behaviour should require work on the part of the coder, so that he'd only do it when there's a good reason for it (optimization, or writing system code that needs to go outside the confines of the default safe environment, etc.). In this respect, D scores much better than C/C++. Very often, the "easy" way to write something in D is also the correct way. It may not be the fastest way for the performance-obsessed premature-optimizing C hacker crowd (and I include myself among them), but it won't leak memory, overrun buffers, act on random stack values from uninitialized local variables, etc.. Your program is correct to begin with, which then gives you a stable footing to start working on improving its performance. In C/C++, your program is most likely wrong to begin with, so imagine what happens when you try to optimize that wrong code in typical C/C++ hacker premature optimization fashion. (Nevermind the elephant in the room that 80-90% of the "optimizations" C/C++ coders -- including myself -- have programmed into their finger reflexes are actually irrelevant at best, because either compilers already do those optimizations for you, or the hot spot simply isn't where we'd like to believe it is; or outright de-optimizing at worst, because we've successfully defeated the compiler's optimizer by writing inscrutable code.) Null dereference is one area where D does no better than C/C++, though even in that case, language features like closures help alleviate much of the kind of code that would otherwise need to deal with pointers directly. (Yes, I'm aware C++ now has closures... but most of the C++ code out in the industry -- and C++ coders themselves -- have a *long* ways to go before they can catch up with the latest C++ standards. Until then, it's lots of manual pointer manipulations that are ready to explode in your face anytime.)
 The remaining problems theoretically shouldn't occur by disciplined
 use of commonly accepted C99 guidelines.  But I agree that even then
 and with the use of sanitizers writing correct C code remains very
 hard.
That's another fundamental problem with the C/C++ world: coding by convention. We all know all too well that *if* we'd only abide by such-and-such coding guidelines and recommendations, our code would actually stand a chance of being correct, safe, non-leaking, etc.. However, the problem with conventions is that they are just that: conventions. They get broken all the time, with disastrous consequences. I used to believe in convention -- after all, who wouldn't want to be goodie-two-shoes coders who abides by all the rules so that they could take pride in their shiny, perfect code? Unfortunately, after almost 20 years working in the industry and seeing "enterprise" code that makes my eyes bleed, I've lost all confidence that conventions are of any help. I've seen code written by supposedly "renown" or "expert" C coders that represent some of the most repulsive, stomach-turning examples of antipatterns I've ever had the dubious pleasure of needing to debug. D's stance of static verifiability and compile-time guarantees is an oft under-appreciated big step in the right direction. In the long run, conventions will not solve anything; you need *enforcement*. The compiler has to be able to prove, at compile-time, that function X is actually pure, or nothrow, or safe, or whatever, for those things to have any value whatsoever. And for this to be possible, the language itself needs to have these notions built-in, rather than have it tacked on by an external tool (that people will be reluctant to use, or outright ignore, or iti doesn't work with their strange build system, target arch, or whatever). Sure, there are currently implementation bugs that make safe not quite so safe in some cases, or too much of Phobos is still safe-incompatible. But still, these are implementation quality issues. The concept itself is a sound and powerful one. A compiler-verified attribute is far more effective than any blind faith trust in convention ever will be, e.g., D's immutable vs. C++'s easy-to-cast-away const -- that we *trust* people won't attempt. Yes, I'm aware of bugs in the current implementation that allows you to bypass immutable, but still, it's a QoI issue. And yes, there are areas in the spec that have holes, etc.. But assuming these QoI issues and spec holes / inconsistencies are fixed, what we have is a powerful system that will actually deliver compile-time guarantees about memory safety, rather than a system of conventions that you can never be too sure that somebody somewhere didn't break, and therefore you can only *hope* that it is memory-safe. T -- Life is too short to run proprietary software. -- Bdale Garbee
May 09
next sibling parent reply "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
 In this sense I agree with Walter that warnings are basically useless,
 because they're not enforced. Either something is correct and compiles,
 or it should be an error that stops compilation. Anything else, and you
 start having people ignore warnings.
Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut". (Hogan's Compiler Heroes: "I see NUH-TING!!") And then the flip side is that some code smells are just to pedantic to justify breaking the build while the programmer is in the middle of some debugging or refactoring or some such. That puts me strongly in the philosophy of "Code containing warnings: Allowed while compiling, disallowed when committing (with allowances for mitigating circumstances)." C/C++ doesn't demonstrate that warnings are doomed to be useless and "always" ignored. What it demonstrates is that warnings are NOT an appropriate strategy for fixing language problems.
 As for avoiding de-facto-deprecated functions, I've already said it:
 *everybody* knows strcat is bad, and strcpy is bad, and so on and so
 forth.  So how come I still see new C code being written almost every
 day that continues to use these functions?  It's not that the coders
 refuse to cooperate... I've seen a lot of code in my project where
 people meticulously use strncpy instead of strcat / strcpy -- I presume
 out of the awareness that they are "bad".  But when push comes to shove
 and there's a looming deadline, all scruples are thrown to the winds and
 people just take the path of least resistance.  The mere fact that
 strcat and strcpy exist means that somebody, sometime, will use them,
 and usually to disastrous consequences.
The moral of this story: Sometimes, breaking people's code is GOOD! ;)
 And *that's* the fundamental problem with C (and in the same principle,
 C++): the correct way to write code is also a very onerous, fragile,
 error-prone, and verbose way of writing code. The "obvious" and "easy"
 way to write C code is almost always the wrong way.  The incentives are
 all wrong, and so there's a big temptation for people to cut corners and
 take the easy way out.
Damn straight :)
 (Nevermind the elephant in the room that 80-90% of the "optimizations"
 C/C++ coders -- including myself -- have programmed into their finger
 reflexes are actually irrelevant at best, because either compilers
 already do those optimizations for you, or the hot spot simply isn't
 where we'd like to believe it is; or outright de-optimizing at worst,
 because we've successfully defeated the compiler's optimizer by writing
 inscrutable code.)
C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.
 That's another fundamental problem with the C/C++ world: coding by
 convention.  We all know all too well that *if* we'd only abide by
 such-and-such coding guidelines and recommendations, our code would
 actually stand a chance of being correct, safe, non-leaking, etc..
Luckily, there IS a way to enforce that proper coding conventions are actually adhered to: It's called "compile-time error". :)
May 09
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky (Abscissa) via
Digitalmars-d wrote:
 On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
 
 In this sense I agree with Walter that warnings are basically
 useless, because they're not enforced. Either something is correct
 and compiles, or it should be an error that stops compilation.
 Anything else, and you start having people ignore warnings.
 
Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut". (Hogan's Compiler Heroes: "I see NUH-TING!!")
I'd much rather the compiler say "Hey, you! This piece of code is probably wrong, so please fix it! If it was intentional, please write it another way that makes that clear!" - and abort with a compile error. This is actually one of the things I like about D. For example, if you wrote: switch (e) { case 1: return "blah"; case 2: return "bluh"; } the compiler will refuse to compile the code until you either add a default case, or make it a final switch (in which case the compiler will refuse the compile the code unless every possible case is in fact covered). Now imagine if this was merely a warning that people could just ignore. Yep, we're squarely back in good ole C/C++ land, where an unexpected value of e causes the code to amble down an unexpected path, with the consequent hilarity that ensues. IOW, it should not be possible to write tricky stuff by default; you should need to ask for it explicitly so that intent is clear. Another switch example: switch (e) { case 1: x = 2; case 2: x = 3; default: x = 4; } In C, the compiler happily compiles the code for you. In D, at least the latest dmd will give you deprecation warnings (and presumably, in the future, actual compile errors) for forgetting to write `break;`. But if the fallthrough was intentional, you document that with an explicit `goto case ...`. IOW, the default behaviour is the safe one (no fallthrough), and the non-default behaviour (fallthrough) has to be explicitly asked for. Much, much better.
 And then the flip side is that some code smells are just to pedantic
 to justify breaking the build while the programmer is in the middle of
 some debugging or refactoring or some such.
 
 That puts me strongly in the philosophy of "Code containing warnings:
 Allowed while compiling, disallowed when committing (with allowances
 for mitigating circumstances)."
I'm on the fence about the former. My current theory is that being forced to write "proper" code even while refactoring actually helps the quality of the resulting code. But I definitely agree that code with warnings should never make it into the code repo. The problem is that it's not enforced by the compiler, so *somebody* somewhere will inevitably bypass it.
 C/C++ doesn't demonstrate that warnings are doomed to be useless and
 "always" ignored. What it demonstrates is that warnings are NOT an
 appropriate strategy for fixing language problems.
Point. I suppose YMMV, but IME unless warnings are enforced with -Werror or equivalent, after a while people just stop paying attention to them, at least where I work. It's entirely possible that it's a bias specific to my job, but somehow I have a suspicion that this isn't completely the case. Humans tend to be lazy, and ignoring compiler warnings is rather high up on the list of things lazy people tend to do. The likelihood increases with the presence of other factors like looming deadlines, unreasonable customer requests, ambiguous feature specs handed down from the PTBs, or just plain having too much on your plate to be bothering with "trivialities" like fixing compiler warnings. That's why my eventual conclusion is that anything short of enforcement will ultimately fail. Unless there is no way you can actually get an executable out of badly-written code, there will always be *somebody* out there that will write bad code. And by Murphy's Law, that somebody will eventually be someone in your team, and chances are you'll be the one cleaning up the mess afterwards. Not something I envy doing (I've already had to do too much of that). [...]
 The moral of this story: Sometimes, breaking people's code is GOOD! ;)
Tell that to Walter / Andrei. ;-) [...]
 (Nevermind the elephant in the room that 80-90% of the
 "optimizations" C/C++ coders -- including myself -- have programmed
 into their finger reflexes are actually irrelevant at best, because
 either compilers already do those optimizations for you, or the hot
 spot simply isn't where we'd like to believe it is; or outright
 de-optimizing at worst, because we've successfully defeated the
 compiler's optimizer by writing inscrutable code.)
C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.
LOL!! Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers don't use a profiler, and don't *want* to use a profiler, because they're either ignorant that such things exist (unlikely), or they're too dang proud to admit that their painfully-accumulated preconceptions about optimization might possibly be wrong. Or maybe my perceptions are just heavily colored by the supposedly "expert" C coders I've met, who wrote supposedly better code that I eventually realized was actually not better, but in many ways actually worse -- less readable, less maintainable, more error-prone to write, and at the end of the day arguably less performant because it ultimately led to far too much boilerplate and other sources of code bloat, excessive string copying, too much indirection (cache unfriendliness), and other such symptoms that C coders often overlook. (And meanwhile, the mere mention of the two letters "G C" and they instantly recoil, and rattle of an interminable list of 20-years-outdated GC-phobic excuses, preferring rather to die the death of a thousand pointer bugs (and memory leaks, and overrun buffers) than succumb to the Java of the early 90's with its klunky, poorly-performing GC of spotted repute that has long since been surpassed. And of course, any mention of any evidence that Java *might* actually perform better than poorly-written C code in some cases will incite instant vehement denial. After all, how can an "interpreted" language possibly outperform poorly-designed, over-engineered C scaffolding that necessitates far too much excessive buffer copying and destroys cache coherence with far too many unnecessary indirections? Inconceivable!)
 That's another fundamental problem with the C/C++ world: coding by
 convention.  We all know all too well that *if* we'd only abide by
 such-and-such coding guidelines and recommendations, our code would
 actually stand a chance of being correct, safe, non-leaking, etc..
Luckily, there IS a way to enforce that proper coding conventions are actually adhered to: It's called "compile-time error". :)
Exactly. Not compiler warnings... :-D T -- You have to expect the unexpected. -- RL
May 09
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 5/10/17 8:28 AM, H. S. Teoh via Digitalmars-d wrote:
 C++'s fundamental paradigm has always been "Premature-optimization
 oriented programming". C++ promotes POOP.
LOL!! Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers don't use a profiler, and don't *want* to use a profiler, because they're either ignorant that such things exist (unlikely), or they're too dang proud to admit that their painfully-accumulated preconceptions about optimization might possibly be wrong. Or maybe my perceptions are just heavily colored by the supposedly "expert" C coders I've met, who wrote supposedly better code that I eventually realized was actually not better, but in many ways actually worse -- less readable, less maintainable, more error-prone to write, and at the end of the day arguably less performant because it ultimately led to far too much boilerplate and other sources of code bloat, excessive string copying, too much indirection (cache unfriendliness), and other such symptoms that C coders often overlook.
Just to add a different perspective - the people I work with is the kind of guys who know when not to trust the profiler and what to try if the profile is flat. There is no question raised that you should run it, it's just assumed you always do. P.S. Can't wait to see "Are we fast yet?" graph for Phobos functions. --- Dmitry Olshansky
May 10
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky 
 (Abscissa) via Digitalmars-d wrote:
 On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
 
 In this sense I agree with Walter that warnings are 
 basically useless, because they're not enforced. Either 
 something is correct and compiles, or it should be an error 
 that stops compilation. Anything else, and you start having 
 people ignore warnings.
 
Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut". (Hogan's Compiler Heroes: "I see NUH-TING!!")
I'd much rather the compiler say "Hey, you! This piece of code is probably wrong, so please fix it! If it was intentional, please write it another way that makes that clear!" - and abort with a compile error. This is actually one of the things I like about D. For example, if you wrote: switch (e) { case 1: return "blah"; case 2: return "bluh"; } the compiler will refuse to compile the code until you either add a default case, or make it a final switch (in which case the compiler will refuse the compile the code unless every possible case is in fact covered). Now imagine if this was merely a warning that people could just ignore. Yep, we're squarely back in good ole C/C++ land, where an unexpected value of e causes the code to amble down an unexpected path, with the consequent hilarity that ensues. IOW, it should not be possible to write tricky stuff by default; you should need to ask for it explicitly so that intent is clear. Another switch example: switch (e) { case 1: x = 2; case 2: x = 3; default: x = 4; } In C, the compiler happily compiles the code for you. In D, at least the latest dmd will give you deprecation warnings (and presumably, in the future, actual compile errors) for forgetting to write `break;`. But if the fallthrough was intentional, you document that with an explicit `goto case ...`. IOW, the default behaviour is the safe one (no fallthrough), and the non-default behaviour (fallthrough) has to be explicitly asked for. Much, much better.
 And then the flip side is that some code smells are just to 
 pedantic to justify breaking the build while the programmer is 
 in the middle of some debugging or refactoring or some such.
 
 That puts me strongly in the philosophy of "Code containing 
 warnings: Allowed while compiling, disallowed when committing 
 (with allowances for mitigating circumstances)."
I'm on the fence about the former. My current theory is that being forced to write "proper" code even while refactoring actually helps the quality of the resulting code. But I definitely agree that code with warnings should never make it into the code repo. The problem is that it's not enforced by the compiler, so *somebody* somewhere will inevitably bypass it.
 C/C++ doesn't demonstrate that warnings are doomed to be 
 useless and "always" ignored. What it demonstrates is that 
 warnings are NOT an appropriate strategy for fixing language 
 problems.
Point. I suppose YMMV, but IME unless warnings are enforced with -Werror or equivalent, after a while people just stop paying attention to them, at least where I work. It's entirely possible that it's a bias specific to my job, but somehow I have a suspicion that this isn't completely the case. Humans tend to be lazy, and ignoring compiler warnings is rather high up on the list of things lazy people tend to do. The likelihood increases with the presence of other factors like looming deadlines, unreasonable customer requests, ambiguous feature specs handed down from the PTBs, or just plain having too much on your plate to be bothering with "trivialities" like fixing compiler warnings. That's why my eventual conclusion is that anything short of enforcement will ultimately fail. Unless there is no way you can actually get an executable out of badly-written code, there will always be *somebody* out there that will write bad code. And by Murphy's Law, that somebody will eventually be someone in your team, and chances are you'll be the one cleaning up the mess afterwards. Not something I envy doing (I've already had to do too much of that). [...]
 The moral of this story: Sometimes, breaking people's code is 
 GOOD! ;)
Tell that to Walter / Andrei. ;-) [...]
 (Nevermind the elephant in the room that 80-90% of the 
 "optimizations" C/C++ coders -- including myself -- have 
 programmed into their finger reflexes are actually 
 irrelevant at best, because either compilers already do 
 those optimizations for you, or the hot spot simply isn't 
 where we'd like to believe it is; or outright de-optimizing 
 at worst, because we've successfully defeated the compiler's 
 optimizer by writing inscrutable code.)
C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.
LOL!! Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers don't use a profiler, and don't *want* to use a profiler, because they're either ignorant that such things exist (unlikely), or they're too dang proud to admit that their painfully-accumulated preconceptions about optimization might possibly be wrong.
The likelihood of a randomly picked C/C++ programmer not even knowing what a profiler is, much less having used one, is extremely high in my experience. I worked with a lot of embedded C programmers with several years of experience who knew nothing but embedded C. We're talking dozens of people here. Not one of them had ever used a profiler. In fact, a senior developer (now tech lead) doubted I could make our build system any faster. I did by 2 orders of magnitude. When I presented the result to him he said in disbelief: "But, how? I mean, if it's doing exactly the same thing, how can it be faster?". Big O? Profiler? What are those? I actually stood there for a few seconds with my mouth open because I didn't know what to say back to him. These people are also likely to raise concerns about performance during code review despite having no idea what a cache line is. They still opine that one shouldn't add another function call for readability because that'll hurt performance. No need to measure anything, we all know calling functions is bad, even when they're in the same file and the callee is `static`. I think a lot of us underestimate just how bad the "average" developer is. A lot of them write C code, which is like giving chainsaws to chimpanzees.
 (And meanwhile, the mere mention of the two letters "G C" and 
 they instantly recoil, and rattle of an interminable list of
That's cognitive dissonance: there's not much anyone can do about that. Unfortunately, facts don't matter, feelings do. Atila
May 10
next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 10 May 2017 at 11:16:57 UTC, Atila Neves wrote:
[...]
 The likelihood of a randomly picked C/C++ programmer not even 
 knowing what a profiler is, much less having used one, is 
 extremely high in my experience. I worked with a lot of 
 embedded C programmers with several years of experience who 
 knew nothing but embedded C. We're talking dozens of people 
 here. Not one of them had ever used a profiler.
I've worked 10 years in embedded (industry, time acquisition and network gears) and I can say that there is a good reason to that. It's nearly impossible to profile in an embedded system (nowadays it's often possible because of the generalization of Linux and gnu tools but at that time it wasn't). The tools don't exist or if they do, the instrumentation breaks the constraints of the controller. This was also one of the reason we chose our embedded CPU's very carefully. We always chose processors for which there existed mainstream desktop versions so that we could at least use the confortable tooling to test some parts of the code on a nice environment. We used Z80 (CP/M), 80186 (MS-C on DOS) and then 68030 (Pure-C on Atari TT). TL;DR profiling for embedded is order of magnitudes harder than for nice OS environments.
May 10
next sibling parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 12:18:40 UTC, Patrick Schluter wrote:
 On Wednesday, 10 May 2017 at 11:16:57 UTC, Atila Neves wrote:
 [...]
 The likelihood of a randomly picked C/C++ programmer not even 
 knowing what a profiler is, much less having used one, is 
 extremely high in my experience. I worked with a lot of 
 embedded C programmers with several years of experience who 
 knew nothing but embedded C. We're talking dozens of people 
 here. Not one of them had ever used a profiler.
I've worked 10 years in embedded (industry, time acquisition and network gears) and I can say that there is a good reason to that. It's nearly impossible to profile in an embedded system (nowadays it's often possible because of the generalization of Linux and gnu tools but at that time it wasn't). The tools don't exist or if they do, the instrumentation breaks the constraints of the controller. This was also one of the reason we chose our embedded CPU's very carefully. We always chose processors for which there existed mainstream desktop versions so that we could at least use the confortable tooling to test some parts of the code on a nice environment. We used Z80 (CP/M), 80186 (MS-C on DOS) and then 68030 (Pure-C on Atari TT). TL;DR profiling for embedded is order of magnitudes harder than for nice OS environments.
That doesn't mean they shouldn't know what a profiler is. The response would then be (assuming they're competent) "I wish I could use a profiler, but I can't because...", not "how can two programs output the same thing in different amounts of time". Also, there's a good way around this sort of thing and it applies to testing as well: run the tools on a development machine (and the tests). Write portable standards-compliant code, make a thin wrapper where needed and suddendly you can write tests easily, run valgrind, use address sanitizer, ... There's no good reason why you can't profile pure algorithms: C code is C code and has specified semantics whether it's running on a dev machine or a controller. The challenge is to write mostly pure code with thin IO wrappers. It's always a win/win though. Atila
May 10
prev sibling parent Adrian Matoga <dlang.spam matoga.info> writes:
On Wednesday, 10 May 2017 at 12:18:40 UTC, Patrick Schluter wrote:
 On Wednesday, 10 May 2017 at 11:16:57 UTC, Atila Neves wrote:
 [...]
 The likelihood of a randomly picked C/C++ programmer not even 
 knowing what a profiler is, much less having used one, is 
 extremely high in my experience. I worked with a lot of 
 embedded C programmers with several years of experience who 
 knew nothing but embedded C. We're talking dozens of people 
 here. Not one of them had ever used a profiler.
I've worked 10 years in embedded (industry, time acquisition and network gears) and I can say that there is a good reason to that. It's nearly impossible to profile in an embedded system (nowadays it's often possible because of the generalization of Linux and gnu tools but at that time it wasn't). The tools don't exist or if they do, the instrumentation breaks the constraints of the controller. This was also one of the reason we chose our embedded CPU's very carefully. We always chose processors for which there existed mainstream desktop versions so that we could at least use the confortable tooling to test some parts of the code on a nice environment. We used Z80 (CP/M), 80186 (MS-C on DOS) and then 68030 (Pure-C on Atari TT). TL;DR profiling for embedded is order of magnitudes harder than for nice OS environments.
IMO it's just different. The thing is, the tools you can use don't need to be marketed as "profilers". People will always find excuses if they lack time, will or knowledge. In practice, there's always a way to profile and debug, even if you don't have dedicated tools for it. It's also a lot easier to reason about performance on small chips with no caches, ILP, etc. and with fixed instruction timing, than it is on modern complex CPUs with hundreds of tasks competing for resources. One universal tool is oscilloscope, for sure you have one on your colleague's desk if you really do embedded stuff. A common way to profile on home computers from the '80s such as Atari XE (6502), was simply to change screen colors. That way you always knew the time taken by the measured code with 1-cycle precision. 13.5 scanlines are white? That's 1539 cycles! The time it took to execute a tight loop could even be computed accurately with pen and paper by just looking at the assembly. It's also a lot easier to implement a cycle-exact emulator for such simple chips, and then you can measure everything without observer effect.
May 14
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 11:16:57AM +0000, Atila Neves via Digitalmars-d wrote:
[...]
 The likelihood of a randomly picked C/C++ programmer not even knowing
 what a profiler is, much less having used one, is extremely high in my
 experience.  I worked with a lot of embedded C programmers with
 several years of experience who knew nothing but embedded C. We're
 talking dozens of people here. Not one of them had ever used a
 profiler. In fact, a senior developer (now tech lead) doubted I could
 make our build system any faster. I did by 2 orders of magnitude.
Very nice! Reminds me of an incident many years ago where I "optimized" a shell script that took >2 days to generate a report by rewriting it Perl, which produced the report in 2 mins. (Don't ask why somebody thought it was a good idea to write a report generation script as a *shell script*, of all things. You really do not want to know.)
 When I presented the result to him he said in disbelief: "But, how? I
 mean, if it's doing exactly the same thing, how can it be faster?".
 Big O?  Profiler? What are those? I actually stood there for a few
 seconds with my mouth open because I didn't know what to say back to
 him.
Glad to hear I'm not the only one faced with senior programmers who show surprising ignorance in matters you'd think they really ought to know like the back of their hand.
 These people are also likely to raise concerns about performance
 during code review despite having no idea what a cache line is. They
 still opine that one shouldn't add another function call for
 readability because that'll hurt performance. No need to measure
 anything, we all know calling functions is bad, even when they're in
 the same file and the callee is `static`.
Yep, typical C coder premature optimization syndrome. I would not be surprised if today there's still a significant number of C coders who believe that writing "i++;" is faster than writing "i=i+1;". Ironically, these same people would also come up with harebrained schemes of avoiding something they are prejudiced against, like C++ standard library string types, while ignoring the cost of needing to constantly call O(n) algorithms for string processing (strlen, strncpy, etc.). I remember many years ago when I was still young and nave, in one of projects, I spent days micro-optimizing my code to eliminate every last CPU cycle I could from my linked-list type, only to discover to my chagrin that the bottleneck was nowhere near it -- it was caused by a debugging fprintf() that I had forgotten to take out. And I had only found this out because I finally conceded to run a profiler. That was when this amazing concept finally dawned on me that I could possibly be *wrong* about my ideas of performant code, imagine that! (Of course, then later on I discovered that my meticulously optimized linked list was ultimately worthless, because it has O(n) complexity, whereas had I just used a library type instead, I could've had O(log n) complexity. But I had dismissed the library type because it was obviously "too complex" to possibly be performant enough for my oh-so-performance-critical code. (Ahem. It was a *game*, and not even a good one. But it absolutely needed every last drop of juice I could squeeze from the CPU. Oh yes.))
 I think a lot of us underestimate just how bad the "average" developer
 is. A lot of them write C code, which is like giving chainsaws to
 chimpanzees.
[...] Hmm. What would giving them D be equivalent to, then? :-D T -- If you're not part of the solution, you're part of the precipitate.
May 10
parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 18:58:35 UTC, H. S. Teoh wrote:
 On Wed, May 10, 2017 at 11:16:57AM +0000, Atila Neves via 
 Digitalmars-d wrote: [...]
 [...]
Very nice! Reminds me of an incident many years ago where I "optimized" a shell script that took >2 days to generate a report by rewriting it Perl, which produced the report in 2 mins. (Don't ask why somebody thought it was a good idea to write a report generation script as a *shell script*, of all things. You really do not want to know.) [...]
 Hmm. What would giving them D be equivalent to, then? :-D
I'm not sure! If I knew you were going to ask that I'd probably have picked a different analogy ;) Atila
May 10
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky
[...]
 Perhaps I'm just being cynical, but my current unfounded 
 hypothesis is that the majority of C/C++ programmers ...
Just a nitpick, could we also please stop conflating C and C++ programmers? My experience is that C++ programmer are completely clueless when it comes to C programming? They think they know C but it's generally far away. The thing is, that C has evolved with C99 and C11 and the changes have not all been adopted by C++ (and Microsoft actively stalling the adoption of C99 in Visual C didn't help either).
May 10
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 12:06:46PM +0000, Patrick Schluter via Digitalmars-d
wrote:
 On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky
[...]
 Perhaps I'm just being cynical, but my current unfounded hypothesis
 is that the majority of C/C++ programmers ...
Just a nitpick, could we also please stop conflating C and C++ programmers? My experience is that C++ programmer are completely clueless when it comes to C programming? They think they know C but it's generally far away. The thing is, that C has evolved with C99 and C11 and the changes have not all been adopted by C++ (and Microsoft actively stalling the adoption of C99 in Visual C didn't help either).
OK, I'll try to stop conflating them... but the main reason for that is because I find myself stuck in-between the two, having started myself on C (well, assembly before that, but anyway) then moved on to C++, only to grow skeptical of C++'s direction of development and eventually settling on a hybrid of the two commonly known as "C with classes" (i.e., a dialect of C++ without some of what I consider to be poorly-designed features). Recently, though, I've mostly been working on pure C because of my job. I used to still use "C with classes" in my own projects but after I found D, I'd essentially swore myself off ever using C++ in my own projects again. My experience reviewing the C++ code that comes up every now and then at work, though, tells me that the average typical C++ programmer is probably worse than the average typical C programmer when it comes to code quality. And C++ gives you just so many more ways to shoot yourself in the foot. The joke used to go that C gives you many ways to shoot yourself in the foot, but C++ gives you many ways to shoot yourself in the foot and then encapsulate all the evidence away, all packaged in one convenient wrapper. (And don't get me started on C++ "experts" who invent extravagantly over-engineered class hierarchies that nobody can understand and 90% of which is actually completely irrelevant to the task at hand, resulting in such abysmal performance that people just bypassed the whole thing in the first place and revert to copy-pasta-ism and using C hacks in C++ code, causing double the carnage. Once I had to invent a stupendous hack to bridge a C++ daemon with a C module whose owners flatly refused to link in any C++ libraries. The horrendous result had 7 layers of abstraction just to make a single function call, one of which involved fwrite()-ing function arguments to a file, fork-and-exec'ing, and fread()-ing it from the other end. Why didn't I just open a socket to the daemon directly? Because the ridiculously over-engineered daemon only understands the reverse-encrypted Klingon protocol spoken by a makefile-generated IPC wrapper file containing 2000 procedurally-generated templates (I kid you not, I'm not talking about 2000 instantiations of one template, I'm talking about 2000 templates which are themselves procedurally generated), and the only way you could speak this protocol was to use the resultant ridiculously bloated C++ library. Which the PTBs have dictated that I cannot link into the C module. What else was a man to do?) T -- Try to keep an open mind, but not so open your brain falls out. -- theboz
May 10
prev sibling parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/10/2017 08:06 AM, Patrick Schluter wrote:
 On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky
[...]
 Perhaps I'm just being cynical, but my current unfounded hypothesis is
 that the majority of C/C++ programmers ...
Just a nitpick, could we also please stop conflating C and C++ programmers? My experience is that C++ programmer are completely clueless when it comes to C programming? They think they know C but it's generally far away. The thing is, that C has evolved with C99 and C11 and the changes have not all been adopted by C++ (and Microsoft actively stalling the adoption of C99 in Visual C didn't help either).
I wouldn't know the difference all that well anyway. Aside from a brief stint playing around with the Marmalade engine, the last time I was still really using C *or* C++ was back when C++ *did* mean little more than "C with classes" (and there was this new "templates" thing that was considered best avoided for the time being because all the implementations were known buggy). I left them when I could tell the complexity of getting things done (in either) was falling way behind the modern curve, and there were other languages which offered sane productivity without completely sacrificing low-level capabilities.
May 11
prev sibling parent reply "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/10/2017 02:28 AM, H. S. Teoh via Digitalmars-d wrote:
 I'd much rather the compiler say "Hey, you! This piece of code is
 probably wrong, so please fix it! If it was intentional, please write it
 another way that makes that clear!" - and abort with a compile error.
In the vast majority of cases, yes, I agree. But I've seen good ideas of useful heads-ups the compiler *could* provide get shot down in favor of silence because making it an error would, indeed, be a pedantic pain. As I see it, an argument against warnings is an argument against lint tools. And lint messages are *less* likely to get heeded, because the user has to actually go ahead and bother to install and run them.
 That puts me strongly in the philosophy of "Code containing warnings:
 Allowed while compiling, disallowed when committing (with allowances
 for mitigating circumstances)."
I'm on the fence about the former. My current theory is that being forced to write "proper" code even while refactoring actually helps the quality of the resulting code.
I find anything too pedantic to be an outright error will *seriously* get in my way and break my workflow on the task at hand when I'm dealing with refactoring, debugging, playing around with an idea, etc., if I'm required to compulsively "clean them all up" at every little step along the way (it'd be like working with my mother hovering over my shoulder...). And that's been the case even for things I would normally want to be informed of. Dead/unreachable code and unused variables are two examples that come to mind.
 The problem is that
 it's not enforced by the compiler, so *somebody* somewhere will
 inevitably bypass it.
I never understood the "Some people ignore it, therefore it's good to remove it and prevent anyone else from ever benefiting" line of reasoning. I don't want all "caution" road signs ("stop sign ahead", "hidden driveway", "speed limit decreases ahead", etc) all ripped out of the ground and tossed just because there are some jackasses who ignore them and cause trouble. Bad things happen when people ignore road signs, and they do ignore road signs, therefore let's get rid of road signs. That wouldn't make any shred of sense, would it? It's the same thing here: I'd rather have somebody somewhere bypass that enforcement than render EVERYONE completely unable to benefit from it, ever. When the compiler keeps silent about a code smell instead of emitting a waring, that's exactly the same as emitting a warning but *requiring* that *everybody* *always* ignores it. "Sometimes" missing a heads-up is better than "always" missing it.
 C/C++ doesn't demonstrate that warnings are doomed to be useless and
 "always" ignored. What it demonstrates is that warnings are NOT an
 appropriate strategy for fixing language problems.
Point. I suppose YMMV, but IME unless warnings are enforced with -Werror or equivalent, after a while people just stop paying attention to them, at least where I work.
So nobody else should have the opportunity to benefit from them? Because that's what the alternative is. As soon as we buy into the "error" vs "totally ok" false dichotomy, we start hitting (and this is exactly what did happen in D many years ago) cases where a known code smell is too pedantic to be justifiable as a build-breaking error. So if we buy into the "error/ok" dichotomy, those code smells are forced into the "A-Ok!" bucket, guaranteeing that nobody benefits. Those "X doesn't fit into the error vs ok dichotomy" realities are exactly why DMD wound up with a set of warnings despite Walter's philosophical objections to them.
 That's why my eventual conclusion is that anything short of enforcement
 will ultimately fail. Unless there is no way you can actually get an
 executable out of badly-written code, there will always be *somebody*
 out there that will write bad code. And by Murphy's Law, that somebody
 will eventually be someone in your team, and chances are you'll be the
 one cleaning up the mess afterwards.  Not something I envy doing (I've
 already had to do too much of that).
And when I am tasked with cleaning up that bad code, I *really* hope it's from me being the only one to read the warnings, and not because I just wasted the whole day tracking down some weird bug only to find it was caused by something the compiler *could* have warned me about, but chose not to because the compiler doesn't believe in warnings out of fear that somebody, somewhere might ignore it.
May 11
parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/11/2017 10:20 PM, Nick Sabalausky (Abscissa) wrote:
 On 05/10/2017 02:28 AM, H. S. Teoh via Digitalmars-d wrote:
 I'm on the fence about the former.  My current theory is that being
 forced to write "proper" code even while refactoring actually helps the
 quality of the resulting code.
I find anything too pedantic to be an outright error will *seriously* get in my way and break my workflow on the task at hand when I'm dealing with refactoring, debugging, playing around with an idea, etc., if I'm required to compulsively "clean them all up" at every little step along the way
Another thing to keep in mind is that deprecations are nothing more than a special type of warning. If code must be be either "error" or "non-error" with no in-between, then that rules out deprecations. They would be forced to either become fatal errors (thus defeating the whole point of keeping an old symbol around marked as deprecated) or go away entirely.
May 11
prev sibling parent Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Wednesday, 10 May 2017 at 01:19:08 UTC, Nick Sabalausky 
(Abscissa) wrote:
 The moral of this story: Sometimes, breaking people's code is 
 GOOD! ;)
I don't get the hate that compiler warnings get in the D community. Sure you can disable them if you don't care, but then don't complain about C being inherently unsafe and bug-prone while praising D for breaking things. Uninitialized variables is an example that I think does not need to be a language feature: If the compiler can prove the usage is sound, everything is fine. The compiler has much deeper knowledge about the concrete case than static language rules. If analysis fails, issue a warning. Usually the problematic code is far from obvious and refactoring is a good idea. If the programmer still thinks that no action is needed, just suppress that warning with a pragma.
May 10
prev sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
 		strncpy(tmp, desc->data1, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}

 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning. Side note: scope(exit) is one of the best inventions in PLs ever.
May 09
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via Digitalmars-d wrote:
 On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
 		strncpy(tmp, desc->data1, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
 
 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning.
Well, there ya go. Case in point. Even when you're consciously trying to write "proper" C code, there are just far too many ways things can go wrong that slip-ups are practically inevitable. Eventually, the idiom that I (and others) eventually converged on looks something like this: int myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) { void *resource1, *resource2, *resource3; int ret = RET_ERROR; /* Vet arguments */ if (!blah || !bleh || !bluh) return ret; /* Acquire resources */ resource1 = acquire_resource(blah->blah); if (!resource1) goto EXIT; resource2 = acquire_resource(bleh->bleh); if (!resource1) goto EXIT; resource3 = acquire_resource(bluh->bluh); if (!resource1) goto EXIT; /* Do actual work */ if (do_step1(blah, resource1) == RET_ERROR) goto EXIT; if (do_step2(blah, resource1) == RET_ERROR) goto EXIT; if (do_step3(blah, resource1) == RET_ERROR) goto EXIT; ret = RET_OK; EXIT: /* Cleanup everything */ if (resource3) release(resource3); if (resource2) release(resource2); if (resource1) release(resource1); return ret; } In other words, we just converged to what essentially amounts to a try-catch block with the manual equivalent of RAII. After a while, this is pretty much the only way to have any confidence at all that there isn't any hidden resource leaks or other such errors in the code. Of course, this is only the first part of the equation. There's also managing buffers and arrays safely, which still needs to be addressed. We haven't quite gotten there yet, but at least some of the code now has started moving away from C standard library string functions completely, and towards a common string buffer type where you work with a struct wrapper with functions for appending data, extracting the result, etc.. IOW, we're slowly reinventing a properly-encapsulated string type that's missing from the language. So eventually, after so much effort chasing down pointer bugs, buffer overflows, resource leaks, and the rest of C's endless slew of pitfalls, we're gradually reinventing RAII, try-catch blocks, and string types all over again. It's like historians are repeating each other^W^W^W^W^W I mean, history is repeating itself. :-D It makes me wonder how much longer it will take for us to gradually converge onto features that today we're enjoying in D. Will it take another decade of segfaults, untraceable pointer bugs, security holes, and memory leaks? Who knows. I'm secretly hoping that between now and then, D finally takes off and we can finally shed this dinosaur age language that should have died after the 70's or latest the 80's, yet still persists to this day.
 Side note: scope(exit) is one of the best inventions in PLs ever.
Ironically, D has gone so far past the woes that still plague C coders every day that scope(exit) is hardly ever used in D anymore. :-P It has its uses, certainly, but in my regular D code, I'm already benefitting so much from D's other features that I can hardly think of a use case for scope(exit) anymore, in the context of idiomatic D. I do regularly find myself wishing for scope(exit) in my C code, though! T -- Век живи - век учись. А дураком помрёшь.
May 09
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 05/09/2017 10:26 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via 
Digitalmars-d wrote:
 On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
 		strncpy(tmp, desc->data1, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}

 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning.
Well, there ya go. Case in point.
I caught that too but I thought you were testing whether we were listening. ;)
 Eventually, the idiom that I (and others) eventually converged on looks
 something like this:

 	int myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
 		void *resource1, *resource2, *resource3;
 		int ret = RET_ERROR;

 		/* Vet arguments */
 		if (!blah || !bleh || !bluh)
 			return ret;

 		/* Acquire resources */
 		resource1 = acquire_resource(blah->blah);
 		if (!resource1) goto EXIT;

 		resource2 = acquire_resource(bleh->bleh);
 		if (!resource1) goto EXIT;
Copy paste error! :p (resource1 should be resource2.)
 		resource3 = acquire_resource(bluh->bluh);
 		if (!resource1) goto EXIT;
Ditto.
 		/* Do actual work */
 		if (do_step1(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step2(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step3(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		ret = RET_OK;
 	EXIT:
 		/* Cleanup everything */
 		if (resource3) release(resource3);
 		if (resource2) release(resource2);
 		if (resource1) release(resource1);

 		return ret;
 	}
As an improvement, consider hiding the checks and the goto statements in macros: resource2 = acquire_resource(bleh->bleh); exit_if_null(resource1); err = do_step2(blah, resource1); exit_if_error(err); Or something similar... Obviously, it requires certain standardization like functions never having a goto statement, yet all having an EXIT area, etc. It makes C code very uniform, which is a good thing as you notice nonstandard idioms quickly. This safer way of needing to do everything in steps of two lines is one of the reasons why I was convinced that exceptions are superior to return codes. Ali
May 10
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 04:38:48AM -0700, Ali ehreli via Digitalmars-d wrote:
 On 05/09/2017 10:26 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via Digitalmars-d
wrote:
 On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
[...]
 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning.
Well, there ya go. Case in point.
I caught that too but I thought you were testing whether we were listening. ;)
Haha, I guess I'm not as good of a C coder as I'd like to think I am. :-D [...]
 		/* Acquire resources */
 		resource1 = acquire_resource(blah->blah);
 		if (!resource1) goto EXIT;

 		resource2 = acquire_resource(bleh->bleh);
 		if (!resource1) goto EXIT;
Copy paste error! :p (resource1 should be resource2.)
 		resource3 = acquire_resource(bluh->bluh);
 		if (!resource1) goto EXIT;
Ditto.
Ouch. Ouch. :-D But then again, I've actually seen similar copy-paste errors in real code before, too. Sometimes they could be overlooked for >5 years (I kid you not, I actually checked the date in svn blame / svn log). [...]
 As an improvement, consider hiding the checks and the goto statements
 in macros:
 
     resource2 = acquire_resource(bleh->bleh);
     exit_if_null(resource1);
 
     err = do_step2(blah, resource1);
     exit_if_error(err);
 
 Or something similar... Obviously, it requires certain standardization
 like functions never having a goto statement, yet all having an EXIT
 area, etc.  It makes C code very uniform, which is a good thing as you
 notice nonstandard idioms quickly.
Yes, eventually this is the only sane and consistent way to deal with these problems. Unfortunately, in C this can only be done by convention, which means that some non-conforming code will inevitably slip through and cause havoc. Also, this starts running dangerously near the slippery slope down into macro hell, where the project accretes its own idiomatic set of inscrutable macro usage conventions and eventually almost all of the C syntax has disappeared and the code no longer looks like C. Then along comes New Recruit, and he makes a right mess with it because he doesn't understand the 15-level-deep nested macros in the global include/macros.h file that's become a 5200-line monstrosity of unreadable CPP hacks. (Also not exaggerating: the very project I'm working on has a module that's written this way, and only the initiated dare dream of fixing bugs in those macros. Fortunately, they have not yet nested to 15 levels deep, so for the most part you just copy and paste existing working code and pray that it will Just Work by analogy. Actually understand what you just wrote? Pfeh! You don't have time for that. The customer wants the release by last week. Copy-n-paste cargo cult FTW!)
 This safer way of needing to do everything in steps of two lines is
 one of the reasons why I was convinced that exceptions are superior to
 return codes.
[...] Yeah, once practically every single statement in your function is an if-statement checking for error codes, you start wondering, why can't the language abstract this nasty boilerplate away for me?! And then the need for exceptions becomes clear. T -- Written on the window of a clothing store: No shirt, no shoes, no service.
May 10
parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 10 May 2017 at 17:51:38 UTC, H. S. Teoh wrote:
 Haha, I guess I'm not as good of a C coder as I'd like to think 
 I am. :-D
That comment puts you ahead of the pack already :)
May 11
prev sibling parent reply Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Wednesday, 10 May 2017 at 05:26:11 UTC, H. S. Teoh wrote:
 	int myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
 		void *resource1, *resource2, *resource3;
 		int ret = RET_ERROR;

 		/* Vet arguments */
 		if (!blah || !bleh || !bluh)
 			return ret;

 		/* Acquire resources */
 		resource1 = acquire_resource(blah->blah);
 		if (!resource1) goto EXIT;

 		resource2 = acquire_resource(bleh->bleh);
 		if (!resource1) goto EXIT;

 		resource3 = acquire_resource(bluh->bluh);
 		if (!resource1) goto EXIT;

 		/* Do actual work */
 		if (do_step1(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step2(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step3(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		ret = RET_OK;
 	EXIT:
 		/* Cleanup everything */
 		if (resource3) release(resource3);
 		if (resource2) release(resource2);
 		if (resource1) release(resource1);

 		return ret;
 	}
In modern C and with GLib (which makes use of a gcc/clang extension) you can write this as: gboolean myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) { /* Cleanup everything automatically at the end */ g_autoptr(GResource) resource1 = NULL, resource2 = NULL, resource3 = NULL; gboolean ok; /* Vet arguments */ g_return_if_fail(blah != NULL, FALSE); g_return_if_fail(bleh != NULL, FALSE); g_return_if_fail(bluh != NULL, FALSE); /* Acquire resources */ ok = acquire_resource(resource1, blah->blah); g_return_if_fail(ok, FALSE); ok = acquire_resource(resource2, bleh->bleh); g_return_if_fail(ok, FALSE); ok = acquire_resource(resource3, bluh->bluh); g_return_if_fail(ok, FALSE); /* Do actual work */ ok = do_step1(blah, resource1); g_return_if_fail(ok, FALSE); ok = do_step2(blah, resource1); g_return_if_fail(ok, FALSE); return do_step3(blah, resource1); } Some random example of this style of coding: https://github.com/flatpak/flatpak/blob/master/common/flatpak-db.c
May 10
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 12:34:05PM +0000, Guillaume Boucher via Digitalmars-d
wrote:
[...]
 In modern C and with GLib (which makes use of a gcc/clang extension) you can
 write this as:
 
 gboolean myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
         /* Cleanup everything automatically at the end */
         g_autoptr(GResource) resource1 = NULL, resource2 = NULL, resource3 =
 NULL;
         gboolean ok;
 
         /* Vet arguments */
         g_return_if_fail(blah != NULL, FALSE);
         g_return_if_fail(bleh != NULL, FALSE);
         g_return_if_fail(bluh != NULL, FALSE);
 
 	/* Acquire resources */
 	ok = acquire_resource(resource1, blah->blah);
 	g_return_if_fail(ok, FALSE);
 
         ok = acquire_resource(resource2, bleh->bleh);
 	g_return_if_fail(ok, FALSE);
 
 	ok = acquire_resource(resource3, bluh->bluh);
 	g_return_if_fail(ok, FALSE);
 
         /* Do actual work */
 	ok = do_step1(blah, resource1);
 	g_return_if_fail(ok, FALSE);
 
 	ok = do_step2(blah, resource1);
 	g_return_if_fail(ok, FALSE);
 
 	return do_step3(blah, resource1);
 }
[...] Yes, this would address the problem somewhat, but the problem is again, this is programming by convention. The language doesn't enforce that you have to write code this way, and because it's not enforced, *somebody* will ignore it and write things the Bad Ole Way. You're essentially writing in what amounts to a subdialect of C using Glib idioms, and that's not a bad thing in and of itself. But the larger language that includes all the old unsafe ways of writing code is still readily accessible. By Murphy's Law, somebody will eventually write something that breaks the idiom and causes problems. Also, because this way of writing code is not part of the language, the compiler cannot verify that you're using the macros correctly. And it cannot verify that you didn't write goto labels or other things that might conflict with the way the macros are implemented. Lack of hygiene in C macros does not help in this respect. I don't dispute that there are ways of writing correct (or mostly correct) C code. But the problem is that these ways of writing correct C code are (1) non-obvious to someone not already in the know, and so you will always have people who either don't know about them or aren't sufficiently well-versed in them to use them effectively; and (2) not statically enforceable because they are not a part of the language. Lack of enforcement, in the long run, can only end in disaster, because programming by convention does not work. It works as long as the convention is kept, but humans are fallible, and we all know how well humans are at keeping conventions over a sustained period of time (or even just short periods of time). Not even D is perfect in this regard, but it has taken significant steps in the right directions. Correct-by-default (well, for the most part anyway, barring compiler bugs / spec issues) and static guarantees (verified by the compiler -- again barring compiler bugs) are major steps forward. Ultimately, I'm unsure how far a language can go at static guarantees: I think somewhere along the line human error will still be unpreventable because you start running into the halting problem when verifying certain things. But I certainly think there's still a LOT that can be done by the language between here and there, much more than what we have today. T -- Mediocrity has been pushed to extremes.
May 10
prev sibling parent reply Joakim <dlang joakim.fea.st> writes:
On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
May 09
next sibling parent reply Adrian Matoga <dlang.spam matoga.info> writes:
On Tuesday, 9 May 2017 at 09:22:13 UTC, Joakim wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
+1
May 09
parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/09/2017 06:29 AM, Adrian Matoga wrote:
 On Tuesday, 9 May 2017 at 09:22:13 UTC, Joakim wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
+1
+ a kajillion (give or take a few hundred)
May 09
prev sibling parent Martin Tschierschke <mt smartdolphin.de> writes:
On Tuesday, 9 May 2017 at 09:22:13 UTC, Joakim wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
+=1; Yes, good idea!
May 09
prev sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Monday, May 08, 2017 23:15:12 H. S. Teoh via Digitalmars-d wrote:
 Recently I've had the dubious privilege of being part of a department
 wide push on the part of my employer to audit our codebases (mostly C,
 with a smattering of C++ and other code, all dealing with various levels
 of network services and running on hardware expected to be "enterprise"
 quality and "secure") and fix security problems and other such bugs,
 with the help of some static analysis tools. I have to say that even
 given my general skepticism about the quality of so-called "enterprise"
 code, I was rather shaken not only to find lots of confirmation of my
 gut feeling that there are major issues in our codebase, but even more
 by just HOW MANY of them there are.
In a way, it's amazing how successful folks can be with software that's quite buggy. A _lot_ of software works just "well enough" that it gets the job done but is actually pretty terrible. And I've had coworkers argue to me before that writing correct software really doesn't matter - it just has to work well enough to get the job done. And sadly, to a great extent, that's true. However, writing software that's works just "well enough" does come at a cost, and if security is a real concern (as it increasingly is), then that sort of attitude is not going to cut it. But since the cost often comes later, I don't think that it's at all clear that we're going to really see a shift towards languages that prevent such bugs. Up front costs tend to have a powerful impact on decision making - especially when the cost that could come later is theoretical rather than guaranteed. Now, given that D is also a very _productive_ language to write in, it stands to reduce up front costs as well, and that combined with its ability to reduce the theoretical security costs, we could have a real win, but with how entrenched C and C++ are and how much many companies are geared towards not caring about security or software quality so long as the software seems to get the job done, I think that it's going to be a _major_ uphill battle for a language like D to really gain mainstream use on anywhere near the level that languages like C and C++ have. But for those who are willing to use a language that makes it harder to write code with memory safety issues, there's a competitive advantage to be gained. - Jonathan M Davis
May 11
next sibling parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/11/2017 11:53 AM, Jonathan M Davis via Digitalmars-d wrote:
 In a way, it's amazing how successful folks can be with software that's
 quite buggy. A _lot_ of software works just "well enough" that it gets the
 job done but is actually pretty terrible. And I've had coworkers argue to me
 before that writing correct software really doesn't matter - it just has to
 work well enough to get the job done. And sadly, to a great extent, that's
 true.

 However, writing software that's works just "well enough" does come at a
 cost, and if security is a real concern (as it increasingly is), then that
 sort of attitude is not going to cut it. But since the cost often comes
 later, I don't think that it's at all clear that we're going to really see a
 shift towards languages that prevent such bugs. Up front costs tend to have
 a powerful impact on decision making - especially when the cost that could
 come later is theoretical rather than guaranteed.

 Now, given that D is also a very _productive_ language to write in, it
 stands to reduce up front costs as well, and that combined with its ability
 to reduce the theoretical security costs, we could have a real win, but with
 how entrenched C and C++ are and how much many companies are geared towards
 not caring about security or software quality so long as the software seems
 to get the job done, I think that it's going to be a _major_ uphill battle
 for a language like D to really gain mainstream use on anywhere near the
 level that languages like C and C++ have. But for those who are willing to
 use a language that makes it harder to write code with memory safety issues,
 there's a competitive advantage to be gained.
All very, unfortunately, true. It's like I say, the tech industry isn't engineering, it's fashion. There is no meritocracy here, not by a long shot. In tech: What's popular is right and what's right is popular, period.
May 11
prev sibling parent reply Laeeth Isharc <laeethnospam nospam.laeeth.com> writes:
On Thursday, 11 May 2017 at 15:53:40 UTC, Jonathan M Davis wrote:
 On Monday, May 08, 2017 23:15:12 H. S. Teoh via Digitalmars-d 
 wrote:
 Recently I've had the dubious privilege of being part of a 
 department wide push on the part of my employer to audit our 
 codebases (mostly C, with a smattering of C++ and other code, 
 all dealing with various levels of network services and 
 running on hardware expected to be "enterprise" quality and 
 "secure") and fix security problems and other such bugs, with 
 the help of some static analysis tools. I have to say that 
 even given my general skepticism about the quality of 
 so-called "enterprise" code, I was rather shaken not only to 
 find lots of confirmation of my gut feeling that there are 
 major issues in our codebase, but even more by just HOW MANY 
 of them there are.
In a way, it's amazing how successful folks can be with software that's quite buggy. A _lot_ of software works just "well enough" that it gets the job done but is actually pretty terrible. And I've had coworkers argue to me before that writing correct software really doesn't matter - it just has to work well enough to get the job done. And sadly, to a great extent, that's true. However, writing software that's works just "well enough" does come at a cost, and if security is a real concern (as it increasingly is), then that sort of attitude is not going to cut it. But since the cost often comes later, I don't think that it's at all clear that we're going to really see a shift towards languages that prevent such bugs. Up front costs tend to have a powerful impact on decision making - especially when the cost that could come later is theoretical rather than guaranteed. Now, given that D is also a very _productive_ language to write in, it stands to reduce up front costs as well, and that combined with its ability to reduce the theoretical security costs, we could have a real win, but with how entrenched C and C++ are and how much many companies are geared towards not caring about security or software quality so long as the software seems to get the job done, I think that it's going to be a _major_ uphill battle for a language like D to really gain mainstream use on anywhere near the level that languages like C and C++ have. But for those who are willing to use a language that makes it harder to write code with memory safety issues, there's a competitive advantage to be gained. - Jonathan M Davis
D wasn't ready for mainstream adoption until quite recently I think. The documentation for Phobos when I started looking at D in 2014 was perfectly clear if you were more theoretically minded, but not for other people. In a previous incarnation I tried to get one trader who writes Python to look at D and he was terrified of it because of the docs. And I used to regularly have compiler crashes and ldc was always too far behind dmd. If you wanted to find commercial users there didn't seem to be so many and so hard to point to successful projects in D that people would have heard of or could recognise - at least not enough of them. Perception has threshold effects and isn't linear. There wasn't that much on numerical front either. The D Foundation didn't exist and Andrei played superhero in his spare time. All that's changed now in every respect. I can point to the documentation and say we should have docs like that and with runnable tests /examples. Most code builds fine with ldc, plenty of numerical libraries - thanks Ilya - and perception is quite different about commercial successes. Remember what's really just incremental in reality can be a step change in perception. I don't think the costs of adopting D are tiny upfront. Putting aside the fact that people expect better IDE support than we have, and that we have quite frequent releases (not a bad thing, but it's where we are in maturity) and some of them are a bit unfinished and others break things for good reasons, build systems are not that great even for middling projects (200k sloc). Dub is an amazing accomplishment for Sonke as one of many projects part time, but it's not yet so mature as a build tool. We have extern(C++) which is great, and no other language has it. But that's not the same thing as saying it's trivial to use a C++ library from D (and I don't think it's yet mature bugwise). No STL yet. Even for C compare the steps involved vs LuaJIT FFI. Dstep is a great tool but not without some friction and it only works for C. So one should expect to pay a price with all of this, and I think most of the price is upfront (also because you might want to wrap the libraries you use most often). And the price is paid by having to deal with things people often take for granted, so even if it's small in the scheme of things, it's more noticeable. A community needs energy coming into it to grow, but if there's too quick an influx of newcomers that wouldn't be good either. Eg if dconf were twice the size it would be a very different experience, not only in a positive way. I think new things often grow not by taking the dominant player head on, but by growing in the interstices. By taking hold in obscure niches nobody cares about you gain power to take on bigger niches and over time turns out some of those niches weren't so unimportant after all. It's a positive for the health of D that it's dismissed and yet keeps growing; just imagine if Stroustrup had had a revelation, written a memo "the static if tidal wave" (BG 1995), persuaded the committee to deprecate all the features and mistakes that hold C++ back and stolen all D's best features in a single language release. A challenger language doesn't want all people to take it seriously because it doesn't have the strength to win a direct contest. It just needs more people to take it seriously. Thr best measure of the health of the language and its community might be are more people using the language to get real work done and is it more or less helping them do so; and what is the quality of new people becoming involved. If those things are positive then if external conditions are favourable then I think it bodes well for the future. And by external conditions I mean that people have gotten used to squandering performance and users' time - see Jonathan Blow on Photoshop for example. If you have an abundance of a resource and keep squandering it, eventually you will run out of abundance. Storage prices are collapsing, data sets are growing, Moore's Law isn't what it was, and even with dirt cheap commodity hardware it's not necessarily the case that one is I/O bound any more. Nvme drive does 2.5 GB /sec and we are happy when we can parse JSON at 200 MB /sec. People who misquote Knuth seem to write slow code, and life is too short to be waiting unnecessarily. At some point people get fed up with slow code. Maybe it's wrong to think about there being one true inheritor of the mantle of C and C++. Maybe no new language will gain the market share that C has, and if so that's probably a good thing. Mozilla probably never had any moments when they woke up and thought hmm maybe we should have used Go instead, and I doubt people writing network services think maybe Rust would have been better. I said to Andrei at dconf that principals rather than agents are much more likely to be receptive towards the adoption of D. If you take an unconventional decision and it doesn't work out, you look doubly stupid - it didn't work out and on top of that nobody else made that mistake : what were you thinking? So by far the best strategy - unless you're in a world of pain, and desperate for a way out - is to copy what everyone else is doing. But if you're a principal - ie in some way an owner of a business - you haven't got the luxury of fooling yourself, not if you want to survive and flourish. The buck stops here, so it's a risk to use D, but it's also a risk not to use D - you can't pretend the conventional wisdom is without risk when it may not suit the problem that's before you. And it's your problem today and it's still your problem tomorrow, and that leads to a different orientation towards the future than being a cog in a vast machine where the top guy is measured by whether he beats earnings next quarter. The web guys do have a lot of engineers but they have an inordinate influence on the culture. Lots more code gets written in enterprises and you never hear about it because it's proprietary and people aren't allowed to or don't have time to discuss it. And maybe it's simply not even interesting to talk about, which doesn't mean it's not interesting to you, and economically important. D covers an enormous surface area - a much larger potential domain set than Go or Rust. Things are more spread out, hence the amusing phenomenon on Reddit and the like of people thinking that because they personally don't know anyone that uses D, nothing is happening and adoption isn't growing. So assessing things by adoption within the niches where people are chatty is interesting but doesn't tell you much. I don't think most users post on the forum much. It's a subset of people that for whatever reasons like posting on the forum for intrinsic or instrumental reasons that do. So if I am right about the surface area and the importance of principals then you should over time see people popping up from areas you had never thought of that have the power to make decisions and trust their own judgement because they have to. That's how you know the language is healthy - that they start using D and enough of them have success with it. Liran at Weka had never heard of D not long before he based his company on it. I had never imagined a ship design company might use Extended Pascal, let alone that D might be a clearly sensible option for automated code conversion and be a great fit for new code. And I am sure Walter is right about the importance of memory safety. But outside of certain areas D isn't in a battle with Rust; memory safety is one more appealing modern feature of D. To say it's important to get it right isn't to say it has to defeat Rust. Not that you implied this, but some people at dconf seemed to implicitly think that way. Laeeth
May 11
next sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Friday, May 12, 2017 04:08:52 Laeeth Isharc via Digitalmars-d wrote:
 And I am sure Walter is right about the importance of memory
 safety.  But outside of certain areas D isn't in a battle with
 Rust; memory safety is one more appealing modern feature of D.
 To say it's important to get it right isn't to say it has to
 defeat Rust. Not that you implied this, but some people at dconf
 seemed to implicitly think that way.
I think that we're far past the point that any language is going to beat everyone else out. Some languages will have higher usage than others, and it's going to vary quite a lot between different domains. Really, it's more of a question of whether a lanugage can compete well enough to be relevant and be used by a lot of developers, not whether it's used by most developers. For instance, D and Go are clearly languages that appeal to a vastly different set of developers, and while they do compete on some level, I think that they're ultimately just going to be used by very different sets of people, because they're just too different (e.g. compare Go's complete lack of generics with D's templates). Rust, on the other hand, seems to have a greater overlap with D, so there's likely to be greater competition there (certainly more competetion with regards to replacing C++ in places where C++ is replaced), but they're still going to appeal to different sets of developers to an extent, just like C++ and D have a lot of overlap but don't appeal to the same set of developers. I fully expect that both Rust and D have bright futures, but I also don't really expect either to become dominant. That's just too hard for a language to do, especially since older languages don't really seem to go away. The programming language ecosystem just becomes more diverse. At most, a language is dominant in a particular domain, not the software industry as a whole. I would love for D to become a serious player in the programming language space such that you see D jobs out there like we currently see C/C++ or Java jobs out there (right now, as I understand it, even Sociomantic Labs advertises for C++ programmers, not D programmers). But ultimately, what I care about is being able to use D when I program and have enough of an ecosystem around it that there are useful libraries and frameworks that I can use and build upon, because D is the language that I prefer and want to program in. Having D destroy C/C++ or Java or C# or Rust or whatever really isn't necessary for that. It just needs to become big enough that it has a real presence, whereas right now, it seems more like the folks who use it professionally are doing so in stealth mode (even if they're not doing so purposefully). Anyone who wants to get a job somewhere and work in D is usually going to have a hard time of it right now, even though such jobs do exist. As it stands, I think a relatively small percentage of D's contributors are able to use D for their day jobs. And if we can really change _that_, then we'll have gotten somewhere big, regardless of what happens with other languages. - Jonathan M Davis
May 12
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 12 May 2017 at 04:08:52 UTC, Laeeth Isharc wrote:
 build tool.  We have extern(C++) which is great, and no other 
 language has it.
Objective-C++/Swift.
 Maybe it's wrong to think about there being one true inheritor 
 of the mantle of C and C++.  Maybe no new language will gain 
 the market share that C has, and if so that's probably a good 
 thing.  Mozilla probably never had any moments when they woke 
 up and thought hmm maybe we should have used Go instead, and I 
 doubt people writing network services think maybe Rust would 
 have been better.
Yes, I think this is right, although C++ is taking over more and more of C's space. But there are still niches where C++ have a hard time going and C still dominates. The problem is of course, that less and less software projects benefit from what C offers...
 But if you're a principal - ie in some way an owner of a 
 business - you haven't got the luxury of fooling yourself, not 
 if you want to survive and flourish.  The buck stops here, so 
 it's a risk to use D, but it's also a risk not to use D - you 
 can't pretend the conventional wisdom is without risk when it 
 may not suit the problem that's before you. And it's your 
 problem today and it's still your problem tomorrow, and that 
 leads to a different orientation towards the future than being 
 a cog in a vast machine where the top guy is measured by 
 whether he beats earnings next quarter.
I don't really think all that many principals make such decisions without pressure from the engineers in the organization, unless it is for going with some big league name... In general many leaders have been burned by using tooling from companies that has folded or not being able to fix issues. Which is a very good reason for going with the safe and well known. Most projects have enough uncertainty factors already so adding an extra uncertainty factor in the tooling is usually not the right choice.
 The web guys do have a lot of engineers but they have an 
 inordinate influence on the culture.  Lots more code gets
Right, the web guys adopt bleeding edge tech like crazy, because the risk is low. The projects are small and they can start over with a new tech on the next project in a few months. They don't have to plan for sticking with the same tooling for years and years.
 And I am sure Walter is right about the importance of memory 
 safety.  But outside of certain areas D isn't in a battle with 
 Rust; memory safety is one more appealing modern feature of D.  
 To say it's important to get it right isn't to say it has to 
 defeat Rust. Not that you implied this, but some people at 
 dconf seemed to implicitly think that way.
Well, memory safety isn't a modern feature at all actually. Most languages provide it, C is a notable exception...
May 12
prev sibling next sibling parent reply John Carter <john.carter taitradio.com> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer. I think you will find everything that really matters and is internet facing has been run up under a tool like that. They are truly wonderfully power tools... with the limitation that they are run time. ie. If you don't run that line of code... they won't tell you if you have it wrong. Index out of bounds exceptions are great... but the elements of Walter's talk we bugs are banished at compile time are more compelling. Now if we can get to the point where there is no undefined behaviour in any safe code... that would be a major step forward. Languages like Ruby are memory safe... but they are written in C and hence have a very long catalog of bugs found and fixed in the interpretor and supporting libraries. D has the interesting promise of being memory safe and the compiler and libraries being written in D.
May 08
next sibling parent reply John Carter <john.carter taitradio.com> writes:
On Monday, 8 May 2017 at 20:55:02 UTC, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
Google makes my point for me.... https://opensource.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html
 Index out of bounds exceptions are great... but the elements of 
 Walter's talk where bugs are banished at compile time are more 
 compelling.

 Now if we can get to the point where there is no undefined 
 behaviour in any safe code... that would be a major step 
 forward.
May 08
parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/08/2017 08:42 PM, John Carter wrote:
 On Monday, 8 May 2017 at 20:55:02 UTC, John Carter wrote:
 C/C++ has been granted an extension of life by the likes of valgrind
 and purify and *-sanitizer.
Google makes my point for me.... https://opensource.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html
That reminds me, I've been thinking for awhile: We need a dead-simple D/dub-ified tool to fuzz-test our D projects. Even if it's just a trivial wrapper and DUB package for an existing fuzz tester (heck, probably that's the right way to go anyway) we really should make fuzz testing just as common & easy a thing for D projects as doc-generation and unittests.
May 08
prev sibling next sibling parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/08/2017 04:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer. I think you will find everything that really matters and is internet facing has been run up under a tool like that.
Like Cloudflare and OpenSSL?
May 08
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools: 1. it isn't part of the language 2. it may not be available on your platform 3. somebody has to find it, install it, and integrate it into the dev/test process 4. it's incredibly slow to run valgrind, so there are powerful tendencies to skip it valgrind is a stopgap measure, and has saved me much grief over the years, but it is NOT the solution.
May 09
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 14:13:31 UTC, Walter Bright wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools: 1. it isn't part of the language 2. it may not be available on your platform 3. somebody has to find it, install it, and integrate it into the dev/test process 4. it's incredibly slow to run valgrind, so there are powerful tendencies to skip it valgrind is a stopgap measure, and has saved me much grief over the years, but it is NOT the solution.
And it doesn't catch everything. I had the case yesterday at work where one of the file converters that had been written 15 years ago, suddenly crashed in production*. It came from an upstream bug in a script that filled one attribute with garbage. I tried to reproduce the bug in the development environment and funily it didn't crash with newest version of the base library. The production library is one version behind. The garbage in the attribute triggered a buffer overflow in a fixed size array (496 UTF-16 characters in a buffer of 200 character size). This converter is one of the last one with fixed sized arrays. The interesting observation was that valgrind catches the buffer overflow when linked with version 2.31 of the main library but is completely silent when using version 2.32. The changes in that library are minimal and in parts that have nothing to do with this app. It is solely the placement of variables in data iand the bss that change. It is surprizing to see such a big buffer overflow completely missed by valgrind. TL;DR valgrind does not always catch buffer overflows especially if the memory overwritten is not in the head but in the data or the bss segment. There it cannot add guard pages as it does on the heap. * To give a little context. I work at the European Commission on the central translation memory system called Euramis (probably the biggest in the world with more than a billion segments). The system is used intensively by all translators of all European institutions and without it, nothing would be possible. The issue with it is that the back end is written in C and the code goes back to 1990. Me and my colleagues managed to modernize the system and catch most of the code issues with intensive use of C99 idioms, newest gcc and clang diagnostics and also valgrind and such things.
May 09
prev sibling next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 07:13:31AM -0700, Walter Bright via Digitalmars-d wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 
 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools: 1. it isn't part of the language
And it doesn't catch everything.
 2. it may not be available on your platform
And may not be compatible with your target architecture -- a lot of C code, especially in the embedded realm, have curious target archs that could be problematic for 3rd party tools that need to inject runtime instrumentation.
 3. somebody has to find it, install it, and integrate it into the
 dev/test process
This is a big one. Many large C projects have their own idiomatic way of building, which is often incompatible with 3rd party tools. This is a major demotivator for people to want to use those tools, because it's a big time investment to configure the tool to work with the build scripts, and an error-prone and painful process to rework the build scripts to work with the tool. "Why break our delicate tower-of-cards build system that's worked just fine for 20 years, just to run this new-fangled whatever 3rd party thingy promulgated by these young upstarts these days?"
 4. it's incredibly slow to run valgrind, so there are powerful
 tendencies to skip it
Yes, it's an extra step that the developer has to manually run, when he is already under an unreasonable deadline to meet an unreasonable customer request upon which hinges a $1M deal so you can't turn it down no matter how unreasonable it is. He barely has enough time to write code that won't crash outright, nevermind writing *proper* code. Yet another extra step to run manually? Forget it, not gonna happen. Not until a major crash happens on the customer's site that finally convinces the PTB to dictate the use of valgrind as a part of regular work routine. Other than that, the chances of someone actually bothering to do it are slim indeed.
 valgrind is a stopgap measure, and has saved me much grief over the
 years, but it is NOT the solution.
Yes, it's a patch over the current festering wound so that, at least for the time being, the blood is out of sight. But you can't wear that patch forever. Sooner or later the gangrene be visible on the surface. :-D T -- Change is inevitable, except from a vending machine.
May 09
prev sibling next sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tuesday, May 09, 2017 07:13:31 Walter Bright via Digitalmars-d wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools:
 2. it may not be available on your platform
The fact that it's not available on Windows is extremely annoying. Some tools do exist on Windows, but you have to pay for them, and in my experience, they don't work very well. And with my current job, they _definitely_ don't work, because we mix C++ and C# (via COM). Nothing seems to be able to handle that mixture properly, and it's _really_ hard to track down memory problems.
 4. it's incredibly slow to run valgrind, so there are powerful tendencies
 to skip it
There are cases where you literally _can't_ run it, because it's simply too slow. For instance, when dealing with live video from a camera, the odds are very high that under valgrind, the program won't be able to keep up. And if you're doing something like streaming 16 cameras at once (which happens in the security industry all the time), there's no way that it's going to work. Valgrind is a fantastic tool, but saying that valgrind is enough is like saying that dynamic type checking is as good as compile-time type checking. It isn't, and it can't be. So, yes, valgrind can be a lifesaver, but having preventing the bugs that it would find from even being possible is _far_ more valuable. That being said, with the push for nogc and the allocators and whatnot, we're then once again stuck needing to valgrind D code to catch bugs. It's still not as bad as C/C++, because the problems are much more restricted in scope, but avoiding the GC comes at a real cost. Atila commented at dconf that working with allocators in D code for the excel wrapper library he had worked on was like he was stuck in C++ again with all of the memory problems that he had. safe and the GC have _huge_ value. - Jonathan M Davis
May 10
parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 11:50:32 UTC, Jonathan M Davis wrote:
 On Tuesday, May 09, 2017 07:13:31 Walter Bright via 
 Digitalmars-d wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 Atila commented at dconf that working with allocators in D code 
 for the excel wrapper library he had worked on was like he was 
 stuck in C++ again with all of the memory problems that he had. 
  safe and the GC have _huge_ value.

 - Jonathan M Davis
Actually, it was worse than being back in C++ land: there I can use valgrind and address sanitizer. With D's allocators I was lost. I'd forgotten how much "fun" it was to print pointer values to the terminal to track down memory bugs. It's especially fun when you're on Windows, your code is in a DLL loaded by a program you don't control and DebugViewer is your only friend. Atila
May 10
prev sibling next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Tuesday, 9 May 2017 at 14:13:31 UTC, Walter Bright wrote:
 2. it may not be available on your platform
I just had to use valgrind for the first time in years at work (mostly Python code there) and I realized that there's no version that works on the latest OS X version. So valgrind runs on about 2.5% of computers in existence. Fun!
May 11
parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 11 May 2017 at 21:20:35 UTC, Jack Stouffer wrote:
 On Tuesday, 9 May 2017 at 14:13:31 UTC, Walter Bright wrote:
 2. it may not be available on your platform
I just had to use valgrind for the first time in years at work (mostly Python code there) and I realized that there's no version that works on the latest OS X version. So valgrind runs on about 2.5% of computers in existence. Fun!
Use ASAN.
May 11
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2017-05-09 16:13, Walter Bright wrote:

 I agree. But one inevitably runs into problems relying on valgrind and
 other third party tools:

 1. it isn't part of the language

 2. it may not be available on your platform

 3. somebody has to find it, install it, and integrate it into the
 dev/test process

 4. it's incredibly slow to run valgrind, so there are powerful
 tendencies to skip it

 valgrind is a stopgap measure, and has saved me much grief over the
 years, but it is NOT the solution.
AddressSanitizer [1] is a tool similar to Valgrind which is built into the Clang compiler, just add an additional flag. It instruments the binary with the help of compiler so the execution speed will not be that much slower compared to a regular build. Clang also contains a ThreadSanitizer which is supposed to detect data races. [1] https://clang.llvm.org/docs/AddressSanitizer.html [2] https://clang.llvm.org/docs/ThreadSanitizer.html -- /Jacob Carlborg
May 12
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: Anything that goes on the internet.
https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a vulnerability in an application that doesn't go on the internet.
May 11
next sibling parent Joakim <dlang joakim.fea.st> writes:
On Thursday, 11 May 2017 at 09:39:57 UTC, Kagamin wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: Anything that goes on the internet.
https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a vulnerability in an application that doesn't go on the internet.
To be fair, if you're not on the internet, you're unlikely to get any files that will trigger that bug in Microsoft's malware checker, as they noted that they first saw it on a website on the internet. Of course, you could still get such files on a USB stick, which just highlights that unless you completely shut in your computer from the world, you can get bit, just slower and with less consequences than on the internet. I wondered what that Project Zero topic had to do with Chromium, turns out it's a security team that google started three years ago to find zero-day holes in almost any software. That guy from the team also found the recently famous Cloudbleed bug that affected Cloudflare. They have a blog up that details holes they found in all kinds of stuff, security porn if you will: ;) https://googleprojectzero.blogspot.com
May 11
prev sibling parent Jack Stouffer <jack jackstouffer.com> writes:
On Thursday, 11 May 2017 at 09:39:57 UTC, Kagamin wrote:
 https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a
vulnerability in an application that doesn't go on the internet.
This link got me thinking: When will we see the first class action lawsuit for criminal negligence for not catching a buffer overflow (or other commonly known bug) which causes identity theft or loss of data? Putting aside the moral questions, the people suing would have a good case, given the wide knowledge of these bugs and the availability of tools to catch/fix them. I think they could prove negligence/incompetence and win given the right circumstances. Would be an interesting question to pose to any managers who don't want to spend time on security.
May 11
prev sibling next sibling parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
Hi, I think that comparing languages like D to C is not appropriate. C is a high level assembler and has different design goals. A useful document to refer to is: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1250.pdf In particular: (although note the addition of facet f, which echoes the sentiment that security is important) Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. For the Cx1 revision there is consensus to add a new facet f to the original list of facets. The new spirit of C can be summarized in phrases like: (a) Trust the programmer. (b) Don't prevent the programmer from doing what needs to be done. (c) Keep the language small and simple. (d) Provide only one way to do an operation. (e) Make it fast, even if it is not guaranteed to be portable. (f) Make support for safety and security demonstrable. Proverb e needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule. An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine. I think Linus Torvalds makes an important observation - he says in one of his talks is that the reason he likes C is that when he write C code he can visualize what the machine code will look like. My feeling is that the C has traditionally been used in contexts where probably it should not be used - i.e. as a general purpose application development language. But I don't see how languages like D or Rust can replace C for certain types of use cases. Regards Dibyendu
May 13
next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:
 (a) Trust the programmer.
That's the first and most deadly mistake. Buffer overflows and null pointers alone have caused hundreds of millions of dollars of damages. I think we can say that this trust is misplaced.
 (b) Don't prevent the programmer from doing what needs to be 
 done.
In reality this manifests as "Don't prevent the programmer from doing anything, especially if they're about to shoot themself". See the code examples throughout this thread.
 (c) Keep the language small and simple.
 (d) Provide only one way to do an operation.
lol
 (f) Make support for safety and security demonstrable.
LOL http://article.gmane.org/gmane.comp.compilers.llvm.devel/87749
My conclusion is that C, and derivatives like C++, is a very
dangerous language the write safety/correctness critical software
in, and my personal opinion is that it is almost impossible to 
write
*security* critical software in it.
(that's from the creator of clang btw)
 But I don't see how languages like D or Rust can replace C for 
 certain types of use cases.
Maybe you can argue for the use of C in embedded systems and in OS's, although I see no reason why Rust can't eventually overtake C there. However, much of the internet's security critical systems (openssl, openssh, DNS systems, router firmware) are in C, and if Google's Project Zero are any indication, they all have ticking time bombs in them. As I stated earlier in the thread, at some point, some company is going to get sued for criminal negligence for shipping software with a buffer overflow bug that caused a security breach. It almost happened with Toyota. The auto industry has a C coding convention for safety called MISRA C, and it was brought up in court as to why Toyota's acceleration problems were entirely their fault. You can bet this will be brought up again.
May 13
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 01:30:47 UTC, Jack Stouffer wrote:
 It almost happened with Toyota. The auto industry has a C 
 coding convention for safety called MISRA C, and it was brought 
 up in court as to why Toyota's acceleration problems were 
 entirely their fault. You can bet this will be brought up again.
1. Changing language won't change this, for that you need something that is formally proven (and even that assumes that the requirements spec is correct). I found this book from 2012 on industry use of formal methods which seems to be available on Google Books: https://books.google.no/books?id=E5sdDs00MuwC 2. What good does it do you to have your source code proven formally correct if your compiler can contain bugs? To get around that you need a formally verified compiler: http://compcert.inria.fr/ So... We are back to C again.
May 14
prev sibling parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 14 May 2017 at 01:30:47 UTC, Jack Stouffer wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:
 (a) Trust the programmer.
That's the first and most deadly mistake. Buffer overflows and null pointers alone have caused hundreds of millions of dollars of damages. I think we can say that this trust is misplaced.
I should have added that the C11 charter also says: <quote> 12. Trust the programmer, as a goal, is outdated in respect to the security and safety programming communities. While it should not be totally disregarded as a facet of the spirit of C, the C11 version of the C Standard should take into account that programmers need the ability to check their work. <endquote> In real terms though tools like ASAN and Valgrind if used from the start usually allow you to catch most of the issues. Most likely even better tools for C will come about in time.
 But I don't see how languages like D or Rust can replace C for 
 certain types of use cases.
Maybe you can argue for the use of C in embedded systems and in OS's, although I see no reason why Rust can't eventually overtake C there.
I think Rust is a promising language but I don't know enough about it to comment. My impression about Rust is that: a) Rust has a steep learning curve as a language. b) If you want to do things that C allows you to do, then Rust is no more safer than C. Regards
May 14
parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Sunday, 14 May 2017 at 10:10:41 UTC, Dibyendu Majumdar wrote:
 In real terms though tools like ASAN and Valgrind if used from 
 the start usually allow you to catch most of the issues. Most 
 likely even better tools for C will come about in time.
See Walter's comment earlier in this thread and my reply.
 I think Rust is a promising language but I don't know enough 
 about it to comment. My impression about Rust is that:

 a) Rust has a steep learning curve as a language.
So does C, if you're doing C "correctly".
 b) If you want to do things that C allows you to do, then Rust 
 is no more safer than C.
That's the entire bloody point isn't it? Maybe you shouldn't be doing a lot of the things that C allows you to do.
May 14
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 21:01:40 UTC, Jack Stouffer wrote:
 On Sunday, 14 May 2017 at 10:10:41 UTC, Dibyendu Majumdar wrote:
 b) If you want to do things that C allows you to do, then Rust 
 is no more safer than C.
That's the entire bloody point isn't it? Maybe you shouldn't be doing a lot of the things that C allows you to do.
Like building a graph? Sure, Rust is perfect if you can model your world like a tree, but that is usually not what you want if you are looking for performance. You could replace pointers with integer-ids, but that is just emulating pointers with a construct that may be harder to check for in an automated fashion. So that is not a good solution either.
May 14
prev sibling parent Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 14 May 2017 at 21:01:40 UTC, Jack Stouffer wrote:
 On Sunday, 14 May 2017 at 10:10:41 UTC, Dibyendu Majumdar wrote:
 b) If you want to do things that C allows you to do, then Rust 
 is no more safer than C.
That's the entire bloody point isn't it? Maybe you shouldn't be doing a lot of the things that C allows you to do.
Hi, I think you are missing the point. I am talking here about things you need to do rather than writing code just for the heck of it.
May 15
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
May 13
next sibling parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 14 May 2017 at 02:11:36 UTC, bachmeier wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
Hi - I think this point really is saying that the type system in C is for convenience only - ultimately if you as a programmer want to manipulate memory in a certain way then C assumes you know what you are doing and why. As I said C is really a high level assembler. Regards
May 14
parent bachmeier <no spam.net> writes:
On Sunday, 14 May 2017 at 09:56:18 UTC, Dibyendu Majumdar wrote:
 On Sunday, 14 May 2017 at 02:11:36 UTC, bachmeier wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar 
 wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
Hi - I think this point really is saying that the type system in C is for convenience only - ultimately if you as a programmer want to manipulate memory in a certain way then C assumes you know what you are doing and why. As I said C is really a high level assembler. Regards
I guess my point is that C only trusts programmers in one direction. You can go as low-level as you want, but it doesn't trust you to use more productive features when that is better (but it certainly gives you the tools to roll your own buggy, hard-to-share version of those features). D, C++, and Rust really do trust the programmer.
May 14
prev sibling parent qznc <qznc web.de> writes:
On Sunday, 14 May 2017 at 02:11:36 UTC, bachmeier wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
The C99 Rationale also says: "The Committee is content to let C++ be the big and ambitious language. While some features of C++ may well be embraced, it is not the Committee’s intention that C become C++." I read that as: C is mostly in preservation and fossilization mode. If you want new features look elsewhere. We will not rock the boat. That is probably a good thing. C has its niche and it is comfortable there. If you want to beat C, it will not fight back. The only problem is to convince the C programmers to move.
May 14
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
Hi, I think that comparing languages like D to C is not appropriate. C is a high level assembler and has different design goals. A useful document to refer to is: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1250.pdf In particular: (although note the addition of facet f, which echoes the sentiment that security is important) Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. For the Cx1 revision there is consensus to add a new facet f to the original list of facets. The new spirit of C can be summarized in phrases like: (a) Trust the programmer. (b) Don't prevent the programmer from doing what needs to be done. (c) Keep the language small and simple. (d) Provide only one way to do an operation. (e) Make it fast, even if it is not guaranteed to be portable. (f) Make support for safety and security demonstrable. Proverb e needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule. An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine.
If only the gcc and clang designers followed that rule. These <beeep> consider that undefined behaviour allows to break the code in any way they fancy (the nasal demon thing). While pragmaticists interpret it as do the thing that is the simplest to implement on that hardware. The most ridiculous example being the undefined behaviour of signed integer overflow. Signed integer overflow is undefined in C because some obscure platforms may not use 2 complements for the representation of integers. So INT_MAX+1 does not necessarily result in INT_MIN. But completely removing the code when one encounters for example: if(val+1 == INT_MIN) is simply nuts.
May 14
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 14.05.2017 11:42, Patrick Schluter wrote:
 ...
 (a) Trust the programmer.
 (b) Don't prevent the programmer from doing what needs to be done.
 (c) Keep the language small and simple.
 (d) Provide only one way to do an operation.
 (e) Make it fast, even if it is not guaranteed to be portable.
 (f) Make support for safety and security demonstrable.

 Proverb e needs a little explanation. The potential for efficient code
 generation is one of the most important strengths of C. To help ensure
 that no code explosion occurs for what appears to be a very simple
 operation, many operations are defined to be how the target machine's
 hardware does it rather than by a general abstract rule. An example of
 this willingness to live with what the machine does can be seen in the
 rules that govern the widening of char objects for use in expressions:
 whether the values of char objects widen to signed or unsigned
 quantities typically depends on which byte operation is more
 efficient on the target machine.
If only the gcc and clang designers followed that rule.
It's precisely what they do. You are blaming the wrong people.
 These <beeep>
 consider that undefined behaviour allows to break the code in any way
 they fancy (the nasal demon thing). While pragmaticists interpret it as
 do the thing that is the simplest to implement on that hardware.
Those "pragmaticists" cannot be trusted, therefore they are not programmers. Why do they matter?
 The
 most ridiculous example being the undefined behaviour of signed integer
 overflow. Signed integer overflow is undefined in C because some obscure
 platforms may not use 2 complements for the representation of integers.
 So INT_MAX+1 does not necessarily result in INT_MIN.
It's _undefined_, not implementation-defined or unspecified. Excerpt from the C standard:
 3.4.1
 1 implementation-defined behavior
   unspecified behavior where each implementation documents how the choice is
made
 ...
 3.4.3
 1 undefined behavior
   behavior, upon use of a nonportable or erroneous program construct or of
erroneous data,
   for which this International Standard imposes no requirements
 ...
 3.4.4
 1 unspecified behavior
 use of an unspecified value, or other behavior where this International
Standard provides
 two or more possibilities and imposes no further requirements on which is
chosen in any
 instance
 ...
What is it about "no requirements" that "pragmaticists" fail to understand? Not inventing artificial additional requirements is among the most pragmatic things to do.
 But completely removing the code when one encounters for example:
 if(val+1 == INT_MIN) is simply nuts.
Why? This is simple dead code elimination. The programmer clearly must have known that it is dead code and the compiler trusts the programmer. The programmer would _never_ break that trust and make a program evaluate INT_MAX+1 ! The corollary to 'trust the programmer' is 'blame the programmer'. Don't use C if you want to blame the compiler.
May 14
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 12:07:40 UTC, Timon Gehr wrote:
 On 14.05.2017 11:42, Patrick Schluter wrote:
 But completely removing the code when one encounters for 
 example:
 if(val+1 == INT_MIN) is simply nuts.
Why? This is simple dead code elimination. The programmer clearly must have known that it is dead code and the compiler trusts the programmer. The programmer would _never_ break that trust and make a program evaluate INT_MAX+1 !
Well, actually, it makes sense to issue a warning in C. But in C++ it makes less sense since meta-programming easily can generate such code without breaking the semantics of the program.
 The corollary to 'trust the programmer' is 'blame the 
 programmer'. Don't use C if you want to blame the compiler.
Oh well, there are lots of different checkers for C, so I guess it would be more like "don't blame the compiler, blame the verifier".
May 14
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
What does that snippet do ? What should it do?

int caca(void)
{
   for(int i=0xFFFFFFFF; i!=0x80000000; i++)
     printf("coucou");
}
May 14
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 16:44:10 UTC, Patrick Schluter wrote:
 What does that snippet do ? What should it do?

 int caca(void)
 {
   for(int i=0xFFFFFFFF; i!=0x80000000; i++)
     printf("coucou");
 }
Implicit coercion is a design bug in both C and D... :-P
May 14
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 19:10:05 UTC, Ola Fosheim Grøstad wrote:
 On Sunday, 14 May 2017 at 16:44:10 UTC, Patrick Schluter wrote:
 What does that snippet do ? What should it do?

 int caca(void)
 {
   for(int i=0xFFFFFFFF; i!=0x80000000; i++)
     printf("coucou");
 }
Implicit coercion is a design bug in both C and D... :-P
Of course the annoying part is that C allows 2s-complement notation for integer literals, so with warnings on: int i = 0xFFFFFFFF; // passes without warning. int i = 0xFFFFFFFFUL; // warning is issued.
May 14
prev sibling parent Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Sunday, 14 May 2017 at 09:42:05 UTC, Patrick Schluter wrote:
 But completely removing the code when one encounters for 
 example: if(val+1 == INT_MIN) is simply nuts.
Removing such code is precisely what dmd does: https://issues.dlang.org/show_bug.cgi?id=16268
May 14
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/5/2017 11:26 PM, Joakim wrote:
 Walter: I believe memory safety will kill C.
I can't find any definitive explanation of what the Wannacry exploit is. One person told me it was an overflow bug, another that it was truncation from converting 32 to 16 bits. Anyhow, the Wannacry disaster looks to be a very expensive lesson in using memory unsafe languages for critical software. I know Microsoft has worked for years to use their own C which is memory safer, apparently it is not enough. https://blogs.msdn.microsoft.com/martynl/2005/10/10/annotations-yet-more-help-finding-buffer-overflows/
May 16
next sibling parent Joakim <dlang joakim.fea.st> writes:
On Tuesday, 16 May 2017 at 15:19:54 UTC, Walter Bright wrote:
 On 5/5/2017 11:26 PM, Joakim wrote:
 Walter: I believe memory safety will kill C.
I can't find any definitive explanation of what the Wannacry exploit is. One person told me it was an overflow bug, another that it was truncation from converting 32 to 16 bits. Anyhow, the Wannacry disaster looks to be a very expensive lesson in using memory unsafe languages for critical software. I know Microsoft has worked for years to use their own C which is memory safer, apparently it is not enough. https://blogs.msdn.microsoft.com/martynl/2005/10/10/annotations-yet-more-help-finding-buffer-overflows/
I happened to be reading this blog post concerning the issue today: https://www.troyhunt.com/dont-tell-people-to-turn-off-windows-update-just-dont/ It links to this official MS page from a couple months ago, which lists several CVE entries, which explicitly say they're different exploits: https://technet.microsoft.com/en-us/library/security/ms17-010.aspx Googling for that security update turns up this script, which claims a buffer overflow, but that could be just one of the holes: https://github.com/RiskSense-Ops/MS17-010/blob/master/exploits/eternalblue/ms17_010_eternalblue.rb I don't believe MS has disclosed the exact exploits, so it would depend on someone reversing the updates and since there are so many, they're likely different types. For those like Scott who say C has survived this long, I say it isn't unprecedented for tech with fairly well-known design flaws to last much longer than it should, until a crisis springing from those flaws finally kills it off. People usually ignore the potential problems until it blows up in front of their face. I agree that this current constant security crisis, now that everything's on the internet, will kill off a lot of old tech, including C. It is one of the reasons IoT is currently stillborn. It is the biggest flaw in Android, where you're selling a billion+ mobile devices a year, and almost none of them get any security updates: https://arstechnica.com/gadgets/2017/05/op-ed-google-should-take-full-control-of-androids-security-updates/ It will get a lot worse before it gets better, because it has been neglected for so long. :|
May 16
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/16/17 11:19 AM, Walter Bright wrote:
 On 5/5/2017 11:26 PM, Joakim wrote:
 Walter: I believe memory safety will kill C.
I can't find any definitive explanation of what the Wannacry exploit is. One person told me it was an overflow bug, another that it was truncation from converting 32 to 16 bits. Anyhow, the Wannacry disaster looks to be a very expensive lesson in using memory unsafe languages for critical software. I know Microsoft has worked for years to use their own C which is memory safer, apparently it is not enough. https://blogs.msdn.microsoft.com/martynl/2005/10/10/annotations-yet-more-help-finding-buffer-overflows/
Scott: "I am skeptical of the claim that memory safety is going to kill [C] off because it has been known that this is not a memory safe language for decades." Dylan: "Do you think that maybe Walter and Andrei planted the memory safety topic just to try to kill C?" Scott: "You know that would be like them..." 1 week later: WanaCry. Both Walter and WanaCry start with W. Hm.... -Steve
May 16
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/16/2017 10:29 AM, Steven Schveighoffer wrote:
 1 week later: WanaCry.  Both Walter and WanaCry start with W. Hm....
No need to breed mosquitos to promote a cure for malaria :-)
May 16