www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Fantastic exchange from DConf

reply Joakim <dlang joakim.fea.st> writes:
Walter Bright: I firmly believe that memory safety is gonna be an 
absolute requirement moving forward, very soon, for programming 
language selection.

Scott Meyers: For, for what kinds of applications?

Walter: Anything that goes on the internet.

Scott: Uh, let me just, sort of as background, given the 
remaining popularity of C, unbelievable popularity of C, which is 
far from a memory-safe language, do you think that that... I'm 
having trouble reconciling the ongoing popularity of C with the 
claim that you're making that this is going to be an absolute 
requirement for programming languages going forward.

Walter: I believe memory safety will kill C.

Scott: ... Wow.
https://www.youtube.com/watch?v=_gfwk-zRwmk#t=8h35m18s

The whole exchange starts with a question at the 8h:33m mark and 
goes on for about 13 mins, worth listening to.

I agree with Walter that safety will be big going forward, should 
have been big already.
May 05
next sibling parent reply qznc <qznc web.de> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter Bright: I firmly believe that memory safety is gonna be 
 an absolute requirement moving forward, very soon, for 
 programming language selection.

 Scott Meyers: For, for what kinds of applications?

 Walter: Anything that goes on the internet.

 Scott: Uh, let me just, sort of as background, given the 
 remaining popularity of C, unbelievable popularity of C, which 
 is far from a memory-safe language, do you think that that... 
 I'm having trouble reconciling the ongoing popularity of C with 
 the claim that you're making that this is going to be an 
 absolute requirement for programming languages going forward.

 Walter: I believe memory safety will kill C.

 Scott: ... Wow.
 https://www.youtube.com/watch?v=_gfwk-zRwmk#t=8h35m18s

 The whole exchange starts with a question at the 8h:33m mark 
 and goes on for about 13 mins, worth listening to.

 I agree with Walter that safety will be big going forward, 
 should have been big already.
Hm, Sociomantic removes the live captures the next day? One request: Chop the panel discussion into one clip per question/topic, please. Alternatively, provide some means to easily jump to the start of each question.
May 06
parent reply Joakim <dlang joakim.fea.st> writes:
On Saturday, 6 May 2017 at 09:53:52 UTC, qznc wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 [...]
Hm, Sociomantic removes the live captures the next day? One request: Chop the panel discussion into one clip per question/topic, please. Alternatively, provide some means to easily jump to the start of each question.
Video of the exchange is now back up: https://www.youtube.com/watch?v=Lo6Q2vB9AAg#t=24m37s Question now starts at 22m:19s mark.
May 12
parent Nemanja Boric <4burgos gmail.com> writes:
On Friday, 12 May 2017 at 18:52:43 UTC, Joakim wrote:
 On Saturday, 6 May 2017 at 09:53:52 UTC, qznc wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 [...]
Hm, Sociomantic removes the live captures the next day? One request: Chop the panel discussion into one clip per question/topic, please. Alternatively, provide some means to easily jump to the start of each question.
Video of the exchange is now back up: https://www.youtube.com/watch?v=Lo6Q2vB9AAg#t=24m37s Question now starts at 22m:19s mark.
Oh no, my accent is terrible! Time to stand in front of a mirror and rehersal :-). When I said "outside community pressure", I meant "trends", but didn't make it clear then :(.
May 12
prev sibling next sibling parent reply thedeemon <dlang thedeemon.com> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
And then null safety will kill D. ;)
May 06
parent deadalnix <deadalnix gmail.com> writes:
On Saturday, 6 May 2017 at 17:59:38 UTC, thedeemon wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
And then null safety will kill D. ;)
I actually think this is more likely than memory safety killing C. Just because both are very important but D is just easier to kill than C for historical reasons.
May 09
prev sibling next sibling parent reply Jerry <hurricane hereiam.com> writes:
Anything that goes on the internet already has memory safety. The 
things that need it aren't written in C, there's a lot of 
programs out there that just don't require it. C won't be killed, 
there's too much already written in it. Sure there might be 
nothing new getting written in it but there will still be tons of 
software that needs to be maintained even if nothing new is being 
written in it. D also won't be that far behind it if that's truly 
the reason C gets 'killed'.

Anyways can't watch the discussion as it's private.
May 08
next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Monday, 8 May 2017 at 18:33:08 UTC, Jerry wrote:
 Anything that goes on the internet already has memory safety.
BS, a damn buffer overflow bug caused cloudflare to spew its memory all over the internet just a couple of months ago. Discussed here https://forum.dlang.org/post/bomiwvlcdhxfegvxxier forum.dlang.org These things still happen all the time. Especially when companies realize that transitioning from a Python/Ruby backend to a C++ one can save tens of thousands in server costs.
May 08
parent Jack Stouffer <jack jackstouffer.com> writes:
On Monday, 8 May 2017 at 19:37:05 UTC, Jack Stouffer wrote:
 ...
Wrong link https://forum.dlang.org/post/novsplitocprdvpookre forum.dlang.org
May 08
prev sibling next sibling parent Moritz Maxeiner <moritz ucworks.org> writes:
On Monday, 8 May 2017 at 18:33:08 UTC, Jerry wrote:
 Anything that goes on the internet already has memory safety.
Bait [1]?
 The things that need it aren't written in C
Except - of course - for virtually all of our entire digital infrastructure.
 there's a lot of programs out there that just don't require it.
Just not anything that may run on a system connected to the internet. [1] https://nvd.nist.gov/vuln/search/results?adv_search=false&form_type=basic&results_type=overview&search_type=all&query=remote+buffer+overflow
May 08
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via Digitalmars-d wrote:
 Anything that goes on the internet already has memory safety.
Is that a subtle joke, or are you being serious? A LOT of code out in the internet, both in infrastructure and as applications, run C code. And if you know the typical level of quality of a large C project written by 50-100 (or more) employees who have a rather high turnover, you should be peeing your pants right now. A frightening amount of C code both in infrastructure (by that I mean stuff like routers, switches, firewalls, core services like DNS, etc.) and in applications (application-level services like webservers, file servers, database servers, etc.) are literally riddled with buffer overflows, null pointer dereference bugs, off-by-1 string manipulations, and other such savorable things. Recently I've had the dubious privilege of being part of a department wide push on the part of my employer to audit our codebases (mostly C, with a smattering of C++ and other code, all dealing with various levels of network services and running on hardware expected to be "enterprise" quality and "secure") and fix security problems and other such bugs, with the help of some static analysis tools. I have to say that even given my general skepticism about the quality of so-called "enterprise" code, I was rather shaken not only to find lots of confirmation of my gut feeling that there are major issues in our codebase, but even more by just HOW MANY of them there are. An unsettlingly large percentage of bugs / problematic code is in the realm of not handling null pointers correctly. The simplest is checking for null correctly at the beginning of the function, but then proceeding to dereference the possibly-null pointer with wild abandon thereafter. This may seem like not such a big problem, until you realize that all it takes is for *one* of these literally *hundreds* of instances of wrong code to get exposed to a public interface, and you have a DDOS attack waiting for you in your glorious future. Another unsettlingly common problem is the off-by-1 error in string handling. Actually, the most unsettling thing in this area is the pervasiveness of strcpy() and strcat() -- even after decades of experience that these functions are inherently unsafe and should be avoided if at all possible. Yet they still appear with persistent frequency, introducing hidden vulnerabilities that people overlook because, oh well, we trust the guy who wrote it 'cos he's an expert C coder, so he must have already made sure it's actually OK. Unfortunately, upon closer inspection, there are actual bugs in a large percentage of such code. Next to this is strncpy(), the touted "safe" variant of strcpy / strcat, except that people keep writing this: strncpy(buf, src, sizeof(buf)); Quick, without looking: what's wrong with the above line of code? Not so obvious, huh? The problem is that strncpy is, in spite of being the "safe" version of strcpy, badly designed. It does not guarantee buf is null-terminated if src was too long to fit in buf! Next thing you know -- why, hello, unterminated string used to inject shellcode into your "secure" webserver! The "obvious" fix, of course, is to leave 1 byte for the \0 terminator: strncpy(buf, src, sizeof(buf)-1); Except that this is *still* wrong, because strncpy doesn't write a '\0' to the end. You have to manually put one there: strncpy(buf, src, sizeof(buf)-1); buf[sizeof(buf)-1] = '\0'; The second line there has a -1 that lazy/careless C coders often forget, so you end up *introducing* a buffer overrun in the name of fixing another. This single problem area (improper use of strncpy) accounts for a larger chunk of code I've audited than I dare to admit -- all just timebombs waiting for somebody to write an exploit for. Then there's the annoyingly common matter of checking for return codes. Walter has said this before, and he's spot on: 90% of C code out there ignore error codes where they shouldn't, so as soon as a normally-working syscall fails for whatever reason, the code cascades down a chain of unexpected control flow changes and ends in catastrophe. Or rather, in silent corruption of internal data because any signs that something has gone wrong was conveniently ignored by the caller, of course. And even when you *do* meticulously check for every single darn error code evah, it's so ridiculously easy to make a blunder: int my_func(mytype_t *input, outbuf_t *output_buf, char *buffer, int size) { /* Typical lazy way of null-checking (that will blow up * later) */ myhandle_t *h = input ? input->handle : 0; writer_t *w = output_buf ? output_buf->writer : 0; char *block = (char *)malloc(size); FILE *fp; int i; if (!buffer) return -1; /* typical useless error return code */ /* (also, memory leak) */ if (h->control_block) { /* oops, possible null deref */ fp = fopen("blah", "w"); if (!fp) return -1; /* oops, memory leak */ } if (w->buffered) { /* oops, possible null deref */ strncpy(buffer, input->data, size); /* oops, unterminated string */ if (w->write(buffer, size) != 0) /* hmm, is 0 the error status, or is it -1? */ /* also, what if w->write == null? */ { return -1; /* oops, memory leak AND file descriptor leak */ } } for (i = 0; i <= input->size; i++) { /* oops, off-by-1 error */ ... /* more nauseating nasty stuff here */ if (error) goto EXIT; ... /* ad nauseum */ } EXIT: if (fp) fclose(fp); /* oops, uninitialized ptr deref */ free(block); /* Typical lazy way of evading more tedious `if * (func(...) == error) goto EXIT;` style code, which * ends up being even more error-prone */ return error ? -1 : w->num_bytes_written(); /* oops, what if w or w->num_bytes_written is * null? */ } If you look hard enough, almost every line of C code has one potential problem or another. OK, I exaggerate, but in a large codebase written by 100 people, many of whom have since left the company for greener fields, code of this sort can be found everywhere. And nobody realizes just how bad it is, because everyone is too busy fixing pointer bugs in their own code to have time to read code written by somebody else, that doesn't directly concern them. Another big cause of bugs is C/C++'s lovely convention of not initializing local variables. Hello, random stack value that just so happens to be usually 0 when we test the code, but becomes something else when the customer runs it, and BOOM, the code tumbles onto an unexpected and disastrous chain of wrong steps. And you'd better be praying that this wasn't a pointer, 'cos you know what that means... if you're lucky, it's a random memory corruption that, for the most part, goes undetected until a customer with an obscure setup triggers a visible effect. Then you spend the next 6 months trying to trace the bug from the visible effect, which is long, long, past the actual cause. If you're not so lucky, though, this could result in leakage of unrelated memory locations (think Cloudbleed) or worse, arbitrary code execution. Hello, random remote hacker, welcome to DoD Nuclear Missile Control Panel. Who would you like to nuke today? C++ has a lovely footnote to add to this: class/struct members aren't initialized by default, so the ctor has to do it, *explicitly*. How many programmers aren't too lazy to just skip this part and trust that the methods will initialize whatever members need initializing? Keep in mind that a lot of C++ code out there has yet to catch up with the latest "correct" C++ coding style -- there's still way too much god-object classes with far too many fields than they should have, and of course, the guy who wrote the ctor was too lazy to initialize all of them explicitly. (You should be thankful already that he wasn't too lazy to skip writing the ctor altogether!) And inevitably, later on some method will read the uninitialized value and do something unexpected. This doesn't happen when you first write the class, of course. But 50 ex-employees later, 'tis a whole new landscape out there. These are some of the simpler common flaws I've come across. I've also seen several very serious bugs that could lead to actual remote exploits, if somebody tried hard enough to find a path to them from the outside. tl;dr: the C language simply isn't friendly towards memory-safe code. Most C coders (including myself, I'll admit, before this code audit) are unaware of just how bad it is, because over the years, we've accumulated a set of idioms of how to write safe code (and the scars to prove that, at least in *some* cases, they result in safer code). Too bad our accumulated wisdom isn't enough to prevent *all* of the blunders that we still regular commit, except now we're even less aware of them, because, after all, we are experienced C coders now, so surely we've outgrown such elementary mistakes! Due to past experience we've honed our eagle eyes to catch mistakes we've learned from... unfortunately, that also distracts us from *other* mistakes the language also happily lets slip by. It gives us a false sense of security that we could easily detect blunders just by eyeing the code carefully. I was under that false sense... until I participated in the audit, and discovered to my chagrin that there are a LOT of other bugs that I often fail to catch because I simply didn't have them in mind, or they were more subtle to detect, or just by pure habit I was looking in other directions and therefore missed an otherwise obvious problem spot. Walter is probably right that one of C's biggest blunders was to conflate arrays and pointers. I'd say 85-90% of the bugs I found were directly or indirectly caused by C arrays not carrying length information along with the reference to the data, of which the misuse of strncpy and off-by-1 errors in loops are prime examples. Another big part of C's unsafety is the legacy C library that contains far too many misdesigned safety-bombs like strcpy and strcat, that are there merely for legacy reasons but really ought to have been killed with fire 2 decades ago. You'd think people know better by now, but no, they STILL keep writing code that calls these badly designed functions... In this day and age of automated exploit-hunting bots, it's only a matter of time before somebody, or some*thing*, discovers that sending a certain kind of packet to a certain port on a certain firewall produces an unusual response... and pretty soon, somebody is worming his way into your supposedly secure local network and doing who knows what. And it's scary how much poorly-written C code is running on the targeted machine, and how much more poor C code is being written everyday, even today, for stuff that's going to be running on the backbone of the internet or on some wide-impact online applications (hello, Heartbleed!). Something's gonna give eventually. As I was participating in the code audit, I couldn't help thinking how many of D's features would have outright prevented a large percentage of the bugs I've found before the code found its way into production. 1) D arrays have length! Imagine that! This singlehandedly eliminates an entire class of bugs with one blow. No more strcpy/strncpy monstrosities. No more buffer overruns -- thanks to array bounds checks. Not to mention slices eliminating the need for the ubiquitous string copying in C/C++ that represents a not-often-thought-of background source of performance degradation. 2) D variables are initialized by default: this would have prevented all of the uninitialized value bugs I found. And where performance matters (hint: it usually doesn't matter where you think it does -- please fess up: which C coder here regularly uses a profiler? Probably only a minority.), you ask for =void explicitly. So when there's a bug involving uninitialized values later, the offending line is easily found. 3) Exceptions, love it or hate it, dispense with that horrible C idiom of repeatedly writing `if (func(...)!=OK) goto EXIT;` that's so horribly error-prone, and so non-DRY that the programmer is basically motivated to find an excuse NOT to write it that way (and thereby inevitably introduce a bug). Here I've to add that there's this perverse thought out there that C++ exceptions are "expensive", and so in the name of performance people would outlaw using try/catch blocks, using instead homegrown (and inevitably buggy -- and not necessarily less expensive) alternatives instead. All the while ignoring the fact that C/C++ arrays being what they are, too much array copying (esp. string copying) is happening where in D you'd just be taking a slice in O(1) time. 4) D switches would eliminate certain very nasty bugs that I've discovered involving some code assuming that a variable can only hold certain values, but it actually doesn't, in which case nasty problems happen. In D, a non-final switch requires a default case... seemingly onerous but prevents this class of bugs. Also, the deprecation of switch case fallthrough is a step in the right direction, along with goto switch for when you *want* fallthrough. Inadvertent fallthrough was one of the bugs I found that happens every so often -- but at the same time there were a number of false positives where fallthrough was intentional. D solves both problems by having the code explicitly document intent with goto case. 5) scope(exit) is also great for preventing resource leaks in a way that doesn't require "code at a distance" (very error prone). The one big class of issues that D doesn't solve is null pointer handling. Exceptions help to some extent by making it possible to do a check at the top and aborting immediately if something is null, but it's still possible to have some nasty null pointer bugs in D. Fortunately, D mostly dispenses with the need to directly manipulate pointers, so the surface area for bugs is somewhat smaller than in C (and older-style C++ code -- unfortunately still prevalent even today). A good part of the null pointer bugs I found were related to C's lack of closures -- D's delegates would obviate the need for direct pointer manipulation in this case, even if D still suffers from null handling issues. Anyway, the point of all this is that C/C++'s safety problems are very real, and C/C++ code is very widespread in online services and more C/C++ code is still being written every day for online services, and a lot of that code is still being affected by the lack of safety in C/C++. So safety problems in C/C++ are very relevant today, and will continue being a major concern in the near future. If we can complete the implementation of SafeD (and plug the existing holes in safe), it could have significant impact in this area. T -- Not all rumours are as misleading as this one.
May 08
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:


 	strncpy(buf, src, sizeof(buf));

 Quick, without looking: what's wrong with the above line of 
 code?

 Not so obvious, huh?  The problem is that strncpy is, in spite 
 of being the "safe" version of strcpy, badly designed. It does 
 not guarantee buf is null-terminated if src was too long to fit 
 in buf!  Next thing you know -- why, hello, unterminated string 
 used to inject shellcode into your "secure" webserver!

 The "obvious" fix, of course, is to leave 1 byte for the \0 
 terminator:

 	strncpy(buf, src, sizeof(buf)-1);

 Except that this is *still* wrong, because strncpy doesn't 
 write a '\0' to the end. You have to manually put one there:

 	strncpy(buf, src, sizeof(buf)-1);
 	buf[sizeof(buf)-1] = '\0';

 The second line there has a -1 that lazy/careless C coders 
 often forget, so you end up *introducing* a buffer overrun in 
 the name of fixing another.

 This single problem area (improper use of strncpy) accounts for 
 a larger chunk of code I've audited than I dare to admit -- all 
 just timebombs waiting for somebody to write an exploit for.
Adding to that, strncpy() is also a performance trap. strncpy will not stop when the input string is finished, it will fill the buffer up with 0. so char buff[4000]; strncpy(buff, "hello", sizeof buff); will write 4000 bytes on every call. The thing with strncpy() is that it's a badly named function. It is named as a string function but isn't a string function. Had it been named as memncpy() or something like that, it wouldn't confuse most C programmers. If I get my C lore right, the function was initially written for writing the file name in the Unix directory strncpy(dirent, filename, 14); or something like that.
May 09
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 	int my_func(mytype_t *input, outbuf_t *output_buf,
 	            char *buffer, int size)
 	{
 		/* Typical lazy way of null-checking (that will blow up
 		 * later) */
 		myhandle_t *h = input ? input->handle : 0;
 		writer_t *w = output_buf ? output_buf->writer : 0;
 		char *block = (char *)malloc(size);
Hey, you've been outed as a C++ programmer. A real C programmer never casts a void *. In that specific case, casting away the malloc() return can mask a nasty bug. If you have forgotten to include the header declaring the function, the compiler would assume an int returning function and the cast would suppress the righteous warning message of the compiler. On 64 bit machines the returned pointer would be truncated to the lower half. Unfortunately on Linux, as the heap starts in the lower 4 GiB of address space, the code would run for a long time before it crashed. On Solaris-SPARC it would crash directly as binaries are loaded address 0x1_0000_0000 of the address space.
 		FILE *fp;
 		int i;

 		if (!buffer)
 			return -1; /* typical useless error return code */
 				/* (also, memory leak) */

 		if (h->control_block) { /* oops, possible null deref */
 			fp = fopen("blah", "w");
 			if (!fp)
 				return -1; /* oops, memory leak */
 		}
 		if (w->buffered) { /* oops, possible null deref */
 			strncpy(buffer, input->data, size); /* oops, unterminated 
 string */
 			if (w->write(buffer, size) != 0)
 				/* hmm, is 0 the error status, or is it -1? */
 				/* also, what if w->write == null? */
Or is it inspired by fwrite, which returns the number of written records? In that case 0 return might be an error or not, depends on size.
 			{
 				return -1; /* oops, memory leak AND file
 						descriptor leak */
 			}
 		}
 		for (i = 0; i <= input->size; i++) {	/* oops, off-by-1 error 
 */
 			... /* more nauseating nasty stuff here */
 			if (error)
 				goto EXIT;
 			... /* ad nauseum */
 		}
 	EXIT:
 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
Worse, you didn't check the return of fclose() on writing FILE. fclose() can fail if the disk was full. As the FILE is buffered, the last fwrite might not have flushed it yet. So it is the fclose() that will try to write the last block and that can fail, but the app wouldn't be able to even report it.
 		free(block);

 		/* Typical lazy way of evading more tedious `if
 		 * (func(...) == error) goto EXIT;` style code, which
 		 * ends up being even more error-prone */
 		return error ? -1 : w->num_bytes_written();
 			/* oops, what if w or w->num_bytes_written is
 			 * null? */
 	}
May 09
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 08:18:09AM +0000, Patrick Schluter via Digitalmars-d
wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 
 	int my_func(mytype_t *input, outbuf_t *output_buf,
 	            char *buffer, int size)
 	{
 		/* Typical lazy way of null-checking (that will blow up
 		 * later) */
 		myhandle_t *h = input ? input->handle : 0;
 		writer_t *w = output_buf ? output_buf->writer : 0;
 		char *block = (char *)malloc(size);
Hey, you've been outed as a C++ programmer. A real C programmer never casts a void *. In that specific case, casting away the malloc() return can mask a nasty bug. If you have forgotten to include the header declaring the function, the compiler would assume an int returning function and the cast would suppress the righteous warning message of the compiler. On 64 bit machines the returned pointer would be truncated to the lower half. Unfortunately on Linux, as the heap starts in the lower 4 GiB of address space, the code would run for a long time before it crashed. On Solaris-SPARC it would crash directly as binaries are loaded address 0x1_0000_0000 of the address space.
Ouch. Haha, even I forgot about this particularly lovely aspect of C. Hooray, freely call functions without declaring them, and "obviously" they return int! Why not? There's an even more pernicious version of this, in that the compiler blindly believes whatever you declare a symbol to be, and the declaration doesn't even have to be in a .h file or anything even remotely related to the real definition. Here's a (greatly) reduced example (paraphrased from an actual bug I discovered): module.c: ------- int get_passwd(char *buf, int size); int func() { char passwd[100]; if (!get_passwd(buf, 100)) return -1; do_something(passwd); } passwd.c: --------- void get_passwd(struct user_db *db, struct login_record *rec) { ... // stuff } old_passwd.c: ------------- /* Please don't use this code anymore, it's deprecated. */ /* ^^^^ gratuitous useless comment */ int get_passwd(char *buf, int size) { ... /* old code */ } Originally, in the makefile, module.o is linked with libutil.so, which in turn is built from old_passwd.o and a bunch of other stuff. Later on, passwd.o was added to libotherutil.so, which was listed after libutil.so in the linker command, so the symbol conflict was masked because the linker found the libutil.so version of get_passwd first. Then one day, somebody changed the order of libraries in the makefile, and suddenly func() mysteriously starts malfunctioning because get_passwd now links to the wrong version of the function! Worse yet, the makefile was written to be "smart", as in, it uses wildcards to pick up .so files (y'know, us lazy programmers don't wanna have to manually type out the name of every library). So when somebody tried to fix this bug by removing old_passwd.o from libotherutil.so altogether, the bug was still happening in other developers' machines, because a stale copy of the old version of libotherutil.so was still left in their source tree, so when *they* built the executable, it contains the bug, but the bug vanishes when built from a fresh checkout. Who knows how many hours were wasted chasing after this heisenbug. [...]
 		if (w->buffered) { /* oops, possible null deref */
 			strncpy(buffer, input->data, size); /* oops, unterminated string */
 			if (w->write(buffer, size) != 0)
 				/* hmm, is 0 the error status, or is it -1? */
 				/* also, what if w->write == null? */
Or is it inspired by fwrite, which returns the number of written records? In that case 0 return might be an error or not, depends on size.
Yep, fwrite has an utterly lovely interface. The epitome of API design. :-D [...]
 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
Worse, you didn't check the return of fclose() on writing FILE. fclose() can fail if the disk was full. As the FILE is buffered, the last fwrite might not have flushed it yet. So it is the fclose() that will try to write the last block and that can fail, but the app wouldn't be able to even report it.
[...] Haha, you're right. NONE of the code I've ever dealt with even considers this case. None at all. In fact, I don't even remember the last time I've seen C code that bothers checking the return value of fclose(). Maybe I've written it *once* in my lifetime when I was young and nave, and actually bothered to notice the documentation that fclose() may sometimes fail. Even the static analysis tool we're using doesn't report it!! So again Walter was spot on: fill up the disk to 99% full, and 99% of C programs would start malfunctioning and showing all kinds of odd behaviours, because they never check the return code of printf, fprintf, or fclose, or any of a whole bunch of other syscalls that are regularly *assumed* to just work, when in reality they *can* fail. The worst part of all this is, this kind of C code is prevalent everywhere in C projects, including those intended for supposedly security-aware software. Basically, the language itself is just so unfriendly to safe coding practices that it's nigh impossible to write safe code in it. It's *theoretically* possible, certainly, but in practice nobody writes C code that way. It is a scary thought indeeed, how much of our current infrastructure relies on software running this kind of code. Something's gotta give, eventually. And it ain't gonna be pretty when it all starts crumbling down. T -- Caffeine underflow. Brain dumped.
May 09
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 08:18:09AM +0000, Patrick Schluter via
[...]
 Ouch.  Haha, even I forgot about this particularly lovely 
 aspect of C. Hooray, freely call functions without declaring 
 them, and "obviously" they return int! Why not?

 There's an even more pernicious version of this, in that the 
 compiler blindly believes whatever you declare a symbol to be, 
 and the declaration doesn't even have to be in a .h file or 
 anything even remotely related to the real definition. Here's a 
 (greatly) reduced example (paraphrased from an actual bug I 
 discovered):

 	module.c:
 	-------
 	int get_passwd(char *buf, int size);
yeah, this is a code smell. A not static declared function prototype in a C file. Raises the alarm bells automatically now. The same issue but much more frequent to observe, extern variable declaration in .c files. That one is really widespread and few see it as an anti-pattern. An extern global variable should always be put in the header file, never in the C file. Exactly for the same reason as your example with the wrong prototype below: non matching types the linker will join wrongly.
 	int func() {
 		char passwd[100];
 		if (!get_passwd(buf, 100)) return -1;
 		do_something(passwd);
 	}

 	passwd.c:
 	---------
 	void get_passwd(struct user_db *db, struct login_record *rec) {
 		... // stuff
 	}

 	old_passwd.c:
 	-------------
 	/* Please don't use this code anymore, it's deprecated. */
 	/* ^^^^ gratuitous useless comment */
 	int get_passwd(char *buf, int size) { ... /* old code */ }

 Originally, in the makefile, module.o is linked with 
 libutil.so, which in turn is built from old_passwd.o and a 
 bunch of other stuff. Later on, passwd.o was added to 
 libotherutil.so, which was listed after libutil.so in the 
 linker command, so the symbol conflict was masked because the 
 linker found the libutil.so version of get_passwd first.

 Then one day, somebody changed the order of libraries in the 
 makefile, and suddenly func() mysteriously starts 
 malfunctioning because get_passwd now links to the wrong 
 version of the function!

 Worse yet, the makefile was written to be "smart", as in, it 
 uses wildcards to pick up .so files (y'know, us lazy 
 programmers don't wanna have to manually type out the name of 
 every library).
Yeah, we also had makefiles using wildcards. It took a long time but I managed to get rid of them.
 So when somebody tried to fix this bug by removing old_passwd.o 
 from libotherutil.so altogether, the bug was still happening in 
 other developers' machines, because a stale copy of the old 
 version of libotherutil.so was still left in their source tree, 
 so when *they* built the executable, it contains the bug, but 
 the bug vanishes when built from a fresh checkout. Who knows 
 how many hours were wasted chasing after this heisenbug.


 [...]
 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
Worse, you didn't check the return of fclose() on writing FILE. fclose() can fail if the disk was full. As the FILE is buffered, the last fwrite might not have flushed it yet. So it is the fclose() that will try to write the last block and that can fail, but the app wouldn't be able to even report it.
[...] Haha, you're right. NONE of the code I've ever dealt with even considers this case. None at all. In fact, I don't even remember the last time I've seen C code that bothers checking the return value of fclose(). Maybe I've written it *once* in my lifetime when I was young and naïve, and actually bothered to notice the documentation that fclose() may sometimes fail. Even the static analysis tool we're using doesn't report it!!
I discovered that one only a few month ago. I have now around 30 places in our code base to fix. It's only important for writing FILE. Reading FILE can ignore the return values.
 So again Walter was spot on: fill up the disk to 99% full, and 
 99% of C programs would start malfunctioning and showing all 
 kinds of odd behaviours, because they never check the return 
 code of printf, fprintf, or fclose, or any of a whole bunch of 
 other syscalls that are regularly *assumed* to just work, when 
 in reality they *can* fail.

 The worst part of all this is, this kind of C code is prevalent 
 everywhere in C projects, including those intended for 
 supposedly security-aware software.  Basically, the language 
 itself is just so unfriendly to safe coding practices that it's 
 nigh impossible to write safe code in it.  It's *theoretically* 
 possible, certainly, but in practice nobody writes C code that 
 way.  It is a scary thought indeeed, how much of our current 
 infrastructure relies on software running this kind of code.  
 Something's gotta give, eventually. And it ain't gonna be 
 pretty when it all starts crumbling down.
Agreed. That's why I'm learning D now, it's probably the only language that will be able to replace progressively our C code base in a skunk work fashion. I wanted to do it already 5 or 6 years ago but couldn't as we were on Solaris/SPARC back then. Now that we have migrated to Linux-AMD64 there's not much holding us back. Oracle client is maybe still an issue though.
May 09
prev sibling parent reply Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
 Ouch.  Haha, even I forgot about this particularly lovely 
 aspect of C. Hooray, freely call functions without declaring 
 them, and "obviously" they return int! Why not?
To be fair, most of your complaints can be fixed by enabling compiler warnings and by avoiding the use of de-facto-deprecated functions (strnlen). The remaining problems theoretically shouldn't occur by disciplined use of commonly accepted C99 guidelines. But I agree that even then and with the use of sanitizers writing correct C code remains very hard.
May 09
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 11:09:27PM +0000, Guillaume Boucher via Digitalmars-d
wrote:
 On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
 Ouch.  Haha, even I forgot about this particularly lovely aspect of
 C.  Hooray, freely call functions without declaring them, and
 "obviously" they return int! Why not?
To be fair, most of your complaints can be fixed by enabling compiler warnings and by avoiding the use of de-facto-deprecated functions (strnlen).
The problem is that warnings don't work, because people ignore them. Everybody knows warnings shouldn't be ignored, but let's face it, when you make a 1-line code change and run make, and the output is 250 pages long (large project, y'know), any warnings that are buried somewhere in there won't even be noticed, much less acted on. In this sense I agree with Walter that warnings are basically useless, because they're not enforced. Either something is correct and compiles, or it should be an error that stops compilation. Anything else, and you start having people ignore warnings. Yes I know, there's gcc -Werror and the analogous flags for the other compilers, but in a sufficiently large project, -Werror is basically impractical because too much of legacy code will just refuse to compile, and it's not feasible to rewrite / waste time fixing it. As for avoiding de-facto-deprecated functions, I've already said it: *everybody* knows strcat is bad, and strcpy is bad, and so on and so forth. So how come I still see new C code being written almost every day that continues to use these functions? It's not that the coders refuse to cooperate... I've seen a lot of code in my project where people meticulously use strncpy instead of strcat / strcpy -- I presume out of the awareness that they are "bad". But when push comes to shove and there's a looming deadline, all scruples are thrown to the winds and people just take the path of least resistance. The mere fact that strcat and strcpy exist means that somebody, sometime, will use them, and usually to disastrous consequences. And *that's* the fundamental problem with C (and in the same principle, C++): the correct way to write code is also a very onerous, fragile, error-prone, and verbose way of writing code. The "obvious" and "easy" way to write C code is almost always the wrong way. The incentives are all wrong, and so there's a big temptation for people to cut corners and take the easy way out. It's much easier to write this: int myfunc(context_t *ctx) { data_desc_t *desc = ctx->data; FILE *fp = fopen(desc->filename, "w"); char *tmp = malloc(1000); strcpy(tmp, desc->data1); fwrite(tmp, strlen(tmp), 1, fp); strcpy(tmp, desc->data2); fwrite(tmp, strlen(tmp), 1, fp); strcpy(desc->cache, tmp); fclose(fp); free(tmp); return 0; } rather than this: int myfunc(context_t *ctx) { data_desc_t *desc; FILE *fp; char *tmp; size_t bufsz; if (!ctx) return INVALID_CONTEXT; desc = ctx->data; if (!desc->data1 || !desc->data2) return INVALID_ARGS; fp = fopen("blah", "w"); if (!fp) return CANT_OPEN_FILE; bufsz = desc->data1_len + desc->data2_len + 1; tmp = malloc(bufsz); if (!tmp) return OUT_OF_MEMORY; strncpy(tmp, desc->data1, bufsz); if (fwrite(tmp, strlen(tmp), 1, fp) != 1) { fclose(fp); unlink("blah"); return IO_ERROR; } strncpy(tmp, desc->data2, bufsz); if (fwrite(tmp, strlen(tmp), 1, fp) != 1) { fclose(fp); unlink("blah"); return IO_ERROR; } if (desc->cache) strncpy(desc->cache, tmp, sizeof(desc->cache)); if (fclose(fp) != 0) { WARN("I/O error"); free(tmp); return IO_ERROR; } free(tmp); return OK; } Most people would probably write something in between, which is neither completely wrong, nor completely right. But it works for 90% of the cases, and since it meets the deadline, it's "good enough". Notice how much longer and more onerous it is to write the "correct" version of the code than the easy way. A properly-designed language ought to reverse the incentives: the default, "easy" way to write code should be the "correct", safe, non-leaking way. Potentially unsafe, potentially resource-leaking behaviour should require work on the part of the coder, so that he'd only do it when there's a good reason for it (optimization, or writing system code that needs to go outside the confines of the default safe environment, etc.). In this respect, D scores much better than C/C++. Very often, the "easy" way to write something in D is also the correct way. It may not be the fastest way for the performance-obsessed premature-optimizing C hacker crowd (and I include myself among them), but it won't leak memory, overrun buffers, act on random stack values from uninitialized local variables, etc.. Your program is correct to begin with, which then gives you a stable footing to start working on improving its performance. In C/C++, your program is most likely wrong to begin with, so imagine what happens when you try to optimize that wrong code in typical C/C++ hacker premature optimization fashion. (Nevermind the elephant in the room that 80-90% of the "optimizations" C/C++ coders -- including myself -- have programmed into their finger reflexes are actually irrelevant at best, because either compilers already do those optimizations for you, or the hot spot simply isn't where we'd like to believe it is; or outright de-optimizing at worst, because we've successfully defeated the compiler's optimizer by writing inscrutable code.) Null dereference is one area where D does no better than C/C++, though even in that case, language features like closures help alleviate much of the kind of code that would otherwise need to deal with pointers directly. (Yes, I'm aware C++ now has closures... but most of the C++ code out in the industry -- and C++ coders themselves -- have a *long* ways to go before they can catch up with the latest C++ standards. Until then, it's lots of manual pointer manipulations that are ready to explode in your face anytime.)
 The remaining problems theoretically shouldn't occur by disciplined
 use of commonly accepted C99 guidelines.  But I agree that even then
 and with the use of sanitizers writing correct C code remains very
 hard.
That's another fundamental problem with the C/C++ world: coding by convention. We all know all too well that *if* we'd only abide by such-and-such coding guidelines and recommendations, our code would actually stand a chance of being correct, safe, non-leaking, etc.. However, the problem with conventions is that they are just that: conventions. They get broken all the time, with disastrous consequences. I used to believe in convention -- after all, who wouldn't want to be goodie-two-shoes coders who abides by all the rules so that they could take pride in their shiny, perfect code? Unfortunately, after almost 20 years working in the industry and seeing "enterprise" code that makes my eyes bleed, I've lost all confidence that conventions are of any help. I've seen code written by supposedly "renown" or "expert" C coders that represent some of the most repulsive, stomach-turning examples of antipatterns I've ever had the dubious pleasure of needing to debug. D's stance of static verifiability and compile-time guarantees is an oft under-appreciated big step in the right direction. In the long run, conventions will not solve anything; you need *enforcement*. The compiler has to be able to prove, at compile-time, that function X is actually pure, or nothrow, or safe, or whatever, for those things to have any value whatsoever. And for this to be possible, the language itself needs to have these notions built-in, rather than have it tacked on by an external tool (that people will be reluctant to use, or outright ignore, or iti doesn't work with their strange build system, target arch, or whatever). Sure, there are currently implementation bugs that make safe not quite so safe in some cases, or too much of Phobos is still safe-incompatible. But still, these are implementation quality issues. The concept itself is a sound and powerful one. A compiler-verified attribute is far more effective than any blind faith trust in convention ever will be, e.g., D's immutable vs. C++'s easy-to-cast-away const -- that we *trust* people won't attempt. Yes, I'm aware of bugs in the current implementation that allows you to bypass immutable, but still, it's a QoI issue. And yes, there are areas in the spec that have holes, etc.. But assuming these QoI issues and spec holes / inconsistencies are fixed, what we have is a powerful system that will actually deliver compile-time guarantees about memory safety, rather than a system of conventions that you can never be too sure that somebody somewhere didn't break, and therefore you can only *hope* that it is memory-safe. T -- Life is too short to run proprietary software. -- Bdale Garbee
May 09
next sibling parent reply "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
 In this sense I agree with Walter that warnings are basically useless,
 because they're not enforced. Either something is correct and compiles,
 or it should be an error that stops compilation. Anything else, and you
 start having people ignore warnings.
Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut". (Hogan's Compiler Heroes: "I see NUH-TING!!") And then the flip side is that some code smells are just to pedantic to justify breaking the build while the programmer is in the middle of some debugging or refactoring or some such. That puts me strongly in the philosophy of "Code containing warnings: Allowed while compiling, disallowed when committing (with allowances for mitigating circumstances)." C/C++ doesn't demonstrate that warnings are doomed to be useless and "always" ignored. What it demonstrates is that warnings are NOT an appropriate strategy for fixing language problems.
 As for avoiding de-facto-deprecated functions, I've already said it:
 *everybody* knows strcat is bad, and strcpy is bad, and so on and so
 forth.  So how come I still see new C code being written almost every
 day that continues to use these functions?  It's not that the coders
 refuse to cooperate... I've seen a lot of code in my project where
 people meticulously use strncpy instead of strcat / strcpy -- I presume
 out of the awareness that they are "bad".  But when push comes to shove
 and there's a looming deadline, all scruples are thrown to the winds and
 people just take the path of least resistance.  The mere fact that
 strcat and strcpy exist means that somebody, sometime, will use them,
 and usually to disastrous consequences.
The moral of this story: Sometimes, breaking people's code is GOOD! ;)
 And *that's* the fundamental problem with C (and in the same principle,
 C++): the correct way to write code is also a very onerous, fragile,
 error-prone, and verbose way of writing code. The "obvious" and "easy"
 way to write C code is almost always the wrong way.  The incentives are
 all wrong, and so there's a big temptation for people to cut corners and
 take the easy way out.
Damn straight :)
 (Nevermind the elephant in the room that 80-90% of the "optimizations"
 C/C++ coders -- including myself -- have programmed into their finger
 reflexes are actually irrelevant at best, because either compilers
 already do those optimizations for you, or the hot spot simply isn't
 where we'd like to believe it is; or outright de-optimizing at worst,
 because we've successfully defeated the compiler's optimizer by writing
 inscrutable code.)
C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.
 That's another fundamental problem with the C/C++ world: coding by
 convention.  We all know all too well that *if* we'd only abide by
 such-and-such coding guidelines and recommendations, our code would
 actually stand a chance of being correct, safe, non-leaking, etc..
Luckily, there IS a way to enforce that proper coding conventions are actually adhered to: It's called "compile-time error". :)
May 09
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky (Abscissa) via
Digitalmars-d wrote:
 On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
 
 In this sense I agree with Walter that warnings are basically
 useless, because they're not enforced. Either something is correct
 and compiles, or it should be an error that stops compilation.
 Anything else, and you start having people ignore warnings.
 
Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut". (Hogan's Compiler Heroes: "I see NUH-TING!!")
I'd much rather the compiler say "Hey, you! This piece of code is probably wrong, so please fix it! If it was intentional, please write it another way that makes that clear!" - and abort with a compile error. This is actually one of the things I like about D. For example, if you wrote: switch (e) { case 1: return "blah"; case 2: return "bluh"; } the compiler will refuse to compile the code until you either add a default case, or make it a final switch (in which case the compiler will refuse the compile the code unless every possible case is in fact covered). Now imagine if this was merely a warning that people could just ignore. Yep, we're squarely back in good ole C/C++ land, where an unexpected value of e causes the code to amble down an unexpected path, with the consequent hilarity that ensues. IOW, it should not be possible to write tricky stuff by default; you should need to ask for it explicitly so that intent is clear. Another switch example: switch (e) { case 1: x = 2; case 2: x = 3; default: x = 4; } In C, the compiler happily compiles the code for you. In D, at least the latest dmd will give you deprecation warnings (and presumably, in the future, actual compile errors) for forgetting to write `break;`. But if the fallthrough was intentional, you document that with an explicit `goto case ...`. IOW, the default behaviour is the safe one (no fallthrough), and the non-default behaviour (fallthrough) has to be explicitly asked for. Much, much better.
 And then the flip side is that some code smells are just to pedantic
 to justify breaking the build while the programmer is in the middle of
 some debugging or refactoring or some such.
 
 That puts me strongly in the philosophy of "Code containing warnings:
 Allowed while compiling, disallowed when committing (with allowances
 for mitigating circumstances)."
I'm on the fence about the former. My current theory is that being forced to write "proper" code even while refactoring actually helps the quality of the resulting code. But I definitely agree that code with warnings should never make it into the code repo. The problem is that it's not enforced by the compiler, so *somebody* somewhere will inevitably bypass it.
 C/C++ doesn't demonstrate that warnings are doomed to be useless and
 "always" ignored. What it demonstrates is that warnings are NOT an
 appropriate strategy for fixing language problems.
Point. I suppose YMMV, but IME unless warnings are enforced with -Werror or equivalent, after a while people just stop paying attention to them, at least where I work. It's entirely possible that it's a bias specific to my job, but somehow I have a suspicion that this isn't completely the case. Humans tend to be lazy, and ignoring compiler warnings is rather high up on the list of things lazy people tend to do. The likelihood increases with the presence of other factors like looming deadlines, unreasonable customer requests, ambiguous feature specs handed down from the PTBs, or just plain having too much on your plate to be bothering with "trivialities" like fixing compiler warnings. That's why my eventual conclusion is that anything short of enforcement will ultimately fail. Unless there is no way you can actually get an executable out of badly-written code, there will always be *somebody* out there that will write bad code. And by Murphy's Law, that somebody will eventually be someone in your team, and chances are you'll be the one cleaning up the mess afterwards. Not something I envy doing (I've already had to do too much of that). [...]
 The moral of this story: Sometimes, breaking people's code is GOOD! ;)
Tell that to Walter / Andrei. ;-) [...]
 (Nevermind the elephant in the room that 80-90% of the
 "optimizations" C/C++ coders -- including myself -- have programmed
 into their finger reflexes are actually irrelevant at best, because
 either compilers already do those optimizations for you, or the hot
 spot simply isn't where we'd like to believe it is; or outright
 de-optimizing at worst, because we've successfully defeated the
 compiler's optimizer by writing inscrutable code.)
C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.
LOL!! Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers don't use a profiler, and don't *want* to use a profiler, because they're either ignorant that such things exist (unlikely), or they're too dang proud to admit that their painfully-accumulated preconceptions about optimization might possibly be wrong. Or maybe my perceptions are just heavily colored by the supposedly "expert" C coders I've met, who wrote supposedly better code that I eventually realized was actually not better, but in many ways actually worse -- less readable, less maintainable, more error-prone to write, and at the end of the day arguably less performant because it ultimately led to far too much boilerplate and other sources of code bloat, excessive string copying, too much indirection (cache unfriendliness), and other such symptoms that C coders often overlook. (And meanwhile, the mere mention of the two letters "G C" and they instantly recoil, and rattle of an interminable list of 20-years-outdated GC-phobic excuses, preferring rather to die the death of a thousand pointer bugs (and memory leaks, and overrun buffers) than succumb to the Java of the early 90's with its klunky, poorly-performing GC of spotted repute that has long since been surpassed. And of course, any mention of any evidence that Java *might* actually perform better than poorly-written C code in some cases will incite instant vehement denial. After all, how can an "interpreted" language possibly outperform poorly-designed, over-engineered C scaffolding that necessitates far too much excessive buffer copying and destroys cache coherence with far too many unnecessary indirections? Inconceivable!)
 That's another fundamental problem with the C/C++ world: coding by
 convention.  We all know all too well that *if* we'd only abide by
 such-and-such coding guidelines and recommendations, our code would
 actually stand a chance of being correct, safe, non-leaking, etc..
Luckily, there IS a way to enforce that proper coding conventions are actually adhered to: It's called "compile-time error". :)
Exactly. Not compiler warnings... :-D T -- You have to expect the unexpected. -- RL
May 09
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 5/10/17 8:28 AM, H. S. Teoh via Digitalmars-d wrote:
 C++'s fundamental paradigm has always been "Premature-optimization
 oriented programming". C++ promotes POOP.
LOL!! Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers don't use a profiler, and don't *want* to use a profiler, because they're either ignorant that such things exist (unlikely), or they're too dang proud to admit that their painfully-accumulated preconceptions about optimization might possibly be wrong. Or maybe my perceptions are just heavily colored by the supposedly "expert" C coders I've met, who wrote supposedly better code that I eventually realized was actually not better, but in many ways actually worse -- less readable, less maintainable, more error-prone to write, and at the end of the day arguably less performant because it ultimately led to far too much boilerplate and other sources of code bloat, excessive string copying, too much indirection (cache unfriendliness), and other such symptoms that C coders often overlook.
Just to add a different perspective - the people I work with is the kind of guys who know when not to trust the profiler and what to try if the profile is flat. There is no question raised that you should run it, it's just assumed you always do. P.S. Can't wait to see "Are we fast yet?" graph for Phobos functions. --- Dmitry Olshansky
May 10
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky 
 (Abscissa) via Digitalmars-d wrote:
 On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
 
 In this sense I agree with Walter that warnings are 
 basically useless, because they're not enforced. Either 
 something is correct and compiles, or it should be an error 
 that stops compilation. Anything else, and you start having 
 people ignore warnings.
 
Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut". (Hogan's Compiler Heroes: "I see NUH-TING!!")
I'd much rather the compiler say "Hey, you! This piece of code is probably wrong, so please fix it! If it was intentional, please write it another way that makes that clear!" - and abort with a compile error. This is actually one of the things I like about D. For example, if you wrote: switch (e) { case 1: return "blah"; case 2: return "bluh"; } the compiler will refuse to compile the code until you either add a default case, or make it a final switch (in which case the compiler will refuse the compile the code unless every possible case is in fact covered). Now imagine if this was merely a warning that people could just ignore. Yep, we're squarely back in good ole C/C++ land, where an unexpected value of e causes the code to amble down an unexpected path, with the consequent hilarity that ensues. IOW, it should not be possible to write tricky stuff by default; you should need to ask for it explicitly so that intent is clear. Another switch example: switch (e) { case 1: x = 2; case 2: x = 3; default: x = 4; } In C, the compiler happily compiles the code for you. In D, at least the latest dmd will give you deprecation warnings (and presumably, in the future, actual compile errors) for forgetting to write `break;`. But if the fallthrough was intentional, you document that with an explicit `goto case ...`. IOW, the default behaviour is the safe one (no fallthrough), and the non-default behaviour (fallthrough) has to be explicitly asked for. Much, much better.
 And then the flip side is that some code smells are just to 
 pedantic to justify breaking the build while the programmer is 
 in the middle of some debugging or refactoring or some such.
 
 That puts me strongly in the philosophy of "Code containing 
 warnings: Allowed while compiling, disallowed when committing 
 (with allowances for mitigating circumstances)."
I'm on the fence about the former. My current theory is that being forced to write "proper" code even while refactoring actually helps the quality of the resulting code. But I definitely agree that code with warnings should never make it into the code repo. The problem is that it's not enforced by the compiler, so *somebody* somewhere will inevitably bypass it.
 C/C++ doesn't demonstrate that warnings are doomed to be 
 useless and "always" ignored. What it demonstrates is that 
 warnings are NOT an appropriate strategy for fixing language 
 problems.
Point. I suppose YMMV, but IME unless warnings are enforced with -Werror or equivalent, after a while people just stop paying attention to them, at least where I work. It's entirely possible that it's a bias specific to my job, but somehow I have a suspicion that this isn't completely the case. Humans tend to be lazy, and ignoring compiler warnings is rather high up on the list of things lazy people tend to do. The likelihood increases with the presence of other factors like looming deadlines, unreasonable customer requests, ambiguous feature specs handed down from the PTBs, or just plain having too much on your plate to be bothering with "trivialities" like fixing compiler warnings. That's why my eventual conclusion is that anything short of enforcement will ultimately fail. Unless there is no way you can actually get an executable out of badly-written code, there will always be *somebody* out there that will write bad code. And by Murphy's Law, that somebody will eventually be someone in your team, and chances are you'll be the one cleaning up the mess afterwards. Not something I envy doing (I've already had to do too much of that). [...]
 The moral of this story: Sometimes, breaking people's code is 
 GOOD! ;)
Tell that to Walter / Andrei. ;-) [...]
 (Nevermind the elephant in the room that 80-90% of the 
 "optimizations" C/C++ coders -- including myself -- have 
 programmed into their finger reflexes are actually 
 irrelevant at best, because either compilers already do 
 those optimizations for you, or the hot spot simply isn't 
 where we'd like to believe it is; or outright de-optimizing 
 at worst, because we've successfully defeated the compiler's 
 optimizer by writing inscrutable code.)
C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.
LOL!! Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers don't use a profiler, and don't *want* to use a profiler, because they're either ignorant that such things exist (unlikely), or they're too dang proud to admit that their painfully-accumulated preconceptions about optimization might possibly be wrong.
The likelihood of a randomly picked C/C++ programmer not even knowing what a profiler is, much less having used one, is extremely high in my experience. I worked with a lot of embedded C programmers with several years of experience who knew nothing but embedded C. We're talking dozens of people here. Not one of them had ever used a profiler. In fact, a senior developer (now tech lead) doubted I could make our build system any faster. I did by 2 orders of magnitude. When I presented the result to him he said in disbelief: "But, how? I mean, if it's doing exactly the same thing, how can it be faster?". Big O? Profiler? What are those? I actually stood there for a few seconds with my mouth open because I didn't know what to say back to him. These people are also likely to raise concerns about performance during code review despite having no idea what a cache line is. They still opine that one shouldn't add another function call for readability because that'll hurt performance. No need to measure anything, we all know calling functions is bad, even when they're in the same file and the callee is `static`. I think a lot of us underestimate just how bad the "average" developer is. A lot of them write C code, which is like giving chainsaws to chimpanzees.
 (And meanwhile, the mere mention of the two letters "G C" and 
 they instantly recoil, and rattle of an interminable list of
That's cognitive dissonance: there's not much anyone can do about that. Unfortunately, facts don't matter, feelings do. Atila
May 10
next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 10 May 2017 at 11:16:57 UTC, Atila Neves wrote:
[...]
 The likelihood of a randomly picked C/C++ programmer not even 
 knowing what a profiler is, much less having used one, is 
 extremely high in my experience. I worked with a lot of 
 embedded C programmers with several years of experience who 
 knew nothing but embedded C. We're talking dozens of people 
 here. Not one of them had ever used a profiler.
I've worked 10 years in embedded (industry, time acquisition and network gears) and I can say that there is a good reason to that. It's nearly impossible to profile in an embedded system (nowadays it's often possible because of the generalization of Linux and gnu tools but at that time it wasn't). The tools don't exist or if they do, the instrumentation breaks the constraints of the controller. This was also one of the reason we chose our embedded CPU's very carefully. We always chose processors for which there existed mainstream desktop versions so that we could at least use the confortable tooling to test some parts of the code on a nice environment. We used Z80 (CP/M), 80186 (MS-C on DOS) and then 68030 (Pure-C on Atari TT). TL;DR profiling for embedded is order of magnitudes harder than for nice OS environments.
May 10
next sibling parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 12:18:40 UTC, Patrick Schluter wrote:
 On Wednesday, 10 May 2017 at 11:16:57 UTC, Atila Neves wrote:
 [...]
 The likelihood of a randomly picked C/C++ programmer not even 
 knowing what a profiler is, much less having used one, is 
 extremely high in my experience. I worked with a lot of 
 embedded C programmers with several years of experience who 
 knew nothing but embedded C. We're talking dozens of people 
 here. Not one of them had ever used a profiler.
I've worked 10 years in embedded (industry, time acquisition and network gears) and I can say that there is a good reason to that. It's nearly impossible to profile in an embedded system (nowadays it's often possible because of the generalization of Linux and gnu tools but at that time it wasn't). The tools don't exist or if they do, the instrumentation breaks the constraints of the controller. This was also one of the reason we chose our embedded CPU's very carefully. We always chose processors for which there existed mainstream desktop versions so that we could at least use the confortable tooling to test some parts of the code on a nice environment. We used Z80 (CP/M), 80186 (MS-C on DOS) and then 68030 (Pure-C on Atari TT). TL;DR profiling for embedded is order of magnitudes harder than for nice OS environments.
That doesn't mean they shouldn't know what a profiler is. The response would then be (assuming they're competent) "I wish I could use a profiler, but I can't because...", not "how can two programs output the same thing in different amounts of time". Also, there's a good way around this sort of thing and it applies to testing as well: run the tools on a development machine (and the tests). Write portable standards-compliant code, make a thin wrapper where needed and suddendly you can write tests easily, run valgrind, use address sanitizer, ... There's no good reason why you can't profile pure algorithms: C code is C code and has specified semantics whether it's running on a dev machine or a controller. The challenge is to write mostly pure code with thin IO wrappers. It's always a win/win though. Atila
May 10
prev sibling parent Adrian Matoga <dlang.spam matoga.info> writes:
On Wednesday, 10 May 2017 at 12:18:40 UTC, Patrick Schluter wrote:
 On Wednesday, 10 May 2017 at 11:16:57 UTC, Atila Neves wrote:
 [...]
 The likelihood of a randomly picked C/C++ programmer not even 
 knowing what a profiler is, much less having used one, is 
 extremely high in my experience. I worked with a lot of 
 embedded C programmers with several years of experience who 
 knew nothing but embedded C. We're talking dozens of people 
 here. Not one of them had ever used a profiler.
I've worked 10 years in embedded (industry, time acquisition and network gears) and I can say that there is a good reason to that. It's nearly impossible to profile in an embedded system (nowadays it's often possible because of the generalization of Linux and gnu tools but at that time it wasn't). The tools don't exist or if they do, the instrumentation breaks the constraints of the controller. This was also one of the reason we chose our embedded CPU's very carefully. We always chose processors for which there existed mainstream desktop versions so that we could at least use the confortable tooling to test some parts of the code on a nice environment. We used Z80 (CP/M), 80186 (MS-C on DOS) and then 68030 (Pure-C on Atari TT). TL;DR profiling for embedded is order of magnitudes harder than for nice OS environments.
IMO it's just different. The thing is, the tools you can use don't need to be marketed as "profilers". People will always find excuses if they lack time, will or knowledge. In practice, there's always a way to profile and debug, even if you don't have dedicated tools for it. It's also a lot easier to reason about performance on small chips with no caches, ILP, etc. and with fixed instruction timing, than it is on modern complex CPUs with hundreds of tasks competing for resources. One universal tool is oscilloscope, for sure you have one on your colleague's desk if you really do embedded stuff. A common way to profile on home computers from the '80s such as Atari XE (6502), was simply to change screen colors. That way you always knew the time taken by the measured code with 1-cycle precision. 13.5 scanlines are white? That's 1539 cycles! The time it took to execute a tight loop could even be computed accurately with pen and paper by just looking at the assembly. It's also a lot easier to implement a cycle-exact emulator for such simple chips, and then you can measure everything without observer effect.
May 14
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 11:16:57AM +0000, Atila Neves via Digitalmars-d wrote:
[...]
 The likelihood of a randomly picked C/C++ programmer not even knowing
 what a profiler is, much less having used one, is extremely high in my
 experience.  I worked with a lot of embedded C programmers with
 several years of experience who knew nothing but embedded C. We're
 talking dozens of people here. Not one of them had ever used a
 profiler. In fact, a senior developer (now tech lead) doubted I could
 make our build system any faster. I did by 2 orders of magnitude.
Very nice! Reminds me of an incident many years ago where I "optimized" a shell script that took >2 days to generate a report by rewriting it Perl, which produced the report in 2 mins. (Don't ask why somebody thought it was a good idea to write a report generation script as a *shell script*, of all things. You really do not want to know.)
 When I presented the result to him he said in disbelief: "But, how? I
 mean, if it's doing exactly the same thing, how can it be faster?".
 Big O?  Profiler? What are those? I actually stood there for a few
 seconds with my mouth open because I didn't know what to say back to
 him.
Glad to hear I'm not the only one faced with senior programmers who show surprising ignorance in matters you'd think they really ought to know like the back of their hand.
 These people are also likely to raise concerns about performance
 during code review despite having no idea what a cache line is. They
 still opine that one shouldn't add another function call for
 readability because that'll hurt performance. No need to measure
 anything, we all know calling functions is bad, even when they're in
 the same file and the callee is `static`.
Yep, typical C coder premature optimization syndrome. I would not be surprised if today there's still a significant number of C coders who believe that writing "i++;" is faster than writing "i=i+1;". Ironically, these same people would also come up with harebrained schemes of avoiding something they are prejudiced against, like C++ standard library string types, while ignoring the cost of needing to constantly call O(n) algorithms for string processing (strlen, strncpy, etc.). I remember many years ago when I was still young and nave, in one of projects, I spent days micro-optimizing my code to eliminate every last CPU cycle I could from my linked-list type, only to discover to my chagrin that the bottleneck was nowhere near it -- it was caused by a debugging fprintf() that I had forgotten to take out. And I had only found this out because I finally conceded to run a profiler. That was when this amazing concept finally dawned on me that I could possibly be *wrong* about my ideas of performant code, imagine that! (Of course, then later on I discovered that my meticulously optimized linked list was ultimately worthless, because it has O(n) complexity, whereas had I just used a library type instead, I could've had O(log n) complexity. But I had dismissed the library type because it was obviously "too complex" to possibly be performant enough for my oh-so-performance-critical code. (Ahem. It was a *game*, and not even a good one. But it absolutely needed every last drop of juice I could squeeze from the CPU. Oh yes.))
 I think a lot of us underestimate just how bad the "average" developer
 is. A lot of them write C code, which is like giving chainsaws to
 chimpanzees.
[...] Hmm. What would giving them D be equivalent to, then? :-D T -- If you're not part of the solution, you're part of the precipitate.
May 10
parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 18:58:35 UTC, H. S. Teoh wrote:
 On Wed, May 10, 2017 at 11:16:57AM +0000, Atila Neves via 
 Digitalmars-d wrote: [...]
 [...]
Very nice! Reminds me of an incident many years ago where I "optimized" a shell script that took >2 days to generate a report by rewriting it Perl, which produced the report in 2 mins. (Don't ask why somebody thought it was a good idea to write a report generation script as a *shell script*, of all things. You really do not want to know.) [...]
 Hmm. What would giving them D be equivalent to, then? :-D
I'm not sure! If I knew you were going to ask that I'd probably have picked a different analogy ;) Atila
May 10
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky
[...]
 Perhaps I'm just being cynical, but my current unfounded 
 hypothesis is that the majority of C/C++ programmers ...
Just a nitpick, could we also please stop conflating C and C++ programmers? My experience is that C++ programmer are completely clueless when it comes to C programming? They think they know C but it's generally far away. The thing is, that C has evolved with C99 and C11 and the changes have not all been adopted by C++ (and Microsoft actively stalling the adoption of C99 in Visual C didn't help either).
May 10
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 12:06:46PM +0000, Patrick Schluter via Digitalmars-d
wrote:
 On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky
[...]
 Perhaps I'm just being cynical, but my current unfounded hypothesis
 is that the majority of C/C++ programmers ...
Just a nitpick, could we also please stop conflating C and C++ programmers? My experience is that C++ programmer are completely clueless when it comes to C programming? They think they know C but it's generally far away. The thing is, that C has evolved with C99 and C11 and the changes have not all been adopted by C++ (and Microsoft actively stalling the adoption of C99 in Visual C didn't help either).
OK, I'll try to stop conflating them... but the main reason for that is because I find myself stuck in-between the two, having started myself on C (well, assembly before that, but anyway) then moved on to C++, only to grow skeptical of C++'s direction of development and eventually settling on a hybrid of the two commonly known as "C with classes" (i.e., a dialect of C++ without some of what I consider to be poorly-designed features). Recently, though, I've mostly been working on pure C because of my job. I used to still use "C with classes" in my own projects but after I found D, I'd essentially swore myself off ever using C++ in my own projects again. My experience reviewing the C++ code that comes up every now and then at work, though, tells me that the average typical C++ programmer is probably worse than the average typical C programmer when it comes to code quality. And C++ gives you just so many more ways to shoot yourself in the foot. The joke used to go that C gives you many ways to shoot yourself in the foot, but C++ gives you many ways to shoot yourself in the foot and then encapsulate all the evidence away, all packaged in one convenient wrapper. (And don't get me started on C++ "experts" who invent extravagantly over-engineered class hierarchies that nobody can understand and 90% of which is actually completely irrelevant to the task at hand, resulting in such abysmal performance that people just bypassed the whole thing in the first place and revert to copy-pasta-ism and using C hacks in C++ code, causing double the carnage. Once I had to invent a stupendous hack to bridge a C++ daemon with a C module whose owners flatly refused to link in any C++ libraries. The horrendous result had 7 layers of abstraction just to make a single function call, one of which involved fwrite()-ing function arguments to a file, fork-and-exec'ing, and fread()-ing it from the other end. Why didn't I just open a socket to the daemon directly? Because the ridiculously over-engineered daemon only understands the reverse-encrypted Klingon protocol spoken by a makefile-generated IPC wrapper file containing 2000 procedurally-generated templates (I kid you not, I'm not talking about 2000 instantiations of one template, I'm talking about 2000 templates which are themselves procedurally generated), and the only way you could speak this protocol was to use the resultant ridiculously bloated C++ library. Which the PTBs have dictated that I cannot link into the C module. What else was a man to do?) T -- Try to keep an open mind, but not so open your brain falls out. -- theboz
May 10
prev sibling parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/10/2017 08:06 AM, Patrick Schluter wrote:
 On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
 On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky
[...]
 Perhaps I'm just being cynical, but my current unfounded hypothesis is
 that the majority of C/C++ programmers ...
Just a nitpick, could we also please stop conflating C and C++ programmers? My experience is that C++ programmer are completely clueless when it comes to C programming? They think they know C but it's generally far away. The thing is, that C has evolved with C99 and C11 and the changes have not all been adopted by C++ (and Microsoft actively stalling the adoption of C99 in Visual C didn't help either).
I wouldn't know the difference all that well anyway. Aside from a brief stint playing around with the Marmalade engine, the last time I was still really using C *or* C++ was back when C++ *did* mean little more than "C with classes" (and there was this new "templates" thing that was considered best avoided for the time being because all the implementations were known buggy). I left them when I could tell the complexity of getting things done (in either) was falling way behind the modern curve, and there were other languages which offered sane productivity without completely sacrificing low-level capabilities.
May 11
prev sibling parent reply "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/10/2017 02:28 AM, H. S. Teoh via Digitalmars-d wrote:
 I'd much rather the compiler say "Hey, you! This piece of code is
 probably wrong, so please fix it! If it was intentional, please write it
 another way that makes that clear!" - and abort with a compile error.
In the vast majority of cases, yes, I agree. But I've seen good ideas of useful heads-ups the compiler *could* provide get shot down in favor of silence because making it an error would, indeed, be a pedantic pain. As I see it, an argument against warnings is an argument against lint tools. And lint messages are *less* likely to get heeded, because the user has to actually go ahead and bother to install and run them.
 That puts me strongly in the philosophy of "Code containing warnings:
 Allowed while compiling, disallowed when committing (with allowances
 for mitigating circumstances)."
I'm on the fence about the former. My current theory is that being forced to write "proper" code even while refactoring actually helps the quality of the resulting code.
I find anything too pedantic to be an outright error will *seriously* get in my way and break my workflow on the task at hand when I'm dealing with refactoring, debugging, playing around with an idea, etc., if I'm required to compulsively "clean them all up" at every little step along the way (it'd be like working with my mother hovering over my shoulder...). And that's been the case even for things I would normally want to be informed of. Dead/unreachable code and unused variables are two examples that come to mind.
 The problem is that
 it's not enforced by the compiler, so *somebody* somewhere will
 inevitably bypass it.
I never understood the "Some people ignore it, therefore it's good to remove it and prevent anyone else from ever benefiting" line of reasoning. I don't want all "caution" road signs ("stop sign ahead", "hidden driveway", "speed limit decreases ahead", etc) all ripped out of the ground and tossed just because there are some jackasses who ignore them and cause trouble. Bad things happen when people ignore road signs, and they do ignore road signs, therefore let's get rid of road signs. That wouldn't make any shred of sense, would it? It's the same thing here: I'd rather have somebody somewhere bypass that enforcement than render EVERYONE completely unable to benefit from it, ever. When the compiler keeps silent about a code smell instead of emitting a waring, that's exactly the same as emitting a warning but *requiring* that *everybody* *always* ignores it. "Sometimes" missing a heads-up is better than "always" missing it.
 C/C++ doesn't demonstrate that warnings are doomed to be useless and
 "always" ignored. What it demonstrates is that warnings are NOT an
 appropriate strategy for fixing language problems.
Point. I suppose YMMV, but IME unless warnings are enforced with -Werror or equivalent, after a while people just stop paying attention to them, at least where I work.
So nobody else should have the opportunity to benefit from them? Because that's what the alternative is. As soon as we buy into the "error" vs "totally ok" false dichotomy, we start hitting (and this is exactly what did happen in D many years ago) cases where a known code smell is too pedantic to be justifiable as a build-breaking error. So if we buy into the "error/ok" dichotomy, those code smells are forced into the "A-Ok!" bucket, guaranteeing that nobody benefits. Those "X doesn't fit into the error vs ok dichotomy" realities are exactly why DMD wound up with a set of warnings despite Walter's philosophical objections to them.
 That's why my eventual conclusion is that anything short of enforcement
 will ultimately fail. Unless there is no way you can actually get an
 executable out of badly-written code, there will always be *somebody*
 out there that will write bad code. And by Murphy's Law, that somebody
 will eventually be someone in your team, and chances are you'll be the
 one cleaning up the mess afterwards.  Not something I envy doing (I've
 already had to do too much of that).
And when I am tasked with cleaning up that bad code, I *really* hope it's from me being the only one to read the warnings, and not because I just wasted the whole day tracking down some weird bug only to find it was caused by something the compiler *could* have warned me about, but chose not to because the compiler doesn't believe in warnings out of fear that somebody, somewhere might ignore it.
May 11
parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/11/2017 10:20 PM, Nick Sabalausky (Abscissa) wrote:
 On 05/10/2017 02:28 AM, H. S. Teoh via Digitalmars-d wrote:
 I'm on the fence about the former.  My current theory is that being
 forced to write "proper" code even while refactoring actually helps the
 quality of the resulting code.
I find anything too pedantic to be an outright error will *seriously* get in my way and break my workflow on the task at hand when I'm dealing with refactoring, debugging, playing around with an idea, etc., if I'm required to compulsively "clean them all up" at every little step along the way
Another thing to keep in mind is that deprecations are nothing more than a special type of warning. If code must be be either "error" or "non-error" with no in-between, then that rules out deprecations. They would be forced to either become fatal errors (thus defeating the whole point of keeping an old symbol around marked as deprecated) or go away entirely.
May 11
prev sibling parent Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Wednesday, 10 May 2017 at 01:19:08 UTC, Nick Sabalausky 
(Abscissa) wrote:
 The moral of this story: Sometimes, breaking people's code is 
 GOOD! ;)
I don't get the hate that compiler warnings get in the D community. Sure you can disable them if you don't care, but then don't complain about C being inherently unsafe and bug-prone while praising D for breaking things. Uninitialized variables is an example that I think does not need to be a language feature: If the compiler can prove the usage is sound, everything is fine. The compiler has much deeper knowledge about the concrete case than static language rules. If analysis fails, issue a warning. Usually the problematic code is far from obvious and refactoring is a good idea. If the programmer still thinks that no action is needed, just suppress that warning with a pragma.
May 10
prev sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
 		strncpy(tmp, desc->data1, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}

 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning. Side note: scope(exit) is one of the best inventions in PLs ever.
May 09
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via Digitalmars-d wrote:
 On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
 		strncpy(tmp, desc->data1, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
 
 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning.
Well, there ya go. Case in point. Even when you're consciously trying to write "proper" C code, there are just far too many ways things can go wrong that slip-ups are practically inevitable. Eventually, the idiom that I (and others) eventually converged on looks something like this: int myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) { void *resource1, *resource2, *resource3; int ret = RET_ERROR; /* Vet arguments */ if (!blah || !bleh || !bluh) return ret; /* Acquire resources */ resource1 = acquire_resource(blah->blah); if (!resource1) goto EXIT; resource2 = acquire_resource(bleh->bleh); if (!resource1) goto EXIT; resource3 = acquire_resource(bluh->bluh); if (!resource1) goto EXIT; /* Do actual work */ if (do_step1(blah, resource1) == RET_ERROR) goto EXIT; if (do_step2(blah, resource1) == RET_ERROR) goto EXIT; if (do_step3(blah, resource1) == RET_ERROR) goto EXIT; ret = RET_OK; EXIT: /* Cleanup everything */ if (resource3) release(resource3); if (resource2) release(resource2); if (resource1) release(resource1); return ret; } In other words, we just converged to what essentially amounts to a try-catch block with the manual equivalent of RAII. After a while, this is pretty much the only way to have any confidence at all that there isn't any hidden resource leaks or other such errors in the code. Of course, this is only the first part of the equation. There's also managing buffers and arrays safely, which still needs to be addressed. We haven't quite gotten there yet, but at least some of the code now has started moving away from C standard library string functions completely, and towards a common string buffer type where you work with a struct wrapper with functions for appending data, extracting the result, etc.. IOW, we're slowly reinventing a properly-encapsulated string type that's missing from the language. So eventually, after so much effort chasing down pointer bugs, buffer overflows, resource leaks, and the rest of C's endless slew of pitfalls, we're gradually reinventing RAII, try-catch blocks, and string types all over again. It's like historians are repeating each other^W^W^W^W^W I mean, history is repeating itself. :-D It makes me wonder how much longer it will take for us to gradually converge onto features that today we're enjoying in D. Will it take another decade of segfaults, untraceable pointer bugs, security holes, and memory leaks? Who knows. I'm secretly hoping that between now and then, D finally takes off and we can finally shed this dinosaur age language that should have died after the 70's or latest the 80's, yet still persists to this day.
 Side note: scope(exit) is one of the best inventions in PLs ever.
Ironically, D has gone so far past the woes that still plague C coders every day that scope(exit) is hardly ever used in D anymore. :-P It has its uses, certainly, but in my regular D code, I'm already benefitting so much from D's other features that I can hardly think of a use case for scope(exit) anymore, in the context of idiomatic D. I do regularly find myself wishing for scope(exit) in my C code, though! T -- Век живи - век учись. А дураком помрёшь.
May 09
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 05/09/2017 10:26 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via 
Digitalmars-d wrote:
 On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
 		strncpy(tmp, desc->data1, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}

 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning.
Well, there ya go. Case in point.
I caught that too but I thought you were testing whether we were listening. ;)
 Eventually, the idiom that I (and others) eventually converged on looks
 something like this:

 	int myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
 		void *resource1, *resource2, *resource3;
 		int ret = RET_ERROR;

 		/* Vet arguments */
 		if (!blah || !bleh || !bluh)
 			return ret;

 		/* Acquire resources */
 		resource1 = acquire_resource(blah->blah);
 		if (!resource1) goto EXIT;

 		resource2 = acquire_resource(bleh->bleh);
 		if (!resource1) goto EXIT;
Copy paste error! :p (resource1 should be resource2.)
 		resource3 = acquire_resource(bluh->bluh);
 		if (!resource1) goto EXIT;
Ditto.
 		/* Do actual work */
 		if (do_step1(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step2(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step3(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		ret = RET_OK;
 	EXIT:
 		/* Cleanup everything */
 		if (resource3) release(resource3);
 		if (resource2) release(resource2);
 		if (resource1) release(resource1);

 		return ret;
 	}
As an improvement, consider hiding the checks and the goto statements in macros: resource2 = acquire_resource(bleh->bleh); exit_if_null(resource1); err = do_step2(blah, resource1); exit_if_error(err); Or something similar... Obviously, it requires certain standardization like functions never having a goto statement, yet all having an EXIT area, etc. It makes C code very uniform, which is a good thing as you notice nonstandard idioms quickly. This safer way of needing to do everything in steps of two lines is one of the reasons why I was convinced that exceptions are superior to return codes. Ali
May 10
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 04:38:48AM -0700, Ali ehreli via Digitalmars-d wrote:
 On 05/09/2017 10:26 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via Digitalmars-d
wrote:
 On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
[...]
 		strncpy(tmp, desc->data2, bufsz);
 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
 		{
 			fclose(fp);
 			unlink("blah");
 			return IO_ERROR;
 		}
I think you cause a memory leak in these branches because you forget to free tmp before returning.
Well, there ya go. Case in point.
I caught that too but I thought you were testing whether we were listening. ;)
Haha, I guess I'm not as good of a C coder as I'd like to think I am. :-D [...]
 		/* Acquire resources */
 		resource1 = acquire_resource(blah->blah);
 		if (!resource1) goto EXIT;

 		resource2 = acquire_resource(bleh->bleh);
 		if (!resource1) goto EXIT;
Copy paste error! :p (resource1 should be resource2.)
 		resource3 = acquire_resource(bluh->bluh);
 		if (!resource1) goto EXIT;
Ditto.
Ouch. Ouch. :-D But then again, I've actually seen similar copy-paste errors in real code before, too. Sometimes they could be overlooked for >5 years (I kid you not, I actually checked the date in svn blame / svn log). [...]
 As an improvement, consider hiding the checks and the goto statements
 in macros:
 
     resource2 = acquire_resource(bleh->bleh);
     exit_if_null(resource1);
 
     err = do_step2(blah, resource1);
     exit_if_error(err);
 
 Or something similar... Obviously, it requires certain standardization
 like functions never having a goto statement, yet all having an EXIT
 area, etc.  It makes C code very uniform, which is a good thing as you
 notice nonstandard idioms quickly.
Yes, eventually this is the only sane and consistent way to deal with these problems. Unfortunately, in C this can only be done by convention, which means that some non-conforming code will inevitably slip through and cause havoc. Also, this starts running dangerously near the slippery slope down into macro hell, where the project accretes its own idiomatic set of inscrutable macro usage conventions and eventually almost all of the C syntax has disappeared and the code no longer looks like C. Then along comes New Recruit, and he makes a right mess with it because he doesn't understand the 15-level-deep nested macros in the global include/macros.h file that's become a 5200-line monstrosity of unreadable CPP hacks. (Also not exaggerating: the very project I'm working on has a module that's written this way, and only the initiated dare dream of fixing bugs in those macros. Fortunately, they have not yet nested to 15 levels deep, so for the most part you just copy and paste existing working code and pray that it will Just Work by analogy. Actually understand what you just wrote? Pfeh! You don't have time for that. The customer wants the release by last week. Copy-n-paste cargo cult FTW!)
 This safer way of needing to do everything in steps of two lines is
 one of the reasons why I was convinced that exceptions are superior to
 return codes.
[...] Yeah, once practically every single statement in your function is an if-statement checking for error codes, you start wondering, why can't the language abstract this nasty boilerplate away for me?! And then the need for exceptions becomes clear. T -- Written on the window of a clothing store: No shirt, no shoes, no service.
May 10
parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 10 May 2017 at 17:51:38 UTC, H. S. Teoh wrote:
 Haha, I guess I'm not as good of a C coder as I'd like to think 
 I am. :-D
That comment puts you ahead of the pack already :)
May 11
prev sibling parent reply Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Wednesday, 10 May 2017 at 05:26:11 UTC, H. S. Teoh wrote:
 	int myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
 		void *resource1, *resource2, *resource3;
 		int ret = RET_ERROR;

 		/* Vet arguments */
 		if (!blah || !bleh || !bluh)
 			return ret;

 		/* Acquire resources */
 		resource1 = acquire_resource(blah->blah);
 		if (!resource1) goto EXIT;

 		resource2 = acquire_resource(bleh->bleh);
 		if (!resource1) goto EXIT;

 		resource3 = acquire_resource(bluh->bluh);
 		if (!resource1) goto EXIT;

 		/* Do actual work */
 		if (do_step1(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step2(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		if (do_step3(blah, resource1) == RET_ERROR)
 			goto EXIT;

 		ret = RET_OK;
 	EXIT:
 		/* Cleanup everything */
 		if (resource3) release(resource3);
 		if (resource2) release(resource2);
 		if (resource1) release(resource1);

 		return ret;
 	}
In modern C and with GLib (which makes use of a gcc/clang extension) you can write this as: gboolean myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) { /* Cleanup everything automatically at the end */ g_autoptr(GResource) resource1 = NULL, resource2 = NULL, resource3 = NULL; gboolean ok; /* Vet arguments */ g_return_if_fail(blah != NULL, FALSE); g_return_if_fail(bleh != NULL, FALSE); g_return_if_fail(bluh != NULL, FALSE); /* Acquire resources */ ok = acquire_resource(resource1, blah->blah); g_return_if_fail(ok, FALSE); ok = acquire_resource(resource2, bleh->bleh); g_return_if_fail(ok, FALSE); ok = acquire_resource(resource3, bluh->bluh); g_return_if_fail(ok, FALSE); /* Do actual work */ ok = do_step1(blah, resource1); g_return_if_fail(ok, FALSE); ok = do_step2(blah, resource1); g_return_if_fail(ok, FALSE); return do_step3(blah, resource1); } Some random example of this style of coding: https://github.com/flatpak/flatpak/blob/master/common/flatpak-db.c
May 10
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, May 10, 2017 at 12:34:05PM +0000, Guillaume Boucher via Digitalmars-d
wrote:
[...]
 In modern C and with GLib (which makes use of a gcc/clang extension) you can
 write this as:
 
 gboolean myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
         /* Cleanup everything automatically at the end */
         g_autoptr(GResource) resource1 = NULL, resource2 = NULL, resource3 =
 NULL;
         gboolean ok;
 
         /* Vet arguments */
         g_return_if_fail(blah != NULL, FALSE);
         g_return_if_fail(bleh != NULL, FALSE);
         g_return_if_fail(bluh != NULL, FALSE);
 
 	/* Acquire resources */
 	ok = acquire_resource(resource1, blah->blah);
 	g_return_if_fail(ok, FALSE);
 
         ok = acquire_resource(resource2, bleh->bleh);
 	g_return_if_fail(ok, FALSE);
 
 	ok = acquire_resource(resource3, bluh->bluh);
 	g_return_if_fail(ok, FALSE);
 
         /* Do actual work */
 	ok = do_step1(blah, resource1);
 	g_return_if_fail(ok, FALSE);
 
 	ok = do_step2(blah, resource1);
 	g_return_if_fail(ok, FALSE);
 
 	return do_step3(blah, resource1);
 }
[...] Yes, this would address the problem somewhat, but the problem is again, this is programming by convention. The language doesn't enforce that you have to write code this way, and because it's not enforced, *somebody* will ignore it and write things the Bad Ole Way. You're essentially writing in what amounts to a subdialect of C using Glib idioms, and that's not a bad thing in and of itself. But the larger language that includes all the old unsafe ways of writing code is still readily accessible. By Murphy's Law, somebody will eventually write something that breaks the idiom and causes problems. Also, because this way of writing code is not part of the language, the compiler cannot verify that you're using the macros correctly. And it cannot verify that you didn't write goto labels or other things that might conflict with the way the macros are implemented. Lack of hygiene in C macros does not help in this respect. I don't dispute that there are ways of writing correct (or mostly correct) C code. But the problem is that these ways of writing correct C code are (1) non-obvious to someone not already in the know, and so you will always have people who either don't know about them or aren't sufficiently well-versed in them to use them effectively; and (2) not statically enforceable because they are not a part of the language. Lack of enforcement, in the long run, can only end in disaster, because programming by convention does not work. It works as long as the convention is kept, but humans are fallible, and we all know how well humans are at keeping conventions over a sustained period of time (or even just short periods of time). Not even D is perfect in this regard, but it has taken significant steps in the right directions. Correct-by-default (well, for the most part anyway, barring compiler bugs / spec issues) and static guarantees (verified by the compiler -- again barring compiler bugs) are major steps forward. Ultimately, I'm unsure how far a language can go at static guarantees: I think somewhere along the line human error will still be unpreventable because you start running into the halting problem when verifying certain things. But I certainly think there's still a LOT that can be done by the language between here and there, much more than what we have today. T -- Mediocrity has been pushed to extremes.
May 10
prev sibling parent reply Joakim <dlang joakim.fea.st> writes:
On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
May 09
next sibling parent reply Adrian Matoga <dlang.spam matoga.info> writes:
On Tuesday, 9 May 2017 at 09:22:13 UTC, Joakim wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
+1
May 09
parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/09/2017 06:29 AM, Adrian Matoga wrote:
 On Tuesday, 9 May 2017 at 09:22:13 UTC, Joakim wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
+1
+ a kajillion (give or take a few hundred)
May 09
prev sibling parent Martin Tschierschke <mt smartdolphin.de> writes:
On Tuesday, 9 May 2017 at 09:22:13 UTC, Joakim wrote:
 On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
 On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via 
 Digitalmars-d wrote:
 [...]
Is that a subtle joke, or are you being serious? [...]
Heh, I saw you wrote the post and knew it would be long, then I kept scrolling and scrolling... :) Please, please, please submit this as a post on the D blog, perhaps prefaced by the Walter/Scott exchange and with some links to the issues you mention and the relevant portions of the D reference. I think it would do well.
+=1; Yes, good idea!
May 09
prev sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Monday, May 08, 2017 23:15:12 H. S. Teoh via Digitalmars-d wrote:
 Recently I've had the dubious privilege of being part of a department
 wide push on the part of my employer to audit our codebases (mostly C,
 with a smattering of C++ and other code, all dealing with various levels
 of network services and running on hardware expected to be "enterprise"
 quality and "secure") and fix security problems and other such bugs,
 with the help of some static analysis tools. I have to say that even
 given my general skepticism about the quality of so-called "enterprise"
 code, I was rather shaken not only to find lots of confirmation of my
 gut feeling that there are major issues in our codebase, but even more
 by just HOW MANY of them there are.
In a way, it's amazing how successful folks can be with software that's quite buggy. A _lot_ of software works just "well enough" that it gets the job done but is actually pretty terrible. And I've had coworkers argue to me before that writing correct software really doesn't matter - it just has to work well enough to get the job done. And sadly, to a great extent, that's true. However, writing software that's works just "well enough" does come at a cost, and if security is a real concern (as it increasingly is), then that sort of attitude is not going to cut it. But since the cost often comes later, I don't think that it's at all clear that we're going to really see a shift towards languages that prevent such bugs. Up front costs tend to have a powerful impact on decision making - especially when the cost that could come later is theoretical rather than guaranteed. Now, given that D is also a very _productive_ language to write in, it stands to reduce up front costs as well, and that combined with its ability to reduce the theoretical security costs, we could have a real win, but with how entrenched C and C++ are and how much many companies are geared towards not caring about security or software quality so long as the software seems to get the job done, I think that it's going to be a _major_ uphill battle for a language like D to really gain mainstream use on anywhere near the level that languages like C and C++ have. But for those who are willing to use a language that makes it harder to write code with memory safety issues, there's a competitive advantage to be gained. - Jonathan M Davis
May 11
next sibling parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/11/2017 11:53 AM, Jonathan M Davis via Digitalmars-d wrote:
 In a way, it's amazing how successful folks can be with software that's
 quite buggy. A _lot_ of software works just "well enough" that it gets the
 job done but is actually pretty terrible. And I've had coworkers argue to me
 before that writing correct software really doesn't matter - it just has to
 work well enough to get the job done. And sadly, to a great extent, that's
 true.

 However, writing software that's works just "well enough" does come at a
 cost, and if security is a real concern (as it increasingly is), then that
 sort of attitude is not going to cut it. But since the cost often comes
 later, I don't think that it's at all clear that we're going to really see a
 shift towards languages that prevent such bugs. Up front costs tend to have
 a powerful impact on decision making - especially when the cost that could
 come later is theoretical rather than guaranteed.

 Now, given that D is also a very _productive_ language to write in, it
 stands to reduce up front costs as well, and that combined with its ability
 to reduce the theoretical security costs, we could have a real win, but with
 how entrenched C and C++ are and how much many companies are geared towards
 not caring about security or software quality so long as the software seems
 to get the job done, I think that it's going to be a _major_ uphill battle
 for a language like D to really gain mainstream use on anywhere near the
 level that languages like C and C++ have. But for those who are willing to
 use a language that makes it harder to write code with memory safety issues,
 there's a competitive advantage to be gained.
All very, unfortunately, true. It's like I say, the tech industry isn't engineering, it's fashion. There is no meritocracy here, not by a long shot. In tech: What's popular is right and what's right is popular, period.
May 11
prev sibling parent reply Laeeth Isharc <laeethnospam nospam.laeeth.com> writes:
On Thursday, 11 May 2017 at 15:53:40 UTC, Jonathan M Davis wrote:
 On Monday, May 08, 2017 23:15:12 H. S. Teoh via Digitalmars-d 
 wrote:
 Recently I've had the dubious privilege of being part of a 
 department wide push on the part of my employer to audit our 
 codebases (mostly C, with a smattering of C++ and other code, 
 all dealing with various levels of network services and 
 running on hardware expected to be "enterprise" quality and 
 "secure") and fix security problems and other such bugs, with 
 the help of some static analysis tools. I have to say that 
 even given my general skepticism about the quality of 
 so-called "enterprise" code, I was rather shaken not only to 
 find lots of confirmation of my gut feeling that there are 
 major issues in our codebase, but even more by just HOW MANY 
 of them there are.
In a way, it's amazing how successful folks can be with software that's quite buggy. A _lot_ of software works just "well enough" that it gets the job done but is actually pretty terrible. And I've had coworkers argue to me before that writing correct software really doesn't matter - it just has to work well enough to get the job done. And sadly, to a great extent, that's true. However, writing software that's works just "well enough" does come at a cost, and if security is a real concern (as it increasingly is), then that sort of attitude is not going to cut it. But since the cost often comes later, I don't think that it's at all clear that we're going to really see a shift towards languages that prevent such bugs. Up front costs tend to have a powerful impact on decision making - especially when the cost that could come later is theoretical rather than guaranteed. Now, given that D is also a very _productive_ language to write in, it stands to reduce up front costs as well, and that combined with its ability to reduce the theoretical security costs, we could have a real win, but with how entrenched C and C++ are and how much many companies are geared towards not caring about security or software quality so long as the software seems to get the job done, I think that it's going to be a _major_ uphill battle for a language like D to really gain mainstream use on anywhere near the level that languages like C and C++ have. But for those who are willing to use a language that makes it harder to write code with memory safety issues, there's a competitive advantage to be gained. - Jonathan M Davis
D wasn't ready for mainstream adoption until quite recently I think. The documentation for Phobos when I started looking at D in 2014 was perfectly clear if you were more theoretically minded, but not for other people. In a previous incarnation I tried to get one trader who writes Python to look at D and he was terrified of it because of the docs. And I used to regularly have compiler crashes and ldc was always too far behind dmd. If you wanted to find commercial users there didn't seem to be so many and so hard to point to successful projects in D that people would have heard of or could recognise - at least not enough of them. Perception has threshold effects and isn't linear. There wasn't that much on numerical front either. The D Foundation didn't exist and Andrei played superhero in his spare time. All that's changed now in every respect. I can point to the documentation and say we should have docs like that and with runnable tests /examples. Most code builds fine with ldc, plenty of numerical libraries - thanks Ilya - and perception is quite different about commercial successes. Remember what's really just incremental in reality can be a step change in perception. I don't think the costs of adopting D are tiny upfront. Putting aside the fact that people expect better IDE support than we have, and that we have quite frequent releases (not a bad thing, but it's where we are in maturity) and some of them are a bit unfinished and others break things for good reasons, build systems are not that great even for middling projects (200k sloc). Dub is an amazing accomplishment for Sonke as one of many projects part time, but it's not yet so mature as a build tool. We have extern(C++) which is great, and no other language has it. But that's not the same thing as saying it's trivial to use a C++ library from D (and I don't think it's yet mature bugwise). No STL yet. Even for C compare the steps involved vs LuaJIT FFI. Dstep is a great tool but not without some friction and it only works for C. So one should expect to pay a price with all of this, and I think most of the price is upfront (also because you might want to wrap the libraries you use most often). And the price is paid by having to deal with things people often take for granted, so even if it's small in the scheme of things, it's more noticeable. A community needs energy coming into it to grow, but if there's too quick an influx of newcomers that wouldn't be good either. Eg if dconf were twice the size it would be a very different experience, not only in a positive way. I think new things often grow not by taking the dominant player head on, but by growing in the interstices. By taking hold in obscure niches nobody cares about you gain power to take on bigger niches and over time turns out some of those niches weren't so unimportant after all. It's a positive for the health of D that it's dismissed and yet keeps growing; just imagine if Stroustrup had had a revelation, written a memo "the static if tidal wave" (BG 1995), persuaded the committee to deprecate all the features and mistakes that hold C++ back and stolen all D's best features in a single language release. A challenger language doesn't want all people to take it seriously because it doesn't have the strength to win a direct contest. It just needs more people to take it seriously. Thr best measure of the health of the language and its community might be are more people using the language to get real work done and is it more or less helping them do so; and what is the quality of new people becoming involved. If those things are positive then if external conditions are favourable then I think it bodes well for the future. And by external conditions I mean that people have gotten used to squandering performance and users' time - see Jonathan Blow on Photoshop for example. If you have an abundance of a resource and keep squandering it, eventually you will run out of abundance. Storage prices are collapsing, data sets are growing, Moore's Law isn't what it was, and even with dirt cheap commodity hardware it's not necessarily the case that one is I/O bound any more. Nvme drive does 2.5 GB /sec and we are happy when we can parse JSON at 200 MB /sec. People who misquote Knuth seem to write slow code, and life is too short to be waiting unnecessarily. At some point people get fed up with slow code. Maybe it's wrong to think about there being one true inheritor of the mantle of C and C++. Maybe no new language will gain the market share that C has, and if so that's probably a good thing. Mozilla probably never had any moments when they woke up and thought hmm maybe we should have used Go instead, and I doubt people writing network services think maybe Rust would have been better. I said to Andrei at dconf that principals rather than agents are much more likely to be receptive towards the adoption of D. If you take an unconventional decision and it doesn't work out, you look doubly stupid - it didn't work out and on top of that nobody else made that mistake : what were you thinking? So by far the best strategy - unless you're in a world of pain, and desperate for a way out - is to copy what everyone else is doing. But if you're a principal - ie in some way an owner of a business - you haven't got the luxury of fooling yourself, not if you want to survive and flourish. The buck stops here, so it's a risk to use D, but it's also a risk not to use D - you can't pretend the conventional wisdom is without risk when it may not suit the problem that's before you. And it's your problem today and it's still your problem tomorrow, and that leads to a different orientation towards the future than being a cog in a vast machine where the top guy is measured by whether he beats earnings next quarter. The web guys do have a lot of engineers but they have an inordinate influence on the culture. Lots more code gets written in enterprises and you never hear about it because it's proprietary and people aren't allowed to or don't have time to discuss it. And maybe it's simply not even interesting to talk about, which doesn't mean it's not interesting to you, and economically important. D covers an enormous surface area - a much larger potential domain set than Go or Rust. Things are more spread out, hence the amusing phenomenon on Reddit and the like of people thinking that because they personally don't know anyone that uses D, nothing is happening and adoption isn't growing. So assessing things by adoption within the niches where people are chatty is interesting but doesn't tell you much. I don't think most users post on the forum much. It's a subset of people that for whatever reasons like posting on the forum for intrinsic or instrumental reasons that do. So if I am right about the surface area and the importance of principals then you should over time see people popping up from areas you had never thought of that have the power to make decisions and trust their own judgement because they have to. That's how you know the language is healthy - that they start using D and enough of them have success with it. Liran at Weka had never heard of D not long before he based his company on it. I had never imagined a ship design company might use Extended Pascal, let alone that D might be a clearly sensible option for automated code conversion and be a great fit for new code. And I am sure Walter is right about the importance of memory safety. But outside of certain areas D isn't in a battle with Rust; memory safety is one more appealing modern feature of D. To say it's important to get it right isn't to say it has to defeat Rust. Not that you implied this, but some people at dconf seemed to implicitly think that way. Laeeth
May 11
next sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Friday, May 12, 2017 04:08:52 Laeeth Isharc via Digitalmars-d wrote:
 And I am sure Walter is right about the importance of memory
 safety.  But outside of certain areas D isn't in a battle with
 Rust; memory safety is one more appealing modern feature of D.
 To say it's important to get it right isn't to say it has to
 defeat Rust. Not that you implied this, but some people at dconf
 seemed to implicitly think that way.
I think that we're far past the point that any language is going to beat everyone else out. Some languages will have higher usage than others, and it's going to vary quite a lot between different domains. Really, it's more of a question of whether a lanugage can compete well enough to be relevant and be used by a lot of developers, not whether it's used by most developers. For instance, D and Go are clearly languages that appeal to a vastly different set of developers, and while they do compete on some level, I think that they're ultimately just going to be used by very different sets of people, because they're just too different (e.g. compare Go's complete lack of generics with D's templates). Rust, on the other hand, seems to have a greater overlap with D, so there's likely to be greater competition there (certainly more competetion with regards to replacing C++ in places where C++ is replaced), but they're still going to appeal to different sets of developers to an extent, just like C++ and D have a lot of overlap but don't appeal to the same set of developers. I fully expect that both Rust and D have bright futures, but I also don't really expect either to become dominant. That's just too hard for a language to do, especially since older languages don't really seem to go away. The programming language ecosystem just becomes more diverse. At most, a language is dominant in a particular domain, not the software industry as a whole. I would love for D to become a serious player in the programming language space such that you see D jobs out there like we currently see C/C++ or Java jobs out there (right now, as I understand it, even Sociomantic Labs advertises for C++ programmers, not D programmers). But ultimately, what I care about is being able to use D when I program and have enough of an ecosystem around it that there are useful libraries and frameworks that I can use and build upon, because D is the language that I prefer and want to program in. Having D destroy C/C++ or Java or C# or Rust or whatever really isn't necessary for that. It just needs to become big enough that it has a real presence, whereas right now, it seems more like the folks who use it professionally are doing so in stealth mode (even if they're not doing so purposefully). Anyone who wants to get a job somewhere and work in D is usually going to have a hard time of it right now, even though such jobs do exist. As it stands, I think a relatively small percentage of D's contributors are able to use D for their day jobs. And if we can really change _that_, then we'll have gotten somewhere big, regardless of what happens with other languages. - Jonathan M Davis
May 12
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 12 May 2017 at 04:08:52 UTC, Laeeth Isharc wrote:
 build tool.  We have extern(C++) which is great, and no other 
 language has it.
Objective-C++/Swift.
 Maybe it's wrong to think about there being one true inheritor 
 of the mantle of C and C++.  Maybe no new language will gain 
 the market share that C has, and if so that's probably a good 
 thing.  Mozilla probably never had any moments when they woke 
 up and thought hmm maybe we should have used Go instead, and I 
 doubt people writing network services think maybe Rust would 
 have been better.
Yes, I think this is right, although C++ is taking over more and more of C's space. But there are still niches where C++ have a hard time going and C still dominates. The problem is of course, that less and less software projects benefit from what C offers...
 But if you're a principal - ie in some way an owner of a 
 business - you haven't got the luxury of fooling yourself, not 
 if you want to survive and flourish.  The buck stops here, so 
 it's a risk to use D, but it's also a risk not to use D - you 
 can't pretend the conventional wisdom is without risk when it 
 may not suit the problem that's before you. And it's your 
 problem today and it's still your problem tomorrow, and that 
 leads to a different orientation towards the future than being 
 a cog in a vast machine where the top guy is measured by 
 whether he beats earnings next quarter.
I don't really think all that many principals make such decisions without pressure from the engineers in the organization, unless it is for going with some big league name... In general many leaders have been burned by using tooling from companies that has folded or not being able to fix issues. Which is a very good reason for going with the safe and well known. Most projects have enough uncertainty factors already so adding an extra uncertainty factor in the tooling is usually not the right choice.
 The web guys do have a lot of engineers but they have an 
 inordinate influence on the culture.  Lots more code gets
Right, the web guys adopt bleeding edge tech like crazy, because the risk is low. The projects are small and they can start over with a new tech on the next project in a few months. They don't have to plan for sticking with the same tooling for years and years.
 And I am sure Walter is right about the importance of memory 
 safety.  But outside of certain areas D isn't in a battle with 
 Rust; memory safety is one more appealing modern feature of D.  
 To say it's important to get it right isn't to say it has to 
 defeat Rust. Not that you implied this, but some people at 
 dconf seemed to implicitly think that way.
Well, memory safety isn't a modern feature at all actually. Most languages provide it, C is a notable exception...
May 12
prev sibling next sibling parent reply John Carter <john.carter taitradio.com> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer. I think you will find everything that really matters and is internet facing has been run up under a tool like that. They are truly wonderfully power tools... with the limitation that they are run time. ie. If you don't run that line of code... they won't tell you if you have it wrong. Index out of bounds exceptions are great... but the elements of Walter's talk we bugs are banished at compile time are more compelling. Now if we can get to the point where there is no undefined behaviour in any safe code... that would be a major step forward. Languages like Ruby are memory safe... but they are written in C and hence have a very long catalog of bugs found and fixed in the interpretor and supporting libraries. D has the interesting promise of being memory safe and the compiler and libraries being written in D.
May 08
next sibling parent reply John Carter <john.carter taitradio.com> writes:
On Monday, 8 May 2017 at 20:55:02 UTC, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
Google makes my point for me.... https://opensource.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html
 Index out of bounds exceptions are great... but the elements of 
 Walter's talk where bugs are banished at compile time are more 
 compelling.

 Now if we can get to the point where there is no undefined 
 behaviour in any safe code... that would be a major step 
 forward.
May 08
parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/08/2017 08:42 PM, John Carter wrote:
 On Monday, 8 May 2017 at 20:55:02 UTC, John Carter wrote:
 C/C++ has been granted an extension of life by the likes of valgrind
 and purify and *-sanitizer.
Google makes my point for me.... https://opensource.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html
That reminds me, I've been thinking for awhile: We need a dead-simple D/dub-ified tool to fuzz-test our D projects. Even if it's just a trivial wrapper and DUB package for an existing fuzz tester (heck, probably that's the right way to go anyway) we really should make fuzz testing just as common & easy a thing for D projects as doc-generation and unittests.
May 08
prev sibling next sibling parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 05/08/2017 04:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer. I think you will find everything that really matters and is internet facing has been run up under a tool like that.
Like Cloudflare and OpenSSL?
May 08
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools: 1. it isn't part of the language 2. it may not be available on your platform 3. somebody has to find it, install it, and integrate it into the dev/test process 4. it's incredibly slow to run valgrind, so there are powerful tendencies to skip it valgrind is a stopgap measure, and has saved me much grief over the years, but it is NOT the solution.
May 09
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 9 May 2017 at 14:13:31 UTC, Walter Bright wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:

 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools: 1. it isn't part of the language 2. it may not be available on your platform 3. somebody has to find it, install it, and integrate it into the dev/test process 4. it's incredibly slow to run valgrind, so there are powerful tendencies to skip it valgrind is a stopgap measure, and has saved me much grief over the years, but it is NOT the solution.
And it doesn't catch everything. I had the case yesterday at work where one of the file converters that had been written 15 years ago, suddenly crashed in production*. It came from an upstream bug in a script that filled one attribute with garbage. I tried to reproduce the bug in the development environment and funily it didn't crash with newest version of the base library. The production library is one version behind. The garbage in the attribute triggered a buffer overflow in a fixed size array (496 UTF-16 characters in a buffer of 200 character size). This converter is one of the last one with fixed sized arrays. The interesting observation was that valgrind catches the buffer overflow when linked with version 2.31 of the main library but is completely silent when using version 2.32. The changes in that library are minimal and in parts that have nothing to do with this app. It is solely the placement of variables in data iand the bss that change. It is surprizing to see such a big buffer overflow completely missed by valgrind. TL;DR valgrind does not always catch buffer overflows especially if the memory overwritten is not in the head but in the data or the bss segment. There it cannot add guard pages as it does on the heap. * To give a little context. I work at the European Commission on the central translation memory system called Euramis (probably the biggest in the world with more than a billion segments). The system is used intensively by all translators of all European institutions and without it, nothing would be possible. The issue with it is that the back end is written in C and the code goes back to 1990. Me and my colleagues managed to modernize the system and catch most of the code issues with intensive use of C99 idioms, newest gcc and clang diagnostics and also valgrind and such things.
May 09
prev sibling next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, May 09, 2017 at 07:13:31AM -0700, Walter Bright via Digitalmars-d wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 
 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools: 1. it isn't part of the language
And it doesn't catch everything.
 2. it may not be available on your platform
And may not be compatible with your target architecture -- a lot of C code, especially in the embedded realm, have curious target archs that could be problematic for 3rd party tools that need to inject runtime instrumentation.
 3. somebody has to find it, install it, and integrate it into the
 dev/test process
This is a big one. Many large C projects have their own idiomatic way of building, which is often incompatible with 3rd party tools. This is a major demotivator for people to want to use those tools, because it's a big time investment to configure the tool to work with the build scripts, and an error-prone and painful process to rework the build scripts to work with the tool. "Why break our delicate tower-of-cards build system that's worked just fine for 20 years, just to run this new-fangled whatever 3rd party thingy promulgated by these young upstarts these days?"
 4. it's incredibly slow to run valgrind, so there are powerful
 tendencies to skip it
Yes, it's an extra step that the developer has to manually run, when he is already under an unreasonable deadline to meet an unreasonable customer request upon which hinges a $1M deal so you can't turn it down no matter how unreasonable it is. He barely has enough time to write code that won't crash outright, nevermind writing *proper* code. Yet another extra step to run manually? Forget it, not gonna happen. Not until a major crash happens on the customer's site that finally convinces the PTB to dictate the use of valgrind as a part of regular work routine. Other than that, the chances of someone actually bothering to do it are slim indeed.
 valgrind is a stopgap measure, and has saved me much grief over the
 years, but it is NOT the solution.
Yes, it's a patch over the current festering wound so that, at least for the time being, the blood is out of sight. But you can't wear that patch forever. Sooner or later the gangrene be visible on the surface. :-D T -- Change is inevitable, except from a vending machine.
May 09
prev sibling next sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tuesday, May 09, 2017 07:13:31 Walter Bright via Digitalmars-d wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
I agree. But one inevitably runs into problems relying on valgrind and other third party tools:
 2. it may not be available on your platform
The fact that it's not available on Windows is extremely annoying. Some tools do exist on Windows, but you have to pay for them, and in my experience, they don't work very well. And with my current job, they _definitely_ don't work, because we mix C++ and C# (via COM). Nothing seems to be able to handle that mixture properly, and it's _really_ hard to track down memory problems.
 4. it's incredibly slow to run valgrind, so there are powerful tendencies
 to skip it
There are cases where you literally _can't_ run it, because it's simply too slow. For instance, when dealing with live video from a camera, the odds are very high that under valgrind, the program won't be able to keep up. And if you're doing something like streaming 16 cameras at once (which happens in the security industry all the time), there's no way that it's going to work. Valgrind is a fantastic tool, but saying that valgrind is enough is like saying that dynamic type checking is as good as compile-time type checking. It isn't, and it can't be. So, yes, valgrind can be a lifesaver, but having preventing the bugs that it would find from even being possible is _far_ more valuable. That being said, with the push for nogc and the allocators and whatnot, we're then once again stuck needing to valgrind D code to catch bugs. It's still not as bad as C/C++, because the problems are much more restricted in scope, but avoiding the GC comes at a real cost. Atila commented at dconf that working with allocators in D code for the excel wrapper library he had worked on was like he was stuck in C++ again with all of the memory problems that he had. safe and the GC have _huge_ value. - Jonathan M Davis
May 10
parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 10 May 2017 at 11:50:32 UTC, Jonathan M Davis wrote:
 On Tuesday, May 09, 2017 07:13:31 Walter Bright via 
 Digitalmars-d wrote:
 On 5/8/2017 1:55 PM, John Carter wrote:
 Atila commented at dconf that working with allocators in D code 
 for the excel wrapper library he had worked on was like he was 
 stuck in C++ again with all of the memory problems that he had. 
  safe and the GC have _huge_ value.

 - Jonathan M Davis
Actually, it was worse than being back in C++ land: there I can use valgrind and address sanitizer. With D's allocators I was lost. I'd forgotten how much "fun" it was to print pointer values to the terminal to track down memory bugs. It's especially fun when you're on Windows, your code is in a DLL loaded by a program you don't control and DebugViewer is your only friend. Atila
May 10
prev sibling next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Tuesday, 9 May 2017 at 14:13:31 UTC, Walter Bright wrote:
 2. it may not be available on your platform
I just had to use valgrind for the first time in years at work (mostly Python code there) and I realized that there's no version that works on the latest OS X version. So valgrind runs on about 2.5% of computers in existence. Fun!
May 11
parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 11 May 2017 at 21:20:35 UTC, Jack Stouffer wrote:
 On Tuesday, 9 May 2017 at 14:13:31 UTC, Walter Bright wrote:
 2. it may not be available on your platform
I just had to use valgrind for the first time in years at work (mostly Python code there) and I realized that there's no version that works on the latest OS X version. So valgrind runs on about 2.5% of computers in existence. Fun!
Use ASAN.
May 11
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2017-05-09 16:13, Walter Bright wrote:

 I agree. But one inevitably runs into problems relying on valgrind and
 other third party tools:

 1. it isn't part of the language

 2. it may not be available on your platform

 3. somebody has to find it, install it, and integrate it into the
 dev/test process

 4. it's incredibly slow to run valgrind, so there are powerful
 tendencies to skip it

 valgrind is a stopgap measure, and has saved me much grief over the
 years, but it is NOT the solution.
AddressSanitizer [1] is a tool similar to Valgrind which is built into the Clang compiler, just add an additional flag. It instruments the binary with the help of compiler so the execution speed will not be that much slower compared to a regular build. Clang also contains a ThreadSanitizer which is supposed to detect data races. [1] https://clang.llvm.org/docs/AddressSanitizer.html [2] https://clang.llvm.org/docs/ThreadSanitizer.html -- /Jacob Carlborg
May 12
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: Anything that goes on the internet.
https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a vulnerability in an application that doesn't go on the internet.
May 11
next sibling parent Joakim <dlang joakim.fea.st> writes:
On Thursday, 11 May 2017 at 09:39:57 UTC, Kagamin wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: Anything that goes on the internet.
https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a vulnerability in an application that doesn't go on the internet.
To be fair, if you're not on the internet, you're unlikely to get any files that will trigger that bug in Microsoft's malware checker, as they noted that they first saw it on a website on the internet. Of course, you could still get such files on a USB stick, which just highlights that unless you completely shut in your computer from the world, you can get bit, just slower and with less consequences than on the internet. I wondered what that Project Zero topic had to do with Chromium, turns out it's a security team that google started three years ago to find zero-day holes in almost any software. That guy from the team also found the recently famous Cloudbleed bug that affected Cloudflare. They have a blog up that details holes they found in all kinds of stuff, security porn if you will: ;) https://googleprojectzero.blogspot.com
May 11
prev sibling parent Jack Stouffer <jack jackstouffer.com> writes:
On Thursday, 11 May 2017 at 09:39:57 UTC, Kagamin wrote:
 https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a
vulnerability in an application that doesn't go on the internet.
This link got me thinking: When will we see the first class action lawsuit for criminal negligence for not catching a buffer overflow (or other commonly known bug) which causes identity theft or loss of data? Putting aside the moral questions, the people suing would have a good case, given the wide knowledge of these bugs and the availability of tools to catch/fix them. I think they could prove negligence/incompetence and win given the right circumstances. Would be an interesting question to pose to any managers who don't want to spend time on security.
May 11
prev sibling parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
Hi, I think that comparing languages like D to C is not appropriate. C is a high level assembler and has different design goals. A useful document to refer to is: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1250.pdf In particular: (although note the addition of facet f, which echoes the sentiment that security is important) Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. For the Cx1 revision there is consensus to add a new facet f to the original list of facets. The new spirit of C can be summarized in phrases like: (a) Trust the programmer. (b) Don't prevent the programmer from doing what needs to be done. (c) Keep the language small and simple. (d) Provide only one way to do an operation. (e) Make it fast, even if it is not guaranteed to be portable. (f) Make support for safety and security demonstrable. Proverb e needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule. An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine. I think Linus Torvalds makes an important observation - he says in one of his talks is that the reason he likes C is that when he write C code he can visualize what the machine code will look like. My feeling is that the C has traditionally been used in contexts where probably it should not be used - i.e. as a general purpose application development language. But I don't see how languages like D or Rust can replace C for certain types of use cases. Regards Dibyendu
May 13
next sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:
 (a) Trust the programmer.
That's the first and most deadly mistake. Buffer overflows and null pointers alone have caused hundreds of millions of dollars of damages. I think we can say that this trust is misplaced.
 (b) Don't prevent the programmer from doing what needs to be 
 done.
In reality this manifests as "Don't prevent the programmer from doing anything, especially if they're about to shoot themself". See the code examples throughout this thread.
 (c) Keep the language small and simple.
 (d) Provide only one way to do an operation.
lol
 (f) Make support for safety and security demonstrable.
LOL http://article.gmane.org/gmane.comp.compilers.llvm.devel/87749
My conclusion is that C, and derivatives like C++, is a very
dangerous language the write safety/correctness critical software
in, and my personal opinion is that it is almost impossible to 
write
*security* critical software in it.
(that's from the creator of clang btw)
 But I don't see how languages like D or Rust can replace C for 
 certain types of use cases.
Maybe you can argue for the use of C in embedded systems and in OS's, although I see no reason why Rust can't eventually overtake C there. However, much of the internet's security critical systems (openssl, openssh, DNS systems, router firmware) are in C, and if Google's Project Zero are any indication, they all have ticking time bombs in them. As I stated earlier in the thread, at some point, some company is going to get sued for criminal negligence for shipping software with a buffer overflow bug that caused a security breach. It almost happened with Toyota. The auto industry has a C coding convention for safety called MISRA C, and it was brought up in court as to why Toyota's acceleration problems were entirely their fault. You can bet this will be brought up again.
May 13
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 01:30:47 UTC, Jack Stouffer wrote:
 It almost happened with Toyota. The auto industry has a C 
 coding convention for safety called MISRA C, and it was brought 
 up in court as to why Toyota's acceleration problems were 
 entirely their fault. You can bet this will be brought up again.
1. Changing language won't change this, for that you need something that is formally proven (and even that assumes that the requirements spec is correct). I found this book from 2012 on industry use of formal methods which seems to be available on Google Books: https://books.google.no/books?id=E5sdDs00MuwC 2. What good does it do you to have your source code proven formally correct if your compiler can contain bugs? To get around that you need a formally verified compiler: http://compcert.inria.fr/ So... We are back to C again.
May 14
prev sibling parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 14 May 2017 at 01:30:47 UTC, Jack Stouffer wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:
 (a) Trust the programmer.
That's the first and most deadly mistake. Buffer overflows and null pointers alone have caused hundreds of millions of dollars of damages. I think we can say that this trust is misplaced.
I should have added that the C11 charter also says: <quote> 12. Trust the programmer, as a goal, is outdated in respect to the security and safety programming communities. While it should not be totally disregarded as a facet of the spirit of C, the C11 version of the C Standard should take into account that programmers need the ability to check their work. <endquote> In real terms though tools like ASAN and Valgrind if used from the start usually allow you to catch most of the issues. Most likely even better tools for C will come about in time.
 But I don't see how languages like D or Rust can replace C for 
 certain types of use cases.
Maybe you can argue for the use of C in embedded systems and in OS's, although I see no reason why Rust can't eventually overtake C there.
I think Rust is a promising language but I don't know enough about it to comment. My impression about Rust is that: a) Rust has a steep learning curve as a language. b) If you want to do things that C allows you to do, then Rust is no more safer than C. Regards
May 14
parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Sunday, 14 May 2017 at 10:10:41 UTC, Dibyendu Majumdar wrote:
 In real terms though tools like ASAN and Valgrind if used from 
 the start usually allow you to catch most of the issues. Most 
 likely even better tools for C will come about in time.
See Walter's comment earlier in this thread and my reply.
 I think Rust is a promising language but I don't know enough 
 about it to comment. My impression about Rust is that:

 a) Rust has a steep learning curve as a language.
So does C, if you're doing C "correctly".
 b) If you want to do things that C allows you to do, then Rust 
 is no more safer than C.
That's the entire bloody point isn't it? Maybe you shouldn't be doing a lot of the things that C allows you to do.
May 14
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 21:01:40 UTC, Jack Stouffer wrote:
 On Sunday, 14 May 2017 at 10:10:41 UTC, Dibyendu Majumdar wrote:
 b) If you want to do things that C allows you to do, then Rust 
 is no more safer than C.
That's the entire bloody point isn't it? Maybe you shouldn't be doing a lot of the things that C allows you to do.
Like building a graph? Sure, Rust is perfect if you can model your world like a tree, but that is usually not what you want if you are looking for performance. You could replace pointers with integer-ids, but that is just emulating pointers with a construct that may be harder to check for in an automated fashion. So that is not a good solution either.
May 14
prev sibling parent Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 14 May 2017 at 21:01:40 UTC, Jack Stouffer wrote:
 On Sunday, 14 May 2017 at 10:10:41 UTC, Dibyendu Majumdar wrote:
 b) If you want to do things that C allows you to do, then Rust 
 is no more safer than C.
That's the entire bloody point isn't it? Maybe you shouldn't be doing a lot of the things that C allows you to do.
Hi, I think you are missing the point. I am talking here about things you need to do rather than writing code just for the heck of it.
May 15
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
May 13
next sibling parent reply Dibyendu Majumdar <d.majumdar gmail.com> writes:
On Sunday, 14 May 2017 at 02:11:36 UTC, bachmeier wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
Hi - I think this point really is saying that the type system in C is for convenience only - ultimately if you as a programmer want to manipulate memory in a certain way then C assumes you know what you are doing and why. As I said C is really a high level assembler. Regards
May 14
parent bachmeier <no spam.net> writes:
On Sunday, 14 May 2017 at 09:56:18 UTC, Dibyendu Majumdar wrote:
 On Sunday, 14 May 2017 at 02:11:36 UTC, bachmeier wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar 
 wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
Hi - I think this point really is saying that the type system in C is for convenience only - ultimately if you as a programmer want to manipulate memory in a certain way then C assumes you know what you are doing and why. As I said C is really a high level assembler. Regards
I guess my point is that C only trusts programmers in one direction. You can go as low-level as you want, but it doesn't trust you to use more productive features when that is better (but it certainly gives you the tools to roll your own buggy, hard-to-share version of those features). D, C++, and Rust really do trust the programmer.
May 14
prev sibling parent qznc <qznc web.de> writes:
On Sunday, 14 May 2017 at 02:11:36 UTC, bachmeier wrote:
 On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:

 (a) Trust the programmer.
I don't understand this point. C doesn't offer the programmer much to work with. If you trust the programmer, shouldn't that mean you provide a large set of tools and let them decide which parts to use? C is pretty much "here are some pointers, go have fun".
The C99 Rationale also says: "The Committee is content to let C++ be the big and ambitious language. While some features of C++ may well be embraced, it is not the Committee’s intention that C become C++." I read that as: C is mostly in preservation and fossilization mode. If you want new features look elsewhere. We will not rock the boat. That is probably a good thing. C has its niche and it is comfortable there. If you want to beat C, it will not fight back. The only problem is to convince the C programmers to move.
May 14
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Sunday, 14 May 2017 at 00:05:56 UTC, Dibyendu Majumdar wrote:
 On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
 Walter: I believe memory safety will kill C.
Hi, I think that comparing languages like D to C is not appropriate. C is a high level assembler and has different design goals. A useful document to refer to is: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1250.pdf In particular: (although note the addition of facet f, which echoes the sentiment that security is important) Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. For the Cx1 revision there is consensus to add a new facet f to the original list of facets. The new spirit of C can be summarized in phrases like: (a) Trust the programmer. (b) Don't prevent the programmer from doing what needs to be done. (c) Keep the language small and simple. (d) Provide only one way to do an operation. (e) Make it fast, even if it is not guaranteed to be portable. (f) Make support for safety and security demonstrable. Proverb e needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule. An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine.
If only the gcc and clang designers followed that rule. These <beeep> consider that undefined behaviour allows to break the code in any way they fancy (the nasal demon thing). While pragmaticists interpret it as do the thing that is the simplest to implement on that hardware. The most ridiculous example being the undefined behaviour of signed integer overflow. Signed integer overflow is undefined in C because some obscure platforms may not use 2 complements for the representation of integers. So INT_MAX+1 does not necessarily result in INT_MIN. But completely removing the code when one encounters for example: if(val+1 == INT_MIN) is simply nuts.
May 14
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 14.05.2017 11:42, Patrick Schluter wrote:
 ...
 (a) Trust the programmer.
 (b) Don't prevent the programmer from doing what needs to be done.
 (c) Keep the language small and simple.
 (d) Provide only one way to do an operation.
 (e) Make it fast, even if it is not guaranteed to be portable.
 (f) Make support for safety and security demonstrable.

 Proverb e needs a little explanation. The potential for efficient code
 generation is one of the most important strengths of C. To help ensure
 that no code explosion occurs for what appears to be a very simple
 operation, many operations are defined to be how the target machine's
 hardware does it rather than by a general abstract rule. An example of
 this willingness to live with what the machine does can be seen in the
 rules that govern the widening of char objects for use in expressions:
 whether the values of char objects widen to signed or unsigned
 quantities typically depends on which byte operation is more
 efficient on the target machine.
If only the gcc and clang designers followed that rule.
It's precisely what they do. You are blaming the wrong people.
 These <beeep>
 consider that undefined behaviour allows to break the code in any way
 they fancy (the nasal demon thing). While pragmaticists interpret it as
 do the thing that is the simplest to implement on that hardware.
Those "pragmaticists" cannot be trusted, therefore they are not programmers. Why do they matter?
 The
 most ridiculous example being the undefined behaviour of signed integer
 overflow. Signed integer overflow is undefined in C because some obscure
 platforms may not use 2 complements for the representation of integers.
 So INT_MAX+1 does not necessarily result in INT_MIN.
It's _undefined_, not implementation-defined or unspecified. Excerpt from the C standard:
 3.4.1
 1 implementation-defined behavior
   unspecified behavior where each implementation documents how the choice is
made
 ...
 3.4.3
 1 undefined behavior
   behavior, upon use of a nonportable or erroneous program construct or of
erroneous data,
   for which this International Standard imposes no requirements
 ...
 3.4.4
 1 unspecified behavior
 use of an unspecified value, or other behavior where this International
Standard provides
 two or more possibilities and imposes no further requirements on which is
chosen in any
 instance
 ...
What is it about "no requirements" that "pragmaticists" fail to understand? Not inventing artificial additional requirements is among the most pragmatic things to do.
 But completely removing the code when one encounters for example:
 if(val+1 == INT_MIN) is simply nuts.
Why? This is simple dead code elimination. The programmer clearly must have known that it is dead code and the compiler trusts the programmer. The programmer would _never_ break that trust and make a program evaluate INT_MAX+1 ! The corollary to 'trust the programmer' is 'blame the programmer'. Don't use C if you want to blame the compiler.
May 14
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 12:07:40 UTC, Timon Gehr wrote:
 On 14.05.2017 11:42, Patrick Schluter wrote:
 But completely removing the code when one encounters for 
 example:
 if(val+1 == INT_MIN) is simply nuts.
Why? This is simple dead code elimination. The programmer clearly must have known that it is dead code and the compiler trusts the programmer. The programmer would _never_ break that trust and make a program evaluate INT_MAX+1 !
Well, actually, it makes sense to issue a warning in C. But in C++ it makes less sense since meta-programming easily can generate such code without breaking the semantics of the program.
 The corollary to 'trust the programmer' is 'blame the 
 programmer'. Don't use C if you want to blame the compiler.
Oh well, there are lots of different checkers for C, so I guess it would be more like "don't blame the compiler, blame the verifier".
May 14
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
What does that snippet do ? What should it do?

int caca(void)
{
   for(int i=0xFFFFFFFF; i!=0x80000000; i++)
     printf("coucou");
}
May 14
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 16:44:10 UTC, Patrick Schluter wrote:
 What does that snippet do ? What should it do?

 int caca(void)
 {
   for(int i=0xFFFFFFFF; i!=0x80000000; i++)
     printf("coucou");
 }
Implicit coercion is a design bug in both C and D... :-P
May 14
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Sunday, 14 May 2017 at 19:10:05 UTC, Ola Fosheim Grøstad wrote:
 On Sunday, 14 May 2017 at 16:44:10 UTC, Patrick Schluter wrote:
 What does that snippet do ? What should it do?

 int caca(void)
 {
   for(int i=0xFFFFFFFF; i!=0x80000000; i++)
     printf("coucou");
 }
Implicit coercion is a design bug in both C and D... :-P
Of course the annoying part is that C allows 2s-complement notation for integer literals, so with warnings on: int i = 0xFFFFFFFF; // passes without warning. int i = 0xFFFFFFFFUL; // warning is issued.
May 14
prev sibling parent Guillaume Boucher <guillaume.boucher.d gmail.com> writes:
On Sunday, 14 May 2017 at 09:42:05 UTC, Patrick Schluter wrote:
 But completely removing the code when one encounters for 
 example: if(val+1 == INT_MIN) is simply nuts.
Removing such code is precisely what dmd does: https://issues.dlang.org/show_bug.cgi?id=16268
May 14