www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Have language researchers gotten it all wrong?

reply Walter Bright <newshound1 digitalmars.com> writes:
Certainly, this is a very interesting topic for D's development.

http://www.reddit.com/r/programming/comments/8yeb0/cacm_almost_the_entire_software_community_has/
Jul 05 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:
 Certainly, this is a very interesting topic for D's development.
 http://www.reddit.com/r/programming/comments/8yeb0/cacm_almost_the_entire_software_community_has/

Two comments about this topic: Having a compiler able to catch more bugs is really useful. The less-bug prone nature of D programs compared to C ones is probably one of the most important selling points of D for me. But in such field D is not doing enough yet, there are many other situations where D may catch bugs, like in: stack overflow protection, run-time integer overflow, some region/zone pointer analysis to catch low-level pointer-related bugs (because if you never use pointers then it's better to just use Mono C# instead of D), bugs caused by missing break in switch, missing type controls of printf (because printf is used in D programs too), some safer C std lib functions (because D programs sometimes use C stdlib functions, especially in C code ported to D), some integral promotions/conversions (I have for the third time a bug in a D program caused to automatic conversion to uint), and remove most or all undefined corner-cases of C (like the order of execution of sub-expressions in function calls, etc. Java shows that all such traps can be removed from the language, yet Java is nowdays very fast, sometimes faster than D). People can talk all they want about the advantages of 'classic' static typing, but sometimes they are wrong. I've programmed in static typed languages most of my time, yet if I write a small but complex (~300-lines long) Python program I am usually able to make it run correctly in a short enough time, shorter than the time to write the same (buggy) program in D, and in the end such D program may even not work correctly and I may lose interest in finding the bug. In some situations this isn't caused by my ignorance of D, or by limits of my brain, or by my laziness, but by the intrinsic less bug-prone nature of Python compared to D programs that use pointers a bit. So while I like to program in D a lot, I don't agree with people that say that 'classic' static typing is able to avoid more bugs. On the other hand, if you refer to the modern type systems of languages like Haskell, then their type system is probably able to actually give you some help to write more correct programs. But in languages with a primitive\simple type system as Java (and D) having a static type system isn't so important in reducing bugs in my small programs. Bye, bearophile
Jul 05 2009
parent reply "Unknown W. Brackets" <unknown simplemachines.org> writes:
Well, I think it's more simple than that.  Suppose I have a problem.

In Python, JavaScript, PHP, or some other similar language, I can solve 
the problem in 1,000 lines of code.

In C++, etc. I can solve the problem in 5,000 lines of code.

Which is likely to have the most bugs?  1,000 lines or 5,000?

Obviously, compiler checking helps.  But, there are logical bugs that 
cannot be caught no matter what.  You will have 5 times as many in this 
example.  And those are the harder ones to find/fix anyway.

The best of both worlds is the answer: 1,000 lines with the compiler 
finding some bugs.  This results in the best code and least bugs. 
Having the compiler do this fast is even better.

D is, in my opinion, moving in the correct direction - toward this 
crossroads.

-[Unknown]

bearophile wrote:
 People can talk all they want about the advantages of 'classic' static typing,
but sometimes they are wrong. I've programmed in static typed languages most of
my time, yet if I write a small but complex (~300-lines long) Python program I
am usually able to make it run correctly in a short enough time, shorter than
the time to write the same (buggy) program in D, and in the end such D program
may even not work correctly and I may lose interest in finding the bug. In some
situations this isn't caused by my ignorance of D, or by limits of my brain, or
by my laziness, but by the intrinsic less bug-prone nature of Python compared
to D programs that use pointers a bit. So while I like to program in D a lot, I
don't agree with people that say that 'classic' static typing is able to avoid
more bugs. On the other hand, if you refer to the modern type systems of
languages like Haskell, then their type system is probably able to actually
give you some help to write more correct progr

Jul 05 2009
parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Unknown W. Brackets (unknown simplemachines.org)'s article
 Well, I think it's more simple than that.  Suppose I have a problem.
 In Python, JavaScript, PHP, or some other similar language, I can solve
 the problem in 1,000 lines of code.
 In C++, etc. I can solve the problem in 5,000 lines of code.
 Which is likely to have the most bugs?  1,000 lines or 5,000?
 Obviously, compiler checking helps.  But, there are logical bugs that
 cannot be caught no matter what.  You will have 5 times as many in this
 example.  And those are the harder ones to find/fix anyway.
 The best of both worlds is the answer: 1,000 lines with the compiler
 finding some bugs.  This results in the best code and least bugs.
 Having the compiler do this fast is even better.
 D is, in my opinion, moving in the correct direction - toward this
 crossroads.
 -[Unknown]

Yes, I personally find that, while they may not be the majority of the bugs I create, the majority of *time* I spend debugging is on high-level logic errors--things like incorrect algorithms, input cases I hadn't considered, bad assumptions about the way stuff I'm interacting with works, etc. This is why I tend to be skeptical of "safety" features in D and to push more for features that make it easier to write good libraries. The only way I feel a language can help with high-level bugs like these is to make it easy to write reusable code with clean abstractions so you're more inclined to write a general high-level library to do something once, meaning you only have to debug it and get it right once.
Jul 05 2009
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 05 Jul 2009 14:02:54 -0700, Walter Bright wrote:

 Certainly, this is a very interesting topic for D's development.
 
 http://www.reddit.com/r/programming/comments/8yeb0/cacm_almost_the_entire_software_community_has/

Yes, it is interesting and no, language researchers have not got it /all/ wrong. If /bug/ is defined as application behaviour that is contrary to its specification, then what we developers need when creating applications is a toolset that enables us to translate specifications into programs. Currently, a large part of that //toolset// is human skill - understanding a specification, selecting the appropriate algorithm, selecting the appropriate implementation of algorithms, avoiding scope creep, manually copying specification excerpts to code (eg. text of error messages, scalar factors, etc...) ... and the list goes on. These are areas in which research is sorely needed. Writing code is a minor part of application development. One can write code which works 100% correctly, according to the developer's understanding of the issue being addressed, but it doesn't mean that the application will satisfy the people who commissioned it. Static typing can help us avoid a specific subset of bug-types, and unit testing can also help us avoid another specific subset of bugs. These two sub-sets are not disjoint. They are also not the entire set of bug-types possible. Another important consideration, and this comes from over 30 years in the development world, many programs are not required to be 100% correct. Customers are often happy to trade-off time and cost for a small number of known bugs. The existence of bugs is not the main issue, but the impact of those bugs are. If they are trivial (from the point of view of the people paying for the development) then spending money on avoiding them is not rational. This is often hard for us quality purists to accept, and I grate every time I'm asked to compromise. But that's how the system works. This I suspect is why dynamic typed languages are proving popular, because you can get a 99% correct program shipped without having to spend 200% of the money available. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Jul 05 2009
parent reply "David B. Held" <dheld codelogicconsulting.com> writes:
Derek Parnell wrote:
 [...]
 This I suspect is why dynamic typed languages are proving popular, because
 you can get a 99% correct program shipped without having to spend 200% of
 the money available.

The main problem I see with dynamically typed languages is simply that they are too concise. That is fine for small-scale development with just one or two programmers on a team who own the software for its entire lifetime. When it comes to enterprise software, the author might own the code for less than 6 months before they move on to another team or get reorganized into a different area. Not only that, the software most likely has to interact with other software, and people on other teams must then understand its interfaces. Because dynamic languages explicitly leave out the details from the interface, you have to rely on the documentation that the original author didn't write to figure out how to use that software. You also have to rely on the tests they didn't write to verify that it works correctly. Static typing doesn't fix any of this by itself, but it does prevent the stupid-but-all-too-common errors where you call a function with the wrong number of arguments, transpose some arguments, etc. It also self-documents what is generally expected for each function in the interface and what kinds of objects will get constructed and passed around. The more dynamic languages cater to large-scale development, the more they begin to look like statically-typed languages. So the argument that "it's faster to prototype with dynamic languages" is only relevant for small-scale software, IMO. In every case I've seen where there is a large system written in a dynamically typed language, I see a push by the newer engineers to rewrite it in a statically typed language, for all the reasons I stated above. The economics of large-scale software engineering are such that writing the code constitutes only a small portion of the time creating the software. All that "burden" of writing types and making sure they match is very, very small compared to the additional safety they provide in an environment that will eventually be full of bugs by the time it hits production. Of course, that implies that for tasks which do not require a lot of engineering, it is entirely appropriate and even advantageous to use a dynamic language. This tends to be utility/infrastructure coding to automate things in 1,000 lines or less. Perl is great as long as the user of the Perl script is a programmer. So I think the whiner in the original article is missing the point. Dynamic languages are not, as far as I can tell, taking over the world... They mostly take up a small-scale niche where they can press their strengths. I would be very surprised to hear about a large-scale project in Python/Ruby/etc. (100k+ lines). Dave
Jul 05 2009
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from David B. Held (dheld codelogicconsulting.com)'s article
 Derek Parnell wrote:
 [...]
 This I suspect is why dynamic typed languages are proving popular, because
 you can get a 99% correct program shipped without having to spend 200% of
 the money available.

they are too concise. That is fine for small-scale development with just one or two programmers on a team who own the software for its entire lifetime. When it comes to enterprise software, the author might own the code for less than 6 months before they move on to another team or get reorganized into a different area. Not only that, the software most likely has to interact with other software, and people on other teams must then understand its interfaces. Because dynamic languages explicitly leave out the details from the interface, you have to rely on the documentation that the original author didn't write to figure out how to use that software. You also have to rely on the tests they didn't write to verify that it works correctly. Static typing doesn't fix any of this by itself, but it does prevent the stupid-but-all-too-common errors where you call a function with the wrong number of arguments, transpose some arguments, etc. It also self-documents what is generally expected for each function in the interface and what kinds of objects will get constructed and passed around. The more dynamic languages cater to large-scale development, the more they begin to look like statically-typed languages. So the argument that "it's faster to prototype with dynamic languages" is only relevant for small-scale software, IMO. In every case I've seen where there is a large system written in a dynamically typed language, I see a push by the newer engineers to rewrite it in a statically typed language, for all the reasons I stated above. The economics of large-scale software engineering are such that writing the code constitutes only a small portion of the time creating the software. All that "burden" of writing types and making sure they match is very, very small compared to the additional safety they provide in an environment that will eventually be full of bugs by the time it hits production. Of course, that implies that for tasks which do not require a lot of engineering, it is entirely appropriate and even advantageous to use a dynamic language. This tends to be utility/infrastructure coding to automate things in 1,000 lines or less. Perl is great as long as the user of the Perl script is a programmer. So I think the whiner in the original article is missing the point. Dynamic languages are not, as far as I can tell, taking over the world... They mostly take up a small-scale niche where they can press their strengths. I would be very surprised to hear about a large-scale project in Python/Ruby/etc. (100k+ lines). Dave

1. Perl is a straw man argument. It is a language that was shoe-horned into being a "real" programming language and bears way too much cruft from when it was really intended for just simple scripts. 2. Yes, the lack of type checking lets large *numbers* of bugs seep into code, but they're usually the kind of bugs that are easy to find and fix. Your code craps out with a decent error message, a line number and a stack trace when you pass a non-duck to a function that expects a duck. No ambiguous "segmentation fault" kinds of error messages. 3. In addition to dynamic languages becoming more like static ones, I think D represents the reverse happening. When programming in D, my programming style is to use so many templates and auto type inference that I almost feel like I'm programming in some super-efficient dynamic language, rather than a "real" static language. 4. I don't know if it's just me, but I can never figure out how to use an API written by someone else if the docs suck, even if it is written in a static language. If the docs suck, I usually figure the code sucks too and I may as well roll my own. Heck, even if the docs are good, if it's overengineered and shoe-horned into a poor excuse for a language like Java, where there is One True Way of programming, it still can be a PITA. (Weka machine learning software comes to mind.) At least with dynamic languages (and D, which, with templates is a kinda-sorta dynamic language), you can design an API so that it is intuitive, flexible and maps well to the problem domain instead of jumping through hoops to satisfy the compiler's requirements for explicit type information at the API's compile time. That said, efficiency and the existence of templates are my main argument in favor of static typing. While dynamic languages can be made efficient in speed, it's a heck of a lot harder than with a static languages, and it still probably has tons of memory overhead and doesn't let you do close to the metal stuff. Furthermore, with D's templates and type inference, I hardly even notice that the language is statically typed, and it very seldom gets in my way.
Jul 05 2009
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
I tend to agree that dynamic languages work best for small projects. The 
larger it is, the more advantages accrue to static type checking.

The reality, though, is that programs most often start out as small 
ones, and grow!
Jul 05 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Walter Bright wrote:
 
 I tend to agree that dynamic languages work best for small projects. The
 larger it is, the more advantages accrue to static type checking.
 
 The reality, though, is that programs most often start out as small
 ones, and grow!

I suspect that's why several dynamic languages are looking at, or are at least interested in, adding optional static types. That way, the code can be "locked down" as it grows.
Jul 05 2009
parent reply "Nick Sabalausky" <a a.a> writes:
"Daniel Keep" <daniel.keep.lists gmail.com> wrote in message 
news:h2rssa$2q8k$1 digitalmars.com...
 Walter Bright wrote:
 I tend to agree that dynamic languages work best for small projects. The
 larger it is, the more advantages accrue to static type checking.

 The reality, though, is that programs most often start out as small
 ones, and grow!

I suspect that's why several dynamic languages are looking at, or are at least interested in, adding optional static types. That way, the code can be "locked down" as it grows.

Yea, I've noticed that. The problem with that approach though, is that for compatibility's sake, the "lock it down" features inevitably end up needing to take the approach of "off by default and then optionally enabled" which makes them too easy to accidentially side-step and defeats the point. It's better to go the other way around, and start with a language with locked-down stuff that can be optionally disabled for the sake of all the "nearsighted" coders. What I mean by "nearsighted": Much of the arguments in favor of these "loose-and-dirty" dynamic features center around "It's quicker/easier for small programs!" What? For *small* programs? So what? Small programs are fucking easy regardless of which route you take. I'm supposed to care about the miniscule amount of time saved by ignoring types, declarations and such in a program of, what, a few hundred lines (ie about the only size for which I would argee it might actually be easier)? Maybe if it's something trivial and I know it'll remain trivial. But if there's any chance I'll want to expand it, I'm better off just writing it in a scalable language in the first place. Even if I misjudge and some of those small apps never end up expanding, I'm still better off overall for never having to switch languages midway or start over (which, while still perfectly doable for a small program, is still not nearly as simple as just doing the whole thing in the scalable language in the first place). And of course, when I'm referring to static, locked-down, scalable languages here, I'm referring to something more like D, not like Java where it takes a million lines just to wipe your...umm...well...in any case, I'm talking about something sensible like D.
Jul 05 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Nick Sabalausky wrote:
 (Stuff)

I'm going to have to disagree with you on this from personal experience. Warning: the below has turned into a semi-rant, semi-story. Feel free to disregard in favour of this executive summary, given in response to your post: nuh-uh! A few years ago, I wrote a tiny, throw-away tool called DSTP for my research project. It was a little 822 (yes, I went back through source control to find it :P) Python script that I wrote the first version of in, literally, a few hours. I'm fairly certain it would have taken significantly longer had I tried to write it in D. That said, the program ended up actually becoming a fairly important part of the research program's toolchain, so I rewrote it to be a little less hacky, and it eventually grew into a small 2,292 line Python program. Lack of static typing was an issue in one or two spots, but again the speed with which I built it and could modify it outweighed that. Note that I consider 2k lines "small". You're right in that "Hello, World!" is easy no matter what language you use [1]. In the end, the other researchers were throwing large enough files at it that it being written in Python was becoming an issue [2]. So I took a few weeks off the main project to rewrite DSTP in D. It's now a 5,014 line D program that does *most* of what the original Python program does. There's still a few things I need to add back in (like scripting support which is obviously several orders of magnitude harder in D than in Python) but it's basically the same program. It uses the same algorithms and data structures. It's now about twice the size it was originally, but it *is* around 10 times faster, depending on how you test it. And you know what? The D version isn't really any easier or harder to write or maintain. I would say, however, that it's harder to prototype in. One of the huge advantages of something like Python that a lot of people, shockingly, seem unable to grasp is the interactive interpreter. I can load my program, type in a few lines of code to see how they work, fiddle with it until it does what I want, then lock that down in to a function. Reload and continue. With D, this is much slower and, lacking Python's built-in dir() and help() functions, harder. (Incidentally, please no one argue that D can do this. It can't. If you think it can, you simply don't understand what I'm talking about.) Really, I think the major benefit of a dynamic language is that it lets you think about what you're trying to do, as opposed to trying to think about how to work within the type system. I lost quite a few nice internal features of the codebase in the transition, like the ability to hook up a processing function like this:
  bind_to('pick-uniform-random', no_contents=True)
 def pick_uniform_random_bind(context, attrs):
     range,words = bind_exactly_one_arg(attrs,'range,words')
     integer = get_arg(attrs,'integer')

     context.write(unicode(pick_uniform_random(range,words,integer)))

Let's just say that the D equivalent is... not as pretty. It's 45 lines of not-as-pretty. That one bind_to call does a LOT of work that I can't really paper over in D that easily; a lot of that is thanks to dynamic typing. In contrast, static types would have made the original code easier to read in some cases. One problem I had was that I was constantly going between something like four or so different representations of the XML being processed, which meant I had to be very careful about which variables I passed to which functions. In the end, I think that anyone who takes the stance that either static or dynamic typing is "right" is just outright missing the point. Correct tools for the job: dynamic typing mobs the floor with static typing when you don't know what the types are, and vice versa. Ok, I'm done. Now get off-a mah lawn! :D [1] unless you've been in the University of Wollongong's introductory Java subject where the lecturer SERIOUSLY told us that a three-file, 5-page "Hello, World" program was an improvement over the 5-line version. I swear I heard someone crying behind me in the hall when he said "this is Java, this is good!" [2] to be fair, the actual problem was that Python's XML libraries just didn't seem to have been built with my use-case in mind. I was doing WAY too many copies and transforms of the data to get what I needed. Slices and having the source to Tango's pull parser helped enormously.
Jul 06 2009
prev sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
David B. Held wrote:

 Derek Parnell wrote:
 [...]
 This I suspect is why dynamic typed languages are proving popular,
 because you can get a 99% correct program shipped without having to spend
 200% of the money available.

The main problem I see with dynamically typed languages is simply that they are too concise. That is fine for small-scale development with just one or two programmers on a team who own the software for its entire lifetime. When it comes to enterprise software, the author might own the code for less than 6 months before they move on to another team or get reorganized into a different area. Not only that, the software most likely has to interact with other software, and people on other teams must then understand its interfaces. Because dynamic languages explicitly leave out the details from the interface, you have to rely on the documentation that the original author didn't write to figure out how to use that software. You also have to rely on the tests they didn't write to verify that it works correctly. Static typing doesn't fix any of this by itself, but it does prevent the stupid-but-all-too-common errors where you call a function with the wrong number of arguments, transpose some arguments, etc. It also self-documents what is generally expected for each function in the interface and what kinds of objects will get constructed and passed around.

Software written in dynamic languages won't scale without some more strict development process. Mandatory testing compensates for the lack of static typing and more. With testing you cover that part which the compiles does in static languages, but do it at runtime.
 The more dynamic languages cater to large-scale development, the more
 they begin to look like statically-typed languages.  So the argument
 that "it's faster to prototype with dynamic languages" is only relevant
 for small-scale software, IMO.  In every case I've seen where there is a
 large system written in a dynamically typed language, I see a push by
 the newer engineers to rewrite it in a statically typed language, for
 all the reasons I stated above.
 
 The economics of large-scale software engineering are such that writing
 the code constitutes only a small portion of the time creating the
 software.  All that "burden" of writing types and making sure they match
 is very, very small compared to the additional safety they provide in an
 environment that will eventually be full of bugs by the time it hits
 production.

I have participated briefly in one large project written in Ruby (on Rails). For every bit of code tests must be written. Code coverage verifies that 100% of code is covered by these tests. The writing of tests and creating fixtures were maybe half the work. When you'd check in code that breaks testing, you had to stay in until you fixed it. When you do have a runtime bug, it is much easier to debug a dynamic language than a static language. Dynamic languages don't do type checking at compile time, but they usually do way more of it at runtime. This is point often not mentioned by it's critics. I think productivity was much better than it would have been in a static language, but testing was essential and considered part of the code. One of the benefits of this kind of development is that it becomes very easy to refactor and adapt to new demands.
 Of course, that implies that for tasks which do not require a lot of
 engineering, it is entirely appropriate and even advantageous to use a
 dynamic language.  This tends to be utility/infrastructure coding to
 automate things in 1,000 lines or less.  Perl is great as long as the
 user of the Perl script is a programmer.
 
 So I think the whiner in the original article is missing the point.
 Dynamic languages are not, as far as I can tell, taking over the
 world...  They mostly take up a small-scale niche where they can press
 their strengths.  I would be very surprised to hear about a large-scale
 project in Python/Ruby/etc. (100k+ lines).
 
 Dave

A large scale project in Ruby/Python of 100K+ lines is equivalent to say 1000K+ lines of Java/C# code. I'm not saying dynamic languages do scale as good as static languages, but you have to take these two points into account: 1) development in these languages must have a different methodology and 2) it's not just writing less lines of code that makes it more productive.
Jul 06 2009