www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D for project in computational chemistry

reply "Yura" <min_yura mail.ru> writes:
Dear D coders/developers,

I am just thinking on one project in computational chemistry, and 
it is sort of difficult for me to pick up the right language this 
project to be written. The project is going to deal with the 
generation of the molecular structures and will resemble to some 
extent some bio-informatic stuff. Personally I code in two 
languages - Python, and a little bit in C (just started to learn 
this language).

While it is easy to code in Python there are two things I do not 
like:

1) Python is slow for nested loops (much slower comparing to C)
2) Python is not compiled. However, I want to work with a code 
which can be compiled and distributed as binaries (at least at 
the beginning).

When it comes to C, it is very difficult to code (I am a chemist 
rather than computer scientist). The pointers, memory allocation, 
absence of the truly dynamically allocated arrays, etc, etc make 
the coding very long. C is too low level I believe.

I just wander how D would be suitable for my purpose? Please, 
correct me if I am wrong, but in D the need of pointers is 
minimal, there is a garbage collector, the arrays can be 
dynamically allocated, the arrays can be sliced, ~=, etc which 
makes it similar to python at some extent. I tried to write a 
little code in D and it was very much intuitive and similar to 
what I did both in Python and C.

Any hints/thoughts/advises?

With kind regards,
Yury
Aug 02 2015
next sibling parent "ZombineDev" <valid_email he.re> writes:
On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:
 Dear D coders/developers,

 I am just thinking on one project in computational chemistry, 
 and it is sort of difficult for me to pick up the right 
 language this project to be written. The project is going to 
 deal with the generation of the molecular structures and will 
 resemble to some extent some bio-informatic stuff. Personally I 
 code in two languages - Python, and a little bit in C (just 
 started to learn this language).

 While it is easy to code in Python there are two things I do 
 not like:

 1) Python is slow for nested loops (much slower comparing to C)
 2) Python is not compiled. However, I want to work with a code 
 which can be compiled and distributed as binaries (at least at 
 the beginning).

 When it comes to C, it is very difficult to code (I am a 
 chemist rather than computer scientist). The pointers, memory 
 allocation, absence of the truly dynamically allocated arrays, 
 etc, etc make the coding very long. C is too low level I 
 believe.

 I just wander how D would be suitable for my purpose? Please, 
 correct me if I am wrong, but in D the need of pointers is 
 minimal, there is a garbage collector, the arrays can be 
 dynamically allocated, the arrays can be sliced, ~=, etc which 
 makes it similar to python at some extent. I tried to write a 
 little code in D and it was very much intuitive and similar to 
 what I did both in Python and C.

 Any hints/thoughts/advises?

 With kind regards,
 Yury
I'd say go for it. My experience with D is that you can use it both for fast (to write and execute) scripts and for large enterprise applications. You can certainly view it as a easier version of C, though it can offer a lot more if you need it. 90% of the syntax is the same as C, so there shouldn't be gotchas in the basic stuff. Recently at DConf [1] John Colvin gave a talk [2] about using D for science which will probably be interesting for you. Good luck :) [1]: http://dconf.org/2015/schedule/index.html [2]: https://www.youtube.com/watch?v=edjrSDjkfko D Is For Science
Aug 02 2015
prev sibling next sibling parent "Daniel N" <ufo orbiting.us> writes:
On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:
 Any hints/thoughts/advises?

 With kind regards,
 Yury
Dear Yura, D is a perfect fit. For performance reasons, when releasing your binary make sure to use one of these compilers: GDC - GCC D Compiler (comparable performance to gcc) LDC - LLVM D Compiler (comparable performance to clang) Usually D is on par with C, if performance is less than expected it probably means you forgot one optimization switch, especially singleobj is non-intuitive for ldc2, but can have a dramatic impact. ex ldc2 -O5 -inline -release -singleobj -boundscheck=off Best Regards, Daniel N
Aug 02 2015
prev sibling next sibling parent "bachmeier" <no spam.net> writes:
On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:
 I just wander how D would be suitable for my purpose? Please, 
 correct me if I am wrong, but in D the need of pointers is 
 minimal, there is a garbage collector, the arrays can be 
 dynamically allocated, the arrays can be sliced, ~=, etc which 
 makes it similar to python at some extent. I tried to write a 
 little code in D and it was very much intuitive and similar to 
 what I did both in Python and C.
If you can do it in C, you can do it in D. Full stop. You get a lot of nice additional features with D, and unlike C++, there are not a thousand ways to shoot yourself in the foot with a hundred line program. You can easily call C libraries from D so you lose nothing wrt to legacy C code. You can use pyd to interoperate with Python, so you can move as quickly or as slowly as desired from Python to D. You can even call D from C, so if D doesn't work out, you don't lose the code you've written. I primarily use R but have been moving a lot of code into D for speed and the nice language features for two years now, and I have no regrets. When I started with D, I read lots of comments about the compiler being buggy, but have yet to encounter a compiler bug. Thankfully that myth seems to be dying.
Aug 02 2015
prev sibling next sibling parent Rikki Cattermole <alphaglosined gmail.com> writes:
On 3/08/2015 4:25 a.m., Yura wrote:
 Dear D coders/developers,

 I am just thinking on one project in computational chemistry, and it is
 sort of difficult for me to pick up the right language this project to
 be written. The project is going to deal with the generation of the
 molecular structures and will resemble to some extent some
 bio-informatic stuff. Personally I code in two languages - Python, and a
 little bit in C (just started to learn this language).

 While it is easy to code in Python there are two things I do not like:

 1) Python is slow for nested loops (much slower comparing to C)
 2) Python is not compiled. However, I want to work with a code which can
 be compiled and distributed as binaries (at least at the beginning).

 When it comes to C, it is very difficult to code (I am a chemist rather
 than computer scientist). The pointers, memory allocation, absence of
 the truly dynamically allocated arrays, etc, etc make the coding very
 long. C is too low level I believe.

 I just wander how D would be suitable for my purpose? Please, correct me
 if I am wrong, but in D the need of pointers is minimal, there is a
 garbage collector, the arrays can be dynamically allocated, the arrays
 can be sliced, ~=, etc which makes it similar to python at some extent.
 I tried to write a little code in D and it was very much intuitive and
 similar to what I did both in Python and C.

 Any hints/thoughts/advises?

 With kind regards,
 Yury
Everyone else seems to be focusing on the technical aspects of why choose/not D. To put it simply, just have a go! Write a small prototype. - Did you enjoy it? - Did it reflect what you were thinking well? - Can others understand it? If you need help, feel free to jump on and post on D.learn. If you need more interactive help, come on IRC. We have a channel on FreeNode and even OFTC.
Aug 02 2015
prev sibling next sibling parent reply "yawniek" <dlang srtnwz.com> writes:
On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:

 While it is easy to code in Python there are two things I do 
 not like:

 1) Python is slow for nested loops (much slower comparing to C)
 2) Python is not compiled. However, I want to work with a code 
 which can be compiled and distributed as binaries (at least at 
 the beginning).
you can use the best of both worlds with pyd: https://github.com/ariovistus/pyd - write python Modules in D and/or - make your D code scriptable with python
Aug 02 2015
parent "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
On Monday, 3 August 2015 at 06:16:57 UTC, yawniek wrote:
 On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:

 While it is easy to code in Python there are two things I do 
 not like:

 1) Python is slow for nested loops (much slower comparing to C)
 2) Python is not compiled. However, I want to work with a code 
 which can be compiled and distributed as binaries (at least at 
 the beginning).
you can use the best of both worlds with pyd: https://github.com/ariovistus/pyd - write python Modules in D and/or - make your D code scriptable with python
Also, note that you can write D in the ipython/jupyter notebook and have it interoperate with D libraries from code.dlang.org and with python. It's at an early stage, but so far I have found it to work well. https://github.com/DlangScience/PydMagic
Aug 03 2015
prev sibling next sibling parent reply "FreeSlave" <freeslave93 gmail.com> writes:
On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:
 Dear D coders/developers,

 I am just thinking on one project in computational chemistry, 
 and it is sort of difficult for me to pick up the right 
 language this project to be written. The project is going to 
 deal with the generation of the molecular structures and will 
 resemble to some extent some bio-informatic stuff. Personally I 
 code in two languages - Python, and a little bit in C (just 
 started to learn this language).

 [...]
Did you try PyPy implementation of python? It's claimed to be faster than CPython. If it's still not enough for you, then try D for sure. Write sample program that do calculations on real data, use gdc or ldc to get the optimized code and see if you're happy with results.
Aug 03 2015
parent "jmh530" <john.michael.hall gmail.com> writes:
On Monday, 3 August 2015 at 14:25:21 UTC, FreeSlave wrote:
 On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:
 Dear D coders/developers,

 I am just thinking on one project in computational chemistry, 
 and it is sort of difficult for me to pick up the right 
 language this project to be written. The project is going to 
 deal with the generation of the molecular structures and will 
 resemble to some extent some bio-informatic stuff. Personally 
 I code in two languages - Python, and a little bit in C (just 
 started to learn this language).

 [...]
Did you try PyPy implementation of python? It's claimed to be faster than CPython. If it's still not enough for you, then try D for sure. Write sample program that do calculations on real data, use gdc or ldc to get the optimized code and see if you're happy with results.
Last time I checked there's lots of stuff that you can't use with pypy.
Aug 03 2015
prev sibling next sibling parent reply "Chris" <wendlec tcd.ie> writes:
On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:
 Dear D coders/developers,

 I am just thinking on one project in computational chemistry, 
 and it is sort of difficult for me to pick up the right 
 language this project to be written. The project is going to 
 deal with the generation of the molecular structures and will 
 resemble to some extent some bio-informatic stuff. Personally I 
 code in two languages - Python, and a little bit in C (just 
 started to learn this language).

 While it is easy to code in Python there are two things I do 
 not like:

 1) Python is slow for nested loops (much slower comparing to C)
 2) Python is not compiled. However, I want to work with a code 
 which can be compiled and distributed as binaries (at least at 
 the beginning).

 When it comes to C, it is very difficult to code (I am a 
 chemist rather than computer scientist). The pointers, memory 
 allocation, absence of the truly dynamically allocated arrays, 
 etc, etc make the coding very long. C is too low level I 
 believe.

 I just wander how D would be suitable for my purpose? Please, 
 correct me if I am wrong, but in D the need of pointers is 
 minimal, there is a garbage collector, the arrays can be 
 dynamically allocated, the arrays can be sliced, ~=, etc which 
 makes it similar to python at some extent. I tried to write a 
 little code in D and it was very much intuitive and similar to 
 what I did both in Python and C.

 Any hints/thoughts/advises?

 With kind regards,
 Yury
I agree with bachmeier. You cannot go wrong. You mentioned nested loops. D allows you to concatenate (or "pipe") loops. So instead of foreach { foreach { foreach { } } } you have something like int[] numbers = [-2, 1, 6, -3, 10]; foreach (ref n; numbers .map!(a => a * 5) // multiply each value by 5 .filter!(a => a > 0)) // filter values that are 0 or less { // Do something } or just write auto result = numbers.map!(a => a * 5).filter!(a => a > 0); // ==> result = [5, 30, 50] You'd probably want to have a look at: http://dlang.org/phobos/std_algorithm.html and ranges (a very important concept in D): http://ddili.org/ders/d.en/ranges.html http://wiki.dlang.org/Component_programming_with_ranges Excessive use of nested loops is not necessary in D nor is it very common. This makes the code easier to maintain and less buggy in the end.
Aug 04 2015
next sibling parent reply maarten van damme via Digitalmars-d <digitalmars-d puremagic.com> writes:
I'm not a programmer myself and used D for a project in computational
electromagnetics. While I had to implement numerical integration and a bit
of linear algebra which was annoying (would be really useful in phobos), it
was a joy to work with and the resulting program was incredibly fast.
Most others used matlab and the difference in speed was more than a factor
100. Not only that, prototyping went quicker in D.

I've also written a simulation of the dual slit experiment which I'll drop
somewhere on github once the code is presentable.

So, if you don't mind having to implement a few algorithms that are already
available in numpy, D will be pleasant and fast.

2015-08-04 11:48 GMT+02:00 Chris via Digitalmars-d <
digitalmars-d puremagic.com>:

 On Sunday, 2 August 2015 at 16:25:18 UTC, Yura wrote:

 Dear D coders/developers,

 I am just thinking on one project in computational chemistry, and it is
 sort of difficult for me to pick up the right language this project to be
 written. The project is going to deal with the generation of the molecular
 structures and will resemble to some extent some bio-informatic stuff.
 Personally I code in two languages - Python, and a little bit in C (just
 started to learn this language).

 While it is easy to code in Python there are two things I do not like:

 1) Python is slow for nested loops (much slower comparing to C)
 2) Python is not compiled. However, I want to work with a code which can
 be compiled and distributed as binaries (at least at the beginning).

 When it comes to C, it is very difficult to code (I am a chemist rather
 than computer scientist). The pointers, memory allocation, absence of the
 truly dynamically allocated arrays, etc, etc make the coding very long. C
 is too low level I believe.

 I just wander how D would be suitable for my purpose? Please, correct me
 if I am wrong, but in D the need of pointers is minimal, there is a garbage
 collector, the arrays can be dynamically allocated, the arrays can be
 sliced, ~=, etc which makes it similar to python at some extent. I tried to
 write a little code in D and it was very much intuitive and similar to what
 I did both in Python and C.

 Any hints/thoughts/advises?

 With kind regards,
 Yury
I agree with bachmeier. You cannot go wrong. You mentioned nested loops. D allows you to concatenate (or "pipe") loops. So instead of foreach { foreach { foreach { } } } you have something like int[] numbers = [-2, 1, 6, -3, 10]; foreach (ref n; numbers .map!(a => a * 5) // multiply each value by 5 .filter!(a => a > 0)) // filter values that are 0 or less { // Do something } or just write auto result = numbers.map!(a => a * 5).filter!(a => a > 0); // ==> result = [5, 30, 50] You'd probably want to have a look at: http://dlang.org/phobos/std_algorithm.html and ranges (a very important concept in D): http://ddili.org/ders/d.en/ranges.html http://wiki.dlang.org/Component_programming_with_ranges Excessive use of nested loops is not necessary in D nor is it very common. This makes the code easier to maintain and less buggy in the end.
Aug 04 2015
parent reply "Chris" <wendlec tcd.ie> writes:
On Tuesday, 4 August 2015 at 13:25:22 UTC, maarten van damme 
wrote:
 I'm not a programmer myself and used D for a project in 
 computational
 electromagnetics. While I had to implement numerical 
 integration and a bit
 of linear algebra which was annoying (would be really useful in 
 phobos), it
 was a joy to work with and the resulting program was incredibly 
 fast.
 Most others used matlab and the difference in speed was more 
 than a factor
 100. Not only that, prototyping went quicker in D.
Good that you point that out. Most people I know claim that it's easier to develop/prototype in with Matlab. Apart from execution speed and fast prototyping, Matlab is proprietary. This alone is a deal breaker.
 I've also written a simulation of the dual slit experiment 
 which I'll drop somewhere on github once the code is 
 presentable.

 So, if you don't mind having to implement a few algorithms that 
 are already available in numpy, D will be pleasant and fast.
Aug 04 2015
next sibling parent reply "bachmeier" <no spam.com> writes:
On Tuesday, 4 August 2015 at 13:42:15 UTC, Chris wrote:

 Good that you point that out. Most people I know claim that 
 it's easier to develop/prototype in with Matlab. Apart from 
 execution speed and fast prototyping, Matlab is proprietary. 
 This alone is a deal breaker.
I can only imagine it being faster to prototype in Matlab if there are additional libraries available. D's just a way better language - Matlab was designed as a replacement for FORTRAN 77 - and static typing means the compiler catches a lot of bugs that you don't want to think about while prototyping.
Aug 04 2015
next sibling parent "Chris" <wendlec tcd.ie> writes:
On Tuesday, 4 August 2015 at 13:58:02 UTC, bachmeier wrote:
 On Tuesday, 4 August 2015 at 13:42:15 UTC, Chris wrote:

 Good that you point that out. Most people I know claim that 
 it's easier to develop/prototype in with Matlab. Apart from 
 execution speed and fast prototyping, Matlab is proprietary. 
 This alone is a deal breaker.
I can only imagine it being faster to prototype in Matlab if there are additional libraries available. D's just a way better language - Matlab was designed as a replacement for FORTRAN 77 - and static typing means the compiler catches a lot of bugs that you don't want to think about while prototyping.
I think it's a myth that Matlab is better for prototyping. A lot of people don't want to learn D (or any other language) and say that prototyping in Matlab is faster (which it is, of course, if you don't know D at all). Have you ever seen any of those quickly written Matlab programs? Oh deary me!
Aug 04 2015
prev sibling parent reply "jmh530" <john.michael.hall gmail.com> writes:
On Tuesday, 4 August 2015 at 13:58:02 UTC, bachmeier wrote:
 I can only imagine it being faster to prototype in Matlab if 
 there are additional libraries available. D's just a way better 
 language - Matlab was designed as a replacement for FORTRAN 77 
 - and static typing means the compiler catches a lot of bugs 
 that you don't want to think about while prototyping.
It depends on what you need to do. I typically use Matlab/R/Python for statistics, matrix math, and optimization. Because all the libraries are either part of the language or readily available, I can do what I need to easily...much easier than in D. Wrt static typing, I don't think the issue is about catching bugs in Matlab. I just haven't had much an issue with mixing up types in Matlab. I can see the arguments about static typing and ahead of time compilation for performance, but that's less important for prototyping. One good thing about dynamic typing is that it allows for easier manipulation of matrices.
Aug 04 2015
parent reply "bachmeier" <no spam.com> writes:
On Tuesday, 4 August 2015 at 19:40:30 UTC, jmh530 wrote:

 Wrt static typing, I don't think the issue is about catching 
 bugs in Matlab. I just haven't had much an issue with mixing up 
 types in Matlab. I can see the arguments about static typing 
 and ahead of time compilation for performance, but that's less 
 important for prototyping. One good thing about dynamic typing 
 is that it allows for easier manipulation of matrices.
For me, the big win in prototyping comes from the ability to easily make significant changes to my program. With D, I am able to give the compiler a bunch of information about the types of data I'm working with, and the compiler handles all the details for me, better than a good research assistant. If I change the function to take different input or return different output, I don't have to look through 500 lines of code to make sure I'm keeping everything consistent. A couple of things that might make D more pleasant for me are: - I do a lot of simulation-related things, where the inputs and outputs can change a lot as I figure out how I want to do things, and - I use R. R was invented down the hall from C, and AFAICT, the C and R guys were believers that silent casting and undefined behavior are the foundation of a good programming language.
Aug 04 2015
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 4 August 2015 at 20:37:18 UTC, bachmeier wrote:
 A couple of things that might make D more pleasant for me are:
 - I do a lot of simulation-related things, where the inputs and 
 outputs can change a lot as I figure out how I want to do 
 things, and
 - I use R. R was invented down the hall from C, and AFAICT, the 
 C and R guys were believers that silent casting and undefined 
 behavior are the foundation of a good programming language.
We should talk. I work in a research group with a lot of simulation work going on, your perspectives on D for simulations could be useful to me. If you're interested, drop me an email: john dot loughran dot colvin at gmail dot com
Aug 04 2015
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 4 August 2015 at 13:42:15 UTC, Chris wrote:
 On Tuesday, 4 August 2015 at 13:25:22 UTC, maarten van damme 
 wrote:
 I'm not a programmer myself and used D for a project in 
 computational
 electromagnetics. While I had to implement numerical 
 integration and a bit
 of linear algebra which was annoying (would be really useful 
 in phobos), it
 was a joy to work with and the resulting program was 
 incredibly fast.
 Most others used matlab and the difference in speed was more 
 than a factor
 100. Not only that, prototyping went quicker in D.
Good that you point that out. Most people I know claim that it's easier to develop/prototype in with Matlab. Apart from execution speed and fast prototyping, Matlab is proprietary. This alone is a deal breaker.
Matlab is a linear algebra DSL with a general purpose mathematical programming language caked on to it, held in place purely by money and consumer lock-in.
Aug 04 2015
prev sibling parent reply "jmh530" <john.michael.hall gmail.com> writes:
On Tuesday, 4 August 2015 at 09:48:07 UTC, Chris wrote:
 I agree with bachmeier. You cannot go wrong. You mentioned 
 nested loops. D allows you to concatenate (or "pipe") loops. So 
 instead of

 foreach
 {
   foreach
   {
     foreach
     {
     }
   }
 }

 you have something like

 int[] numbers = [-2, 1, 6, -3, 10];
 foreach (ref n; numbers
   .map!(a => a * 5)  // multiply each value by 5
   .filter!(a => a > 0))  // filter values that are 0 or less
 {
   //  Do something
 }
I don't think I had seen an example like this before (though it is obvious in retrospect). Is there any advantage in terms of performance?
Aug 04 2015
next sibling parent "bachmeier" <no spam.com> writes:
On Tuesday, 4 August 2015 at 18:56:20 UTC, jmh530 wrote:

 I don't think I had seen an example like this before (though it 
 is obvious in retrospect). Is there any advantage in terms of 
 performance?
The big win is in terms of being able to write complicated, correct code easily. However, there was a recent thread on the topic: http://forum.dlang.org/post/mailman.4829.1434623275.7663.digitalmars-d puremagic.com Walter said, "I expect that at the moment, range+algorithms code will likely be somewhat slower than old fashioned loops. This is because code generators have been tuned for decades to do a great job with loops. There's no intrinsic reason why ranges must do worse, so I expect they'll achieve parity. Ranges can move ahead because they can reduce the algorithmic complexity, whereas user written loops tend to be suboptimal."
Aug 04 2015
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 4 August 2015 at 18:56:20 UTC, jmh530 wrote:
 On Tuesday, 4 August 2015 at 09:48:07 UTC, Chris wrote:
 I agree with bachmeier. You cannot go wrong. You mentioned 
 nested loops. D allows you to concatenate (or "pipe") loops. 
 So instead of

 foreach
 {
   foreach
   {
     foreach
     {
     }
   }
 }

 you have something like

 int[] numbers = [-2, 1, 6, -3, 10];
 foreach (ref n; numbers
   .map!(a => a * 5)  // multiply each value by 5
   .filter!(a => a > 0))  // filter values that are 0 or less
 {
   //  Do something
 }
I don't think I had seen an example like this before (though it is obvious in retrospect). Is there any advantage in terms of performance?
ldc and gdc can often achieve parity with explicit loops.
Aug 04 2015
prev sibling next sibling parent reply "Yura" <min_yura mail.ru> writes:
Dear all,

Thank you for your replies. I am now really convinced that D is a 
decent choice for my project (also I am really happy to see that 
the forum is really active and apparently many of you use D for 
your scientific projects). I am just looking forward to writing 
the code. I had a very quick look at lecture given at DConf 2015 
- good talk, and I believe D has a big promise in Science. 
Perhaps the only problem being is the mathematical library, like 
numpy.

Until now I usually wrote the prototype algorithms in Python and 
then translated the code onto C for speed. It would be just dream 
to use only one language. The dominant languages in science now 
for production codes are Fortran or C/C++, may be D could become 
another option?

With kind regards,
Yury
Aug 05 2015
next sibling parent reply "Chris" <wendlec tcd.ie> writes:
On Wednesday, 5 August 2015 at 17:47:49 UTC, Yura wrote:
 Dear all,

 Thank you for your replies. I am now really convinced that D is 
 a decent choice for my project (also I am really happy to see 
 that the forum is really active and apparently many of you use 
 D for your scientific projects). I am just looking forward to 
 writing the code. I had a very quick look at lecture given at 
 DConf 2015 - good talk, and I believe D has a big promise in 
 Science. Perhaps the only problem being is the mathematical 
 library, like numpy.

 Until now I usually wrote the prototype algorithms in Python 
 and then translated the code onto C for speed. It would be just 
 dream to use only one language. The dominant languages in 
 science now for production codes are Fortran or C/C++, may be D 
 could become another option?

 With kind regards,
 Yury
I think NumPy was written in C(++) and is imported as a Python module. So if you can get your hands on the original underlying C(++) library, you can call NumPy directly from D, can't you? In case you do this, let me know how you fared with it. NumPy is usually the deadbeat argument when people have to choose between Python or other languages.
Aug 05 2015
parent "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
On Wednesday, 5 August 2015 at 18:20:20 UTC, Chris wrote:
 On Wednesday, 5 August 2015 at 17:47:49 UTC, Yura wrote:
 Dear all,

 Thank you for your replies. I am now really convinced that D 
 is a decent choice for my project (also I am really happy to 
 see that the forum is really active and apparently many of you 
 use D for your scientific projects). I am just looking forward 
 to writing the code. I had a very quick look at lecture given 
 at DConf 2015 - good talk, and I believe D has a big promise 
 in Science. Perhaps the only problem being is the mathematical 
 library, like numpy.

 Until now I usually wrote the prototype algorithms in Python 
 and then translated the code onto C for speed. It would be 
 just dream to use only one language. The dominant languages in 
 science now for production codes are Fortran or C/C++, may be 
 D could become another option?

 With kind regards,
 Yury
I think NumPy was written in C(++) and is imported as a Python module. So if you can get your hands on the original underlying C(++) library, you can call NumPy directly from D, can't you? In case you do this, let me know how you fared with it. NumPy is usually the deadbeat argument when people have to choose between Python or other languages.
Isn't the useful bit of NumPy more like a set of wrappers around other C/C++/Fortran libraries? And the whole point of NumPy is that you can call it easily from python, which doesn't make for a nice calling convention from D (although you can call from PyD if you want). Anyway, I agree that it's a big project, but not an infeasible one to implement similar functionality in D.
Aug 05 2015
prev sibling parent reply "bachmeier" <no spam.com> writes:
On Wednesday, 5 August 2015 at 17:47:49 UTC, Yura wrote:

 The dominant languages in science now for production codes are 
 Fortran or C/C++, may be D could become another option?

 With kind regards,
 Yury
Yes. The question is whether we can put together a group of developers to build the infrastructure, which is a lot more than just code. That means, in particular, good documentation and using it for our own projects. Everyone these days talks about how Python is a powerhouse scientific programming language. A decade ago it was crap. I know, because I watched it for years wishing I could use it. There were some poorly documented, domain-specific, hacked-together libraries, but Python was not for the most part a suitable choice. There is no reason we can't do the same for D. The main question is whether we are sufficiently committed to that goal. Others may consider Python, Julia, and Matlab to be good enough alternatives (I don't, but not everyone necessarily agrees with me).
Aug 05 2015
next sibling parent reply "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
On Wednesday, 5 August 2015 at 18:49:21 UTC, bachmeier wrote:
 On Wednesday, 5 August 2015 at 17:47:49 UTC, Yura wrote:

 The dominant languages in science now for production codes are 
 Fortran or C/C++, may be D could become another option?

 With kind regards,
 Yury
Yes. The question is whether we can put together a group of developers to build the infrastructure, which is a lot more than just code. That means, in particular, good documentation and using it for our own projects. Everyone these days talks about how Python is a powerhouse scientific programming language. A decade ago it was crap. I know, because I watched it for years wishing I could use it. There were some poorly documented, domain-specific, hacked-together libraries, but Python was not for the most part a suitable choice. There is no reason we can't do the same for D. The main question is whether we are sufficiently committed to that goal. Others may consider Python, Julia, and Matlab to be good enough alternatives (I don't, but not everyone necessarily agrees with me).
Yes - I fully agree with everything you say here. John Colvin has done great work in putting together a scientific computing portal for D, and writing a wrapper so you can write D within an ipython notebook (and call it transparently from python code). Great for rapid iteration and exploration of data, and it means you don't need to write the whole stack from scratch in D. One place I started in a small way was implementing a limited subset of dataframe functionality. There's not much to it, but it's something very handy to be able to slurp in data from csv or hdf5 that might not fit in a more rigid format and do spreadsheet type stuff with it. So instead of an array of variants, you define a type for the column (every entry in the column is the same type, but different columns may be of different types). In addition each column has a name and you can add and remove columns easily. I've implemented a just about usable version of that, but it's not pretty, rigorous, or especially efficient. The next stage is creating indexing by columns a la pandas. Dataframes aren't intellectually very exciting, but they are very useful for iterative data exploration and quick prototyping since all of that starts with getting the data in from somewhere in a standard format. The problem I have is that I have an ambitious project and too few resources for now. So I can't at this stage put much time into making anything someone else could use. But maybe we could work together on parts of this, if that would be interesting. I am speaking to Vlad Lefenfeld about this a bit too. On the pure numerical stuff, speak to John Colvin. If you want you can email me laeeth laeeth dot com. Laeeth.
Aug 05 2015
parent reply "jmh530" <john.michael.hall gmail.com> writes:
On Wednesday, 5 August 2015 at 23:37:37 UTC, Laeeth Isharc wrote:
 Dataframes aren't intellectually very exciting, but they are 
 very useful for iterative data exploration and quick 
 prototyping since all of that starts with getting the data in 
 from somewhere in a standard format.
May not be intellectually exciting, but look at how popular pandas is for python.
Aug 05 2015
parent reply "Laeeth Isharc" <Laeeth.nospam nospam-laeeth.com> writes:
On Thursday, 6 August 2015 at 02:38:15 UTC, jmh530 wrote:
 On Wednesday, 5 August 2015 at 23:37:37 UTC, Laeeth Isharc 
 wrote:
 Dataframes aren't intellectually very exciting, but they are 
 very useful for iterative data exploration and quick 
 prototyping since all of that starts with getting the data in 
 from somewhere in a standard format.
May not be intellectually exciting, but look at how popular pandas is for python.
Yes - exactly my point. I figured I needed to acknowledged this however, since smart people often work on what's stimulating rather than what's most useful given the choice. Having dataframes will open up more possibilities. Via PyD and some extra glue, it shouldn't be hard to make D dataframes easily convertible to pandas dataframes and back. Thats perfect for the Ipython notebook because you then don't need to convert all your code at once, and you still have access to Python notebooks.
Aug 05 2015
parent "Laeeth Isharc" <Laeeth.nospam nospam-laeeth.com> writes:
On Thursday, 6 August 2015 at 06:56:11 UTC, Laeeth Isharc wrote:
 On Thursday, 6 August 2015 at 02:38:15 UTC, jmh530 wrote:
 On Wednesday, 5 August 2015 at 23:37:37 UTC, Laeeth Isharc 
 wrote:
 Dataframes aren't intellectually very exciting, but they are 
 very useful for iterative data exploration and quick 
 prototyping since all of that starts with getting the data in 
 from somewhere in a standard format.
May not be intellectually exciting, but look at how popular pandas is for python.
Yes - exactly my point. I figured I needed to acknowledged this however, since smart people often work on what's stimulating rather than what's most useful given the choice. Having dataframes will open up more possibilities. Via PyD and some extra glue, it shouldn't be hard to make D dataframes easily convertible to pandas dataframes and back. Thats perfect for the Ipython notebook because you then don't need to convert all your code at once, and you still have access to Python notebooks.
I mean access to Python libraries.
Aug 05 2015
prev sibling parent reply "Gerald Jansen" <gjansen ownmail.net> writes:
On Wednesday, 5 August 2015 at 18:49:21 UTC, bachmeier wrote:

 Yes. The question is whether we can put together a group of 
 developers to build the infrastructure, which is a lot more 
 than just code. That means, in particular, good documentation 
 and using it for our own projects.
Right on! I would be willing to help with documentation if there were a concerted effort in this direction. There have been a number of failed individual efforts over the years. So how can a group effort be promoted? Is the Dscience github project an adequate platform? How can other people get involved? Is a dedicated discussion group needed? Can we develop a plan of some sort rather than just a scatter of individual efforts?
 Everyone these days talks about how Python is a powerhouse 
 scientific programming language. A decade ago it was crap. I 
 know, because I watched it for years wishing I could use it. 
 There were some poorly documented, domain-specific, 
 hacked-together libraries, but Python was not for the most part 
 a suitable choice.
I started with Python in the years of the Numeric+numarray->NumPy transition. It was messy. Personally I think a unified library like NumPy, to underpin other more specialized libraries, is of paramount importance to any success of D in science. Ideally there would be a NumPy/ndarray usage-compatible module in D. That would make D much more attractive to potential Python converts and lower the entry barrier considerably.
Aug 06 2015
next sibling parent "ixid" <nuaccount gmail.com> writes:
On Thursday, 6 August 2015 at 08:11:49 UTC, Gerald Jansen wrote:
 Is the Dscience github project an adequate platform? How can 
 other people get involved? Is a dedicated discussion group 
 needed? Can we develop a plan of some sort rather than just a 
 scatter of individual efforts?
If a group do go ahead with this it would be good to add a discussion group on this forum for it to maximize visibility and involvement.
Aug 06 2015
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 6 August 2015 at 08:11:49 UTC, Gerald Jansen wrote:
 On Wednesday, 5 August 2015 at 18:49:21 UTC, bachmeier wrote:

 Yes. The question is whether we can put together a group of 
 developers to build the infrastructure, which is a lot more 
 than just code. That means, in particular, good documentation 
 and using it for our own projects.
Right on! I would be willing to help with documentation if there were a concerted effort in this direction. There have been a number of failed individual efforts over the years. So how can a group effort be promoted? Is the Dscience github project an adequate platform? How can other people get involved? Is a dedicated discussion group needed? Can we develop a plan of some sort rather than just a scatter of individual efforts?
Yes, come join https://github.com/DlangScience. Ilya and I have a plan of sorts, but it needs formally writing down. For now we have been using a private Gitter room for discussion, which has been OK for now but likely won't scale. Anyone serious about getting involved, drop a message here: https://gitter.im/DlangScience/public and we can start to build a picture of what expertise we have and what we're missing. A proper public forum could be great, but I don't personally have time to set up something like that at the moment.
Aug 06 2015
prev sibling parent reply "Yura" <min_yura mail.ru> writes:
Good afternoon, gentlemen,

just want to describe my very limited experience. I have 
re-written about half of my Python code into D. I got it faster 
by 6 times. This is a good news.

However, I was amazed by performance of D vs Python for following 
simple nested loops (see below). D was faster by 2 order of 
magnitude!

Bearing in mind that Python is really used in computational 
chemistry/bioinformatics, I am sure D can be a good option in 
this field. In the modern strategy for the computational software 
python is used as a glue language and the number crunching parts 
are usually written in Fortran or C/C++. Apparently, with D one 
language can be used to write the entire code. Please, also look 
at this article:

http://www.worldcomp-proceedings.com/proc/p2012/PDP3426.pdf

Also, I wander about the results of this internship:

http://forum.dlang.org/post/laha9j$pc$1 digitalmars.com

With kind regards,
Yury


Python:


import sys, string, os, glob, random
from math import *

a = 0

l = 1000

for i in range(l):
         for j in range(l):
                 for m in range(l):
                         a = a +i*i*0.7+j*j*0.8+m*m*0.9

print a

D:

import std.stdio;
// command line argument
import std.getopt;
import std.string;
import std.array;
import std.conv;
import std.math;

// main program starts here
void main(string[] args) {


int l = 1000;
double a = 0;
for (auto i=0;i<l;i++){
         for (auto j=0;j<l;j++) {
                 for (auto m=0;m<l;m++) {
                         a = a + i*i*0.7+j*j*0.8+m*m*0.9;
                         }

         }
}
writeln(a);
}
Aug 16 2015
next sibling parent Rikki Cattermole <alphaglosined gmail.com> writes:
On 17/08/2015 1:11 a.m., Yura wrote:
 Good afternoon, gentlemen,

 just want to describe my very limited experience. I have re-written
 about half of my Python code into D. I got it faster by 6 times. This is
 a good news.

 However, I was amazed by performance of D vs Python for following simple
 nested loops (see below). D was faster by 2 order of magnitude!

 Bearing in mind that Python is really used in computational
 chemistry/bioinformatics, I am sure D can be a good option in this
 field. In the modern strategy for the computational software python is
 used as a glue language and the number crunching parts are usually
 written in Fortran or C/C++. Apparently, with D one language can be used
 to write the entire code. Please, also look at this article:

 http://www.worldcomp-proceedings.com/proc/p2012/PDP3426.pdf

 Also, I wander about the results of this internship:

 http://forum.dlang.org/post/laha9j$pc$1 digitalmars.com

 With kind regards,
 Yury


 Python:


 import sys, string, os, glob, random
 from math import *

 a = 0

 l = 1000

 for i in range(l):
          for j in range(l):
                  for m in range(l):
                          a = a +i*i*0.7+j*j*0.8+m*m*0.9

 print a

 D:

 import std.stdio;
 // command line argument
 import std.getopt;
 import std.string;
 import std.array;
 import std.conv;
 import std.math;

 // main program starts here
 void main(string[] args) {


 int l = 1000;
 double a = 0;
 for (auto i=0;i<l;i++){
          for (auto j=0;j<l;j++) {
                  for (auto m=0;m<l;m++) {
                          a = a + i*i*0.7+j*j*0.8+m*m*0.9;
                          }

          }
 }
 writeln(a);
 }
Any chance for when you get the time/content, to create a research paper using your use case? It would be amazing publicity and even more so to get it published! Otherwise, we could always do with another user story :)
Aug 16 2015
prev sibling next sibling parent reply "Idan Arye" <GenericNPC gmail.com> writes:
On Sunday, 16 August 2015 at 13:11:12 UTC, Yura wrote:
 Good afternoon, gentlemen,

 just want to describe my very limited experience. I have 
 re-written about half of my Python code into D. I got it faster 
 by 6 times. This is a good news.

 However, I was amazed by performance of D vs Python for 
 following simple nested loops (see below). D was faster by 2 
 order of magnitude!

 Bearing in mind that Python is really used in computational 
 chemistry/bioinformatics, I am sure D can be a good option in 
 this field. In the modern strategy for the computational 
 software python is used as a glue language and the number 
 crunching parts are usually written in Fortran or C/C++. 
 Apparently, with D one language can be used to write the entire 
 code. Please, also look at this article:

 http://www.worldcomp-proceedings.com/proc/p2012/PDP3426.pdf

 Also, I wander about the results of this internship:

 http://forum.dlang.org/post/laha9j$pc$1 digitalmars.com

 With kind regards,
 Yury


 Python:


 import sys, string, os, glob, random
 from math import *

 a = 0

 l = 1000

 for i in range(l):
         for j in range(l):
                 for m in range(l):
                         a = a +i*i*0.7+j*j*0.8+m*m*0.9

 print a

 D:

 import std.stdio;
 // command line argument
 import std.getopt;
 import std.string;
 import std.array;
 import std.conv;
 import std.math;

 // main program starts here
 void main(string[] args) {


 int l = 1000;
 double a = 0;
 for (auto i=0;i<l;i++){
         for (auto j=0;j<l;j++) {
                 for (auto m=0;m<l;m++) {
                         a = a + i*i*0.7+j*j*0.8+m*m*0.9;
                         }

         }
 }
 writeln(a);
 }
Initially I thought the Python version is so slow because it uses `range` instead of `xrange`, but I tried them both and they both take about the same, so I guess the Python JIT(or even interpreter!) can optimize these allocations away. BTW - if you want to iterate over a range of numbers in D, you can use a foreach loop: foreach (i; 0 .. l) { foreach (j; 0 .. l) { foreach (m; 0 .. l) { a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9; } } } Or, to make it look more like the Python version, you can iterate over a range-returning function: import std.range : iota; foreach (i; iota(l)) { foreach (j; iota(l)) { foreach (m; iota(l)) { a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9; } } } There are also functions for building ranges from other ranges: import std.algorithm : cartesianProduct; import std.range : iota; foreach (i, j, m; cartesianProduct(iota(l), iota(l), iota(l))) { a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9; } Keep in mind though that using these functions, while making the code more readable(to those with some experience in D, at least), is bad for performance - for my first version I got about 5 seconds when building with DMD in debug mode, while for the last version I get 13 seconds when building with LDC in release mode.
Aug 16 2015
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 16 August 2015 at 13:59:33 UTC, Idan Arye wrote:
 Initially I thought the Python version is so slow because it 
 uses `range` instead of `xrange`, but I tried them both and 
 they both take about the same, so I guess the Python JIT(or 
 even interpreter!) can optimize these allocations away.

 BTW - if you want to iterate over a range of numbers in D, you 
 can use a foreach loop:

     foreach (i; 0 .. l) {
         foreach (j; 0 .. l) {
             foreach (m; 0 .. l) {
                 a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9;
             }

         }
     }

 Or, to make it look more like the Python version, you can 
 iterate over a range-returning function:

     import std.range : iota;
     foreach (i; iota(l)) {
         foreach (j; iota(l)) {
             foreach (m; iota(l)) {
                 a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9;
             }

         }
     }

 There are also functions for building ranges from other ranges:

     import std.algorithm : cartesianProduct;
     import std.range : iota;
     foreach (i, j, m; cartesianProduct(iota(l), iota(l), 
 iota(l))) {
         a = a + i * i * 0.7 + j * j * 0.8 + m * m * 0.9;
     }

 Keep in mind though that using these functions, while making 
 the code more readable(to those with some experience in D, at 
 least), is bad for performance - for my first version I got 
 about 5 seconds when building with DMD in debug mode, while for 
 the last version I get 13 seconds when building with LDC in 
 release mode.
There is a new implementation of cartesianProduct that makes the performance difference disappear for me with ldc and dmd. It's not in ldc's phobos yet so I had to copy it manually, but hopefully it will be in the next release.
Aug 17 2015
prev sibling parent "jmh530" <john.michael.hall gmail.com> writes:
On Sunday, 16 August 2015 at 13:11:12 UTC, Yura wrote:
 Python:


 import sys, string, os, glob, random
 from math import *

 a = 0

 l = 1000

 for i in range(l):
         for j in range(l):
                 for m in range(l):
                         a = a +i*i*0.7+j*j*0.8+m*m*0.9

 print a
While starting over with D might make a better framework for going forward, it might be less work than you think to speed-up your existing Python code base. Loops in Python are notoriously slow. The code you're using seems like a classic example of something that could be sped up. You could write the slow parts in C with Cython. Alternately, you could play with Numba's jit.
Aug 17 2015