www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Scientific computing in D

reply =?UTF-8?B?TcOhcmNpbw==?= Martins <marcioapm gmail.com> writes:
I have been running some MCMC simulations in Python and it's hard 
to cope with how unbelievably slow it is.
Takes me almost a minute to run a few hundred thousand samples on 
my laptop whereas I can run the same simulation with a million 
samples in under 100ms, on my phone with JavaScript on a browser.

Then, you spend a minute waiting for the simulation to finish, to 
find out you had an error in your report code that would have 
been easily caught with static typing. So annoying...

Is anyone doing similar stuff with D? Unfortunately, I couldn't 
find any plotting libraries nor MATLAB-like numerical/stats libs 
in dub.

This seems like another area where D could easily pick up 
momentum with RDMD and perhaps an integration with Jupyter which 
is becoming very very popular.
Nov 09 2015
next sibling parent cym13 <cpicard openmailbox.org> writes:
On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins wrote:
 I have been running some MCMC simulations in Python and it's 
 hard to cope with how unbelievably slow it is.
 Takes me almost a minute to run a few hundred thousand samples 
 on my laptop whereas I can run the same simulation with a 
 million samples in under 100ms, on my phone with JavaScript on 
 a browser.

 Then, you spend a minute waiting for the simulation to finish, 
 to find out you had an error in your report code that would 
 have been easily caught with static typing. So annoying...

 Is anyone doing similar stuff with D? Unfortunately, I couldn't 
 find any plotting libraries nor MATLAB-like numerical/stats 
 libs in dub.

 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
I think you may like to have a look at https://github.com/dscience-developers/dscience that is a collective effort to give D tools for bioinformatics (mainly). Also note that if you want you can just plug some python code in using PyD https://github.com/ariovistus/pyd. A concrete example can be found here: http://d.readthedocs.org/en/latest/examples.html#plotting-with-matplotlib-python Although, this video could be of some interest to you: http://dconf.org/2015/talks/colvin.html
Nov 09 2015
prev sibling next sibling parent reply Gerald Jansen <gjansen ownmail.net> writes:
On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins wrote:
 I have been running some MCMC simulations in Python ...

 Is anyone doing similar stuff with D? Unfortunately, I couldn't 
 find any plotting libraries nor MATLAB-like numerical/stats 
 libs in dub.

 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
see http://dlangscience.github.io/
Nov 09 2015
parent reply bachmeier <no spam.com> writes:
On Monday, 9 November 2015 at 20:30:49 UTC, Gerald Jansen wrote:
 On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins 
 wrote:
 I have been running some MCMC simulations in Python ...

 Is anyone doing similar stuff with D? Unfortunately, I 
 couldn't find any plotting libraries nor MATLAB-like 
 numerical/stats libs in dub.

 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
see http://dlangscience.github.io/
And here is the gitter discussion site: https://gitter.im/DlangScience/public I've got this project https://bitbucket.org/bachmeil/dmdinline2 to embed D inside R on Linux. Unfortunately the documentation isn't good. I'm currently working on going in the other direction, embedding R inside D. There are, of course, many good MCMC options in R that you could call from your D code.
Nov 09 2015
next sibling parent reply Idan Arye <GenericNPC gmail.com> writes:
On Monday, 9 November 2015 at 21:05:35 UTC, bachmeier wrote:
 On Monday, 9 November 2015 at 20:30:49 UTC, Gerald Jansen wrote:
 On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins 
 wrote:
 I have been running some MCMC simulations in Python ...

 Is anyone doing similar stuff with D? Unfortunately, I 
 couldn't find any plotting libraries nor MATLAB-like 
 numerical/stats libs in dub.

 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
see http://dlangscience.github.io/
And here is the gitter discussion site: https://gitter.im/DlangScience/public I've got this project https://bitbucket.org/bachmeil/dmdinline2 to embed D inside R on Linux. Unfortunately the documentation isn't good. I'm currently working on going in the other direction, embedding R inside D. There are, of course, many good MCMC options in R that you could call from your D code.
Weird approach. Usually, when one wants to use an interpreted language as an host to a compiled language, the strategy is precompile the compiled language's code and load it as extensions in the interpreted code.
Nov 09 2015
parent reply bachmeier <no spam.net> writes:
On Tuesday, 10 November 2015 at 00:01:19 UTC, Idan Arye wrote:
 Weird approach. Usually, when one wants to use an interpreted 
 language as an host to a compiled language, the strategy is 
 precompile the compiled language's code and load it as 
 extensions in the interpreted code.
That's what dmdinline does.
Nov 09 2015
parent reply Idan Arye <GenericNPC gmail.com> writes:
On Tuesday, 10 November 2015 at 01:53:25 UTC, bachmeier wrote:
 On Tuesday, 10 November 2015 at 00:01:19 UTC, Idan Arye wrote:
 Weird approach. Usually, when one wants to use an interpreted 
 language as an host to a compiled language, the strategy is 
 precompile the compiled language's code and load it as 
 extensions in the interpreted code.
That's what dmdinline does.
From the examples, it seems like it doesn't. It seems like it's compiling D code on the fly, rather than loading pre-compiled libraries as R extensions.
Nov 10 2015
parent bachmeier <no spam.com> writes:
On Tuesday, 10 November 2015 at 10:18:14 UTC, Idan Arye wrote:
 That's what dmdinline does.
From the examples, it seems like it doesn't. It seems like it's compiling D code on the fly, rather than loading pre-compiled libraries as R extensions.
Okay, I see what you're saying. You are correct that the examples look like that. That's because it is common in the R community to write your C/C++ code inline. It becomes an interactive process, and works good for debugging. You can take the same D functions, insert them in a D file inside an R package, include a Makefile, and your D functions become part of that package. They'll be compiled when you install the package. That way you only compile once.
Nov 10 2015
prev sibling parent reply Laeeth Isharc <laeethnospam nospam.laeeth.com> writes:
On Monday, 9 November 2015 at 21:05:35 UTC, bachmeier wrote:
 On Monday, 9 November 2015 at 20:30:49 UTC, Gerald Jansen wrote:
 On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins 
 wrote:
 I have been running some MCMC simulations in Python ...

 Is anyone doing similar stuff with D? Unfortunately, I 
 couldn't find any plotting libraries nor MATLAB-like 
 numerical/stats libs in dub.

 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
see http://dlangscience.github.io/
And here is the gitter discussion site: https://gitter.im/DlangScience/public I've got this project https://bitbucket.org/bachmeil/dmdinline2 to embed D inside R on Linux. Unfortunately the documentation isn't good. I'm currently working on going in the other direction, embedding R inside D. There are, of course, many good MCMC options in R that you could call from your D code.
Hi bachmeier. Hope you're well. What's the current status of calling R from D and D from R? A friend who is global head of derivatives research in London for a bank was asking me as he is receptive to exploring alternatives. It's for research not production so rough around the edges is acceptable - provided one knows what one is dealing with beforehand. Thanks. Laeeth.
Feb 05
next sibling parent bachmeier <no spam.com> writes:
On Friday, 5 February 2016 at 20:13:42 UTC, Laeeth Isharc wrote:
 On Monday, 9 November 2015 at 21:05:35 UTC, bachmeier wrote:
 On Monday, 9 November 2015 at 20:30:49 UTC, Gerald Jansen 
 wrote:
 On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins 
 wrote:
 I have been running some MCMC simulations in Python ...

 Is anyone doing similar stuff with D? Unfortunately, I 
 couldn't find any plotting libraries nor MATLAB-like 
 numerical/stats libs in dub.

 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
see http://dlangscience.github.io/
And here is the gitter discussion site: https://gitter.im/DlangScience/public I've got this project https://bitbucket.org/bachmeil/dmdinline2 to embed D inside R on Linux. Unfortunately the documentation isn't good. I'm currently working on going in the other direction, embedding R inside D. There are, of course, many good MCMC options in R that you could call from your D code.
Hi bachmeier. Hope you're well. What's the current status of calling R from D and D from R? A friend who is global head of derivatives research in London for a bank was asking me as he is receptive to exploring alternatives. It's for research not production so rough around the edges is acceptable - provided one knows what one is dealing with beforehand. Thanks. Laeeth.
On Linux, works fine in either direction, though my recent efforts have been embedding R inside D, so the underlying library for that is much advanced over the older library for calling D from R. I've set up a simple html page here: http://lancebachmeier.com/rdlang/ I've gotten most of it working on Windows, but it's been slow, since I don't use or understand that OS for development. I expect to have it working fully in the next couple weeks. It would probably be easy for someone with Windows experience to get it working now. Send an email to the address at this page http://www.k-state.edu/economics/staff/bios/bachmeier.html if you have any questions.
Feb 05
prev sibling parent bachmeier <no spam.net> writes:
On Friday, 5 February 2016 at 20:13:42 UTC, Laeeth Isharc wrote:

 Hi bachmeier.

 Hope you're well.

 What's the current status of calling R from D and D from R?  A 
 friend who is global head of derivatives research in London for 
 a bank was asking me as he is receptive to exploring 
 alternatives.  It's for research not production so rough around 
 the edges is acceptable - provided one knows what one is 
 dealing with beforehand.


 Thanks.


 Laeeth.
Something I didn't emphasize before is that embedding R inside D does more than open the door to mixing R and D code together in one program. We now have an interface available in D to any compiled C, C++, or Fortran library for which there is an R package. You can call those libraries from D, no glue code required and using the existing documentation for the R package, yet it's fully efficient because you're passing a pointer to the data from D into the underlying library. R code only gets in the way (i.e., slows your program down) if you want to run functions written in R.
Feb 06
prev sibling next sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Mon, 2015-11-09 at 19:31 +0000, M=C3=A1rcio Martins via Digitalmars-d
wrote:
 I have been running some MCMC simulations in Python and it's hard=20
 to cope with how unbelievably slow it is.
 Takes me almost a minute to run a few hundred thousand samples on=20
 my laptop whereas I can run the same simulation with a million=20
 samples in under 100ms, on my phone with JavaScript on a browser.
Are you using NumPy with Python, or pure Python? In either case you will be better served by profiling you code to find out whoch is actually the performance bottleneck and then doing one of: 1. Use Numba to JIT the Python and achieve native code speed. 2. Replace the Python code with D code using either CFFI or PyD. 3. Replace the Python code with Chapel code using either CFFI or PyChapel. You could also use C++ or Rust via CFFI, but I have little experience using Rust with Python and am trying to avoid C++.
 Then, you spend a minute waiting for the simulation to finish, to=20
 find out you had an error in your report code that would have=20
 been easily caught with static typing. So annoying...
=20
 Is anyone doing similar stuff with D? Unfortunately, I couldn't=20
 find any plotting libraries nor MATLAB-like numerical/stats libs=20
 in dub.
No need if you leave the coordination code in Python and then use Matplotlib, etc. Just put the computational expensive code into Chapel or D, leave the rest of your code in Python.
 This seems like another area where D could easily pick up=20
 momentum with RDMD and perhaps an integration with Jupyter which=20
 is becoming very very popular.
Jupyter (n=C3=A9e IPython) is only popular in one workflow, creting papers for people to read and play with the executable code fragments. This is a big area but only one of many. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Nov 10 2015
parent =?UTF-8?B?TcOhcmNpbw==?= Martins <marcioapm gmail.com> writes:
On Tuesday, 10 November 2015 at 10:13:46 UTC, Russel Winder wrote:
 On Mon, 2015-11-09 at 19:31 +0000, Márcio Martins via 
 Digitalmars-d wrote:
 I have been running some MCMC simulations in Python and it's 
 hard
 to cope with how unbelievably slow it is.
 Takes me almost a minute to run a few hundred thousand samples 
 on
 my laptop whereas I can run the same simulation with a million
 samples in under 100ms, on my phone with JavaScript on a 
 browser.
Are you using NumPy with Python, or pure Python? In either case you will be better served by profiling you code to find out whoch is actually the performance bottleneck and then doing one of: 1. Use Numba to JIT the Python and achieve native code speed. 2. Replace the Python code with D code using either CFFI or PyD. 3. Replace the Python code with Chapel code using either CFFI or PyChapel. You could also use C++ or Rust via CFFI, but I have little experience using Rust with Python and am trying to avoid C++.
 Then, you spend a minute waiting for the simulation to finish, 
 to find out you had an error in your report code that would 
 have been easily caught with static typing. So annoying...
 
 Is anyone doing similar stuff with D? Unfortunately, I 
 couldn't find any plotting libraries nor MATLAB-like 
 numerical/stats libs in dub.
No need if you leave the coordination code in Python and then use Matplotlib, etc. Just put the computational expensive code into Chapel or D, leave the rest of your code in Python.
 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
Jupyter (née IPython) is only popular in one workflow, creting papers for people to read and play with the executable code fragments. This is a big area but only one of many.
Numba worked great! I had never heard of it before, but seems like an elegant enough and practical solution. Still slightly slower than JavaScript but fast enough for prototyping. No wonder Python is so popular despite being so slow - everything just works and feels elegant with no frustration. From not even having Python installed to having everything setup with Miniconda and having my code running took under 10 minutes and the only thing that wasn't smooth was really the speed, which was really frustrating a the time, but in a way was to be expected, and now was also elegantly fixed with Numba. No hacks, no platform issues, just silky smoothness :) Thanks!
Nov 10 2015
prev sibling next sibling parent Chris <wendlec tcd.ie> writes:
On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins wrote:
 I have been running some MCMC simulations in Python and it's 
 hard to cope with how unbelievably slow it is.
 Takes me almost a minute to run a few hundred thousand samples 
 on my laptop whereas I can run the same simulation with a 
 million samples in under 100ms, on my phone with JavaScript on 
 a browser.

 Then, you spend a minute waiting for the simulation to finish, 
 to find out you had an error in your report code that would 
 have been easily caught with static typing. So annoying...

 Is anyone doing similar stuff with D? Unfortunately, I couldn't 
 find any plotting libraries nor MATLAB-like numerical/stats 
 libs in dub.

 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
To give D a try in this field is definitely a good idea. It will pay in the long run, I believe. Where D lacks libraries and the like, there's always the option to interface to C (and C++ to a certain extent).
Nov 10 2015
prev sibling next sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins wrote:
 I have been running some MCMC simulations in Python and it's 
 hard to cope with how unbelievably slow it is.
 Takes me almost a minute to run a few hundred thousand samples 
 on my laptop whereas I can run the same simulation with a 
 million samples in under 100ms, on my phone with JavaScript on 
 a browser.

 Then, you spend a minute waiting for the simulation to finish, 
 to find out you had an error in your report code that would 
 have been easily caught with static typing. So annoying...
No matter how interested I am in Bayesian statistics, I would think that an MCMC library is relatively lower in importance than a number of other libraries. I've written some Gibbs samplers in Matlab and Python. They are quite slow in those languages, but I didn't notice that the Python code was more than 2X or so slower than the Matlab code, and I believe that was almost entirely due to Matlab using Intel MKL and Numpy using a slightly less efficient implementation. While I learned a lot about MCMC by writing my own Gibbs samplers, I don't write them much anymore. I make more use of MC Stan, which can be called from Python with PyStan (maybe easier if you're on Linux than Windows, I tend to use rstan more which works easily with Windows). PyMC is another option that I've heard good things about, but I haven't tried it. I think the simplest way forward would be something like wrappers to functionality in other languages. With PyD, it shouldn't be inconceivable to have a wrapper to PyMC. Alternately, MC Stan is written in C++ and has interfaces to a number of languages. Being able to call Stan from D would be cool, especially since I don't even think there's a C++ interface yet (you have to use the command line or R or whatever). I have essentially no idea how to do that.
 Is anyone doing similar stuff with D? Unfortunately, I couldn't 
 find any plotting libraries nor MATLAB-like numerical/stats 
 libs in dub.
My attitude is the more the better.
 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
That would be interesting, but I'm not sure how high a priority it is.
Nov 10 2015
prev sibling parent reply Laeeth Isharc <nospamlaeeth nospamlaeeth.com> writes:
On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins wrote:
 I have been running some MCMC simulations in Python and it's 
 hard to cope with how unbelievably slow it is.
 Takes me almost a minute to run a few hundred thousand samples 
 on my laptop whereas I can run the same simulation with a 
 million samples in under 100ms, on my phone with JavaScript on 
 a browser.

 Then, you spend a minute waiting for the simulation to finish, 
 to find out you had an error in your report code that would 
 have been easily caught with static typing. So annoying...

 Is anyone doing similar stuff with D? Unfortunately, I couldn't 
 find any plotting libraries nor MATLAB-like numerical/stats 
 libs in dub.
dlangscience has some energy and people behind it. John Colvin is heavily involved with it, but it's a joint project, and there are others too. He wrote a draft white paper really thinking through the best design approach, and that will pay dividends over time, but in the meantime it's a central point for different scientific computing libraries. plotting is a work in progress, I think. there are some options. for my stuff, it's not particularly clever so I use the D bindings to mathgl (a nice and simple C library), but depending on what you want to do, other choices may be more suitable. it's pretty easy to call python libraries from D, and I have done that initially for plotting using bokeh. the only problem was for callbacks that then meant potentially dealing with javascript as well, and I drew the line at three languages for such a simple thing.
 This seems like another area where D could easily pick up 
 momentum with RDMD and perhaps an integration with Jupyter 
 which is becoming very very popular.
Actually John Colvin has written an extension called pydmagic that allows you to write D code within a Jupyter notebook - it integrates with PyD so you can call D from python and call python from D (even embedded as a string if you like). It's not yet highly-polished, but it works, and I have used it to get work done.
Nov 10 2015
parent jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 11 November 2015 at 03:29:56 UTC, Laeeth Isharc 
wrote:
 plotting is a work in progress, I think.  there are some 
 options.
  for my stuff, it's not particularly clever so I use the D 
 bindings to mathgl (a nice and simple C library), but depending 
 on what you want to do, other choices may be more suitable.
I just came across https://github.com/BlackEdder/ggplotd which looks promising.
Nov 18 2015