digitalmars.D - My Framework wishlist For D

bioinfornatics (19/19) Apr 28 2021 Firstly my needs it is around data processing and knowledge

bachmeier (4/23) Apr 28 2021 Which of these can be done by calling other languages (easy to

jmh530 (12/38) Apr 28 2021 For some of these, the OP references libraries, like
bioinfornatics (3/9) Apr 28 2021 Yes I agree some can be done by calling other languages

Tobias Pankrath (4/23) Apr 28 2021 I don't know what JHipster does exactly, but something simple to

bioinfornatics (5/11) Apr 28 2021 Yes it is a project generator wich will all the conf for a

Andre Pany (7/26) Apr 28 2021 Regarding reading and writing Parquet files using Apache arrow,

bioinfornatics (7/36) Apr 28 2021 Yes of course with some effort it is possible that means the

Andre Pany (19/58) Apr 29 2021 Yes, I working in the area of big data / cloud with Python

bioinfornatics <bioinfornatics fedoraproject.org> writes:

Firstly my needs it is around data processing and knowledge 
extraction so It is no a generalization of the needs. Moreover 
some tools/frameworks have an alternative in D (often not enough 
mature)

Data computing:
  - job scheduling (yarn from hadoop, celery from python or slurm 
from HPC world)
  - data storage at least read and write to parquet file (through 
apache arrow lib)
  - Multinode processing such it is done by Ray: 
https://docs.ray.io/en/master/
  - Data processing «à la» Pandas/Dask
  - scipy and numpy library
  - a web project generator such it is done with jhipster: 
https://www.jhipster.tech/
  - IA library (maybe), if we can store to parquet that imply we 
are able to load them from python and run tensorfow, pytorch or 
other …

and may others things

Apr 28 2021

bachmeier <no spam.net> writes:

On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics wrote:
 Firstly my needs it is around data processing and knowledge 
 extraction so It is no a generalization of the needs. Moreover 
 some tools/frameworks have an alternative in D (often not 
 enough mature)

 Data computing:
  - job scheduling (yarn from hadoop, celery from python or 
 slurm from HPC world)
  - data storage at least read and write to parquet file 
 (through apache arrow lib)
  - Multinode processing such it is done by Ray: 
 https://docs.ray.io/en/master/
  - Data processing «à la» Pandas/Dask
  - scipy and numpy library
  - a web project generator such it is done with jhipster: 
 https://www.jhipster.tech/
  - IA library (maybe), if we can store to parquet that imply we 
 are able to load them from python and run tensorfow, pytorch or 
 other …

 and may others things

Which of these can be done by calling other languages (easy to 
handle) and which would need to be written in D (probably won't 
happen)? Is Windows support necessary or is WSL sufficient?

Apr 28 2021

jmh530 <john.michael.hall gmail.com> writes:

On Wednesday, 28 April 2021 at 15:20:40 UTC, bachmeier wrote:
 On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics 
 wrote:
 Firstly my needs it is around data processing and knowledge 
 extraction so It is no a generalization of the needs. Moreover 
 some tools/frameworks have an alternative in D (often not 
 enough mature)

 Data computing:
  - job scheduling (yarn from hadoop, celery from python or 
 slurm from HPC world)
  - data storage at least read and write to parquet file 
 (through apache arrow lib)
  - Multinode processing such it is done by Ray: 
 https://docs.ray.io/en/master/
  - Data processing «à la» Pandas/Dask
  - scipy and numpy library
  - a web project generator such it is done with jhipster: 
 https://www.jhipster.tech/
  - IA library (maybe), if we can store to parquet that imply 
 we are able to load them from python and run tensorfow, 
 pytorch or other …

 and may others things

 Which of these can be done by calling other languages (easy to 
 handle) and which would need to be written in D (probably won't 
 happen)? Is Windows support necessary or is WSL sufficient?

For some of these, the OP references libraries, like 
numpy/scipy/pandas, that are largely user-friendly wrappers over 
some C or Fortran libraries, at least for the most 
computationally intensive parts of the libraries. While something 
like lapack is doing the same thing, you get the benefit of 
keeping it all in D (the scripting languages also may have used 
different default settings which trade performance for accuracy).

Apache Arrow is a C++ library. I don't have any idea how 
difficult it would be to get working in D, but there is a GLIB 
version with a C API that should be easy enough to get working in 
D.

Apr 28 2021

bioinfornatics <bioinfornatics fedoraproject.org> writes:

On Wednesday, 28 April 2021 at 15:20:40 UTC, bachmeier wrote:
 On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics 
 wrote:
 [...]

 Which of these can be done by calling other languages (easy to 
 handle) and which would need to be written in D (probably won't 
 happen)? Is Windows support necessary or is WSL sufficient?

Yes I agree some can be done by calling other languages
I work only on Linux as our webapp run on this OS

Apr 28 2021

Tobias Pankrath <tobias+dlang pankrath.net> writes:

On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics wrote:
 Firstly my needs it is around data processing and knowledge 
 extraction so It is no a generalization of the needs. Moreover 
 some tools/frameworks have an alternative in D (often not 
 enough mature)

 Data computing:
  - job scheduling (yarn from hadoop, celery from python or 
 slurm from HPC world)
  - data storage at least read and write to parquet file 
 (through apache arrow lib)
  - Multinode processing such it is done by Ray: 
 https://docs.ray.io/en/master/
  - Data processing «à la» Pandas/Dask
  - scipy and numpy library
  - a web project generator such it is done with jhipster: 
 https://www.jhipster.tech/
  - IA library (maybe), if we can store to parquet that imply we 
 are able to load them from python and run tensorfow, pytorch or 
 other …

 and may others things

I don't know what JHipster does exactly, but something simple to 
quickly set up microservices would be great. I think most of the 
ingredients are already here, but not neatly packaged.

Apr 28 2021

bioinfornatics <bioinfornatics fedoraproject.org> writes:

On Wednesday, 28 April 2021 at 16:01:46 UTC, Tobias Pankrath 
wrote:
 On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics 
 wrote:
 [...]

 I don't know what JHipster does exactly, but something simple 
 to quickly set up microservices would be great. I think most of 
 the ingredients are already here, but not neatly packaged.

Yes it is a project generator wich will all the conf for a 
microservice architecture (with consul) with SSO (through 
keycloak or other) and ORM (hibernate) included

Apr 28 2021

Andre Pany <andre s-e-a-p.de> writes:

On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics wrote:
 Firstly my needs it is around data processing and knowledge 
 extraction so It is no a generalization of the needs. Moreover 
 some tools/frameworks have an alternative in D (often not 
 enough mature)

 Data computing:
  - job scheduling (yarn from hadoop, celery from python or 
 slurm from HPC world)
  - data storage at least read and write to parquet file 
 (through apache arrow lib)
  - Multinode processing such it is done by Ray: 
 https://docs.ray.io/en/master/
  - Data processing «à la» Pandas/Dask
  - scipy and numpy library
  - a web project generator such it is done with jhipster: 
 https://www.jhipster.tech/
  - IA library (maybe), if we can store to parquet that imply we 
 are able to load them from python and run tensorfow, pytorch or 
 other …

 and may others things

Regarding reading and writing Parquet files using Apache arrow, 
this is more or less easily possible. You can use DPP, but you 
have some small effort afterwards,  see here

https://github.com/atilaneves/dpp/issues/242

Kind regards
Andre

Apr 28 2021

bioinfornatics <bioinfornatics fedoraproject.org> writes:

On Wednesday, 28 April 2021 at 18:44:52 UTC, Andre Pany wrote:
 On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics 
 wrote:
 Firstly my needs it is around data processing and knowledge 
 extraction so It is no a generalization of the needs. Moreover 
 some tools/frameworks have an alternative in D (often not 
 enough mature)

 Data computing:
  - job scheduling (yarn from hadoop, celery from python or 
 slurm from HPC world)
  - data storage at least read and write to parquet file 
 (through apache arrow lib)
  - Multinode processing such it is done by Ray: 
 https://docs.ray.io/en/master/
  - Data processing «à la» Pandas/Dask
  - scipy and numpy library
  - a web project generator such it is done with jhipster: 
 https://www.jhipster.tech/
  - IA library (maybe), if we can store to parquet that imply 
 we are able to load them from python and run tensorfow, 
 pytorch or other …

 and may others things

 Regarding reading and writing Parquet files using Apache arrow, 
 this is more or less easily possible. You can use DPP, but you 
 have some small effort afterwards,  see here

 https://github.com/atilaneves/dpp/issues/242

 Kind regards
 Andre

Yes of course with some effort it is possible that means the 
ecosystem is not ready for
and you will loose Every possible candidate to choose D

in c++ you have apache arrow and Dataframe 
(https://github.com/hosseinmoein/DataFrame)
in place of python (pyarrow/pandas) . And it is ready to use

Apr 28 2021

Andre Pany <andre s-e-a-p.de> writes:

On Wednesday, 28 April 2021 at 19:23:44 UTC, bioinfornatics wrote:
 On Wednesday, 28 April 2021 at 18:44:52 UTC, Andre Pany wrote:
 On Wednesday, 28 April 2021 at 12:47:49 UTC, bioinfornatics 
 wrote:
 Firstly my needs it is around data processing and knowledge 
 extraction so It is no a generalization of the needs. 
 Moreover some tools/frameworks have an alternative in D 
 (often not enough mature)

 Data computing:
  - job scheduling (yarn from hadoop, celery from python or 
 slurm from HPC world)
  - data storage at least read and write to parquet file 
 (through apache arrow lib)
  - Multinode processing such it is done by Ray: 
 https://docs.ray.io/en/master/
  - Data processing «à la» Pandas/Dask
  - scipy and numpy library
  - a web project generator such it is done with jhipster: 
 https://www.jhipster.tech/
  - IA library (maybe), if we can store to parquet that imply 
 we are able to load them from python and run tensorfow, 
 pytorch or other …

 and may others things

 Regarding reading and writing Parquet files using Apache 
 arrow, this is more or less easily possible. You can use DPP, 
 but you have some small effort afterwards,  see here

 https://github.com/atilaneves/dpp/issues/242

 Kind regards
 Andre

 Yes of course with some effort it is possible that means the 
 ecosystem is not ready for
 and you will loose Every possible candidate to choose D

 in c++ you have apache arrow and Dataframe 
 (https://github.com/hosseinmoein/DataFrame)
 in place of python (pyarrow/pandas) . And it is ready to use

Yes, I working in the area of big data / cloud with Python 
(numpy/ pandas) and D. And yes, you are right, while the dub 
packages list is growing, the scientific area is really small. 
You have to leave here the happy path and have to invest 3 to 4 
hours to get Parquet working with D.

This is my personal opinion: Every minute I had to invest 
additionally to get everything running in D was a success. I was 
bitten so many times by python issues and invested many hours to 
solve them...
Now I have smoothly running D code working hand in hand with 
python code. And yes, every few week something else, breaks in 
the python, while D continues to run smoothly.

You are right, at the moment you need to be enthusiastic about 
D,to setup a scientific application, but it totally worths it.

For me, the biggest blocker to get a numpy like library in D is 
missing named arguments feature (dip is accepted).

Kind regards
Andre

Apr 29 2021

D Programming

C/C++ Programming

Other

digitalmars.D - My Framework wishlist For D