www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Suboptimal array copy in druntime?

reply Guillaume Chatelet <chatelet.guillaume gmail.com> writes:
I was looking at the _d_arrayassign family functions in druntime:
https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L47
https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L139

The code seems suboptimal for several reasons:

1. memcpy is more efficient on big arrays than iterating on a few 
bytes because it can use mmx/sse/avx. I would naturally memcpy 
the whole array and postblit/destroy individual elements 
separately.

2. ti.destroy and ti.postblit are always called but they might do 
nothing, since the code is not templated the compiler can't 
eliminate the calls. How about caching in TypeInfo if the type 
has a non empty destructor / postblit and do:

   if(ti.hasDestroy)
     for(element : dst_array)
       ti.destroy(element);
   memcpy(dst_array, src_array);
   if(ti.hasPostBlit)
     for(element : dst_array)
       ti.postblit(element);

Granted that worse case we iterate the array several time, we 
could fallback to the current implementation if both are set.

Did I miss something?
Apr 16 2017
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Sunday, 16 April 2017 at 10:08:22 UTC, Guillaume Chatelet 
wrote:
 I was looking at the _d_arrayassign family functions in 
 druntime:
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L47
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L139

 [...]
Nope. Those are valid points. Templatizing the code is the way to go.
Apr 16 2017
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Sunday, 16 April 2017 at 10:33:01 UTC, Stefan Koch wrote:
 On Sunday, 16 April 2017 at 10:08:22 UTC, Guillaume Chatelet 
 wrote:
 I was looking at the _d_arrayassign family functions in 
 druntime:
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L47
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L139

 [...]
Nope. Those are valid points. Templatizing the code is the way to go.
Indeed. See also http://dconf.org/2017/talks/cojocaru.html
Apr 16 2017
parent reply Guillaume Chatelet <chatelet.guillaume gmail.com> writes:
On Sunday, 16 April 2017 at 11:25:15 UTC, Nicholas Wilson wrote:
 On Sunday, 16 April 2017 at 10:33:01 UTC, Stefan Koch wrote:
 On Sunday, 16 April 2017 at 10:08:22 UTC, Guillaume Chatelet 
 wrote:
 I was looking at the _d_arrayassign family functions in 
 druntime:
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L47
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L139

 [...]
Nope. Those are valid points. Templatizing the code is the way to go.
Indeed. See also http://dconf.org/2017/talks/cojocaru.html
Sweet! Glad to see this is being worked on :)
Apr 16 2017
parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Sunday, 16 April 2017 at 11:58:11 UTC, Guillaume Chatelet 
wrote:
 On Sunday, 16 April 2017 at 11:25:15 UTC, Nicholas Wilson wrote:
 On Sunday, 16 April 2017 at 10:33:01 UTC, Stefan Koch wrote:
 On Sunday, 16 April 2017 at 10:08:22 UTC, Guillaume Chatelet 
 wrote:
 I was looking at the _d_arrayassign family functions in 
 druntime:
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L47
 https://github.com/dlang/druntime/blob/master/src/rt/arrayassign.d#L139

 [...]
Nope. Those are valid points. Templatizing the code is the way to go.
Indeed. See also http://dconf.org/2017/talks/cojocaru.html
Sweet! Glad to see this is being worked on :)
Specifically, see these pull requests as an example how the rest of druntime can be turned into templates: https://github.com/dlang/dmd/pull/6597 https://github.com/dlang/druntime/pull/1781 https://github.com/dlang/dmd/pull/6634 https://github.com/dlang/druntime/pull/1792 I tested the array comparison lowering a couple of weeks ago and the results looked promising with regards to reducing link-time dependencies: main.cpp: #include <array> #include <cstdio> int compareArrays(const int *p1, size_t len1, const int *p2, size_t len2); int main() { std::array<int, 3> arr1 = {1, 2, 3}; std::array<int, 3> arr2 = {1, 2, 4}; int res = compareArrays(arr1.begin(), arr1.size(), arr2.begin(), arr2.size()); printf("%d\n", res); } compare.d: extern(C++) pure nothrow nogc int compareArrays(scope const(int)* p1, size_t len1, scope const(int)* p2, size_t len2) { return p1[0 .. len1] < p2[0 .. len2]; } extern(C) void _d_dso_registry() {} $ ~/dlang/install.sh install dmd-nightly Downloading and unpacking http://nightlies.dlang.org/dmd-master-2017-03-28/dmd.master.linux.tar.xz dub-1.2.1 already installed Run `source ~/dlang/dmd-master-2017-03-28/activate` in your shell to use dmd-master-2017-03-28. This will setup PATH, LIBRARY_PATH, LD_LIBRARY_PATH, DMD, DC, and PS1. Run `deactivate` later on to restore your environment. $ source ~/dlang/dmd-master-2017-03-28/activate $ g++ -std=c++11 -c main.cpp && \ dmd -O -betterC -c compare.d && \ g++ main.o compare.o -o d_array_compare $ ./d_array_compare 1 $ nm compare.o 0000000000000000 t 0000000000000000 W _D6object12__T5__cmpTiZ5__cmpFNaNbNiNexAixAiZi 0000000000000000 T _d_dso_registry U _d_dso_registry U _GLOBAL_OFFSET_TABLE_ U __start_minfo U __stop_minfo 0000000000000000 T _Z13compareArraysPKimS0_m So choose your favorite under-performing runtime hook (https://wiki.dlang.org/Runtime_Hooks; https://github.com/dlang/dmd/blob/v2.074.0/src/ddmd/backend/rtlsym.h#L42 - definitive list) and turn it into a template :P
Apr 16 2017