www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to share an appender!string?

reply Thorsten Sommer <vektoren gmail.com> writes:
Dear community,

I tried to create a kind of collector module which collects 
strings by using a shared appender!string. Why? To collect KPIs 
at a huge program across multiple classes and threads and store 
it later as e.g. CSV file in order to analyse it by using R.

But I failed...

Attempt #1: I tried what I would do in Java and C#: Using a 
Singelton pattern. But it was not as easy to create it in D and I 
found out, that today the Singleton is considered as 
anti-pattern. My conclusion was: Okay, that is right. Why should 
I force OOP in D to realize a Singleton? I can use different 
paradigm with D -- so just create a module with a shared 
appender!string ... cf. attempt #2.


Attempt #2: https://dpaste.dzfl.pl/6035b6fdd4bd
My second attempt was to make my appender "sb" shared. But I got 
this compiler error:

std/array.d(3118): Error: 'auto' can only be used as part of 
'auto ref' for template function parameters

Do I used a wrong syntax or is this a bug?


Attempt #3: https://dpaste.dzfl.pl/1e5df9cf08c6
Okay, the third attempt. I searched around and found the 
Appender!(shared(string)) syntax. That version compiles and it 
runs fine. But the "sb" is not shared, because the "four" output 
is missing (cf. dpaste).


Can anyone help me? I am stucked...


Best regards,
Thorsten
May 19 2016
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 19/05/2016 10:08 PM, Thorsten Sommer wrote:
 Dear community,

 I tried to create a kind of collector module which collects strings by
 using a shared appender!string. Why? To collect KPIs at a huge program
 across multiple classes and threads and store it later as e.g. CSV file
 in order to analyse it by using R.

 But I failed...

 Attempt #1: I tried what I would do in Java and C#: Using a Singelton
 pattern. But it was not as easy to create it in D and I found out, that
 today the Singleton is considered as anti-pattern. My conclusion was:
 Okay, that is right. Why should I force OOP in D to realize a Singleton?
 I can use different paradigm with D -- so just create a module with a
 shared appender!string ... cf. attempt #2.


 Attempt #2: https://dpaste.dzfl.pl/6035b6fdd4bd
 My second attempt was to make my appender "sb" shared. But I got this
 compiler error:

 std/array.d(3118): Error: 'auto' can only be used as part of 'auto ref'
 for template function parameters

 Do I used a wrong syntax or is this a bug?


 Attempt #3: https://dpaste.dzfl.pl/1e5df9cf08c6
 Okay, the third attempt. I searched around and found the
 Appender!(shared(string)) syntax. That version compiles and it runs
 fine. But the "sb" is not shared, because the "four" output is missing
 (cf. dpaste).


 Can anyone help me? I am stucked...


 Best regards,
 Thorsten
At this point I'd recommend you to just ignore Appender. Write your own. After all, all it does is this: T[] data_; size_t realLen; void add(T v) { if (realLen + 1 > data.length) { data_.length += x; } data_[realLen] = v; realLen++; } T[] data() { return data_; } Okay okay, I simplified things quite a bit.
May 19 2016
parent reply Thorsten Sommer <vektoren gmail.com> writes:
On Thursday, 19 May 2016 at 10:13:21 UTC, rikki cattermole wrote:

 At this point I'd recommend you to just ignore Appender.
 Write your own.
Dear rikki, Thanks for the proposal :) Here is the new attempt #4 as simple test case: https://dpaste.dzfl.pl/f6a9663320e5 It compiles & runs, but the array of strings gets not shared across threads :( I am sure that I missed something about the shared() concept... Hmm... Best regards, Thorsten
May 19 2016
next sibling parent reply Rene Zwanenburg <renezwanenburg gmail.com> writes:
On Thursday, 19 May 2016 at 10:41:14 UTC, Thorsten Sommer wrote:
 On Thursday, 19 May 2016 at 10:13:21 UTC, rikki cattermole 
 wrote:

 At this point I'd recommend you to just ignore Appender.
 Write your own.
Dear rikki, Thanks for the proposal :) Here is the new attempt #4 as simple test case: https://dpaste.dzfl.pl/f6a9663320e5 It compiles & runs, but the array of strings gets not shared across threads :( I am sure that I missed something about the shared() concept... Hmm... Best regards, Thorsten
Calling task() only creates a Task, you also have to start it somehow. The documentation contains an example: https://dlang.org/phobos/std_parallelism.html#.task
May 19 2016
parent reply Rene Zwanenburg <renezwanenburg gmail.com> writes:
On Thursday, 19 May 2016 at 10:58:42 UTC, Rene Zwanenburg wrote:
 Calling task() only creates a Task, you also have to start it 
 somehow. The documentation contains an example:

 https://dlang.org/phobos/std_parallelism.html#.task
I should add that a single shared array will cause contention if the number of calls to addLine is large compared to the amount of work done. If performance is a problem, another way is to use a thread local array and merge the arrays to a single global one on thread termination. Something like this: // thread local. Can be appended to without locking string[] lines; __gshared Mutex mutex; shared static this() { mutex = new Mutex(); } __gshared string allLines; static ~this() { synchronized(mutex) { allLines ~= lines; } } This will not preserve order of course, and you'll have to make sure all your worker threads are terminated before using allLines. Both can be fixed if required but will make the code more complicated. The upside is that this will avoid contention on the mutex. You'll still have to be careful with the GC though, just about every GC operation takes a global lock so it doesn't play nice with high perf multi-threaded code.
May 19 2016
parent reply Thorsten Sommer <vektoren gmail.com> writes:
Dear all,

I am done :) Thanks  Kagamin,  Rene and  rikki for the help.

Short answers:
 Rene: You are right, I missed the starting of that task i.e. 
thread. Used before spawn() where the thread runs directly. But 
spawn() crashes dpaste.pl...

 rikki: Yes, I known what you mentioned ;) I just constructed a 
very simple and short test code on dpaste.pl and not the full 
implementation of your proposal. Just the basic idea, which is 
great. Within my final solution, I still use your advice and 
reject appender but use just a simple data type instead.

 Rene: Thanks for the great idea with the destructor and the 
thread-local data :) I adapted that for my solution. The order of 
the entries does not matter for my case.

 Kagamin: It do not know why it crashes, but where: It comes from 
the spawn() call.


Issue analysis: My main issue was that the main() does not waited 
for the new thread (I used spawn() before I opened this 
discussion). Thus, a simple thread_joinAll(); solved that.

For the archive -- my final solution with comments:
https://dpaste.dzfl.pl/3a34df24ed6c


Best regards,
Thorsten
May 19 2016
parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 19 May 2016 at 13:33:50 UTC, Thorsten Sommer wrote:
 Issue analysis: My main issue was that the main() does not 
 waited for the new thread (I used spawn() before I opened this 
 discussion). Thus, a simple thread_joinAll(); solved that.
Since each thread can run at different times having thread_joinAll() would be best at the end of a loop or before writing the output. An alternate to writing a custom appender is simply to make the assignment atomic. Haven't tried this but if you did 'shared string[] lines;' then you could build the string and then append the string to the lines. You could also avoid adding newlines since they would be appended afterwards by a helper function. void writeLintes(File output) { foreach(ln; lines) { output.writeln(ln); } } I actually wonder how much of it would have to be shared at that point, since strings are immutable then the returning/assigning strings are safe once set; The only thing that needs to be shared is the array that grows as the chances of reallocation. I'll experiment with this and get back with you. Multi-threading isn't my strong suit either.
May 19 2016
parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Thursday, 19 May 2016 at 19:31:26 UTC, Era Scarecrow wrote:
 An alternate to writing a custom appender is simply to make the 
 assignment atomic. Haven't tried this but if you did 'shared 
 string[] lines;'

  I'll experiment with this and get back with you. 
 Multi-threading isn't my strong suit either.
Experimented and quickly got what looks like good clean results. Took your code, ripped out what I didn't want and added in what I did. Simple! https://dpaste.dzfl.pl/6952fdf463b66
May 19 2016
parent reply captaindet <2krnk gmx.net> writes:
On 2016-05-20 07:49, Era Scarecrow wrote:
   Experimented and quickly got what looks like good clean results. Took
 your code, ripped out what I didn't want and added in what I did. Simple!

   https://dpaste.dzfl.pl/6952fdf463b66
i am most curious about your solution. why does printAll() has a synchronized block? in case you would call it before thread_joinAll() i.e. before all threads are terminated? then again, why is there a synchronized block necessary in printAll() at all? it is only reading out data, not writing. (i am still learning the subtleties of multithreading.) /det
May 19 2016
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Friday, 20 May 2016 at 02:04:56 UTC, captaindet wrote:
 i am most curious about your solution.

 why does printAll() has a synchronized block? in case you would 
 call it before thread_joinAll() i.e. before all threads are 
 terminated?
 then again, why is there a synchronized block necessary in 
 printAll() at all? it is only reading out data, not writing.
If a local copy/slice/range is made for foreach (which I think it is) then synchronized isn't needed. If reallocation worked differently then it could be quite annoying when the memory is reallocated and you were using it; But since it isn't, I just threw it in for completion sake.
 (i am still learning the subtleties of multithreading.)
I tried to learn using C/C++ mutexes and semaphores and got hopelessly lost; Never tried to really get into it. Found a whole new meaning to the process while just recently re-reading the D2 book. Regardless, let's learn this stuff together :) Still if you have any more complex commands then what you're doing here, you might wrap it into a class; It appears synchronized can work on classes (even Object) and not require a separate mutex for it to compile, so... I'm not sure what that means, or if it's even safe.
May 19 2016
prev sibling parent rikki cattermole <rikki cattermole.co.nz> writes:
On 19/05/2016 10:41 PM, Thorsten Sommer wrote:
 On Thursday, 19 May 2016 at 10:13:21 UTC, rikki cattermole wrote:

 At this point I'd recommend you to just ignore Appender.
 Write your own.
Dear rikki, Thanks for the proposal :) Here is the new attempt #4 as simple test case: https://dpaste.dzfl.pl/f6a9663320e5 It compiles & runs, but the array of strings gets not shared across threads :( I am sure that I missed something about the shared() concept... Hmm... Best regards, Thorsten
What I meant was for you to create e.g. a struct that you can control to meet your needs. Not to declare an empty class and make your data global. struct MyAppender(T) { T[] data; size_t realLen; void add(T v) {...} T[] data() { return data[0 .. realLen]; } } void main() { import std.stdio : writeln; MyAppender!char stuff; stuff.add('a'); writeln(stuff.data); } Although based upon your posts, I'd say you should focus more on learning the language and less on threading. I.e. immutable A obj = new A(); Is probably not doing what you think it is.
May 19 2016
prev sibling parent Kagamin <spam here.lot> writes:
I'd say do something like https://dpaste.dzfl.pl/e9a2327ff2a1
Any idea why it crashes?
May 19 2016