www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - missing data with parallel and stdin

reply moechofe <truc moechofe.com> writes:
Hi, I write a script that take a list of files from STDIN, 
compute some stuff, and copy files with a new names.

I got 33k lines at input but got only 3k-5k in the destination 
folder.
This is not append if I remove the .parallel() function.

What did I do wrong?

     void delegate(string source,string dest) handler;

     if(use_symlink) handler = delegate(string s,string d){
         symlink(s,d);
     }; else handler = delegate(string s,string d){
         copy(s,d);
     };

     foreach(entry; parallel(stdin.byLineCopy)) try
     {
         auto source = buildPath(static_path,entry);
         auto md5 = digest!MD5(File(source).byChunk(64*1024));
         auto hash = toHexString!(LetterCase.lower)(md5);
         auto file = text(hash,'_',baseName(entry));
         auto dest = buildPath(hashed_path,file);
         handler(source,dest);
         writeln(entry,' ',file);
     }
     catch(Exception e)
     {
         error("Couldn't read, hash or copy %s",entry);
     }
May 23 2016
parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Monday, 23 May 2016 at 08:59:31 UTC, moechofe wrote:
     void delegate(string source,string dest) handler;

     if(use_symlink) handler = delegate(string s,string d){
         symlink(s,d);
     }; else handler = delegate(string s,string d){
         copy(s,d);
     };
Boy that's a confusing way to write that. Here's a clearer version if(use_symlink) handler = delegate(string s,string d){ symlink(s,d); }; else handler = delegate(string s,string d){ copy(s,d); };
 What did I do wrong?
Sounds like a data race problem. Use a lock on the file write operation and see if that helps.
May 23 2016
parent reply moechofe <truc moechofe.com> writes:
On Monday, 23 May 2016 at 14:16:13 UTC, Jack Stouffer wrote:
 Sounds like a data race problem. Use a lock on the file write 
 operation and see if that helps.
Like this?: synchronized(mutex) copy(source,dest); That didn't solve anything. What I observe is: when the process is slower, more files are copied.
May 23 2016
parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Monday, 23 May 2016 at 15:53:23 UTC, moechofe wrote:
 On Monday, 23 May 2016 at 14:16:13 UTC, Jack Stouffer wrote:
 Sounds like a data race problem. Use a lock on the file write 
 operation and see if that helps.
That didn't solve anything. What I observe is: when the process is slower, more files are copied.
Last night I took the code sample and left copy out, everything else I got working. However when I ran it I noticed it's only running on one core and worked fine. However when I put in a number for how many to work on at once (adding any number to parallel's call) it would crash the program quite often, generally because it couldn't close files it was scanning. Looking over the documentation you appear to be using parallel correctly, so I don't know why it isn't working.
May 23 2016