www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Strange Bug in LDC vs DMD

reply FoxyBrown <Foxy Brown.IPT> writes:
I am using dirEntries to iterate over files to rename them.

I am renaming them in a loop(will change but added code for 
testing).


In DMD the renaming works but in LDC the renaming fails. It fails 
in a way that I can't quite tell and I cannot debug because 
visual D is not working properly for LDC.

The code essentially look like the following:


	auto dFiles = dirEntries(loc, mask, _mode);
	
	foreach (d; dFiles)
	{		

            auto newName = Recompute(d.name)
            writeln(newName);
            rename(d.name, newName);
         }

but when I comment out rename, it works under LDC.

The funny thing is, newName is printed wrong so Recompute is 
effected by the rename.

This shouldn't occur.

Now, dirEntries is a range, so I'm curious if the recomputation 
is occurring after the rename(if it did then the recomputation 
would be invalid and produce the results it is producing)?

When I forcably convert dirEntries in to an array(manually, 
unfortunately, as I can't seem to use array() on dirEntries), 
everything works).


	struct file { public string name; }
	
	auto dFiles = dirEntries(loc, mask, _mode);

	file[] files;

	foreach (d; dFiles)
	{
		file f;
		f.name = d.name;
		files ~= f;
	}

	foreach (d; files)
	{		

            auto newName = Recompute(d.name)
            writeln(newName);
            rename(d.name, newName);
         }

While it works, the main issue is that a different behavior is 
observed between DMD and LDC in the first case.

It would be nice to know how to simplify the code though so 
"lazy" evaluation of dirEntries did not occur.
Jun 30 2017
next sibling parent reply Murzistor <murzistor mail.ru> writes:
On Friday, 30 June 2017 at 12:50:24 UTC, FoxyBrown wrote:

 The funny thing is, newName is printed wrong so Recompute is 
 effected by the rename.
Does LDC use Unicode? Or, maybe, standard library under LDC does not support Unicode - then it is a serious bug. Do you use any non-ASCII symbols? Maybe, the Recompute() function returns a non-unicode string under LDC?
Jun 30 2017
parent FoxyBrown <Foxy Brown.IPT> writes:
On Friday, 30 June 2017 at 15:07:29 UTC, Murzistor wrote:
 On Friday, 30 June 2017 at 12:50:24 UTC, FoxyBrown wrote:

 The funny thing is, newName is printed wrong so Recompute is 
 effected by the rename.
Does LDC use Unicode? Or, maybe, standard library under LDC does not support Unicode - then it is a serious bug. Do you use any non-ASCII symbols? Maybe, the Recompute() function returns a non-unicode string under LDC?
None of these reasons make sense. They do not take in to account that simply pre-evaluating dirEntries causes it to work nor the fact that I said that commenting the rename out also works.
Jun 30 2017
prev sibling parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Fri, Jun 30, 2017 at 12:50:24PM +0000, FoxyBrown via Digitalmars-d-learn
wrote:
 I am using dirEntries to iterate over files to rename them.
 
 I am renaming them in a loop(will change but added code for testing).
 
 
 In DMD the renaming works but in LDC the renaming fails. It fails in a
 way that I can't quite tell and I cannot debug because visual D is not
 working properly for LDC.
 
 The code essentially look like the following:
 
 
 	auto dFiles = dirEntries(loc, mask, _mode);
 	
 	foreach (d; dFiles)
 	{		
 
            auto newName = Recompute(d.name)
            writeln(newName);
            rename(d.name, newName);
         }
 
 but when I comment out rename, it works under LDC.
 
 The funny thing is, newName is printed wrong so Recompute is effected
 by the rename.
 
 This shouldn't occur.
[...] This sounds very strange. What exactly do you mean by "newName is printed wrong"? Do you mean that somehow it's getting affected by the *subsequent* rename()? That would be truly strange. Or do you mean that newName doesn't match what you expect Recompute to do given d.name? Perhaps you should also print out d.name along with newName just to be sure? Do you have a reduced code example that's compilable/runnable? It's rather hard to tell what's wrong based on your incomplete snippet. T -- An imaginary friend squared is a real enemy.
Jun 30 2017
parent reply FoxyBrown <Foxy Brown.IPT> writes:
On Friday, 30 June 2017 at 17:32:33 UTC, H. S. Teoh wrote:
 On Fri, Jun 30, 2017 at 12:50:24PM +0000, FoxyBrown via 
 Digitalmars-d-learn wrote:
 I am using dirEntries to iterate over files to rename them.
 
 I am renaming them in a loop(will change but added code for 
 testing).
 
 
 In DMD the renaming works but in LDC the renaming fails. It 
 fails in a way that I can't quite tell and I cannot debug 
 because visual D is not working properly for LDC.
 
 The code essentially look like the following:
 
 
 	auto dFiles = dirEntries(loc, mask, _mode);
 
 	foreach (d; dFiles)
 	{
 
            auto newName = Recompute(d.name)
            writeln(newName);
            rename(d.name, newName);
         }
 
 but when I comment out rename, it works under LDC.
 
 The funny thing is, newName is printed wrong so Recompute is 
 effected by the rename.
 
 This shouldn't occur.
[...] This sounds very strange. What exactly do you mean by "newName is printed wrong"? Do you mean that somehow it's getting affected by the *subsequent* rename()? That would be truly strange. Or do you mean that newName doesn't match what you expect Recompute to do given d.name? Perhaps you should also print out d.name along with newName just to be sure? Do you have a reduced code example that's compilable/runnable? It's rather hard to tell what's wrong based on your incomplete snippet. T
No, if I simply comment out the rename line, then the writeln output changes. Simple as that. No other logic changes in the code. This means that the rename is affecting the output. The recompute code gets the filename, does a computation on it, then returns it.. prints it out, then renames that file to the newly computed file name. The only way this can happen is if the rename command is somehow feeding back in to the algorithm. Since the problem goes away when I pre-compute dirEntries, it suggests that dirEntries is being lazily computed. If that is the case, then the problem is easily understood: The file gets renamed, dirEntries reiterates over the file, then it gets recomputed again, but this time the result is bogus because it is a double recompute, which is meaningless in this program. I'm pretty sure that the analysis above is correct, that is, dirEntries is lazy and ends up picking up the renamed file. This is sort of like removing an element in an array while iterating over the array. The odd thing is, is that DMD does not produce the same result. I do not know if there is a different in the LDC vs DMD dirEntries code(or lazily evaluated code in general) or if it has to do with speed(possibly the renames are cached and do not show up immediately to dirEntries with the slower DMD?). I do not have any simplified code and I'm moving on from here. It should be easy to mock something up. The main thing to do is to rename the files based on something in the file name. e.g., suppose you have the files 1,2,3,4,5 (that is there names) and extract and multiply the filenames by 10. (that is your recompute function). You should end up with 10,20,30,40,50. But if the cause of issue I'm describing is in fact true, one don't necessarily get that because some files will be iterated more than once. e.g., maybe 10, 100, 1000, 20, 200, 30, 40, 50, 500. I am doing it over a lot of files btw, but that is essentially what is going on. The example above should be easy to do since one can simply to!int the filename and then multiply it by 10 and then rename that. I have moved on to avoid dirEntries completely and simply use the os directory listing function manually to extract the data but this should be investigated as it if it the behavior is what I am describing, a serious bug exists somewhere. (if someone could confirm that dirEntries is a lazy range, then it would explain the problem, but not necessarily why dmd and ldc differ, (dmd seeming to function as expected)).
Jun 30 2017
parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Fri, Jun 30, 2017 at 07:57:22PM +0000, FoxyBrown via Digitalmars-d-learn
wrote:
[...]
 The only way this can happen is if the rename command is somehow
 feeding back in to the algorithm. Since the problem goes away when I
 pre-compute dirEntries, it suggests that dirEntries is being lazily
 computed.
Um... the docs explicit say that dirEntries is lazy, did you not see that? https://dlang.org/phobos/std_file#dirEntries "Returns an input range of DirEntry that *lazily* iterates a given directory, ..." [emphasis mine] [...]
 I'm pretty sure that the analysis above is correct, that is,
 dirEntries is lazy and ends up picking up the renamed file. This is
 sort of like removing an element in an array while iterating over the
 array.
This is certainly what it looks like; however, it doesn't explain a couple of things: 1) Why the DMD version appears to be unaffected, since as far as I can tell from the code, it is also a lazy iteration; 2) On Linux at least, renaming a file does not move the location of the entry in the directory, so whether dirEntries is lazy or not shouldn't even matter in the first place. Does Windows reorder the directory when you rename files? E.g., if you set the folder to sort alphabetically, does it actually sort the directory, or does it only sort the GUI output? My guess is that the sort order only affects the GUI output, as it would be grossly inefficient to actually sort the directory. But you never know with Windows... In any case, if Windows *does* physically sort the folder, that could explain how rename() affects dirEntries. However, this still doesn't explain the discrepancy between DMD and LDC. Actually, now that I think of it... Linux may do the same thing if the new filename is longer and doesn't fit in the old slot. So that could explain (2). However, why the difference between DMD and LDC? It doesn't make sense to me, if you tested both on the same OS. Here's a way to rule out (2): instead of using the current working directory, change the code to create a fresh copy of the directory each time. Does DMD / LDC still show a difference? The idea here is that it may have been a coincidence that you saw LDC having the problem and DMD not, since whether or not a renamed file gets moved depends on how big the current slot for its name is in the directory, and if you've already done a bunch of operations on the directory, some slots will be bigger and some will be smaller, so some renames will happen in-place whereas others will cause a reordering. It could be you just got unlucky with LDC and caused a reordering, whereas you got lucky with DMD and the existing slots were already big enough so the problem isn't visible. OTOH, this still doesn't explain why calling the OS functions directly fixes the problem. If there is a bug in LDC's version of dirEntries somewhere, we'd like to know about it so that we can fix it. T -- Without outlines, life would be pointless.
Jun 30 2017
parent FoxyBrown <Foxy Brown.IPT> writes:
On Friday, 30 June 2017 at 20:13:37 UTC, H. S. Teoh wrote:
 On Fri, Jun 30, 2017 at 07:57:22PM +0000, FoxyBrown via 
 Digitalmars-d-learn wrote: [...]
 [...]
Um... the docs explicit say that dirEntries is lazy, did you not see that? [...]
It is possible that dmd has the same problem but I did not see it. What I did was develop it using dmd then once it was working went to release ldc and saw that it immediately did not have the same results and the results were wrong. Since I was debugging it the whole time and it was working fine with dmd, I simply assumed dmd was working. Since it is a lazy range, I'm sure that is the problem.
Jun 30 2017