www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Performance Issue

reply Vino.B <vino.bheeman hotmail.com> writes:
Hi,

  The below code is consume more memory and slower can you provide 
your suggestion on how to over come these issues.

string[][] csizeDirList (string FFs, int SizeDir) {
	ulong subdirTotal = 0;
	ulong subdirTotalGB;
	auto Subdata = appender!(string[][]);
     auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a => 
a.isDir && !globMatch(a.baseName, "*DND*")).map!(a => 
tuple(a.name, a.size)).array;
	  foreach (d; dFiles) {
				auto SdFiles = dirEntries(join(["\\\\?\\", d[0]]), 
SpanMode.depth).map!(a => tuple(a.size)).array;
				foreach (f; parallel(SdFiles,1))
					{ subdirTotal += f[0]; }
						subdirTotalGB = (subdirTotal/1024/1024);
						if (subdirTotalGB > SizeDir) { Subdata ~= [d[0], 
to!string(subdirTotalGB)]; }
						subdirTotal = 0;
		    }
			return Subdata.data;
}

From,
Vino.B
Sep 05
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 5 September 2017 at 09:44:09 UTC, Vino.B wrote:
 Hi,

  The below code is consume more memory and slower can you 
 provide your suggestion on how to over come these issues.

 [...]
Much slower then ?
Sep 05
parent reply Vino.B <vino.bheeman hotmail.com> writes:
On Tuesday, 5 September 2017 at 10:28:28 UTC, Stefan Koch wrote:
 On Tuesday, 5 September 2017 at 09:44:09 UTC, Vino.B wrote:
 Hi,

  The below code is consume more memory and slower can you 
 provide your suggestion on how to over come these issues.

 [...]
Much slower then ?
Hi, This code is used to get the size of folders on a NetApp NAS Filesystem , so the NetApp have their own tool to perform such task which is faster than this code, the difference is about 15-20 mins. While going through this website i was able to findd that we can use the "fold" from std.algorithm.iteration which would be faster that use the normal "+=", so tried replacing the line "{ subdirTotal += f[0]; }" with { subdirTotal = f[0].fold!( (a, b) => a + b); }, and this produces the required output+ additional output , in the next line of the code i say to list only folders that are greater than 10 Mb but this now is listing all folder (folder whose size is less than 10 MB are getting listed, not sure why. Program: string[][] coSizeDirList (string FFs, int SizeDir) { ulong subdirTotal = 0; ulong subdirTotalGB; auto Subdata = appender!(string[][]); Subdata.reserve(100); auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a => a.isDir && !globMatch(a.baseName, "*DND*")).map!(a => tuple(a.name, a.size)).array; foreach (d; dFiles) { auto SdFiles = dirEntries(join(["\\\\?\\", d[0]]), SpanMode.depth).map!(a => tuple(a.size)).array; foreach (f; parallel(SdFiles,1)) { subdirTotal = f[0].fold!( (a, b) => a + b); } subdirTotalGB = (subdirTotal/1024/1024); if (subdirTotalGB > SizeDir) { Subdata ~= [d[0], to!string(subdirTotalGB)]; } subdirTotal = 0; } return Subdata.data; } OutPut C:\Temp\TEAM1\dir1 - > Sieze greater than 10MB C:\Temp\TEAM1\dir2 -> Size lesser than 10MB. From, Vino.B
Sep 06
parent reply Azi Hassan <azi.hassan live.fr> writes:
On Wednesday, 6 September 2017 at 08:10:35 UTC, Vino.B wrote:
 in the next line of the code i say to list only folders that 
 are greater than 10 Mb but this now is listing all folder 
 (folder whose size is less than 10 MB are getting listed, not 
 sure why.
Is the size in GB ? If so, then subdirTotalGB = (subdirTotal/1024/1024); needs to become subdirTotalGB = (subdirTotal/1024/1024/1024); for it to take effect. But do correct me if I'm wrong, I still haven't had my morning coffee.
Sep 06
parent reply Vino.B <vino.bheeman hotmail.com> writes:
On Wednesday, 6 September 2017 at 10:58:25 UTC, Azi Hassan wrote:
 On Wednesday, 6 September 2017 at 08:10:35 UTC, Vino.B wrote:
 in the next line of the code i say to list only folders that 
 are greater than 10 Mb but this now is listing all folder 
 (folder whose size is less than 10 MB are getting listed, not 
 sure why.
Is the size in GB ? If so, then subdirTotalGB = (subdirTotal/1024/1024); needs to become subdirTotalGB = (subdirTotal/1024/1024/1024); for it to take effect. But do correct me if I'm wrong, I still haven't had my morning coffee.
Hi Azi, Your are correct, i tried to implement the fold in a separate small program as below, but not able to get the the required output, when you execute the below program the output you get is as below Output: [31460] [31460, 1344448] [31460, 1344448, 2277663] [31460, 1344448, 2277663, 2277663] [31460, 1344448, 2277663, 2277663, 31460] Setup: C:\\Temp\\TEST1\\BACKUP : This has 2 folder and 2 files in each folder C:\\Temp\\TEST2\\EXPORT : This has 2 folder and 2 files in one folder and 1 file in another folder Total files : 5 Required output: [31460, 1344448] - Array 1 for the FS C:\\Temp\\TEST1\\BACKUP [2277663, 2277663, 31460] - Array 2 for the C:\\Temp\\TEST2\\EXPORT import std.algorithm: filter, map, fold; import std.parallelism: parallel; import std.file: SpanMode, dirEntries, isDir; import std.stdio: writeln; import std.typecons: tuple; import std.path: globMatch; import std.array; void main () { ulong[] Alternate; string[] Filesys = ["C:\\Temp\\TEST1\\BACKUP", "C:\\Temp\\TEST2\\EXPORT"]; foreach(FFs; Filesys) { auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a => a.isDir).map!(a => tuple(a.name, a.size)).array; foreach (d; dFiles) { auto SdFiles = dirEntries(join(["\\\\?\\", d[0]]), SpanMode.depth).map!(a => tuple(a.size)).array; foreach (f; parallel(SdFiles,1)) { Alternate ~= f[0]; writeln(Alternate); } } } } From, Vino.B
Sep 06
parent reply Vino.B <vino.bheeman hotmail.com> writes:
On Wednesday, 6 September 2017 at 14:38:39 UTC, Vino.B wrote:
 On Wednesday, 6 September 2017 at 10:58:25 UTC, Azi Hassan 
 wrote:
 [...]
Hi Azi, Your are correct, i tried to implement the fold in a separate small program as below, but not able to get the the required output, when you execute the below program the output you get is as below Output: [31460] [31460, 1344448] [31460, 1344448, 2277663] [31460, 1344448, 2277663, 2277663] [31460, 1344448, 2277663, 2277663, 31460] Setup: C:\\Temp\\TEST1\\BACKUP : This has 2 folder and 2 files in each folder C:\\Temp\\TEST2\\EXPORT : This has 2 folder and 2 files in one folder and 1 file in another folder Total files : 5 Required output: [31460, 1344448] - Array 1 for the FS C:\\Temp\\TEST1\\BACKUP [2277663, 2277663, 31460] - Array 2 for the C:\\Temp\\TEST2\\EXPORT import std.algorithm: filter, map, fold; import std.parallelism: parallel; import std.file: SpanMode, dirEntries, isDir; import std.stdio: writeln; import std.typecons: tuple; import std.path: globMatch; import std.array; void main () { ulong[] Alternate; string[] Filesys = ["C:\\Temp\\TEST1\\BACKUP", "C:\\Temp\\TEST2\\EXPORT"]; foreach(FFs; Filesys) { auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a => a.isDir).map!(a => tuple(a.name, a.size)).array; foreach (d; dFiles) { auto SdFiles = dirEntries(join(["\\\\?\\", d[0]]), SpanMode.depth).map!(a => tuple(a.size)).array; foreach (f; parallel(SdFiles,1)) { Alternate ~= f[0]; writeln(Alternate); } } } } From, Vino.B
Hi Azi, The required out is like below [31460] - Array 1 for folder 1(all files in Folder 1) of the FS C:\\Temp\\TEST1\\BACKUP [1344448] - Array 2 for folder 2(all files in Folder 2) of the FS C:\\Temp\\TEST1\\BACKUP [2277663, 2277663] - Array 3 for folder 1(all files in Folder 1) of the FS C:\\Temp\\TEST2\\EXPOR [31460] - Array 4 for folder 2(all files in Folder 2) the FS C:\\Temp\\TEST2\\EXPORT
Sep 06
parent reply Azi Hassan <azi.hassan live.fr> writes:
On Wednesday, 6 September 2017 at 15:11:57 UTC, Vino.B wrote:
 On Wednesday, 6 September 2017 at 14:38:39 UTC, Vino.B wrote:
 Hi Azi,

   The required out is like below

 [31460]  - Array 1 for folder 1(all files in Folder 1) of the 
 FS C:\\Temp\\TEST1\\BACKUP
 [1344448]  - Array 2 for folder 2(all files in Folder 2) of the 
 FS C:\\Temp\\TEST1\\BACKUP
 [2277663, 2277663]  - Array 3 for folder 1(all files in Folder 
 1) of the FS C:\\Temp\\TEST2\\EXPOR
 [31460] - Array 4 for folder 2(all files in Folder 2) the FS 
 C:\\Temp\\TEST2\\EXPORT
I tried to create a similar file structure on my Linux machine. Here's the result of ls -R TEST1: TEST1: BACKUP TEST1/BACKUP: FOLDER1 FOLDER2 TEST1/BACKUP/FOLDER1: file1 file2 file3 TEST1/BACKUP/FOLDER2: b1 b2 And here's the output of ls -R TEST2 : TEST2: EXPORT TEST2/EXPORT: FOLDER1 FOLDER2 TEST2/EXPORT/FOLDER1: file2_1 file2_2 file2_3 TEST2/EXPORT/FOLDER2: export1 export2 export3 export4 This codes output the sizes in the format you described : import std.algorithm: filter, map, fold, each; import std.parallelism: parallel; import std.file: SpanMode, dirEntries, DirEntry; import std.stdio: writeln; import std.typecons: tuple; import std.path: globMatch; import std.array; void main () { auto Filesys = ["TEST1/BACKUP", "TEST2/EXPORT"]; ulong[][] sizes; foreach(FFs; Filesys) { auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a => a.isDir).map!(a => a.name); foreach (d; dFiles) { sizes ~= dirEntries(d, SpanMode.depth).map!(a => a.size).array; } } sizes.each!writeln; } It outputs the sizes : [6, 6, 6] [8, 8] [8, 8, 8] [9, 9, 9, 9] Note that there's no need to store them in ulong[][] sizes, you can display them inside the loop by replacing `sizes ~= dirEntries(d, SpanMode.depth).map!(a => a.size).array;` with `dirEntries(d, SpanMode.depth).map!(a => a.size).joiner(", ").writeln;` To make sure that it calculates the correct sizes, I made it display the paths instead by making "sizes" string[][] instead of ulong[][] and by replacing map!(a => a.size) with map!(a => a.name) in the second foreach loop : import std.algorithm: filter, map, each; import std.file: SpanMode, dirEntries, DirEntry; import std.stdio: writeln; import std.array : array; void main () { auto Filesys = ["TEST1/BACKUP", "TEST2/EXPORT"]; string[][] sizes; foreach(FFs; Filesys) { auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a => a.isDir).map!(a => a.name); foreach (d; dFiles) { sizes ~= dirEntries(d, SpanMode.depth).map!(a => a.name).array; } } sizes.each!writeln; } It outputs the paths as expected : ["TEST1/BACKUP/FOLDER1/file1", "TEST1/BACKUP/FOLDER1/file2", "TEST1/BACKUP/FOLDER1/file3"] ["TEST1/BACKUP/FOLDER2/b1", "TEST1/BACKUP/FOLDER2/b2"] ["TEST2/EXPORT/FOLDER1/file2_3", "TEST2/EXPORT/FOLDER1/file2_1", "TEST2/EXPORT/FOLDER1/file2_2"] ["TEST2/EXPORT/FOLDER2/export2", "TEST2/EXPORT/FOLDER2/export3", "TEST2/EXPORT/FOLDER2/export1", "TEST2/EXPORT/FOLDER2/export4"]
Sep 06
parent reply Azi Hassan <azi.hassan live.fr> writes:
On Wednesday, 6 September 2017 at 18:21:44 UTC, Azi Hassan wrote:
 I tried to create a similar file structure on my Linux machine. 
 Here's the result of ls -R TEST1:

 TEST1:
 BACKUP
...
Upon further inspection it looks like I messed up the output.
 [31460]  - Array 1 for folder 1(all files in Folder 1) of the 
 FS C:\\Temp\\TEST1\\BACKUP
 [1344448]  - Array 2 for folder 2(all files in Folder 2) of the 
 FS C:\\Temp\\TEST1\\BACKUP
 [2277663, 2277663]  - Array 3 for folder 1(all files in Folder 
 1) of the FS C:\\Temp\\TEST2\\EXPOR
 [31460] - Array 4 for folder 2(all files in Folder 2) the FS 
 C:\\Temp\\TEST2\\EXPORT
What files do these sizes correspond to ? Shouldn't there be two 
elements in the first array because C:\Temp\TEST1\BACKUP\FOLDER1 
contains two files ?
Sep 06
parent Vino.B <vino.bheeman hotmail.com> writes:
On Wednesday, 6 September 2017 at 18:44:26 UTC, Azi Hassan wrote:
 On Wednesday, 6 September 2017 at 18:21:44 UTC, Azi Hassan 
 wrote:
 I tried to create a similar file structure on my Linux 
 machine. Here's the result of ls -R TEST1:

 TEST1:
 BACKUP
...
Upon further inspection it looks like I messed up the output.
 [31460]  - Array 1 for folder 1(all files in Folder 1) of the 
 FS C:\\Temp\\TEST1\\BACKUP
 [1344448]  - Array 2 for folder 2(all files in Folder 2) of 
 the FS C:\\Temp\\TEST1\\BACKUP
 [2277663, 2277663]  - Array 3 for folder 1(all files in Folder 
 1) of the FS C:\\Temp\\TEST2\\EXPOR
 [31460] - Array 4 for folder 2(all files in Folder 2) the FS 
 C:\\Temp\\TEST2\\EXPORT
What files do these sizes correspond to ? Shouldn't there be 
two elements in the first array because 
C:\Temp\TEST1\BACKUP\FOLDER1 contains two files ?
Hi Azi, Was able to implement "fold", below is the update code, regarding container array, I have almost completed my program(Release 1), so it is not a good idea to convert the program from standard array to container array at this point. Some staring tomorrow i would be working on(Release 2) where in this release i plan to make the above changes. I have not reached my study on container array, so can you help me on how to implement the container array for the below code. Note : I have raised another thread "Container Array" asking the same. string[][] coSizeDirList (string FFs, int SizeDir) { ulong subdirTotal = 0; ulong subdirTotalGB; auto Subdata = appender!(string[][]); Subdata.reserve(100); auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a => a.isDir).map!(a => tuple(a.name, a.size)).array; foreach (d; dFiles) { auto SdFiles = dirEntries(join(["\\\\?\\", d[0]]), SpanMode.depth).map!(a => tuple(a.size)).array; foreach(f; parallel(SdFiles, 1)) { subdirTotal += f.fold!((a, b) => a + b); } subdirTotalGB = (subdirTotal/1024/1024); if (subdirTotalGB > SizeDir) { Subdata ~= [d[0], to!string(subdirTotalGB)]; } subdirTotal = 0; } return Subdata.data; } Note To All : I am basically a Admin guy, so started learning D a few months ago and found it very interesting, hence i raise so many question, so request you to adjust with me for a while. From, Vino.B
Sep 06
prev sibling next sibling parent Azi Hassan <azi.hassan live.fr> writes:
On Tuesday, 5 September 2017 at 09:44:09 UTC, Vino.B wrote:
 Hi,

  The below code is consume more memory and slower can you 
 provide your suggestion on how to over come these issues.
You can start by dropping the .array conversions after dirEntries. That way your algorithm will become lazy (as opposed to eager), meaning that it won't allocate an entire array of DirEntry[]. It will, instead, treat the DirEntries one at a time, resulting in less memory consumption. I didn't understand the join(["\\\\?\\", d[0]]) part, maybe you meant to write join("\\\\?\\", d[0]) ? If appender is too slow, you can experiment with a dynamic array whose capacity was preallocated : string[][] Subdata; Subdata.reserve(10000); In this case Subdata will hold enough space for 10000 string[]s, which will result in better performance. Here's the updated code (sans .array) in case any one wants to reproduce the issue : import std.stdio; import std.conv; import std.typecons; import std.array; import std.path; import std.container; import std.file; import std.parallelism; import std.algorithm; void main() { ".".csizeDirList(1024).each!writeln; } string[][] csizeDirList (string FFs, int SizeDir) { ulong subdirTotal = 0; ulong subdirTotalGB; auto Subdata = appender!(string[][]); auto dFiles = dirEntries(FFs, SpanMode.shallow) .filter!(a => a.isDir && !globMatch(a.baseName, "*DND*")) .map!(a => tuple(a.name, a.size)); foreach (d; dFiles) { auto SdFiles = dirEntries(join(["\\\\?\\", d[0]]), SpanMode.depth) .map!(a => tuple(a.size)); foreach (f; parallel(SdFiles,1)) { subdirTotal += f[0]; } subdirTotalGB = (subdirTotal/1024/1024); if (subdirTotalGB > SizeDir) { Subdata ~= [d[0], to!string(subdirTotalGB)]; } subdirTotal = 0; } return Subdata.data; }
Sep 05
prev sibling parent user1234 <user1234 12.hu> writes:
On Tuesday, 5 September 2017 at 09:44:09 UTC, Vino.B wrote:
 Hi,

  The below code is consume more memory and slower can you 
 provide your suggestion on how to over come these issues.

 string[][] csizeDirList (string FFs, int SizeDir) {
 	ulong subdirTotal = 0;
 	ulong subdirTotalGB;
 	auto Subdata = appender!(string[][]);
     auto dFiles = dirEntries(FFs, SpanMode.shallow).filter!(a 
 => a.isDir && !globMatch(a.baseName, "*DND*")).map!(a => 
 tuple(a.name, a.size)).array;
 	  foreach (d; dFiles) {
 				auto SdFiles = dirEntries(join(["\\\\?\\", d[0]]), 
 SpanMode.depth).map!(a => tuple(a.size)).array;
 				foreach (f; parallel(SdFiles,1))
 					{ subdirTotal += f[0]; }
 						subdirTotalGB = (subdirTotal/1024/1024);
 						if (subdirTotalGB > SizeDir) { Subdata ~= [d[0], 
 to!string(subdirTotalGB)]; }
 						subdirTotal = 0;
 		    }
 			return Subdata.data;
 }

 From,
 Vino.B
Try to suppress the globMatch. according to the glob, just a ctRegex would do the job or even more simple `!a.canFind("DND")`.
Sep 06