www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - using ntfs write_through option to create an efficient unzipped layout

reply "Jay Norwood" <jayn prismnet.com> writes:
Below are measured times on operations on an unzipped 2GB layout. 
My observation is that use of a slightly modified version of 
std.file.write for the creation of the unzipped files results in 
a folder that is much more efficient for sequential file system 
operations.  In particular, the ntfs rmdir takes 6 sec vs 161 sec 
when removing the layout.

I don't have a clear explanation why this is faster, but there is 
some mention in the article bleow about lazy writes by ntfs if 
you don't use write-through, and I suspect that is involved.

http://msdn.microsoft.com/en-us/library/windows/desktop/aa364218(v=vs.85).aspx

All times on a seagate 7200rpm hard drive on win7-64

                unzip       rmd2          cpd       rmdir (ntfs) 
xcopy (ntfs)
uzp SEQ Par	   86 secs     171 secs      21 secs   169 secs     
91 secs
uzp NS WT      157 secs    12 secs       13 secs   6 sec        
43 secs
uzp NS WT Par  87 secs     16 secs       17 secs   17 sec       
48 secs
7zip unzip     127 secs    151 secs      135 secs  161 sec      
68 secs
myDefrag       +15 min     3.5 secs      54.3 secs 4.3 sec      
90 secs

uzp SEQ Par  is using the current std.file.write.  Parallel ops 
during decompress.
uzp NS WT    is using a modified version of std.file.write, no 
SEQ, added WRITE_THROUGH
uzp NS WT Par is same as above, but write operations parallel, 
100 files per thread
myDefrag is using sortByName to defrag the unzipped folder
rmd2 is a parallel unzip, with 100 files per thread
cpd is parallel copy with 100 files per thread copy from the hard 
drive to an ssd
rmdir is the regular file system rdmir /q /s
xcopy is ntfs  xcopy /q /e /I  from the hard drive to an ssd

void writeThrough(in char[] name, const void[] buffer)
{
     version(Windows)
     {
         alias TypeTuple!(GENERIC_WRITE, 0, null, CREATE_ALWAYS,
                          
FILE_ATTRIBUTE_NORMAL|FILE_FLAG_WRITE_THROUGH,
                          HANDLE.init)
             defaults;
         auto h = useWfuncs
             ? CreateFileW(std.utf.toUTF16z(name), defaults)
             : CreateFileA(toMBSz(name), defaults);

         cenforce(h != INVALID_HANDLE_VALUE, name);
         scope(exit) cenforce(CloseHandle(h), name);
         DWORD numwritten;
         cenforce(WriteFile(h, buffer.ptr, 
to!DWORD(buffer.length), &numwritten, null) == 1
                  && buffer.length == numwritten,
                  name);
     }
     else version(Posix)
         return writeImpl(name, buffer, O_CREAT | O_WRONLY | 
O_TRUNC);
}
Apr 21 2012
next sibling parent "Kagamin" <spam here.lot> writes:
Did you try without WRITE_THROUGH and without sequential scan?
Apr 23 2012
prev sibling parent "Kagamin" <spam here.lot> writes:
I think, WRITE_THROUGH would cause problems on many small files.
Apr 23 2012