www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - MD5 hash on a file and rawRead

reply Andrej Mitrovic <a a.a> writes:
I'm trying to use the std.md5.sum method. It takes as an argument a digest to
output the hash to, and the second argument is plain data.

So I'm trying to read an entire file at once. I thought about using rawRead,
but I get a runtime exception:
        auto filename = r"C:\file.dat";
        File file;
        try
        {
            file = File(filename, "r");
        }
        catch (ErrnoException exc)
        {
            return;
        }
        ubyte[] buffer;
        file.rawRead(buffer);
 
error: stdio.d:rawRead must take a non-empty buffer

There are no size methods for the File structure (why?). There's a getSize
function but it's in std.file, and I can't use it because:

        auto filename = r"C:\file.dat";
        File file;
        try
        {
            file = File(filename, "r");
        }
        catch (ErrnoException exc)
        {
            return;
        }
        
        ubyte[] buffer = new ubyte[](getSize(filename));
        ubyte[16] digest;
        file.rawRead(buffer);
        std.md5.sum(digest, buffer);

Error: cannot implicitly convert expression
(getSize(cast(const(char[]))this._libFileName)) of type ulong to uint

I can use the buffered version fine:
        auto filename = r"C:\file.dat";
        File file;
        try
        {
            file = File(filename, "r");
        }
        catch (ErrnoException exc)
        {
            return;
        }
        
        ubyte[16] digest;
        MD5_CTX context;
        context.start();
        
        foreach (ubyte[] buffer; file.byChunk(4096 * 1024))
        {
            context.update(buffer);
        }
        
        context.finish(digest);
        writefln("MD5 (%s) = %s", filename, digestToString(digest));

But I'd prefer to write simpler code and use rawRead to read the entire file at
once. I'm reading really small files, so rawRead should be fine.

Also, why do we have file handling in two different modules? I'd expect to find
all file handling ops in std.file, not scattered around Phobos.

Let me know if I'm doing something obviously stupid. :)
Feb 09 2011
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Also disregard that the error shows "_libFileName", I was in the
middle of refactoring so the name stayed.
Feb 09 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/10/11, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 Also disregard that the error shows "_libFileName", I was in the
 middle of refactoring so the name stayed.
*I mean disregard that it's called _libFileName, when it's really "filename".
Feb 09 2011
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Wed, 09 Feb 2011 23:01:47 -0500, Andrej Mitrovic wrote:

 I'm trying to use the std.md5.sum method. It takes as an argument a
 digest to output the hash to, and the second argument is plain data.
 
 So I'm trying to read an entire file at once. I thought about using
 rawRead, but I get a runtime exception:
         auto filename = r"C:\file.dat";
         File file;
         try
         {
             file = File(filename, "r");
         }
         catch (ErrnoException exc)
         {
             return;
         }
         ubyte[] buffer;
         file.rawRead(buffer);
  
 error: stdio.d:rawRead must take a non-empty buffer
 
 There are no size methods for the File structure (why?). There's a
 getSize function but it's in std.file, and I can't use it because:
 
         auto filename = r"C:\file.dat";
         File file;
         try
         {
             file = File(filename, "r");
         }
         catch (ErrnoException exc)
         {
             return;
         }
         
         ubyte[] buffer = new ubyte[](getSize(filename)); ubyte[16]
         digest;
         file.rawRead(buffer);
         std.md5.sum(digest, buffer);
 
 Error: cannot implicitly convert expression
 (getSize(cast(const(char[]))this._libFileName)) of type ulong to uint
 
 I can use the buffered version fine:
         auto filename = r"C:\file.dat";
         File file;
         try
         {
             file = File(filename, "r");
         }
         catch (ErrnoException exc)
         {
             return;
         }
         
         ubyte[16] digest;
         MD5_CTX context;
         context.start();
         
         foreach (ubyte[] buffer; file.byChunk(4096 * 1024)) {
             context.update(buffer);
         }
         
         context.finish(digest);
         writefln("MD5 (%s) = %s", filename, digestToString(digest));
 
 But I'd prefer to write simpler code and use rawRead to read the entire
 file at once. I'm reading really small files, so rawRead should be fine.
To read an entire file at once, you should use std.file.read(), or std.file.readText() if it's an UTF encoded text file.
 Also, why do we have file handling in two different modules? I'd expect
 to find all file handling ops in std.file, not scattered around Phobos.
 
 Let me know if I'm doing something obviously stupid. :)
There are actually three modules for file handling, but I think they are nicely separated: - std.file handles files as isolated units, i.e. it reads, writes and manipulates entire files. - std.path manipulates file/directory names as strings, and performs no disk I/O. - std.stdio is for more advanced file I/O, as it lets you open files and manipulate them through the File handle. (This includes reading, writing, seeking, etc.) Hope this clears things up. :) -Lars
Feb 10 2011
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/10/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 To read an entire file at once, you should use std.file.read(), or
 std.file.readText() if it's an UTF encoded text file.
I missed that method while browsing through the docs. Thanks.
 There are actually three modules for file handling, but I think they are
 nicely separated:

   - std.file handles files as isolated units, i.e. it reads,
     writes and manipulates entire files.

   - std.path manipulates file/directory names as strings, and
     performs no disk I/O.

   - std.stdio is for more advanced file I/O, as it lets you
     open files and manipulate them through the File handle.
     (This includes reading, writing, seeking, etc.)

 Hope this clears things up. :)

 -Lars
Yeah I know there's 3 modules, I'd still prefer having one module for file manipulation and one for the path string functionality. Right now I have to keep switching between stdio and file's documentation all the time, which is how I've managed to miss the .read method. Thanks again though.
Feb 10 2011