www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [your code here]

reply "Roumen Roupski" <rroupski gmail.com> writes:
// Checks if two files have equal content using memory mapped 
files
import std.file;
import std.mmfile;
import std.stdio;

// Compares the content of two files
private bool equals(in string f1, in string f2)
{
	if (getSize(f1) != getSize(f2))
		return false;	// different file sizes

	if (getSize(f1) == 0)
		return true;	// zero-length files are equal
	
	MmFile m1, m2;
	try
	{
		m1 = new MmFile(f1);
		m2 = new MmFile(f2);
		return m1[] == m2[];
	}
	catch (Throwable ex)
	{
		writefln("File read error: %s", ex.msg);
		return false;  // cannot compare the files
	}
	finally
	{
		delete m1;
		delete m2;
	}
}

void main (string[] args)
{
	enum NotFound = "Cannot open file: %s";

	if (args.length == 3)
	{
		auto f1 = args[1], f2 = args[2];

		if (!(exists(f1) && isFile(f1)))
			writefln(NotFound, f1);
		else if (!(exists(f2) && isFile(f2)))
			writefln(NotFound, f2);
		else if (equals(f1, f2))
			writeln("Same files");
		else
			writeln("Different files");
	}
	else
	{
		writeln("Usage: filequals <file1> <file2>");
	}
}
Jan 31 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Roumen Roupski:

 private bool equals(in string f1, in string f2)
 {
 	if (getSize(f1) != getSize(f2))
 		return false;	// different file sizes

 	if (getSize(f1) == 0)
 		return true;	// zero-length files are equal

Making equals() private is not useful, but maybe it's possible to make equals() nothrow.
 	
 	MmFile m1, m2;
 	try
 	{
 		m1 = new MmFile(f1);
 		m2 = new MmFile(f2);
 		return m1[] == m2[];
 	}

I have tried your little program on two files about 500 MBytes long, and the memory usage is strange. Is it doing the right thing? Bye, bearophile
Jan 31 2013
prev sibling next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Thursday, 31 January 2013 at 08:42:48 UTC, Roumen Roupski 
wrote:
 	finally
 	{
 		delete m1;
 		delete m2;
 	}

D has a GC. No need for manual deletion. (In fact, I think this is deprecated? Or scheduled for removal? Or maybe neither. Who knows).
Jan 31 2013
parent reply FG <home fgda.pl> writes:
On 2013-01-31 12:38, Namespace wrote:
 If you want to do something, then take destroy.
 AFAIK delete destroy _and_ release the memory immediately. 'destroy' doesn't.

And that's why delete is valuable (at least on 32-bit windows). Especially when you are comparing 500 MB files in a loop. :)
Jan 31 2013
parent reply FG <home fgda.pl> writes:
On 2013-01-31 14:21, bearophile wrote:
 Especially when you are comparing 500 MB files in a loop. :)

I have had problems comparing with this program a single pair of files that large...

Strange. No problems here. Only had to switch from dmd32 to gdc64 with 1GB or bigger files. Tested on win7-64.
Jan 31 2013
parent FG <home fgda.pl> writes:
On 2013-01-31 15:17, bearophile wrote:
 FG:

 Strange. No problems here. Only had to switch from dmd32 to gdc64 with 1GB or
 bigger files. Tested on win7-64.

How much memory is it using? What's the performance compared to the diff tool?

Two identical files, 1069 MB each. Program compiled with GDC, 64-bit. Used 6272 kB private mem / 2144 MB working set, and took 13.5 seconds. Cygwin's diff took only 1.85 s.
Jan 31 2013
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 31 January 2013 at 10:58:48 UTC, Peter Alexander 
wrote:
 On Thursday, 31 January 2013 at 08:42:48 UTC, Roumen Roupski 
 wrote:
 	finally
 	{
 		delete m1;
 		delete m2;
 	}

D has a GC. No need for manual deletion. (In fact, I think this is deprecated? Or scheduled for removal? Or maybe neither. Who knows).

http://dlang.org/deprecate.html#delete If you want to do something, then take destroy. AFAIK delete destroy _and_ release the memory immediately. 'destroy' doesn't.
Jan 31 2013
prev sibling next sibling parent "simendsjo" <simendsjo gmail.com> writes:
On Thursday, 31 January 2013 at 12:28:43 UTC, FG wrote:
 On 2013-01-31 12:38, Namespace wrote:
 If you want to do something, then take destroy.
 AFAIK delete destroy _and_ release the memory immediately. 
 'destroy' doesn't.

And that's why delete is valuable (at least on 32-bit windows). Especially when you are comparing 500 MB files in a loop. :)

Run GC.collect() After destroy.
Jan 31 2013
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 31 January 2013 at 12:28:43 UTC, FG wrote:
 On 2013-01-31 12:38, Namespace wrote:
 If you want to do something, then take destroy.
 AFAIK delete destroy _and_ release the memory immediately. 
 'destroy' doesn't.

And that's why delete is valuable (at least on 32-bit windows). Especially when you are comparing 500 MB files in a loop. :)

I like and use it also. ;)
 Run GC.collect() After destroy.

collect? Maybe free, but collect seems a bit like to take a sledgehammer to crack a nut.
Jan 31 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 31 January 2013 at 12:28:43 UTC, FG wrote:
 On 2013-01-31 12:38, Namespace wrote:
 If you want to do something, then take destroy.
 AFAIK delete destroy _and_ release the memory immediately. 
 'destroy' doesn't.

And that's why delete is valuable (at least on 32-bit windows). Especially when you are comparing 500 MB files in a loop. :)

If we are talking about memory mapped files, then destroying the MmFile classes will close the mappings, thus freeing the virtual memory. Anyway, I'm really impressed by this example, as the kernel may be inclined to leave the pages we've already compared in RAM (might be fixed with madvise(MADV_SEQUENTIAL)), and it doesn't demonstrate many D strengths (memory-mapped files are an OS feature, for which D just provides a wrapper class). Maybe it would be more interesting if we use std.algorithm.zip or .lockstep or .equal with File.byChunk?
Jan 31 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
FG:

 Especially when you are comparing 500 MB files in a loop. :)

I have had problems comparing with this program a single pair of files that large... Bye, bearophile
Jan 31 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
FG:

 Strange. No problems here. Only had to switch from dmd32 to 
 gdc64 with 1GB or bigger files. Tested on win7-64.

How much memory is it using? What's the performance compared to the diff tool? Bye, bearophile
Jan 31 2013
prev sibling next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 01/31/2013 12:42 AM, Roumen Roupski wrote:

 catch (Throwable ex)
 {
 writefln("File read error: %s", ex.msg);
 return false; // cannot compare the files

Throwable is a little too high in the exception hierarchy: Throwable / \ Error Exception / \ / \ A program should catch only Exception and its subtypes. The Error sub-hierarchy represents errors about the state of the program. The program state may be so invalid that it is not guaranteed that even writefln() or 'return' will work. For the same reason, if it is really an Error that has been thrown, even the destructors are not called during stack unwinding. Ali
Jan 31 2013
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 01/31/2013 10:39 AM, Andrej Mitrovic wrote:
 On 1/31/13, Ali Çehreli<acehreli yahoo.com>  wrote:
 For the same reason, if it is really an Error that has been thrown, even
 the destructors are not called during stack unwinding.

Where are you extracting this information from?

I hope I haven't spread wrong information. I "learned" this from the discussions on this forum. Perhaps it was merely an idea and I remember it as truth. Others, is what I said correct? Why do I think that way? :) Ali
Jan 31 2013
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/31/2013 09:43 PM, Ali Çehreli wrote:
 On 01/31/2013 10:39 AM, Andrej Mitrovic wrote:
  > On 1/31/13, Ali Çehreli<acehreli yahoo.com>  wrote:
  >> For the same reason, if it is really an Error that has been thrown,
 even
  >> the destructors are not called during stack unwinding.
  >
  > Where are you extracting this information from?

 I hope I haven't spread wrong information. I "learned" this from the
 discussions on this forum. Perhaps it was merely an idea and I remember
 it as truth.

 Others, is what I said correct? Why do I think that way? :)

 Ali

Destructors are not "guaranteed" to run. They actually do. I think this is mostly to allow segmentation faults on Linux.
Jan 31 2013
prev sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 01/31/2013 12:43 PM, Ali Çehreli wrote:
 On 01/31/2013 10:39 AM, Andrej Mitrovic wrote:
  > On 1/31/13, Ali Çehreli<acehreli yahoo.com> wrote:
  >> For the same reason, if it is really an Error that has been thrown,
 even
  >> the destructors are not called during stack unwinding.
  >
  > Where are you extracting this information from?

 I hope I haven't spread wrong information. I "learned" this from the
 discussions on this forum. Perhaps it was merely an idea and I remember
 it as truth.

 Others, is what I said correct? Why do I think that way? :)

I tested this with dmd. struct destructors do get called during stack unwinding. However, a relevant quote: http://dlang.org/phobos/object.html#.Exception "In principle, only thrown objects derived from [Exception] are safe to catch inside a catch block. Thrown objects not derived from Exception represent runtime errors that should not be caught, as certain runtime guarantees may not hold, making it unsafe to continue program execution." TDPL talks about what happens (and does not happen) when a function in declared as nothrow. It also talks about why Throwable should not be caught. It doesn't say the same exact things about Error but the book draws a clear distinction between the Exception sub-hierarchy and the other exception classes. There is great information in Chapter 9 of TDPL but they are quite large to type here. Especially sections 9.2 and 9.4 are relevant. The following are my thoughts... Here is the logic behind why the destructors must not be executed when the thrown exception is an Error. AssertError is an Error, indicating that the program state is wrong. When the program state is wrong, there is no guarantee that any further operation in the program can safely be executed. Assume that the AssertError is coming from the invariant block of a struct (or assume that any other assert about the state of an object has failed). In that case the object is in a bad state. Can the destructor be called on that object? Should it be? What can we expect to happen? Ali
Jan 31 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 1/31/13, Ali =C7ehreli <acehreli yahoo.com> wrote:
 For the same reason, if it is really an Error that has been thrown, even
 the destructors are not called during stack unwinding.

Where are you extracting this information from?
Jan 31 2013
prev sibling parent "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Thursday, 31 January 2013 at 20:43:37 UTC, Ali Çehreli wrote:
 On 01/31/2013 10:39 AM, Andrej Mitrovic wrote:
 On 1/31/13, Ali Çehreli<acehreli yahoo.com>  wrote:
 For the same reason, if it is really an Error that has been


 the destructors are not called during stack unwinding.

Where are you extracting this information from?

I hope I haven't spread wrong information. I "learned" this from the discussions on this forum. Perhaps it was merely an idea and I remember it as truth. Others, is what I said correct? Why do I think that way? :) Ali

There is not much information about this topic, but I believe there are two separate issues here (technical and practical): 1) Errors can behave not always like exceptions. For example, most errors (which are not thrown directly) are generated by D features: final switch throws SwitchError, notorious activity inside class dtors which calls GC causes InvalidMemoryOperationError, etc. These are typically called as OnXXError functions and are in druntime (https://github.com/D-Programming-Language/druntime/blob/master/sr /core/exception.d). Theoretically this functions may just terminate application without throwing exception, so point here is that even trying to catch Error can be useless. However if Error is thrown by D exception mechanism, I think you can handle it just like other Throwables. 2) Although you can (sometimes) catch Error, state of the program is in unpredictable condition. These conditions may depend on type of error and other factors.
Jan 31 2013