www.digitalmars.com         C & C++   DMDScript  

D - generic algorithms, resource mgmt, and other ramblings

reply Patrick Down <pat codemoon.com> writes:
Lately in some posts there has been talk about D objects 
and the fact that you can't automatically tie object 
lifetime to a stack frame or to the lifetime of other 
objects like you could do in C++.  I think what is most 
lamented is the fact that you can't use this feature to 
do automatic deterministic resource management.

The following ideas follow from languages like Smalltalk, 
Ruby, and now Python ( version 2.2 ).  These languages
provide some mechanisms that can be used as alternatives
to stack frame object lifetime management.

If we look at a very simple example with out error checking
of a C++ file helper class we might write it like this.  

class File
{
  FILE* f;
  public:
  File(char* name, char* mode)
  {
    f = fopen(name,mode);
  }
  ~File()
  {
    fclose(f);
  }
  operator FILE*() { return f; }
};

void foo()
{
  File fin("data.dat","r");
  
  // use fin to read and process data
  // from the file
  
  // destructor closes the file
}

If we look at this we can generalize this concept into a 
generic algorithm where some of the steps are predefined,
opening and closing the file for example, and other steps 
are user defined.

1. Open file (predefined)
2. Read/Write/Process (user defined)
3. Close file (predefined)

Languages like smalltalk, ruby, and python actually have
a way to encapsulate these generic algorithms.  In Ruby for
example you can do the same file C++ trick above like this...

File.open("data.data") do |aFile|
  # do io with aFile
end

Even better you can print all the lines in a text
file like this...

File.foreach("data.txt") do |aLine|
  print aLine
end

Each of the functions above, open and foreach, get an
invisible parameter which is a reference to the code
block that follows the function call.  In Ruby open 
might be written something like this...

def open(filename,mode)
  aFile = File.new(filename,mode)
  yield aFile
  aFile.close()
end

The yield statement passes control to the user defined
code block with the file object as a parameter.  When
that code block is done the file is closed.

As you might imagine this is usefull for all sorts of 
things.  Sorting for example.

# sort an array of objects by their name attribute
a.sort { | x,y | x.name < y.name }

The sort algorthm expects some code to do the comparison.
In C you could use qsort but you would have to go someplace
else in you code to define the criteria function.

It's easy to do this sort of thing with these interpreted 
dynamically bound languages.  Everything is an object to 
them, even code blocks. However, it has always seemed to 
me that there should be a way to at least get some of this 
functionality with statically bound languages, like C++... 
or maybe even D.

C++ actually does attempt to provide functionality like this
thru the STL and the algorithm templates.  The problem with
the STL approach comes from the user defined part of the
algorithm.  You ether need to use a fairly unreadable template
base functor language or define a separate class or function
that has at least some semantic distance from the place it 
is used.

I would propose a language level feature to support this
idea.  I've thought a long time about how to do this 
syntactically and this is the best that I've come up with so
far.

Heres the file open from above.  I'm just going to use the
stdio functions and leave out error checking to make the 
example simple.

// Define the function
void withfile(char[] name, char[] mode)
: void do(FILE*)
{
  FILE* f = fopen(toStringz(name),toStringz(mode));
  
  do(f);
  
  fclose(f);
}

// Use the function
void foo()
{
  char buffer[1024];
  int size; 

  withfile("test.dat","r")
  :do(f)
  {
    size = fread(buffer,1,1024,f);
  }
}

The section after the function header defines a function prototype
to be matched up with a similarly named function block at the place
the function is called.  I defined it this way to allow more than
one user defined code block.

void Func(int a)
: void foo(int),void bar(int)
{
  //...
  foo(b);
  //...
  bar(c);
  //...
}

Now image the power of this when combined with generic types.

void sort(type$[] array)
: bit compare(type$ a, type$ b)
{
  // ...Do sorting stuff
  if(compare(array[i],array[j])
  {
    // whatever
  }
}

int[] arr = { 23, 13, 56, 23, 19 }

sort(arr): compare(a,b) { return a < b; }
 
class Foo
{
  char[] GetName() { ... }
  int GetAge() { ... }
}

Foo[] arr;

// sort by name
sort(arr) : compare(a,b) { return a.GetName() < b.GetName(); }
 
// sort by age
sort(arr) : compare(a,b) { return a.GetAge() > b.GetAge(); }



The next question is how do you implement this functionality
behind the scenes?   Simple functions are just inlined when 
possible.  When the function is complex or inlining is not 
possible then the code blocks are actually implemented as 
functions and passed as invisible parameters to the function 
that uses them.

Let's do some cfront like translations to see how this works.
Take the withfile function above.  It translate to this.

void withfile(char[] name, char[] mode, void deligate(FILE*) do)
{
  FILE* f = fopen(toStringz(name),toStringz(mode));
  
  do(f);
  
  fclose(f);
}

The function call is translated like this.

// Create function for code block

void withfile_foo_do(FILE* f)
{
  // Opps, where do buffer and size come from?
  size = fread(buffer,1,1024,f);
}

void foo()
{
  char buffer[1024];
  int size; 

  withfile("data.dat","r",&withfile_foo_do);
}

As you can see there is a problem with the scope of the 
buffer and size variables.  We solve this by creating a
pointer that points into the callers stack to access
the local variables.  This pointer can be passed as another
invisible parameter.

struct foo_stack
{
  char buffer[1024];
  int size; // order depends on how the stack is layed out
}

void withfile(char[] name, char[] mode, foo_stack* pstk, void deligate
(FILE*) do)
{
  FILE* f = fopen(toStringz(name),toStringz(mode));
  
  do(f,pstk);
  
  fclose(f);
}

void withfile_foo_do(FILE* f,foo_stack* pstk)
{
  // Opps where do buffer and size come from?
  pstk.size = fread(pstk.buffer,1,1024,f);
}

void foo()
{
  char buffer[1024];
  int size; 

  withfile("data.dat","r",cast(foo_stack*)&size,&withfile_foo_do);
}


Well thats it in a nutshell.  I'm sorry the examples are so simple.
May 30 2002
parent reply "Sandor Hojtsy" <hojtsy index.hu> writes:
"Patrick Down" <pat codemoon.com> wrote in message
news:Xns921EBF196284Apatcodemooncom 63.105.9.61...
  void foo()
 {
   char buffer[1024];
   int size;

   withfile("test.dat","r")
   :do(f)
   {
     size = fread(buffer,1,1024,f);
   }
 }

void foo() { char buffer[1024]; int size; FILE* f = fopen(toStringz(name),toStringz(mode)); size = fread(buffer,1,1024,f); fclose(f); } seems shorter, and cleaner. But here comes the problem: Exceptions! If fread causes an exception the file will not be closed in either case. So modifiing your example: void withfile(char[] name, char[] mode) : void do(FILE*) { FILE* f = fopen(toStringz(name),toStringz(mode)); try { do(f); } finally { fclose(f); } } Seems better. From this point of view your syntax may decrease client code complexity. For the particular problem of stack lifetime, I propose a smaller syntax sugar. void fn() { int size; owned FileWrapper f = new FileWrapper(name, mode); size = f.fread(buffer,1,1024); f.whatever(); something(); } Where 'owned' is a keyword, which would actually mean: void fn() { int size; FileWrapper f = new FileWrapper(name, mode); try { size = f.fread(buffer,1,1024); f.whatever(); something(); } finally { delete f; } } And then the destructor/finalizer for FileWrapper could close the file. The finally is put to the end of the smallest block inside which f was declared. It doesn't seem to conflict with the garbage collector. Yours, Sandor
Jun 04 2002
next sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Sandor Hojtsy" <hojtsy index.hu> wrote in message
news:adi2hp$nce$1 digitaldaemon.com...

 seems shorter, and cleaner. But here comes the problem: Exceptions!
 If fread causes an exception the file will not be closed in either case.

How can C function fread() throw an exception???
Jun 04 2002
parent reply Patrick Down <pat codemoon.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in news:adiee6$132l$1
 digitaldaemon.com:

 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:adi2hp$nce$1 digitaldaemon.com...
 
 seems shorter, and cleaner. But here comes the problem: Exceptions!
 If fread causes an exception the file will not be closed in either case.

How can C function fread() throw an exception???

It can't. But these are really just overly simple examples. You would probaly use a file helper class instead that would throw exceptions.
Jun 04 2002
parent "Sandor Hojtsy" <hojtsy index.hu> writes:
"Patrick Down" <pat codemoon.com> wrote in message
news:Xns9223768DCBD69patcodemooncom 63.105.9.61...
 If fread causes an exception the file will not be closed in either case.


It can't. But these are really just overly simple examples. You would probaly use a file helper class instead that would throw exceptions.

Of course it can't, but I haven't told that it is the C standard fread! That's just the point : exceptions can hide everywhere. I thought it is obvious that I mean: If any of the hundred functions you would call, in a realistic situation, between opening and manually closing the file throws an exception, it causes the Wrong Thing. Yours, Sandor
Jun 04 2002
prev sibling parent Patrick Down <pat codemoon.com> writes:
"Sandor Hojtsy" <hojtsy index.hu> wrote in
news:adi2hp$nce$1 digitaldaemon.com: 
 If fread causes an exception the file will not be closed in either
 case. So modifiing your example:
 
 void withfile(char[] name, char[] mode)
: void do(FILE*)
 {
   FILE* f = fopen(toStringz(name),toStringz(mode));
   try {
     do(f);
   } finally {
     fclose(f);
   }
 }
 
 Seems better. From this point of view your syntax may decrease client
 code complexity.

Yes, this is better. You would probably wrap FILE* with a helper class too but I do think that it does produce more readable and less complex client code. The purpose of my post was really more to sell the idea of a generic algorithm syntax. The fact that it can be used to clean up stack lifetime resources is an incidental benefit. I think the idea becomes especially usefull when combined with generic types.
Jun 04 2002