www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - compile-time function evaluation consuming too much memory

reply Christian Kamm <kamm.incasoftware shift-at-left-and-remove-this.de> writes:
I have been experimenting with using ctfe and static foreach to generate
code from strings. While for small strings all is fine, larger strings make
DMD's memory usage go over the top. It came up when splitting a string into
an array of strings - something I'd think is relatively common. Simple
example that illustrates the problem:

char[] make_empty_string(int n)
{
  char[] result;

  for(int i = 0; i < n; ++i)
    result ~= " ";
  return result;
}

const char[] largestring1 = make_empty_string(10000);
const char[] largestring2 = make_empty_string(10000);
const char[] largestring3 = make_empty_string(10000);
...

Every call to make_empty_string(10000) will consume 50 MB of memory which it
seems will not be released until dmd terminates. That happens because every
intermediate value in 'result' is stored.

My question is if
- this is expected and actually intentional as ctfe merely extends constant
folding or if 
- this is a bug and dmd could just free the old memory on array
appends/assigns/... (would this ever be a problem?) or if
- D needs a compile-time garbage collector. ;)

Cheers,
Christian
Jun 28 2007
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christian Kamm schrieb am 2007-06-28:
 I have been experimenting with using ctfe and static foreach to generate
 code from strings. While for small strings all is fine, larger strings make
 DMD's memory usage go over the top. It came up when splitting a string into
 an array of strings - something I'd think is relatively common. Simple
 example that illustrates the problem:

 char[] make_empty_string(int n)
 {
   char[] result;

   for(int i = 0; i < n; ++i)
     result ~= " ";
   return result;
 }

 const char[] largestring1 = make_empty_string(10000);
 const char[] largestring2 = make_empty_string(10000);
 const char[] largestring3 = make_empty_string(10000);
 ...

 Every call to make_empty_string(10000) will consume 50 MB of memory which it
 seems will not be released until dmd terminates. That happens because every
 intermediate value in 'result' is stored.

 My question is if
 - this is expected and actually intentional as ctfe merely extends constant
 folding or if 
 - this is a bug and dmd could just free the old memory on array
 appends/assigns/... (would this ever be a problem?) or if
 - D needs a compile-time garbage collector. ;)

The memory performance of ctfe can definatly be improved, however your sample code too. Yes I know the above was just a sample code to demonstrate the issue but quite frequently that issue can be evaded by using another algoritm: # char[] make_empty_string(int n){ # char[] result; # # if(n){ # result = " "; # while(result.length * 2 <= n){ # result ~= result; # } # if(result.length < n){ # result ~= result[0 .. n - $]; # } # } # # return result; # } Thomas -----BEGIN PGP SIGNATURE----- iQIVAwUBRoVizrZlboUnBhRKAQKoPg/+IegyEg3TEpp6+HQehmiHAXTqQZCZpfxc w8w1WyB35pxhvMKrReCXUMuQytQunulRPn6ZSry6pqNypol/UX7pJ1bl48V8bSZN bpPHwietJgkM0b+8tfbVarTCWKmq0Tq2QT2v7Hi1rgB9Vu0Ge8iKLELA0RCLC18z B7eUXQu9L2uvfEePT6+ONTq4ehkkbY4MXQhevl/dhq0VgW0EjGbq8Sx+ebBHnWjb G2EM7UovI5Kxe7XmJKwfOhmziIx4B68M3P+KBNiMvuhR9rqi73KFb9VCAdWVn7PK qOPylEitRLYXqXZ5WD0aJCV/DoyPNbOF+k6VlZ98QmGJFtsG3abYpSTcihs5AgnS pf+4bG/gN1iWoScWWHGWW9cPl8eBvIjHCS9YwklxdSD+/Fs1E6+yA/oBs5TLk1K8 6u8IczMTmFOfecllMxJkKScpYum6Y6DRXl+0SQF9zsMxyiyQUychW38ZVZR1Mu4s D4ktm/++bvMlr9IiX32UB5nMYPDTDyWJI4MheMTCwhelCi0WpXtRKHx/9tU8BAv4 CqDvdmKkE2Jr4xlFDtQl4zbxxSwvwgF5yZAKexZsVe+0E0rUsYG6sXUDPL8PhJCN tHTHmGwKwxDU65GvLYZEOa99CNs+quhhbyG15RmMpvvaJhUz5UMhCkFvU2yPFKoP D9HIg9panV8= =rsjl -----END PGP SIGNATURE-----
Jun 29 2007
parent Christian Kamm <kamm.incasoftware shift-at-left-and-remove-this.de> writes:
 The memory performance of ctfe can definatly be improved,

Do you think this warrants being entered as a bug?
 Yes I know the above was just a sample code to
 demonstrate the issue but quite frequently that issue can be evaded by
 using another algoritm:

Yes, I use the same for tuples. First I was splitting strings directly into tuples in order to statically foreach over them. However, storing the tuples seemed to take a lot more memory than an array of strings, so nowadays I just have const char[][] splitstring = split(str, ' '); foreach(i, unused; MakeIterTuple!(splitstring.length)) { ... splitstring[i] ... } where MakeIterTuple is template MakeIterTuple(uint size) { static if(size == 0) alias Tuple!() MakeIterTuple; else static if(size % 2 == 0) alias Tuple!(MakeIterTuple!(size/2), MakeIterTuple!(size/2)) MakeIterTuple; else alias Tuple!(void, MakeIterTuple!((size-1)/2), MakeIterTuple!((size-1)/2)) MakeIterTuple; } And that works well. What's currently my greatest memory hog is the split function above. It's basically implemented like this char[][] split(char[] str, char by) { char[][] result; size_t c = 0; int pos; while((pos = find(str[c..$], by)) != -1) { result ~= str[c..c+pos]; c += pos+1; } result ~= str[c..$]; return result; } and I can't seem to find a way to make it more memory efficient. I've thought about finding the positions and lengths of the strings to be taken and then using a mixin to create one large array literal, but can't get around building the string somewhere. Cheers, Christian
Jun 29 2007