www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Lisp like lists and a problem with TANGO fromStringz

reply Bjoern <nanali nospam-wanadoo.fr> writes:
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit

Undefined Symbol ... fromStringz
Hi
even after spending a few hours I was not able to figure out where I 
made an mistake.

The error msg as follows :

C:\dmd\projects\bsdutil>dmd lisplist
C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
OPTLINK (R) for Win32  Release 8.00.1
Copyright (C) Digital Mars 1989-2004  All rights reserved.
lisplist.obj(lisplist)
  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
--- errorlevel 1

somehow :) nagged, hope somebody is willing to help
Many thanks in advance :
Bjoern

attached the source
Mar 02 2008
next sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Bjoern" <nanali nospam-wanadoo.fr> wrote in message 
news:fqea2o$tnj$1 digitalmars.com...
 Undefined Symbol ... fromStringz
 Hi
 even after spending a few hours I was not able to figure out where I
 made an mistake.

 The error msg as follows :

 C:\dmd\projects\bsdutil>dmd lisplist
 C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
 OPTLINK (R) for Win32  Release 8.00.1
 Copyright (C) Digital Mars 1989-2004  All rights reserved.
 lisplist.obj(lisplist)
  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
 --- errorlevel 1

 somehow :) nagged, hope somebody is willing to help
 Many thanks in advance :
 Bjoern

If you're not using Tango trunk (that is, you're using the 0.99.4 release), it's called "fromUtf8z". They just forgot to rename it. If that doesn't work, well hell.
Mar 02 2008
next sibling parent Bjoern <nanali nospam-wanadoo.fr> writes:
Jarrett Billingsley schrieb:
 "Bjoern" <nanali nospam-wanadoo.fr> wrote in message 
 news:fqea2o$tnj$1 digitalmars.com...
 Undefined Symbol ... fromStringz
 Hi
 even after spending a few hours I was not able to figure out where I
 made an mistake.

 The error msg as follows :

 C:\dmd\projects\bsdutil>dmd lisplist
 C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
 OPTLINK (R) for Win32  Release 8.00.1
 Copyright (C) Digital Mars 1989-2004  All rights reserved.
 lisplist.obj(lisplist)
  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
 --- errorlevel 1

 somehow :) nagged, hope somebody is willing to help
 Many thanks in advance :
 Bjoern

If you're not using Tango trunk (that is, you're using the 0.99.4 release), it's called "fromUtf8z". They just forgot to rename it. If that doesn't work, well hell.

Ah, the joy of Sunday afternoon programming :) As workaround I use : alias bool function(char*, char*) CompFunc; bool string_compare(char* a, char* b) { return ( a[0 .. strlen(a)] == b[0 .. strlen(b)] ); //return (fromStringz(a) == fromStringz(b)); } Thanks Jarret, have a nice day B
Mar 02 2008
prev sibling parent Bjoern <nanali nospam-wanadoo.fr> writes:
Jarrett Billingsley schrieb:
 "Bjoern" <nanali nospam-wanadoo.fr> wrote in message 
 news:fqea2o$tnj$1 digitalmars.com...
 Undefined Symbol ... fromStringz
 Hi
 even after spending a few hours I was not able to figure out where I
 made an mistake.

 The error msg as follows :

 C:\dmd\projects\bsdutil>dmd lisplist
 C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
 OPTLINK (R) for Win32  Release 8.00.1
 Copyright (C) Digital Mars 1989-2004  All rights reserved.
 lisplist.obj(lisplist)
  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
 --- errorlevel 1

 somehow :) nagged, hope somebody is willing to help
 Many thanks in advance :
 Bjoern

If you're not using Tango trunk (that is, you're using the 0.99.4 release), it's called "fromUtf8z". They just forgot to rename it. If that doesn't work, well hell.

The undefined symbol error message is very confusing. Bjoern
Mar 02 2008
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Bjoern:
 z = cast(CONS *) malloc(CONS.sizeof);

Suggestion 1: all those malloc will slow your code down, and on 32 bit CPU they burn more than 8 bytes for each cons. So I suggest you to manage a free list of them, allocating them at blocks, for example of about 64 kb. This reduces memory used, speeds up the allocation/deallocation, and may increase cache locality of your cons cells pointers a little too. Suggestion 2: I don't fully understand yet how the GC heap interacts with the C heap, so the following may lead to GC problems. You can allocate those memory blocks from the C heap instead of the GC heap (then you may want to add a root for each externally given pointer, this may be a bit messy). The C malloc gives you an aligned memory block, so all cons structs have an address aligned to 8/16 bytes too. So you have one or more bits free into the pointers. So you can use your cons cells to represent far more than just pairs of pointers, like chars, symbols, not-too-much-short integers, almost-floats, short strings (just need their length, that is small. Small strings are very common), etc. Many Lisp implementations use such tagged data, this reduces pointer deferences and memory use. The problem is that this is dynamic typing, so such cons become a kind of box/variant (and this isn't too much bad, I think). [Unrelated question: why aren't D dynamic arrays triples of values? So they can store their actual length and the memory they have available, so they can support something like the reserve() of C++ STL. Is the GC already storing such third value? But then I haven't found a reliable way to perform a reserve-like operation.] Bye, bearophile
Mar 02 2008
next sibling parent Bjoern <nanali nospam-wanadoo.fr> writes:
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit

bearophile schrieb:
 Bjoern:
 z = cast(CONS *) malloc(CONS.sizeof);

Suggestion 1: all those malloc will slow your code down, and on 32 bit CPU they burn more than 8 bytes for each cons. So I suggest you to manage a free list of them, allocating them at blocks, for example of about 64 kb. This reduces memory used, speeds up the allocation/deallocation, and may increase cache locality of your cons cells pointers a little too.

Thanks bearophile, well I thought 1) make it work then make it fast. I'll pick up your idea. I recall that Uwe Salomon uses this technic in his Indigo container library. I'll have a look.
 Suggestion 2: I don't fully understand yet how the GC heap interacts with the
C heap, so the following may lead to GC problems. You can allocate those memory
blocks from the C heap instead of the GC heap (then you may want to add a root
for each externally given pointer, this may be a bit messy). The C malloc gives
you an aligned memory block, so all cons structs have an address aligned to
8/16 bytes too. So you have one or more bits free into the pointers. So you can
use your cons cells to represent far more than just pairs of pointers, like
chars, symbols, not-too-much-short integers, almost-floats, short strings (just
need their length, that is small. Small strings are very common), etc. Many
Lisp implementations use such tagged data, this reduces pointer deferences and
memory use. The problem is that this is dynamic typing, so such cons become a
kind of box/variant (and this isn't too much bad, I think).
 

management article. HTH to understand. The source compiles now. using dmd 1.025 but the output is not as expected. Maybe this is a GC problem. Dunno yet. I've attached the source. Hope you have a look.
 [Unrelated question: why aren't D dynamic arrays triples of values? So they
can store their actual length and the memory they have available, so they can
support  something like the reserve() of C++ STL. Is the GC already storing
such third value? But then I haven't found a reliable way to perform a
reserve-like operation.]
 

If YOU don't know it .... :)
 Bye,
 bearophile

Ah, the joy of Sunday afternoon programming .. Bjoern
Mar 02 2008
prev sibling next sibling parent Bjoern <nanali nospam-wanadoo.fr> writes:
bearophile wrote:

quote :
So you have one or more bits free into the pointers. So you can use your 
cons cells to represent far more than just pairs of pointers, like 
chars, symbols, not-too-much-short integers, almost-floats, short 
strings (just need their length, that is small. Small strings are very 
common), etc. Many Lisp implementations use such tagged data, this 
reduces pointer deferences and memory use. The problem is that this is 
dynamic typing, so such cons become a kind of box/variant (and this 
isn't too much bad, I think).
end quote
---------
Yes, this is excactly what I want!
The idea is to implement a  rowset (database related) datatype based on 
lists.
Well, having a LISP list in D is, beside a possible rowset 
implementation, not that bad, I guess.
Thanks, Bjoern
Mar 02 2008
prev sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
bearophile wrote:

 [Unrelated question: why aren't D dynamic arrays triples of values? So they
can store their actual length and the memory they have available, so they can
support  something like the reserve() of C++ STL. Is the GC already storing
such third value? But then I haven't found a reliable way to perform a
reserve-like operation.]

The gc is storing that third value. Each allocated chunk has a size. If an array starts at the beginning of a gc chunk, its capacity is defined as the size of that chunk. Here is a way to reserve space for arrays: // int rather than size_t due to IFTI sensitivity void reserve(T)(inout T[] arr, int len) { if (len > arr.length) { auto t = arr.length; arr.length = len; arr.length = t; } } -- Oskar
Mar 02 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
To: Newsgroups: digitalmars.D
Subject: Re: Lisp like lists and a problem with TANGO fromStringz

Oskar Linde>Here is a way to reserve space for arrays:<

I did try a function equal to that one, but have you timed it? My timing
results don't show any advantage of using such reserve.

------------------------------

D code:

// this D code is written to look as much as possible as the C++ version
import std.stdio: printf;
import std.c.time: clock, CLOCKS_PER_SEC, time_t;
import std.conv: toInt;

double myclock() {
    time_t t = clock();
    if (t == -1)
        return 0.0;
    else
        return t/cast(double)CLOCKS_PER_SEC;
}

void reserve(T)(ref T[] arr, int len) {
    // int rather than size_t due to IFTI sensitivity
    if (len > arr.length) {
        auto t = arr.length;
        arr.length = len;
        arr.length = t;
    }
}

int main(string[] argv) {
    int n = toInt(argv[1]);
    double t = myclock();

    switch (argv[2][0]) {
        case '1':
            int[] a;
            for (int i = 0; i < n; ++i)
                a ~= i;
            break;
        case '2':
            int[] a;
            a.reserve(n);
            for (int i = 0; i < n; ++i)
                a ~= i;
            break;
        case '3':
            auto a = new int[n];
            for (int i = 0; i < n; ++i)
                a[i] = i;
            break;
    }

    printf("%.3f s\n", myclock() - t);
    return 0;
}


------------------------------

C++ code:

#include <stdio.h>
#include <time.h>
#include <vector>

using namespace std;

double myclock() {
    time_t t = clock();
    if (t == -1)
        return 0.0;
    else
        return t/(double)CLOCKS_PER_SEC;
}

int main(int argc, char **argv) {
    int n = atoi(argv[1]);
    double t = myclock();

    switch (argv[2][0]) {
        case '1': {
            vector<int> a;
            for (int i = 0; i < n; ++i)
                a.push_back(i);
            break;
        }
        case '2': {
            vector<int> a;
            a.reserve(n);
            for (int i = 0; i < n; ++i)
                a.push_back(i);
            break;
        }
        case '3': {
            vector<int> a(n);
            a.reserve(n);
            for (int i = 0; i < n; ++i)
                a[i] = i;
            break;
        }
    }

    printf("%.3f s\n", myclock() - t);
    return 0;
}

-----------------------------------

OUTPUT TIMINGS:

DMD 1.026, N = 12_000_000:
  append:            3.3 s
  reserve + append:  3.3 s
  allocate + assign: 1.0 s

C++ MinGW 4.2.1, N = 12_000_000, best of 3:
  append:            1.4 s
  reserve + append:  0.5 s
  allocate + assign: 1.0 s

C++ compiled with -O3 -s
DMD compiled with -O -release -inline
Win on Pentium3 CPU.

Notes:
- All timings are best of 3, rounded to 2 significant digits.
- I don't understand how in C++ the append() can twice faster than the assign.

Bye,
bearophile
Mar 03 2008
next sibling parent reply kov_serg <kov_serg freemail.ru> writes:
bearophile Wrote:

...
 
 OUTPUT TIMINGS:
 
 DMD 1.026, N = 12_000_000:
   append:            3.3 s
   reserve + append:  3.3 s
   allocate + assign: 1.0 s
 
 C++ MinGW 4.2.1, N = 12_000_000, best of 3:
   append:            1.4 s
   reserve + append:  0.5 s
   allocate + assign: 1.0 s
 
 C++ compiled with -O3 -s
 DMD compiled with -O -release -inline
 Win on Pentium3 CPU.
 
 Notes:
 - All timings are best of 3, rounded to 2 significant digits.
 - I don't understand how in C++ the append() can twice faster than the assign.

It's simple. 1st do not use clock function -- it measure processor time for current thread. Simply use funtion that measure the time passed. 2nd reason is CPU cache behaviour. Turn it off and compare again. http://en.wikipedia.org/wiki/CPU_cache http://leitl.org/docs/comp/AMD_block_prefetch_paper.pdf ...
Mar 03 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Oskar Linde>DMD/GDC also seems unable to inline the _d_arrayappendcTp call
which would be one reason for the rather poor performance overall.<

I see.

--------------

kov_serg:
 1st do not use clock function -- it measure processor time for current thread.
Simply use funtion that measure the time passed.

Okay. I usually use wall clock timings (I haven't used it here to avoid initialization times). Note that I have used a Pentium3 and the system was well unloaded. So I think the actual wall clock time was close.
 http://en.wikipedia.org/wiki/CPU_cache

That's a very rich article! I'll read it.
 http://leitl.org/docs/comp/AMD_block_prefetch_paper.pdf

Very interesting, I have read it, but it's a bit too much maniac for me still :-) Compiler writers must help us for such things ;-)
 2nd reason is CPU cache behaviour. Turn it off and compare again.

I don't understand how switching off a part of the CPU can help my tests show more real results. (And I don't know how to do it.) In this forum there are lot of people that know much more than me, bye and thank you very much to both, bear hugs, bearophile
Mar 03 2008
prev sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
bearophile wrote:
 To: Newsgroups: digitalmars.D
 Subject: Re: Lisp like lists and a problem with TANGO fromStringz
 
 Oskar Linde>Here is a way to reserve space for arrays:<
 
 I did try a function equal to that one, but have you timed it? My timing
results don't show any advantage of using such reserve.

There are more advantages to do it than speed. Memory fragmentation and wastage for example. <snip>
 OUTPUT TIMINGS:
 
 DMD 1.026, N = 12_000_000:
   append:            3.3 s
   reserve + append:  3.3 s
   allocate + assign: 1.0 s

It is hard to draw too many conclusions from such a simple test. For one thing, the DMD GC will on array appends try to grow the allocated GC chunk without reallocating. In an application such as this (with virtually no memory fragmentation), there is a good chance that it is able to do that almost all the time. DMD/GDC also seems unable to inline the _d_arrayappendcTp call which would be one reason for the rather poor performance overall. There are cases where the reallocation helps though, even though the performance overall is embarrassing compared to not appending. GDC 0.24, N = 120_000_000: append: 9.3 s reserve + append: 5.3 s allocate + assign: 1.3 s (Also note that append in this case wastes 168 MB memory compared to reserve+append (*)) It is also slightly faster for small arrays, where the GC bucket limitation prevents in place growing from happening: GDC 0.24, N = 512: append: 26 Ás reserve + append: 23 Ás allocate + assign: 1.7 Ás ---- *) There seems to be a bug (or it might be by design) that the GC on appending to an array sometimes allocates up to 3x the required memory (to me, it seems anything above 2x is seriously overkill). The line in question is in internal/gc/gc.d in Phobos, where the correct line is commented out: //long mult = 100 + (1000L * size) / (6 * log2plus1(newcap)); long mult = 100 + (1000L * size) / (log2plus1(newcap)); So, for example, appending one int to a 1MB int[] array, could result in an array with 3MB reserved space. Or appending a single int to a 74MB array results in a 184 MB array. Something to look out for. -- Oskar
Mar 03 2008
parent Bjoern <nanali nospam-wanadoo.fr> writes:
Oskar Linde schrieb:
 bearophile wrote:
 To: Newsgroups: digitalmars.D
 Subject: Re: Lisp like lists and a problem with TANGO fromStringz

 Oskar Linde>Here is a way to reserve space for arrays:<

 I did try a function equal to that one, but have you timed it? My 
 timing results don't show any advantage of using such reserve.

There are more advantages to do it than speed. Memory fragmentation and wastage for example. <snip>
 OUTPUT TIMINGS:

 DMD 1.026, N = 12_000_000:
   append:            3.3 s
   reserve + append:  3.3 s
   allocate + assign: 1.0 s

It is hard to draw too many conclusions from such a simple test. For one thing, the DMD GC will on array appends try to grow the allocated GC chunk without reallocating. In an application such as this (with virtually no memory fragmentation), there is a good chance that it is able to do that almost all the time. DMD/GDC also seems unable to inline the _d_arrayappendcTp call which would be one reason for the rather poor performance overall. There are cases where the reallocation helps though, even though the performance overall is embarrassing compared to not appending. GDC 0.24, N = 120_000_000: append: 9.3 s reserve + append: 5.3 s allocate + assign: 1.3 s (Also note that append in this case wastes 168 MB memory compared to reserve+append (*)) It is also slightly faster for small arrays, where the GC bucket limitation prevents in place growing from happening: GDC 0.24, N = 512: append: 26 Ás reserve + append: 23 Ás allocate + assign: 1.7 Ás ---- *) There seems to be a bug (or it might be by design) that the GC on appending to an array sometimes allocates up to 3x the required memory (to me, it seems anything above 2x is seriously overkill). The line in question is in internal/gc/gc.d in Phobos, where the correct line is commented out: //long mult = 100 + (1000L * size) / (6 * log2plus1(newcap)); long mult = 100 + (1000L * size) / (log2plus1(newcap)); So, for example, appending one int to a 1MB int[] array, could result in an array with 3MB reserved space. Or appending a single int to a 74MB array results in a 184 MB array. Something to look out for.

Gentleman, somebody out there remembering the topic ? ... : Bjoern -< OOPS wrong planet.
Mar 03 2008