www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Pure dynamic casts?

reply bearophile <bearophileHUGS lycos.com> writes:
I don't know much about this topic, but this post is mostly in question form.

Are dynamic casts pure? I have compiled the following small program with LDC:


version (Tango)
    import tango.stdc.stdio: printf;
else version (D1)
    import std.c.stdio: printf;
else
    import core.stdc.stdio: printf;

class A { int a; }

void main() {
    Object o = new A();
    A a = cast(A)o;
    a.a = 10;
    printf("%d %d\n", a.a, cast(A)o);
}


LDC based on DMD v1.045 and llvm 2.6 (Thu Sep 10 23:50:27 2009)
ldc -O5 -release -inline -output-s dyncast_test.d

The asm it produces shows two calls to _d_dynamic_cast:

_Dmain:
    pushl   %esi
    subl    $16, %esp
    movl    $_D5cast51A7__ClassZ, (%esp)
    call    _d_allocclass
    movl    %eax, %esi
    movl    $_D5cast51A6__vtblZ, (%esi)
    movl    $0, 4(%esi)
    movl    $0, 8(%esi)
    movl    %esi, (%esp)
    movl    $_D5cast51A7__ClassZ, 4(%esp)
    call    _d_dynamic_cast
    movl    $10, 8(%eax)
    movl    %esi, (%esp)
    movl    $_D5cast51A7__ClassZ, 4(%esp)
    call    _d_dynamic_cast
    movl    %eax, 8(%esp)
    movl    $10, 4(%esp)
    movl    $.str1, (%esp)
    call    printf
    xorl    %eax, %eax
    addl    $16, %esp
    popl    %esi
    ret $8


If the dynamic cast are pure the compiler can call it only once here (LLVM
already has two functions attributes for pure functions).

(Related: can the code that performs the dynamic casts be in some situations
inlined by LDC?)

Bye,
bearophile
Sep 21 2009
next sibling parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
bearophile wrote:
 I don't know much about this topic, but this post is mostly in question form.
 
 Are dynamic casts pure? I have compiled the following small program with LDC:
 
 
 version (Tango)
     import tango.stdc.stdio: printf;
 else version (D1)
     import std.c.stdio: printf;
 else
     import core.stdc.stdio: printf;
 
 class A { int a; }
 
 void main() {
     Object o = new A();
     A a = cast(A)o;
     a.a = 10;
     printf("%d %d\n", a.a, cast(A)o);
 }
 
 
 LDC based on DMD v1.045 and llvm 2.6 (Thu Sep 10 23:50:27 2009)
 ldc -O5 -release -inline -output-s dyncast_test.d
 
 The asm it produces shows two calls to _d_dynamic_cast:
 
 [snip]
 
 If the dynamic cast are pure the compiler can call it only once here (LLVM
already has two functions attributes for pure functions).
 
 (Related: can the code that performs the dynamic casts be in some situations
inlined by LDC?)
 
 Bye,
 bearophile

One cast for the assignment to a and one cast for the call to printf. I would say the casts are pure to a certain level, they don't always return the same value to the same input because they return references. They do however perform the exact same code path for the same inputs. So it should be possible to inline them. Take a look at the _d_dynamic_cast implementation of my runtime: Object _d_dynamic_cast(in Object p, in ClassInfo ci) { uint offset = void; return p && _d_isbaseof2(p.classinfo, ci, offset) ? cast(Object)(*cast(void**)&p + offset) : null; } You can clearly see all there is to a dynamic cast is simply adjusting the pointer to the object's virtual table. So a compiler knowing the source and destination offset of the vtable can easily inline the code for such a cast. An interface dynamic cast is similar but the interface reference has to be converted into an object reference first and then passed to _d_dynamic cast. Maybe you could open a ticket in bugzilla about this? Jeremie
Sep 21 2009
parent Christopher Wright <dhasenan gmail.com> writes:
Jeremie Pelletier wrote:
 You can clearly see all there is to a dynamic cast is simply adjusting 
 the pointer to the object's virtual table. So a compiler knowing the 
 source and destination offset of the vtable can easily inline the code 
 for such a cast.

That may be correct, but you're describing it in a way that confuses me, probably in part because it assumes a fair bit of knowledge on the part of the listener. Objects are laid out like this: class vtbl pointer object.Object fields superclass fields superclass interface1 vtbl pointer superclass interface2 vtbl pointer ... class fields class interface1 vtbl pointer class interface2 vtbl pointer ... (The relative order of fields and interface vtbl pointers doesn't matter.) You need a special vtbl for interfaces because a virtual function call works like: branch (object.vtbl + offset) The offset must be known at compile time. And it has to be the same offset for all possible objects implementing this interface. One implemented interface might require an entirely different vtbl layout than another. The solution is to use an entirely different vtbl.
Sep 22 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Dynamic casts are pure. They don't use global state, and have the same output
for the same reference as input. Interestingly, dynamic cast results are
independent of intervening mutable calls... So there's even greater opportunity
for optimization.

bearophile Wrote:

 I don't know much about this topic, but this post is mostly in question form.
 
 Are dynamic casts pure? I have compiled the following small program with LDC:
 
 
 version (Tango)
     import tango.stdc.stdio: printf;
 else version (D1)
     import std.c.stdio: printf;
 else
     import core.stdc.stdio: printf;
 
 class A { int a; }
 
 void main() {
     Object o = new A();
     A a = cast(A)o;
     a.a = 10;
     printf("%d %d\n", a.a, cast(A)o);
 }
 
 
 LDC based on DMD v1.045 and llvm 2.6 (Thu Sep 10 23:50:27 2009)
 ldc -O5 -release -inline -output-s dyncast_test.d
 
 The asm it produces shows two calls to _d_dynamic_cast:
 
 _Dmain:
     pushl   %esi
     subl    $16, %esp
     movl    $_D5cast51A7__ClassZ, (%esp)
     call    _d_allocclass
     movl    %eax, %esi
     movl    $_D5cast51A6__vtblZ, (%esi)
     movl    $0, 4(%esi)
     movl    $0, 8(%esi)
     movl    %esi, (%esp)
     movl    $_D5cast51A7__ClassZ, 4(%esp)
     call    _d_dynamic_cast
     movl    $10, 8(%eax)
     movl    %esi, (%esp)
     movl    $_D5cast51A7__ClassZ, 4(%esp)
     call    _d_dynamic_cast
     movl    %eax, 8(%esp)
     movl    $10, 4(%esp)
     movl    $.str1, (%esp)
     call    printf
     xorl    %eax, %eax
     addl    $16, %esp
     popl    %esi
     ret $8
 
 
 If the dynamic cast are pure the compiler can call it only once here (LLVM
already has two functions attributes for pure functions).
 
 (Related: can the code that performs the dynamic casts be in some situations
inlined by LDC?)
 
 Bye,
 bearophile

Sep 21 2009
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Jason House wrote:
 Dynamic casts are pure. They don't use global state, and have the same output
for the same reference as input. Interestingly, dynamic cast results are
independent of intervening mutable calls... So there's even greater opportunity
for optimization.

What if the GC just happens to re-use that address for a different object?
Sep 22 2009
next sibling parent Jeremie Pelletier <jeremiep gmail.com> writes:
Daniel Keep wrote:
 
 Jason House wrote:
 Dynamic casts are pure. They don't use global state, and have the same output
for the same reference as input. Interestingly, dynamic cast results are
independent of intervening mutable calls... So there's even greater opportunity
for optimization.

What if the GC just happens to re-use that address for a different object?

I don't know about druntime's GC but I used to have this issue with my own GC. I would cast an object to an interface type which would make its only reference point to a different 16bytes block, remember that bit fields in the GC are aligned to 16bytes, and the allocation had therefore no more reachable pointers, or so the GC thought. What fixed it was to adjust the scan phase and align found pointers to the block size of its corresponding allocation. That way the GC won't reuse the address of reachable objects no matter what interface or subclass they are casted to.
Sep 22 2009
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Daniel Keep:
What if the GC just happens to re-use that address for a different object?<

I don't understand. Can you explain me an example where this problem may happen? Bye, bearophile
Sep 22 2009
prev sibling next sibling parent Jason House <jason.james.house gmail.com> writes:
Daniel Keep Wrote:

 
 
 Jason House wrote:
 Dynamic casts are pure. They don't use global state, and have the same output
for the same reference as input. Interestingly, dynamic cast results are
independent of intervening mutable calls... So there's even greater opportunity
for optimization.

What if the GC just happens to re-use that address for a different object?

I was thinking of that and tried to be careful with my wording. I guess I failed. Memoization can't occur, exactly for the reason you indicate. I was trying to say that for the life of an object, the type and type info is immutable. Since a dynamic cast only uses that immutable state, the return value does not change over the lifetime of the object. This could be used for optimizations above and beyond simple purity.
Sep 22 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jarrett Billingsley:

 Or you could - I dunno - cache the result of the dynamic cast in a
 local, since doing multiple dynamic casts is terrible style anyway.
 Just saying. ;)

Yes, most of the optimizations done by the compiler can instead be done by the programmer (but some stupid code can be generated by macros, mixins, etc). Bye, bearophile
Sep 22 2009
parent Jeremie Pelletier <jeremiep gmail.com> writes:
bearophile wrote:
 Jarrett Billingsley:
 
 Or you could - I dunno - cache the result of the dynamic cast in a
 local, since doing multiple dynamic casts is terrible style anyway.
 Just saying. ;)

Yes, most of the optimizations done by the compiler can instead be done by the programmer (but some stupid code can be generated by macros, mixins, etc). Bye, bearophile

But the compiler has no means to determine that the two calls to _d_dynamic_cast will return the same pointer.
Sep 22 2009
prev sibling next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Steven Schveighoffer wrote:
 On Tue, 22 Sep 2009 05:24:11 -0400, Daniel Keep
 <daniel.keep.lists gmail.com> wrote:
 
 Jason House wrote:
 Dynamic casts are pure. They don't use global state, and have the
 same output for the same reference as input. Interestingly, dynamic
 cast results are independent of intervening mutable calls... So
 there's even greater opportunity for optimization.

What if the GC just happens to re-use that address for a different object?

That's only if memoization is used.

Well, if you're not looking at memoization, why is this discussion even happening?
 I think what Jason said is correct -- you can view dynamic cast as
 taking 2 arguments, one is the reference which is simply echoed as the
 return value, and one is the classinfo, which is an immutable argument
 that causes a decision to be made.

The reference *isn't* echoed as a return value. If it was, then casting wouldn't do anything.
 In fact, you could use memoization on a dynamic cast subfunction that
 memoizes on the target type and the classinfo of the source, regardless
 of the reference value, and returns an offset to add to the return value.

Yes, that would work, provided both references are immutable.
 It's pretty easy for the compiler to prove that o has not be reassigned,
 so even without memoization, the compiler can logically assume that the
 result from the first dynamic cast can be reused.  I think this is the
 major optimization for pure functions anyways, not memoization.

Pretty easy? So you're ignoring threads, then? The ONLY way I can see object casting being treated as a pure function is if all references passed in are immutable and even then only if the compiler can prove the reference cannot be changed (i.e. the reference is always kept on the stack AND you assume you never manually delete it).
 -Steve

-- Daniel
Sep 22 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Steven Schveighoffer wrote:
 On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep
 <daniel.keep.lists gmail.com> wrote:
 Well, if you're not looking at memoization, why is this discussion even
 happening?

There are other pure optimizations that can occur without memoization. In fact I don't even know why memoization is brought up so much with pure functions, it is an impossible-to-prove optimization. How do you know memoization helps when you don't know how often the same arguments will be used?

What other optimisations? The only other one I can think of off the top of my head is common subexpression elimination. Except that's effectively caller-side memoisation.
 The ONLY way I can see object casting being treated as a pure function
 is if all references passed in are immutable and even then only if the
 compiler can prove the reference cannot be changed (i.e. the reference
 is always kept on the stack AND you assume you never manually delete it).

How is a thread going to come along and change a local variable in another thread?

I don't know if you've noticed, but objects tend to be allocated on the heap, not the stack. :P There's nothing to stop another thread from killing the object you're pointing to and then substituting in a new one. It's pathological, yes, but this is the sort of thing that, if it happens, you don't have a hope of debugging. This is why threads completely and utterly suck and I frequently wish they'd never been invented. It's also incidentally why I've personally come to regard reference semantics, in general, as a really, REALLY stupid idea.
 -Steve

-- Daniel

--- Steve

--------- Daniel
Sep 22 2009
parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
Daniel Keep wrote:
 
 Steven Schveighoffer wrote:
 On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep
 <daniel.keep.lists gmail.com> wrote:
 Well, if you're not looking at memoization, why is this discussion even
 happening?

In fact I don't even know why memoization is brought up so much with pure functions, it is an impossible-to-prove optimization. How do you know memoization helps when you don't know how often the same arguments will be used?

What other optimisations? The only other one I can think of off the top of my head is common subexpression elimination. Except that's effectively caller-side memoisation.
 The ONLY way I can see object casting being treated as a pure function
 is if all references passed in are immutable and even then only if the
 compiler can prove the reference cannot be changed (i.e. the reference
 is always kept on the stack AND you assume you never manually delete it).

another thread?

I don't know if you've noticed, but objects tend to be allocated on the heap, not the stack. :P There's nothing to stop another thread from killing the object you're pointing to and then substituting in a new one. It's pathological, yes, but this is the sort of thing that, if it happens, you don't have a hope of debugging. This is why threads completely and utterly suck and I frequently wish they'd never been invented. It's also incidentally why I've personally come to regard reference semantics, in general, as a really, REALLY stupid idea.

I myself couldn't live without threads or references, but I agree that mixing the two can lead to interesting scenarios to debug. Just try and code a responsive GUI without using threads, the application will "hang" everytime you're not going back to the main loop every few milliseconds. Or doing network I/O without bringing your program to a crawl, of course you could use async I/O but where is that I/O gonna run without a background thread to support it :) Besides, the upcoming shared qualifier is there to address those issues. You might want to look at this article, I came across it a few months ago and found it rather insightful, it just could change your view on references: http://blogs.msdn.com/ericlippert/archive/2009/02/17/references-are-not-addresses.aspx
Sep 22 2009
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Jeremie Pelletier wrote:
 I myself couldn't live without threads or references, but I agree that
 mixing the two can lead to interesting scenarios to debug. Just try and
 code a responsive GUI without using threads, the application will "hang"
 everytime you're not going back to the main loop every few milliseconds.
 Or doing network I/O without bringing your program to a crawl, of course
 you could use async I/O but where is that I/O gonna run without a
 background thread to support it :)

<rant> See, people equate "parallel execution" with "threads" these days which is half the problem. Threads are a TERRIBLE abstraction, because they don't. There's no protection. Almost every time I express the opinion that threads are broken, people automatically assume I think we should all go back to single-core machines and cooperative multitasking. It's like if I said I don't like Coke that I'm somehow against all beverages. A while ago, I was playing with a toy language design which used value semantics for everything. The one exception was references, but they had two big differences from how, for example, D implements them: 1. They could only point to the heap, and could not point to any pre-existing value. If you, for example, took a ref to an array, it would actually copy the array to the heap. 2. Refs to mutable values could only be dereferenced by their "owning" thread; other threads couldn't even look at the contents. A thread could transfer ownership of a ref to another thread, or disown it entirely at which point it became globally immutable. Because of this, threads in the language couldn't even see each other's mutable data (there were no process-globals, only thread-global); they had to communicate via message passing. In something like that, you don't need to worry about threads interacting badly, because they can't interact except at specific points in the program. For something like a GUI, you'd have the GUI in one thread and do the processing in another. Instead of having to worry about locking and synchronising, you just send the other thread a "do this" message. Of course, something like that would never get into D. Oh well. :) Where was I? </rant>
 Besides, the upcoming shared qualifier is there to address those issues.
 
 You might want to look at this article, I came across it a few months
 ago and found it rather insightful, it just could change your view on
 references:
 http://blogs.msdn.com/ericlippert/archive/2009/02/17/references-are-not-addresses.aspx

I've read it before. It doesn't apply because we're talking about D, not C#. Conceptually, it's correct, but D is a systems programming language. If you're going to deal with things like the definition of purity, then you HAVE to know what a reference really is and how it works. If you just use pure, you don't care. It's like how if you're working on the CLR itself, you'd need to know how references are implemented. If you're USING the CLR, it's immaterial in most cases. In C#, if you talk about "Beethoven", then it's always the same Beethoven. The only way the definition of Beethoven would change is if everyone in the universe forgot who he was first. But in D, that's not entirely true. You can hit the GC over the head with a "delete" and it'll forget. Sometimes it'll forget because you accidentally stored the only remaining reference in a malloc'ed struct RIGHT when some other thread triggered a collect and oh dear. I would like to clarify, however, that I think it is fairly reasonable to expect this optimisation to work. It's just that I've been bitten by threading issues so very many times that I work on the basis that given half a chance, threading will completely screw your program. That said, I think I'm getting too ranty, so I'll just leave it at that.
Sep 22 2009
parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
Daniel Keep wrote:
 Jeremie Pelletier wrote:
 I myself couldn't live without threads or references, but I agree that
 mixing the two can lead to interesting scenarios to debug. Just try and
 code a responsive GUI without using threads, the application will "hang"
 everytime you're not going back to the main loop every few milliseconds.
 Or doing network I/O without bringing your program to a crawl, of course
 you could use async I/O but where is that I/O gonna run without a
 background thread to support it :)

<rant> See, people equate "parallel execution" with "threads" these days which is half the problem. Threads are a TERRIBLE abstraction, because they don't. There's no protection. Almost every time I express the opinion that threads are broken, people automatically assume I think we should all go back to single-core machines and cooperative multitasking. It's like if I said I don't like Coke that I'm somehow against all beverages. A while ago, I was playing with a toy language design which used value semantics for everything. The one exception was references, but they had two big differences from how, for example, D implements them: 1. They could only point to the heap, and could not point to any pre-existing value. If you, for example, took a ref to an array, it would actually copy the array to the heap. 2. Refs to mutable values could only be dereferenced by their "owning" thread; other threads couldn't even look at the contents. A thread could transfer ownership of a ref to another thread, or disown it entirely at which point it became globally immutable. Because of this, threads in the language couldn't even see each other's mutable data (there were no process-globals, only thread-global); they had to communicate via message passing. In something like that, you don't need to worry about threads interacting badly, because they can't interact except at specific points in the program. For something like a GUI, you'd have the GUI in one thread and do the processing in another. Instead of having to worry about locking and synchronising, you just send the other thread a "do this" message. Of course, something like that would never get into D. Oh well. :) Where was I? </rant>
 Besides, the upcoming shared qualifier is there to address those issues.

 You might want to look at this article, I came across it a few months
 ago and found it rather insightful, it just could change your view on
 references:
 http://blogs.msdn.com/ericlippert/archive/2009/02/17/references-are-not-addresses.aspx

I've read it before. It doesn't apply because we're talking about D, not C#. Conceptually, it's correct, but D is a systems programming language. If you're going to deal with things like the definition of purity, then you HAVE to know what a reference really is and how it works. If you just use pure, you don't care. It's like how if you're working on the CLR itself, you'd need to know how references are implemented. If you're USING the CLR, it's immaterial in most cases. In C#, if you talk about "Beethoven", then it's always the same Beethoven. The only way the definition of Beethoven would change is if everyone in the universe forgot who he was first. But in D, that's not entirely true. You can hit the GC over the head with a "delete" and it'll forget. Sometimes it'll forget because you accidentally stored the only remaining reference in a malloc'ed struct RIGHT when some other thread triggered a collect and oh dear. I would like to clarify, however, that I think it is fairly reasonable to expect this optimisation to work. It's just that I've been bitten by threading issues so very many times that I work on the basis that given half a chance, threading will completely screw your program. That said, I think I'm getting too ranty, so I'll just leave it at that.

I dont believe there's such a thing as "too ranty", your post was quite insightful. I understand your views on threading, ownership I believe is a feature planned for D3, but it wouldn't be that hard to implement on a custom runtime (i myself am still unsure about whether to do it or not yet). I've read Bartosz's entry about ownership, and I've looked at libraries using it (such as Qt). I think its a neat idea in itself, but I would rather see shared than ownership, or only using ownership on shared objects to keep things fast. The thing is, you can share any piece of data, not just objects, therefore enforcing the ownership model would limit sharable data to objects, which is too limiting for many of us. You're right about concurrency being a different concept than threading, but I wouldn't give threading away for a pure concurrent model either. I believe D is aiming at giving programmers a choice of the tools they wish to use. I could see uses of both a concurrent model with message passing and a threading model with shared data used at once in a program. Shared data is always faster than message passing so you could implement real time code with it, and use messsage passing for other parts such as delegating GUI messages from the main loop. Shared data being harder to manage than message passing does not make it a bad thing, it just means you can have two different models for two different usages. I know the link I gave is for C# and this is D, but these are concepts not bound to any particular languages. I just wanted to point out that references in themselves aren't a bad idea, I find them much more convenient than pointers for objects and arrays, I just wish you had means to explicitly separate an object pointer from its data instead of only being able to use the reference which doesn't make the difference between the two syntactically. Jeremie
Sep 23 2009
next sibling parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
language_fan wrote:
 Wed, 23 Sep 2009 10:43:53 -0400, Jeremie Pelletier thusly wrote:
 
 You're right about concurrency being a different concept than threading,
   but I wouldn't give threading away for a pure concurrent model either.
 I believe D is aiming at giving programmers a choice of the tools they
 wish to use. I could see uses of both a concurrent model with message
 passing and a threading model with shared data used at once in a
 program.

The danger in too large a flexibility is that concurrency is not easy and it is getting incresingly complex. You need to be extraordinary good at manually managing all concurrent use of data. If I may predict something that is going to happen, it is that there will be high level models that avoid many low level pitfalls. These models will not provide 100% efficiency, but they are getting faster and faster, without compromizing the safety aspect. This already happened with memory allocation (manual vs garbage collection - in common applications, but not in special cases). Before that we gave some of the error detection capabilities to the compiler (e.g. we do not write array bounds checks ourselves anymore). And optimizations (e.g. register allocation). You may disagree, but I find it much more pleasant to find that the application does never crash even though it works 15% slower than an optimal C++ code would.

15% slower is an extreme performance hit. I agree that code safety is useful and I use this model all the time for initialization and other code which isn't real time, but 15% takes away a lot of the application's responsiveness, if you have 50 such applications running on your system you just spent $1000 more in hardware to get the performance of entry level hardware with faster code. If you wrote a real time renderer for example with that 15% hit, you get a very noticeable difference in framerate. Not to mention standard GUIs to be laggy on slower machines (just MSN messenger runs to a crawl on my old computer, yet it can run the first UT which does way more operations per second, because there's no performance hit by safer code bloat).
 Shared data is always faster than message passing so you could
 implement real time code with it, and use messsage passing for other
 parts such as delegating GUI messages from the main loop.

Shared data is maybe faster on shared memory machines, but this is not a fact in the general case.
 Shared data being harder to manage than message passing does not make it
 a bad thing, it just means you can have two different models for two
 different usages.

Shared data is not a single homogenous model. Things like transactional memory work in shared memory systems, but still they have different semantics.

Yes, I am well aware of that, I don't want to favor any particular design, I prefer to learn the semantics of a few of them and implement them all. Then I get the choice of the model to use depending on my needs. Take search algorithm for example, you can pick between a binary search, b-tree r-tree, quadtree, octree, bsp, and a bunch of others along with dozens of variants of the ones mentionned depending on the data you're working with. Its the same for concurrency, I can think of vector processing, functional calls, STM, message passing and shared memory off the top of my head. All being valid models with each their pros and cons, together forming a complete all-around solution. Jeremie
Sep 24 2009
next sibling parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
Robert Jacques wrote:
 On Thu, 24 Sep 2009 20:46:13 -0400, Jeremie Pelletier 
 <jeremiep gmail.com> wrote:
 [snip]
 Its the same for concurrency, I can think of vector processing, 
 functional calls, STM, message passing and shared memory off the top 
 of my head. All being valid models with each their pros and cons, 
 together forming a complete all-around solution.

 Jeremie


Oh yeah, thanks! Those were covered by Bartosz in his blog right? I think he used the term Actor for task based programming, I really enjoyed reading these articles.
Sep 24 2009
parent Jeremie Pelletier <jeremiep gmail.com> writes:
Robert Jacques wrote:
 On Thu, 24 Sep 2009 20:59:23 -0400, Jeremie Pelletier 
 <jeremiep gmail.com> wrote:
 
 Robert Jacques wrote:
 On Thu, 24 Sep 2009 20:46:13 -0400, Jeremie Pelletier 
 <jeremiep gmail.com> wrote:
 [snip]
 Its the same for concurrency, I can think of vector processing, 
 functional calls, STM, message passing and shared memory off the top 
 of my head. All being valid models with each their pros and cons, 
 together forming a complete all-around solution.

 Jeremie


Oh yeah, thanks! Those were covered by Bartosz in his blog right? I think he used the term Actor for task based programming, I really enjoyed reading these articles.

There are many things called Actors, but they generally boil down to a process/thread/object + message passing. It's a good model, particularly for the cloud. Task programming is generally intra-process and works on the level of a single work-task (aka function). Cilk, futures or Intel's Threading Building Blocks are the canonical examples. The nice thing (I think) about a good work-stealing runtime is that it can provide the back-end for CPU-vector processing, tasks/futures, intra-process actors and functional calls.

I think the best way to deal with actors is by making them objects. Binding them to heavyweight kernel structures such as threads and processes will greatly limit the number of concurrent actors you can execute at once. Tasks would be most likely implemented by async methods, since you don't care about knowing when they're done. On the other hand, futures are there when you need an async call and monitor its completion. A good implementation could therefore easily support tens of thousands of concurrent actors. I must admit its the first time I hear about work-stealing, I will need to do some research on that :)
Sep 24 2009
prev sibling next sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Jeremie Pelletier wrote:
 language_fan wrote:
 Wed, 23 Sep 2009 10:43:53 -0400, Jeremie Pelletier thusly wrote:

 You're right about concurrency being a different concept than threading,
   but I wouldn't give threading away for a pure concurrent model either.
 I believe D is aiming at giving programmers a choice of the tools they
 wish to use. I could see uses of both a concurrent model with message
 passing and a threading model with shared data used at once in a
 program.

The danger in too large a flexibility is that concurrency is not easy and it is getting incresingly complex. You need to be extraordinary good at manually managing all concurrent use of data. If I may predict something that is going to happen, it is that there will be high level models that avoid many low level pitfalls. These models will not provide 100% efficiency, but they are getting faster and faster, without compromizing the safety aspect. This already happened with memory allocation (manual vs garbage collection - in common applications, but not in special cases). Before that we gave some of the error detection capabilities to the compiler (e.g. we do not write array bounds checks ourselves anymore). And optimizations (e.g. register allocation). You may disagree, but I find it much more pleasant to find that the application does never crash even though it works 15% slower than an optimal C++ code would.

15% slower is an extreme performance hit. I agree that code safety is useful and I use this model all the time for initialization and other code which isn't real time, but 15% takes away a lot of the application's responsiveness, if you have 50 such applications running on your system you just spent $1000 more in hardware to get the performance of entry level hardware with faster code.

What are most applications these days? MS Office, a web browser, maybe an email client, some odds and ends, and a whole bunch of web sites running on servers somewhere (with associated database servers and some other odds and ends). How much does it cost to get a slightly better machine? Fully hosted? Maybe $100/month. How much does it cost to develop it in a faster but more error-prone language? Maybe months more time to market, time you're not making money but are still paying developers. Those developers will need to be more skilled than the ones using a safer language, and thus will cost you more. New features will take more time to develop. It's a competitive advantage to use a slower, safer language on the web. Desktop applications are not quite so straightforward.
 If you wrote a real time renderer for example with that 15% hit, you get 
 a very noticeable difference in framerate. Not to mention standard GUIs 
 to be laggy on slower machines (just MSN messenger runs to a crawl on my 
 old computer, yet it can run the first UT which does way more operations 
 per second, because there's no performance hit by safer code bloat).

Games are not interesting in this regard. They require performance, and they're hideously expensive. A studio can throw developers at the problems and make them go away. Not all games are like this, of course. But most are.
Sep 24 2009
parent Jeremie Pelletier <jeremiep gmail.com> writes:
Christopher Wright wrote:
 Jeremie Pelletier wrote:
 language_fan wrote:
 Wed, 23 Sep 2009 10:43:53 -0400, Jeremie Pelletier thusly wrote:

 You're right about concurrency being a different concept than 
 threading,
   but I wouldn't give threading away for a pure concurrent model 
 either.
 I believe D is aiming at giving programmers a choice of the tools they
 wish to use. I could see uses of both a concurrent model with message
 passing and a threading model with shared data used at once in a
 program.

The danger in too large a flexibility is that concurrency is not easy and it is getting incresingly complex. You need to be extraordinary good at manually managing all concurrent use of data. If I may predict something that is going to happen, it is that there will be high level models that avoid many low level pitfalls. These models will not provide 100% efficiency, but they are getting faster and faster, without compromizing the safety aspect. This already happened with memory allocation (manual vs garbage collection - in common applications, but not in special cases). Before that we gave some of the error detection capabilities to the compiler (e.g. we do not write array bounds checks ourselves anymore). And optimizations (e.g. register allocation). You may disagree, but I find it much more pleasant to find that the application does never crash even though it works 15% slower than an optimal C++ code would.

15% slower is an extreme performance hit. I agree that code safety is useful and I use this model all the time for initialization and other code which isn't real time, but 15% takes away a lot of the application's responsiveness, if you have 50 such applications running on your system you just spent $1000 more in hardware to get the performance of entry level hardware with faster code.

What are most applications these days? MS Office, a web browser, maybe an email client, some odds and ends, and a whole bunch of web sites running on servers somewhere (with associated database servers and some other odds and ends). How much does it cost to get a slightly better machine? Fully hosted? Maybe $100/month. How much does it cost to develop it in a faster but more error-prone language? Maybe months more time to market, time you're not making money but are still paying developers. Those developers will need to be more skilled than the ones using a safer language, and thus will cost you more. New features will take more time to develop. It's a competitive advantage to use a slower, safer language on the web. Desktop applications are not quite so straightforward.

I agree with you about web applications, scripting languages are fast enough to serve web pages in most cases. However I did see some pretty ugly code which took over a minute to generate a page from a 1.5Gb database using only string fields without any indexes and PHP was throwing more notices and warnings than actual html. I was referring to desktop applications, where you can't just buy a second server to host your database and serve static content like images and whatnot. From the 2 years I worked in tech support the problem I saw the most often was computers slowed down to a crawl, many of them without viruses or spywares. There are so many people who upgrade their computer just so they can run new software that really doesn't have many new features, but just is written in a safer language. That 15% performance hit over all the programs they ran at once was just too much for their old processor, when 2 months ago before they upgraded the software it was working just fine. Live messenger should not take half a minute to load and perform poorly when the old msn messenger was running perfectly fine.
 If you wrote a real time renderer for example with that 15% hit, you 
 get a very noticeable difference in framerate. Not to mention standard 
 GUIs to be laggy on slower machines (just MSN messenger runs to a 
 crawl on my old computer, yet it can run the first UT which does way 
 more operations per second, because there's no performance hit by 
 safer code bloat).

Games are not interesting in this regard. They require performance, and they're hideously expensive. A studio can throw developers at the problems and make them go away. Not all games are like this, of course. But most are.

<rant> What happened to one man armies like John Carmack and Tim Sweeney? These were the guys who inspired me to get into coding in the first place. I just loved how they could put together entire engines on their own and carefully optimize them. What about Walter who wrote the first C++ compiler and optlink as well as dmd from which I learned so much about compiler internals, or Andrei and his extensive work on meta programming and generic programming, from which I also learned so much. There's also that one guy developing the game LOVE all on his own using only C, he wrote his tools, his engine, his model format, does his artwork, *everything*. And he is making some quick progress that would make teams of 50+ jealous. Point is, only companies have the resources to throw more people at the problem and forget about it. At the end of the day, if you multiply the number of people involved by the development time, the one man army still is winning by far, both in time and resources spent. Games were only an example, what about programs like flash, photoshop, autocad, 3dsm, and others. No matter what domain you're interested in you're gonna come across program that require every bit of performance they can get. Be it developed by teams of 50+ programmers or a single one. I'm not saying safer languages are bad, far from that, I'm just saying there's a way to get the productivity of these languages and the optimization power of unsafe languages together, and there's a target audience for that. Having a single language which provides enough features to allow either safe development or unsafe development will also get everything in between. This is what got me hooked on D in the first place, even when I first read about it this is what was going through my head, and I haven't looked back since. </rant> Jeremie.
Sep 24 2009
prev sibling next sibling parent Jeremie Pelletier <jeremiep gmail.com> writes:
language_fan wrote:
 Thu, 24 Sep 2009 20:46:13 -0400, Jeremie Pelletier thusly wrote:
 
 language_fan wrote:
 Wed, 23 Sep 2009 10:43:53 -0400, Jeremie Pelletier thusly wrote:

 You're right about concurrency being a different concept than
 threading,
   but I wouldn't give threading away for a pure concurrent model
   either.
 I believe D is aiming at giving programmers a choice of the tools they
 wish to use. I could see uses of both a concurrent model with message
 passing and a threading model with shared data used at once in a
 program.

and it is getting incresingly complex. You need to be extraordinary good at manually managing all concurrent use of data. If I may predict something that is going to happen, it is that there will be high level models that avoid many low level pitfalls. These models will not provide 100% efficiency, but they are getting faster and faster, without compromizing the safety aspect. This already happened with memory allocation (manual vs garbage collection - in common applications, but not in special cases). Before that we gave some of the error detection capabilities to the compiler (e.g. we do not write array bounds checks ourselves anymore). And optimizations (e.g. register allocation). You may disagree, but I find it much more pleasant to find that the application does never crash even though it works 15% slower than an optimal C++ code would.

useful and I use this model all the time for initialization and other code which isn't real time, but 15% takes away a lot of the application's responsiveness, if you have 50 such applications running on your system you just spent $1000 more in hardware to get the performance of entry level hardware with faster code.

The cost of e.g. doubling computing power depends on the domain. If you are building desktop end user applications, they usually should scale from single core atoms to 8-core high-end enthusiastic game computers. So the cpu requirements shouldn't usually be too large. Usually even most of the 1-3 previous generations' hardware runs them just nicely. Now doubling the cpu power of a low-end current generation PC does not cost $1000, but maybe $20-50. You can continue this until the cpu costs about $400-500. By then you've achieved at least tenfold speedup. On the gpu market the cheapest chips have very limited capabilities. You can buy 5 times faster graphics cards for $50-60. $150-160 will get you a 25 times faster gpu than the first one. 4..8 GB RAM is also really cheap these days, and so is a 1.5 TB hard drive. Hardly any desktop program requires that much from the hardware. The 15% or even 50% slower execution speed seems rather small a problem when you can avoid it by buying faster hardware. Hardly any program is cpu bound, not even the most demanding games are. On the server side many systems use php which is both unsafe and slow. If you have decent load balancing and cache systems, it does not even matter since the system may not be cpu bound, either. Even top10 popular sites like wikipedia run on slow php.

While I agree with what you say, here in Canada our computer parts are often twice the price they cost in the US, we kinda get screwed over transport and customs. A high end CPU is not $500 but well above $1000 CAD alone. And quite frankly, its annoying to change computers every 6 months, I myself buy the high end top of the line every 4-6 years just to avoid that and rarely buy upgrades in between since by then its a new CPU socket, new DDR pinout, new GPU slot, and then I need a more powerful PSU and all that requires a new mobo. Then bluray is out so I change my optical drives, and theres a new sata standard so I get new HDDs. About PHP, its a language that can generate the same webpage in either 0.002 seconds or 2 minutes. Load balancing helps when you cant optimize PHP anymore or when your database is eating all your system resources. Wikipedia most likely has load balancing, but it definitely has well written PHP code running over it too.
Sep 24 2009
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
language_fan wrote:
 The cost of e.g. doubling computing power depends on the domain. If you 
 are building desktop end user applications, they usually should scale 
 from single core atoms to 8-core high-end enthusiastic game computers. So 
 the cpu requirements shouldn't usually be too large. Usually even most of 
 the 1-3 previous generations' hardware runs them just nicely.
 
 Now doubling the cpu power of a low-end current generation PC does not 
 cost $1000, but maybe $20-50.

You also need to consider how widely distributed your application is. If you force a million desktop PC users to upgrade their CPUs, you just wasted twenty to fifty million dollars of your customers' money (or, more likely, lost a million customers). -- Rainer Deyke - rainerd eldwood.com
Sep 24 2009
next sibling parent Jeremie Pelletier <jeremiep gmail.com> writes:
language_fan wrote:
 Thu, 24 Sep 2009 22:58:51 -0600, Rainer Deyke thusly wrote:
 
 language_fan wrote:
 The cost of e.g. doubling computing power depends on the domain. If you
 are building desktop end user applications, they usually should scale
 from single core atoms to 8-core high-end enthusiastic game computers.
 So the cpu requirements shouldn't usually be too large. Usually even
 most of the 1-3 previous generations' hardware runs them just nicely.

 Now doubling the cpu power of a low-end current generation PC does not
 cost $1000, but maybe $20-50.

you force a million desktop PC users to upgrade their CPUs, you just wasted twenty to fifty million dollars of your customers' money (or, more likely, lost a million customers).

I do not believe the market works this way. According to that logic popularity correlates with expenses. So if you have e.g. 1 billion users, even $1 in per-user hardware costs causes a billion dollar losses to customers. Is that unacceptable? On the other hand a program with a userbase of 10 might require a $100 hw upgrade and it is still ok? Usually it is the other way, more popular programs pretty much dictate what hardware is useful today. E.g. consider a G4 Mac or PIII PC, it is pretty much useless today since popular web applications like facebook, myspace, youtube, and others are way too slow on it. (not to mention the obligatory virus scanner updates which today require a 2 GHz PC). The HD videos in youtube may require a 1.2+ GHz PC. If you buy an office suite, the old Windows 98 will not even let you install it. Upgrading to Vista is not possible since it will not fit to the 10 GB hard drive. The stores will not sell PATA drives anymore so you either need to buy a PCI PATA controller (not supported by the OS, of course) or upgrade the case, psu, mobo, cpu, memory chips, graphics card, and the hard disk. A used 2.5 GHz Athlon XP with 1GB of RAM and 100GB of disk costs about $100. Anything below that is obsolete these days. Good luck selling anything to people who use older computers, they are probably broke anyways. Otherwise I just see it cheaper to build your apps slower and require hardware updates. Just imagine - a highly optimized $400 program is way too expensive for most users, a $50 program + $200 hw upgrade sounds just fine.

But a $40 optimized program will flush the competition of either $400 optimized equivalents or $40 slow equivalents, making you the winner in the end. People are so crazy about money they care more about their profits than the satisfaction of their customers.
Sep 25 2009
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
language_fan wrote:
 I do not believe the market works this way. According to that logic 
 popularity correlates with expenses. So if you have e.g. 1 billion users, 
 even $1 in per-user hardware costs causes a billion dollar losses to 
 customers. Is that unacceptable? On the other hand a program with a 
 userbase of 10 might require a $100 hw upgrade and it is still ok?

If you manage to sell a billion copies of your program, you can certainly afford to spend some extra time on optimizations and reach maybe another million users (which would be only 0.1% of your total user base, but a huge amount of money in absolute sales). If you're serving a vertical market of only 10, then your program is probably so expensive that another $1000 for hardware upgrades is peanuts. So yes, that's the way it works.
 A used 2.5 GHz Athlon XP with 1GB of RAM and 100GB of disk costs about 
 $100. Anything below that is obsolete these days. Good luck selling 
 anything to people who use older computers, they are probably broke 
 anyways. Otherwise I just see it cheaper to build your apps slower and 
 require hardware updates. Just imagine - a highly optimized $400 program 
 is way too expensive for most users, a $50 program + $200 hw upgrade 
 sounds just fine.

If you just pull numbers of your ass, you can prove anything. Software is priced to optimize total income, which is net income per unit times number of units sold. Production costs are not factored in at all. So the real question is if your $50 software package sells enough additional units to make up for the increase in production costs. Competent programmers write reasonably efficient code from the start, and can take maybe another 10% of the development time to optimize the code to the point of diminishing returns. If you have competent programmers, your 8x price increase almost an order of magnitude off. Incompetent programmers don't just write slow code, they write buggy, non-reusable code. If you're using incompetent programmers, you're probably wasting more money on management than it would have cost to hire competent programmers in the first place. -- Rainer Deyke - rainerd eldwood.com
Sep 25 2009
parent reply Lutger <lutger.blijdestijn gmail.com> writes:
language_fan wrote:

 Fri, 25 Sep 2009 17:59:06 -0600, Rainer Deyke thusly wrote:
 
 Software is priced to optimize total income, which is net income per
 unit times number of units sold.  Production costs are not factored in
 at all.  So the real question is if your $50 software package sells
 enough additional units to make up for the increase in production costs.

Sure, that is the way it works. But still I think the main motivator for customers is the price. It does not really matter if the latest photoshop runs on a Pentium 233MMX or requires a dual-core with 4GB of RAM for e.g. scaling 5MPix images. The target audience upgrades their hardware anyways and the differences between user interfaces is so huge that competition is inexistant. The poorer customers first use a pirated version of photoshop, and only after gaining some popularity buy the licenses rather than use free software like gimp. Optimizing the software will not bring adobe more customers. You can see their optimizing policy in the famous flash plugin and pdf reader. Performance on both programs is just horrible and I could well imagine that a novice programmer built both of them.

Yet this poor performance annoys people to no end. I don't know a single person who isn't irritated by their PDF bloatware, and all my 'tech-savvy' friends have switched to other PDF readers. Same with IE, I know lots of people switched to firefox just because of performance, and then some switched again to Opera or Chrome because even Firefox is too slow.
Sep 26 2009
parent Jeremie Pelletier <jeremiep gmail.com> writes:
Lutger wrote:
 language_fan wrote:
 
 Fri, 25 Sep 2009 17:59:06 -0600, Rainer Deyke thusly wrote:

 Software is priced to optimize total income, which is net income per
 unit times number of units sold.  Production costs are not factored in
 at all.  So the real question is if your $50 software package sells
 enough additional units to make up for the increase in production costs.

customers is the price. It does not really matter if the latest photoshop runs on a Pentium 233MMX or requires a dual-core with 4GB of RAM for e.g. scaling 5MPix images. The target audience upgrades their hardware anyways and the differences between user interfaces is so huge that competition is inexistant. The poorer customers first use a pirated version of photoshop, and only after gaining some popularity buy the licenses rather than use free software like gimp. Optimizing the software will not bring adobe more customers. You can see their optimizing policy in the famous flash plugin and pdf reader. Performance on both programs is just horrible and I could well imagine that a novice programmer built both of them.

Yet this poor performance annoys people to no end. I don't know a single person who isn't irritated by their PDF bloatware, and all my 'tech-savvy' friends have switched to other PDF readers. Same with IE, I know lots of people switched to firefox just because of performance, and then some switched again to Opera or Chrome because even Firefox is too slow.

I agree with Lutger here, these companies just sit on their success and monopoly as excuses to not optimize their software. That and tight deadlines on new versions which prevents even the best of programmers to properly optimize code. Often when the programmer says "it works, but I can get it 10times faster given another week" to his manager, when that reaches the top of the corporate chain, the CEO only hears "it works". These companies are just shooting themselves in the foot in the long run. IE is losing the market share it fought so hard to steal. Acrobat is used less and less, I myself use it only to print files, I can almost always find a text version of a pdf document within seconds. Loading the pdf reader is way slower than going back to google and clicking the next link. While a lot of people targeted by these heavy softwares change hardware every now and then, they can do so because they make enough from their work. Any newcomer or novice just wanting to learn to someday get a job in the domain usually has no money to upgrade his old computer. And yeah, most people will download these programs to learn, and buy licenses when they turn professional but not all of them, I see that all the time. They would sell way more copies of photoshop if it was actually easy to buy.
Sep 26 2009
prev sibling parent Jeremie Pelletier <jeremiep gmail.com> writes:
language_fan wrote:
 Fri, 25 Sep 2009 00:10:55 +0000, language_fan thusly wrote:
 
 You
 may disagree, but I find it much more pleasant to find that the
 application does never crash even though it works 15% slower than an
 optimal C++ code would.

Imagine if a buggy C++ program was monitoring your health. If it crashed or corrupted data, you would die. Yes, the C++ code would be $1M cheaper to build and the hardware would also be $100K cheaper (overall $10M vs $8.9M), but people would die 95% more often. Or in banking business the systems would handle 4x as much customers and transactions, but unfortunately some transactions would just go to the bit heaven due to the 5-6 reboots the mainframe required daily.

Then that just isn't production ready code, no matter the language. You could get the exact same behaviors with even the safest language ever. A safe language does not prevent logic errors. Jeremie
Sep 24 2009
prev sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Fri, Sep 25, 2009 at 7:31 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 language_fan wrote:
 A used 2.5 GHz Athlon XP with 1GB of RAM and 100GB of disk costs about
 $100. Anything below that is obsolete these days. Good luck selling anything
 to people who use older computers, they are probably broke anyways.
 Otherwise I just see it cheaper to build your apps slower and require
 hardware updates. Just imagine - a highly optimized $400 program is way too
 expensive for most users, a $50 program + $200 hw upgrade sounds just fine.

But a $40 optimized program will flush the competition of either $400 optimized equivalents or $40 slow equivalents, making you the winner in the end. People are so crazy about money they care more about their profits than the satisfaction of their customers.

This. I don't have a problem with making money off software, but don't know why so many companies make it so expensive. Take Alias Sketchbook Pro. It's a fine program. It's got an intuitive interface, makes clean lines, is fast and resource-conscious. At one time it was priced at $200 [1]. For what? It has something like 3 drawing tools and a simple layering scheme. No custom brushes, no effects, no special integration with other programs, nothing. Compare that to Paint Tool Sai, which does everything it does, as well as having vectorized inks, custom brushes, patterns, complex layering and layer blending, accurate digital painting etc. for all of $53. Guess which one I bought. [1]It's now only $100, but Sai still beats it for half the price.
Sep 25 2009
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Jarrett Billingsley wrote:
 Dynamic downcasts are not pure.

Actually they are. Given an object 'o', 'cast(T)(<reference to o>)' always returns the same value (either a reference to 'o' or 'null', depending on class of 'o' and the value of 'T'). If the reference to 'o' is changed to point to a different object, then the dynamic cast can return a different value. This does not make the dynamic cast impure, since you are passing in a different argument. If the object 'o' is destroyed and another object is created at the same physical memory location, 'cast(T)(<reference to o>)' can return a different value. This does not make the dynamic cast impure, since you are passing in a (logically) different argument (which happens to use the same physical representation). -- Rainer Deyke - rainerd eldwood.com
Sep 22 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Rainer Deyke wrote:
 Jarrett Billingsley wrote:
 Dynamic downcasts are not pure.

Actually they are. Given an object 'o', 'cast(T)(<reference to o>)' always returns the same value (either a reference to 'o' or 'null', depending on class of 'o' and the value of 'T'). If the reference to 'o' is changed to point to a different object, then the dynamic cast can return a different value. This does not make the dynamic cast impure, since you are passing in a different argument.

On the basis of how I understand pure to be implemented in D at the moment, it is impure. Purity only considers the bits passed on the stack. If the reference points to the same location in memory, it's considered the same argument. The only way to do what you expect would be to compare not just the arguments, but all memory pointed to by them. If you've got cyclic references, you're just plain screwed.
 If the object 'o' is destroyed and another object is created at the same
 physical memory location, 'cast(T)(<reference to o>)' can return a
 different value.  This does not make the dynamic cast impure, since you
 are passing in a (logically) different argument (which happens to use
 the same physical representation).

Logically yes, but in terms of how it's actually implemented, no.
Sep 22 2009
parent reply Rainer Deyke <rainerd eldwood.com> writes:
Daniel Keep wrote:
 On the basis of how I understand pure to be implemented in D at the
 moment, it is impure.  Purity only considers the bits passed on the
 stack.  If the reference points to the same location in memory, it's
 considered the same argument.

As I understand it, D doesn't attempt to do general memoization anyway (and indeed, shouldn't). Given a reference to an object, if the reference itself is not modified from one dynamic cast to another, then the result of the former cast can be reused for the latter. The existence of the reference prevents the object from being garbage-collected, and manual deletion results in undefined behavior. -- Rainer Deyke - rainerd eldwood.com
Sep 22 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Rainer Deyke wrote:
 Daniel Keep wrote:
 On the basis of how I understand pure to be implemented in D at the
 moment, it is impure.  Purity only considers the bits passed on the
 stack.  If the reference points to the same location in memory, it's
 considered the same argument.

As I understand it, D doesn't attempt to do general memoization anyway (and indeed, shouldn't). Given a reference to an object, if the reference itself is not modified from one dynamic cast to another, then the result of the former cast can be reused for the latter.

Which is memoisation, more or less.
 The
 existence of the reference prevents the object from being
 garbage-collected, and manual deletion results in undefined behavior.

http://digitalmars.com/d/2.0/expression.html#DeleteExpression Doesn't actually specify whether this case is kosher or not. I suppose that makes it undefined behaviour, although possibly not in quite the same way. :)
Sep 22 2009
parent Rainer Deyke <rainerd eldwood.com> writes:
Daniel Keep wrote:
 Rainer Deyke wrote:
 As I understand it, D doesn't attempt to do general memoization anyway
 (and indeed, shouldn't).  Given a reference to an object, if the
 reference itself is not modified from one dynamic cast to another, then
 the result of the former cast can be reused for the latter.

Which is memoisation, more or less.

Memoization that can safely is safe for dynamic casts.
 http://digitalmars.com/d/2.0/expression.html#DeleteExpression
 
 Doesn't actually specify whether this case is kosher or not.  I suppose
 that makes it undefined behaviour, although possibly not in quite the
 same way.  :)

So you're deleting one object, leaving a dangling reference. You're creating a new object that by pure coincidence happens to be stored at the same memory address. Then you're dereferencing the original dangling reference, which by pure coincidence now points to the new object. How can this possibly be defined behavior? You're still dereferencing a dangling reference. -- Rainer Deyke - rainerd eldwood.com
Sep 22 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Sep 22, 2009 at 8:40 AM, bearophile <bearophileHUGS lycos.com> wrote:
 Daniel Keep:
What if the GC just happens to re-use that address for a different object?<

I don't understand. Can you explain me an example where this problem may happen? Bye, bearophile

Object is allocated at 0x10001000, that object is collected, its memory is put back on the freelist, another object of the same size is allocated, and that block of memory is used again, allocating a new object at 0x10001000.
Sep 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Sep 2009 05:24:11 -0400, Daniel Keep  
<daniel.keep.lists gmail.com> wrote:

 Jason House wrote:
 Dynamic casts are pure. They don't use global state, and have the same  
 output for the same reference as input. Interestingly, dynamic cast  
 results are independent of intervening mutable calls... So there's even  
 greater opportunity for optimization.

What if the GC just happens to re-use that address for a different object?

That's only if memoization is used. I think what Jason said is correct -- you can view dynamic cast as taking 2 arguments, one is the reference which is simply echoed as the return value, and one is the classinfo, which is an immutable argument that causes a decision to be made. In fact, you could use memoization on a dynamic cast subfunction that memoizes on the target type and the classinfo of the source, regardless of the reference value, and returns an offset to add to the return value. It's pretty easy for the compiler to prove that o has not be reassigned, so even without memoization, the compiler can logically assume that the result from the first dynamic cast can be reused. I think this is the major optimization for pure functions anyways, not memoization. -Steve
Sep 22 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Sep 22, 2009 at 1:19 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Tue, 22 Sep 2009 05:24:11 -0400, Daniel Keep
 <daniel.keep.lists gmail.com> wrote:

 Jason House wrote:
 Dynamic casts are pure. They don't use global state, and have the same
 output for the same reference as input. Interestingly, dynamic cast res=



 are independent of intervening mutable calls... So there's even greater
 opportunity for optimization.

What if the GC just happens to re-use that address for a different objec=


 That's only if memoization is used.

 I think what Jason said is correct -- you can view dynamic cast as taking=

 arguments, one is the reference which is simply echoed as the return valu=

 and one is the classinfo, which is an immutable argument that causes a
 decision to be made.

 In fact, you could use memoization on a dynamic cast subfunction that
 memoizes on the target type and the classinfo of the source, regardless o=

 the reference value, and returns an offset to add to the return value.

 It's pretty easy for the compiler to prove that o has not be reassigned, =

 even without memoization, the compiler can logically assume that the resu=

 from the first dynamic cast can be reused. =A0I think this is the major
 optimization for pure functions anyways, not memoization.

Or you could - I dunno - cache the result of the dynamic cast in a local, since doing multiple dynamic casts is terrible style anyway. Just saying. ;)
Sep 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Sep 2009 14:03:28 -0400, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 Or you could - I dunno - cache the result of the dynamic cast in a
 local, since doing multiple dynamic casts is terrible style anyway.

 Just saying. ;)

Yes, this is true of many optimizations :P Are you saying it's bad style because of the expense of dynamic casting or for some other reason? If dynamic cast were pure, that takes away that argument. -Steve
Sep 22 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Sep 22, 2009 at 3:35 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Tue, 22 Sep 2009 14:03:28 -0400, Jarrett Billingsley
 <jarrett.billingsley gmail.com> wrote:

 Or you could - I dunno - cache the result of the dynamic cast in a
 local, since doing multiple dynamic casts is terrible style anyway.

 Just saying. ;)

Yes, this is true of many optimizations :P Are you saying it's bad style because of the expense of dynamic casting o=

 for some other reason? =A0If dynamic cast were pure, that takes away that
 argument.

Dynamic downcasts do usually indicate a weakness in the design. But an even more fundamental matter of style is to not repeat yourself. You don't use "x + y" if you need the sum of x and y in ten places; you do "sum =3D x + y" and then use "sum." The same applies here. You're not just working around a deficiency in the compiler, you're saving yourself work later if you need to change all those values, and you're giving it some kind of semantic attachment by putting it in a named location. Besides, you're probably going to be doing: if(auto d =3D cast(Derived)baseRef) { // .. } so you've already bound the result of the dynamic cast to a variable. *shru= g*
Sep 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Sep 2009 16:00:29 -0400, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 On Tue, Sep 22, 2009 at 3:35 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Tue, 22 Sep 2009 14:03:28 -0400, Jarrett Billingsley
 <jarrett.billingsley gmail.com> wrote:

 Or you could - I dunno - cache the result of the dynamic cast in a
 local, since doing multiple dynamic casts is terrible style anyway.

 Just saying. ;)

Yes, this is true of many optimizations :P Are you saying it's bad style because of the expense of dynamic casting or for some other reason? If dynamic cast were pure, that takes away that argument.

Dynamic downcasts do usually indicate a weakness in the design. But an even more fundamental matter of style is to not repeat yourself. You don't use "x + y" if you need the sum of x and y in ten places; you do "sum = x + y" and then use "sum." The same applies here. You're not just working around a deficiency in the compiler, you're saving yourself work later if you need to change all those values, and you're giving it some kind of semantic attachment by putting it in a named location.

What if x and y are possibly changing in between calls to x + y? I'm thinking of a situation where o *might* be changing, but also might not, the compiler could do some optimization to avoid the dynamic-cast calls in the cases where it doesn't change. These would be hard to code manually. My understanding of pure function benefits (and it's not that great) is that you can express yourself how you want and the compiler fixes your code to be more optimized knowing it can avoid calls.
 Besides, you're probably going to be doing:

 if(auto d = cast(Derived)baseRef)
 {
     // ..
 }

 so you've already bound the result of the dynamic cast to a variable.  
 *shrug*

Probably. But even then, if this statement is in a loop that might not change baseRef, the compiler can avoid dynamic casting again. -Steve
Sep 22 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Sep 22, 2009 at 4:13 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 Dynamic downcasts do usually indicate a weakness in the design. But an
 even more fundamental matter of style is to not repeat yourself. You
 don't use "x + y" if you need the sum of x and y in ten places; you do
 "sum =3D x + y" and then use "sum." The same applies here. You're not
 just working around a deficiency in the compiler, you're saving
 yourself work later if you need to change all those values, and you're
 giving it some kind of semantic attachment by putting it in a named
 location.

What if x and y are possibly changing in between calls to x + y? I'm thinking of a situation where o *might* be changing, but also might n=

 the compiler could do some optimization to avoid the dynamic-cast calls i=

 the cases where it doesn't change. =A0These would be hard to code manuall=

 =A0My understanding of pure function benefits (and it's not that great) i=

 that you can express yourself how you want and the compiler fixes your co=

 to be more optimized knowing it can avoid calls.

Realistically, how often is this going to come up? Why the hell are we looking at what amounts to CSEE on a rarely-used construct when there are far more important performance issues? I understand wanting to solve this problem for pedagogical reasons, but practically, I don't see the benefit.
 Besides, you're probably going to be doing:

 if(auto d =3D cast(Derived)baseRef)
 {
 =A0 =A0// ..
 }

 so you've already bound the result of the dynamic cast to a variable.
 *shrug*

Probably. =A0But even then, if this statement is in a loop that might not change baseRef, the compiler can avoid dynamic casting again.

Then you'd hoist the conditional out of the loop. Again, not specific to dynamic downcasts.
Sep 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Sep 2009 17:56:46 -0400, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 On Tue, Sep 22, 2009 at 4:13 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 Dynamic downcasts do usually indicate a weakness in the design. But an
 even more fundamental matter of style is to not repeat yourself. You
 don't use "x + y" if you need the sum of x and y in ten places; you do
 "sum = x + y" and then use "sum." The same applies here. You're not
 just working around a deficiency in the compiler, you're saving
 yourself work later if you need to change all those values, and you're
 giving it some kind of semantic attachment by putting it in a named
 location.

What if x and y are possibly changing in between calls to x + y? I'm thinking of a situation where o *might* be changing, but also might not, the compiler could do some optimization to avoid the dynamic-cast calls in the cases where it doesn't change. These would be hard to code manually. My understanding of pure function benefits (and it's not that great) is that you can express yourself how you want and the compiler fixes your code to be more optimized knowing it can avoid calls.

Realistically, how often is this going to come up? Why the hell are we looking at what amounts to CSEE on a rarely-used construct when there are far more important performance issues? I understand wanting to solve this problem for pedagogical reasons, but practically, I don't see the benefit.

Why have pure functions at all? Seriously, all pure function reorderings and reuse can be rewritten by human optimization. If we aren't going to look for places that pure functions can help optimize, why add them to the language, it seems more trouble than its worth? If all it takes to optimize dynamic casts is to put pure on the function signature, have we wasted that much time? -Steve
Sep 22 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Sep 22, 2009 at 8:48 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:

 Why have pure functions at all? =A0Seriously, all pure function reorderin=

 and reuse can be rewritten by human optimization. =A0If we aren't going t=

 look for places that pure functions can help optimize, why add them to th=

 language, it seems more trouble than its worth?

 If all it takes to optimize dynamic casts is to put pure on the function
 signature, have we wasted that much time?

But dynamic downcasting *isn't* pure, unless you can prove that the reference that you're downcasting is unique. class Base {} class Derived : Base {} struct S { Object o; Derived get() { return cast(Derived)o; } } void main() { S s; s.o =3D new Base(); writeln(s.get()); s.o =3D new Derived(); writeln(s.get()); } Dynamic downcasts are not pure. Simply. That's why they're *dynamic*. Without some kind of uniqueness typing, you cannot prove anything about the validity of such casts until runtime.
Sep 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Sep 2009 21:21:13 -0400, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 On Tue, Sep 22, 2009 at 8:48 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

 Why have pure functions at all?  Seriously, all pure function  
 reorderings
 and reuse can be rewritten by human optimization.  If we aren't going to
 look for places that pure functions can help optimize, why add them to  
 the
 language, it seems more trouble than its worth?

 If all it takes to optimize dynamic casts is to put pure on the function
 signature, have we wasted that much time?

But dynamic downcasting *isn't* pure, unless you can prove that the reference that you're downcasting is unique. class Base {} class Derived : Base {} struct S { Object o; Derived get() { return cast(Derived)o; } } void main() { S s; s.o = new Base(); writeln(s.get()); s.o = new Derived(); writeln(s.get()); } Dynamic downcasts are not pure. Simply. That's why they're *dynamic*. Without some kind of uniqueness typing, you cannot prove anything about the validity of such casts until runtime.

You cannot prove what they will return until runtime, but that doesn't mean they are not pure. In your example, you are calling dynamic cast with two different references, so you'll get two different results. The compiler has to be able to prove that the dynamic cast call is to the same object, which it can't by looking at the code for S. Imagine a pure function foo, and then a normal function bar: pure int foo(n) {...} bar(int n) { int x = foo(n); } Can any optimizations be made here? No, unless you count memoization. But we've already established that memoization isn't useful for dynamic cast, as it isn't useful for any function that takes a mutable reference. It's also difficult to say that memoization is worth the effort without profiling. But if you have the code: bar(int n) { int x = foo(n); x += foo(n); } Then you can damn sure optimize out one of those foos. n didn't change. Same goes for dynamic cast. Part of dynamic cast is easy to be proven pure. If you imagine dynamic cast to be the function: T dynamicCast(T)(U u) { int offset = getOffset!T(u.classinfo); if(offset == CANNOT_CAST) return null; return cast(T)((cast(void *)u) + offset); } The function getOffset returns a constant CANNOT_CAST if the object cannot be cast from a U to a T, and returns the offset to add to the reference to convert the U to a T otherwise. Then the function getOffset can be pure, since classinfo's are unique and immutable. That's the expensive part anyways, the rest can be inlined, and you then then have a pure dynamic cast function, because you are calling dynamicCast with the same parameter over and over. -Steve
Sep 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep  
<daniel.keep.lists gmail.com> wrote:

 Steven Schveighoffer wrote:
 On Tue, 22 Sep 2009 05:24:11 -0400, Daniel Keep
 <daniel.keep.lists gmail.com> wrote:

 Jason House wrote:
 Dynamic casts are pure. They don't use global state, and have the
 same output for the same reference as input. Interestingly, dynamic
 cast results are independent of intervening mutable calls... So
 there's even greater opportunity for optimization.

What if the GC just happens to re-use that address for a different object?

That's only if memoization is used.

Well, if you're not looking at memoization, why is this discussion even happening?

There are other pure optimizations that can occur without memoization. In fact I don't even know why memoization is brought up so much with pure functions, it is an impossible-to-prove optimization. How do you know memoization helps when you don't know how often the same arguments will be used?
 I think what Jason said is correct -- you can view dynamic cast as
 taking 2 arguments, one is the reference which is simply echoed as the
 return value, and one is the classinfo, which is an immutable argument
 that causes a decision to be made.

The reference *isn't* echoed as a return value. If it was, then casting wouldn't do anything.

Well, either the reference itself, an offset reference, or null is returned. Sorry I wasn't explicit.
 In fact, you could use memoization on a dynamic cast subfunction that
 memoizes on the target type and the classinfo of the source, regardless
 of the reference value, and returns an offset to add to the return  
 value.

Yes, that would work, provided both references are immutable.

The classinfo is immutable, it's always the same. Why does the data reference have to be immutable? It's not even dereferenced. Think of it as an integer.
 It's pretty easy for the compiler to prove that o has not be reassigned,
 so even without memoization, the compiler can logically assume that the
 result from the first dynamic cast can be reused.  I think this is the
 major optimization for pure functions anyways, not memoization.

Pretty easy? So you're ignoring threads, then? The ONLY way I can see object casting being treated as a pure function is if all references passed in are immutable and even then only if the compiler can prove the reference cannot be changed (i.e. the reference is always kept on the stack AND you assume you never manually delete it).

How is a thread going to come along and change a local variable in another thread?
 -Steve

-- Daniel

--- Steve
Sep 22 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Sep 2009 23:25:26 -0400, Daniel Keep  
<daniel.keep.lists gmail.com> wrote:

 Steven Schveighoffer wrote:
 On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep
 <daniel.keep.lists gmail.com> wrote:
 Well, if you're not looking at memoization, why is this discussion even
 happening?

There are other pure optimizations that can occur without memoization. In fact I don't even know why memoization is brought up so much with pure functions, it is an impossible-to-prove optimization. How do you know memoization helps when you don't know how often the same arguments will be used?

What other optimisations? The only other one I can think of off the top of my head is common subexpression elimination. Except that's effectively caller-side memoisation.

Yes, those optimizations :) They are easy to do and very different than parameter memoization because you only optimize where you know it makes a difference (and you don't need to use the heap to do it).
 The ONLY way I can see object casting being treated as a pure function
 is if all references passed in are immutable and even then only if the
 compiler can prove the reference cannot be changed (i.e. the reference
 is always kept on the stack AND you assume you never manually delete  
 it).

How is a thread going to come along and change a local variable in another thread?

I don't know if you've noticed, but objects tend to be allocated on the heap, not the stack. :P

The pointer *value* (the actual reference, which is stored on the stack) is not being changed. That's all dynamic cast cares about. The referenced data is irrelevant (except for the classinfo, whose access can be factored out of the function, and which isn't possible to change without resorting to undefined behavior).
 There's nothing to stop another thread from killing the object you're
 pointing to and then substituting in a new one.  It's pathological, yes,
 but this is the sort of thing that, if it happens, you don't have a hope
 of debugging.  This is why threads completely and utterly suck and I
 frequently wish they'd never been invented.

It's not only pathological, but it's undefined. We don't care about undefined behavior. A thread can come through and trounce on any memory at any time, this is also not a case we need to concern ourselves with. -Steve
Sep 22 2009
prev sibling next sibling parent language_fan <foo bar.com.invalid> writes:
Wed, 23 Sep 2009 15:09:59 +1000, Daniel Keep thusly wrote:

 See, people equate "parallel execution" with "threads" these days which
 is half the problem.  Threads are a TERRIBLE abstraction, because they
 don't.  There's no protection.  Almost every time I express the opinion
 that threads are broken, people automatically assume I think we should
 all go back to single-core machines and cooperative multitasking.

Threads come up every once in a while since they are one of the only ways to implement concurrency on the hardware level of shared memory machines. Another alternative is message passing. There just are no intelligent, dynamic low level models that provide e.g. abstractions and protection. Once you realize this, the only way to build safe concurrent systems is to provide a safe high level abstraction on top of the hardware and transform programs written using this abstraction to use low level threads and messages. As the amount of concurrent threads keeps increasing, you start to value models that prevent things like deadlocks, starvation, and accidental concurrent writes to the same location.
Sep 24 2009
prev sibling next sibling parent language_fan <foo bar.com.invalid> writes:
Wed, 23 Sep 2009 10:43:53 -0400, Jeremie Pelletier thusly wrote:

 You're right about concurrency being a different concept than threading,
   but I wouldn't give threading away for a pure concurrent model either.
 I believe D is aiming at giving programmers a choice of the tools they
 wish to use. I could see uses of both a concurrent model with message
 passing and a threading model with shared data used at once in a
 program.

The danger in too large a flexibility is that concurrency is not easy and it is getting incresingly complex. You need to be extraordinary good at manually managing all concurrent use of data. If I may predict something that is going to happen, it is that there will be high level models that avoid many low level pitfalls. These models will not provide 100% efficiency, but they are getting faster and faster, without compromizing the safety aspect. This already happened with memory allocation (manual vs garbage collection - in common applications, but not in special cases). Before that we gave some of the error detection capabilities to the compiler (e.g. we do not write array bounds checks ourselves anymore). And optimizations (e.g. register allocation). You may disagree, but I find it much more pleasant to find that the application does never crash even though it works 15% slower than an optimal C++ code would.
 Shared data is always faster than message passing so you could
 implement real time code with it, and use messsage passing for other
 parts such as delegating GUI messages from the main loop.

Shared data is maybe faster on shared memory machines, but this is not a fact in the general case.
 
 Shared data being harder to manage than message passing does not make it
 a bad thing, it just means you can have two different models for two
 different usages.

Shared data is not a single homogenous model. Things like transactional memory work in shared memory systems, but still they have different semantics.
Sep 24 2009
prev sibling next sibling parent language_fan <foo bar.com.invalid> writes:
Fri, 25 Sep 2009 00:10:55 +0000, language_fan thusly wrote:

 You
 may disagree, but I find it much more pleasant to find that the
 application does never crash even though it works 15% slower than an
 optimal C++ code would.

Imagine if a buggy C++ program was monitoring your health. If it crashed or corrupted data, you would die. Yes, the C++ code would be $1M cheaper to build and the hardware would also be $100K cheaper (overall $10M vs $8.9M), but people would die 95% more often. Or in banking business the systems would handle 4x as much customers and transactions, but unfortunately some transactions would just go to the bit heaven due to the 5-6 reboots the mainframe required daily.
Sep 24 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 24 Sep 2009 20:46:13 -0400, Jeremie Pelletier <jeremiep gmail.com>  
wrote:
[snip]
 Its the same for concurrency, I can think of vector processing,  
 functional calls, STM, message passing and shared memory off the top of  
 my head. All being valid models with each their pros and cons, together  
 forming a complete all-around solution.

 Jeremie

Sep 24 2009
prev sibling next sibling parent language_fan <foo bar.com.invalid> writes:
Thu, 24 Sep 2009 20:46:13 -0400, Jeremie Pelletier thusly wrote:

 language_fan wrote:
 Wed, 23 Sep 2009 10:43:53 -0400, Jeremie Pelletier thusly wrote:
 
 You're right about concurrency being a different concept than
 threading,
   but I wouldn't give threading away for a pure concurrent model
   either.
 I believe D is aiming at giving programmers a choice of the tools they
 wish to use. I could see uses of both a concurrent model with message
 passing and a threading model with shared data used at once in a
 program.

The danger in too large a flexibility is that concurrency is not easy and it is getting incresingly complex. You need to be extraordinary good at manually managing all concurrent use of data. If I may predict something that is going to happen, it is that there will be high level models that avoid many low level pitfalls. These models will not provide 100% efficiency, but they are getting faster and faster, without compromizing the safety aspect. This already happened with memory allocation (manual vs garbage collection - in common applications, but not in special cases). Before that we gave some of the error detection capabilities to the compiler (e.g. we do not write array bounds checks ourselves anymore). And optimizations (e.g. register allocation). You may disagree, but I find it much more pleasant to find that the application does never crash even though it works 15% slower than an optimal C++ code would.

15% slower is an extreme performance hit. I agree that code safety is useful and I use this model all the time for initialization and other code which isn't real time, but 15% takes away a lot of the application's responsiveness, if you have 50 such applications running on your system you just spent $1000 more in hardware to get the performance of entry level hardware with faster code.

The cost of e.g. doubling computing power depends on the domain. If you are building desktop end user applications, they usually should scale from single core atoms to 8-core high-end enthusiastic game computers. So the cpu requirements shouldn't usually be too large. Usually even most of the 1-3 previous generations' hardware runs them just nicely. Now doubling the cpu power of a low-end current generation PC does not cost $1000, but maybe $20-50. You can continue this until the cpu costs about $400-500. By then you've achieved at least tenfold speedup. On the gpu market the cheapest chips have very limited capabilities. You can buy 5 times faster graphics cards for $50-60. $150-160 will get you a 25 times faster gpu than the first one. 4..8 GB RAM is also really cheap these days, and so is a 1.5 TB hard drive. Hardly any desktop program requires that much from the hardware. The 15% or even 50% slower execution speed seems rather small a problem when you can avoid it by buying faster hardware. Hardly any program is cpu bound, not even the most demanding games are. On the server side many systems use php which is both unsafe and slow. If you have decent load balancing and cache systems, it does not even matter since the system may not be cpu bound, either. Even top10 popular sites like wikipedia run on slow php.
Sep 24 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 24 Sep 2009 20:59:23 -0400, Jeremie Pelletier <jeremiep gmail.com>  
wrote:

 Robert Jacques wrote:
 On Thu, 24 Sep 2009 20:46:13 -0400, Jeremie Pelletier  
 <jeremiep gmail.com> wrote:
 [snip]
 Its the same for concurrency, I can think of vector processing,  
 functional calls, STM, message passing and shared memory off the top  
 of my head. All being valid models with each their pros and cons,  
 together forming a complete all-around solution.

 Jeremie


Oh yeah, thanks! Those were covered by Bartosz in his blog right? I think he used the term Actor for task based programming, I really enjoyed reading these articles.

There are many things called Actors, but they generally boil down to a process/thread/object + message passing. It's a good model, particularly for the cloud. Task programming is generally intra-process and works on the level of a single work-task (aka function). Cilk, futures or Intel's Threading Building Blocks are the canonical examples. The nice thing (I think) about a good work-stealing runtime is that it can provide the back-end for CPU-vector processing, tasks/futures, intra-process actors and functional calls.
Sep 24 2009
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Fri, 25 Sep 2009 00:07:08 -0400, Jeremie Pelletier <jeremiep gmail.com>  
wrote:

 Robert Jacques wrote:
 On Thu, 24 Sep 2009 20:59:23 -0400, Jeremie Pelletier  
 <jeremiep gmail.com> wrote:

 Robert Jacques wrote:
 On Thu, 24 Sep 2009 20:46:13 -0400, Jeremie Pelletier  
 <jeremiep gmail.com> wrote:
 [snip]
 Its the same for concurrency, I can think of vector processing,  
 functional calls, STM, message passing and shared memory off the top  
 of my head. All being valid models with each their pros and cons,  
 together forming a complete all-around solution.

 Jeremie


Oh yeah, thanks! Those were covered by Bartosz in his blog right? I think he used the term Actor for task based programming, I really enjoyed reading these articles.

process/thread/object + message passing. It's a good model, particularly for the cloud. Task programming is generally intra-process and works on the level of a single work-task (aka function). Cilk, futures or Intel's Threading Building Blocks are the canonical examples. The nice thing (I think) about a good work-stealing runtime is that it can provide the back-end for CPU-vector processing, tasks/futures, intra-process actors and functional calls.

I think the best way to deal with actors is by making them objects. Binding them to heavyweight kernel structures such as threads and processes will greatly limit the number of concurrent actors you can execute at once. Tasks would be most likely implemented by async methods, since you don't care about knowing when they're done. On the other hand, futures are there when you need an async call and monitor its completion. A good implementation could therefore easily support tens of thousands of concurrent actors. I must admit its the first time I hear about work-stealing, I will need to do some research on that :)

:) I prefer Actor's being objects with async methods too. Anyways here's the work-stealing concept in brief: Each thread/actor keeps a set of tasks to be done. This avoids having a single point of contention for task creation/assignment. When a worker thread runs out of tasks it then steals tasks from other actors/threads. Both Cilk and Intel's TTB are good places to start reading. For low latency, futures require a set with efficient random removal. I.e. the promise becomes due, so the current thread assigns the associated future to itself. Actors, on the other hand, should be limited to FIFO, to maintain consistency. Though mutable methods would require exclusive access, in theory const and immutable methods should be able to be run in parallel with each other.
Sep 24 2009
prev sibling next sibling parent language_fan <foo bar.com.invalid> writes:
Thu, 24 Sep 2009 22:58:51 -0600, Rainer Deyke thusly wrote:

 language_fan wrote:
 The cost of e.g. doubling computing power depends on the domain. If you
 are building desktop end user applications, they usually should scale
 from single core atoms to 8-core high-end enthusiastic game computers.
 So the cpu requirements shouldn't usually be too large. Usually even
 most of the 1-3 previous generations' hardware runs them just nicely.
 
 Now doubling the cpu power of a low-end current generation PC does not
 cost $1000, but maybe $20-50.

You also need to consider how widely distributed your application is. If you force a million desktop PC users to upgrade their CPUs, you just wasted twenty to fifty million dollars of your customers' money (or, more likely, lost a million customers).

I do not believe the market works this way. According to that logic popularity correlates with expenses. So if you have e.g. 1 billion users, even $1 in per-user hardware costs causes a billion dollar losses to customers. Is that unacceptable? On the other hand a program with a userbase of 10 might require a $100 hw upgrade and it is still ok? Usually it is the other way, more popular programs pretty much dictate what hardware is useful today. E.g. consider a G4 Mac or PIII PC, it is pretty much useless today since popular web applications like facebook, myspace, youtube, and others are way too slow on it. (not to mention the obligatory virus scanner updates which today require a 2 GHz PC). The HD videos in youtube may require a 1.2+ GHz PC. If you buy an office suite, the old Windows 98 will not even let you install it. Upgrading to Vista is not possible since it will not fit to the 10 GB hard drive. The stores will not sell PATA drives anymore so you either need to buy a PCI PATA controller (not supported by the OS, of course) or upgrade the case, psu, mobo, cpu, memory chips, graphics card, and the hard disk. A used 2.5 GHz Athlon XP with 1GB of RAM and 100GB of disk costs about $100. Anything below that is obsolete these days. Good luck selling anything to people who use older computers, they are probably broke anyways. Otherwise I just see it cheaper to build your apps slower and require hardware updates. Just imagine - a highly optimized $400 program is way too expensive for most users, a $50 program + $200 hw upgrade sounds just fine.
Sep 25 2009
prev sibling parent language_fan <foo bar.com.invalid> writes:
Fri, 25 Sep 2009 17:59:06 -0600, Rainer Deyke thusly wrote:

 Software is priced to optimize total income, which is net income per
 unit times number of units sold.  Production costs are not factored in
 at all.  So the real question is if your $50 software package sells
 enough additional units to make up for the increase in production costs.

Sure, that is the way it works. But still I think the main motivator for customers is the price. It does not really matter if the latest photoshop runs on a Pentium 233MMX or requires a dual-core with 4GB of RAM for e.g. scaling 5MPix images. The target audience upgrades their hardware anyways and the differences between user interfaces is so huge that competition is inexistant. The poorer customers first use a pirated version of photoshop, and only after gaining some popularity buy the licenses rather than use free software like gimp. Optimizing the software will not bring adobe more customers. You can see their optimizing policy in the famous flash plugin and pdf reader. Performance on both programs is just horrible and I could well imagine that a novice programmer built both of them.
Sep 26 2009