digitalmars.D.bugs - dmd 0.141: acessing associative array and -release compiler switch

digitalmars.D.bugs - dmd 0.141: acessing associative array and -release compiler switch - array.d

Ivan Cibiri (31/31) Dec 29 2005 Hello,

Jarrett Billingsley (5/21) Dec 29 2005 Array bounds checking is turned off in release versions for performance,...

Chris Lajoie (6/9) Dec 31 2005 I don't like the idea that my debug and release builds might run

Ivan Cibiri (18/27) Dec 31 2005 I agree with Chris, because I expect the same behaviour from debug and r...

Ameer Armaly (7/50) Dec 31 2005 Yes, since debug builds are just that. They are designed to actively lo...

Dave (21/30) Dec 31 2005 But -release *is* the compiler switch

Kris (9/18) Dec 31 2005 That's very true. But I think the problem here is actually the API? I ca...

Dave (19/38) Jan 02 2006 Is this part of it:

Oskar Linde (10/30) Jan 03 2006 You could just keep a cache of the last lookup for each AA. No need to

Dave (12/42) Jan 03 2006 Exactly what I had in mind, except I'm not so sure the threading issue c...

Kris (39/68) Jan 03 2006 No ~ I think there's been 4 or 5 threads since then :-)

Ivan Cibiri <Ivan_member pathlink.com> writes:

Hello,

when acessing associative array, there is different exception thrown, depends if
-release compiler switch is used or not. See attached source file. Compile it
with out -release switch and run. Then compile it with -release switch and run.

Problem is that when compiled using -release switch, it generates Access
Violation exception instead of ArrayBoundsError exception. I expect the same
exception to be generated in both cases, otherways it is necessary to have
different application logic to recover from the exception.

In my opinion, behaviour is correct when std.arry.ArrayBoundsError exception is
thrown, in both cases.

Ivan.



begin 0644 array.d

M;&4Z(&1M9"!A<G)A>2YD("T <F5L96%S90T*4G5N.B!A<G)A>0T**B\-"FEM

M(&%R<F%Y.PT*"0T*"6%R<F%Y6R)A8B)=(#T ,3L-" EA<G)A>5LB8V1E9F<B

M72!K97ES(#T 6R`B86(B+"` (F-D969G(BP (G!Q<B(L(")I:B)=.PT*"0T*
M"69O<F5A8V H8VAA<EM=(&ME>3L :V5Y<RD >PT*"0ET<GD >PT*"0D):6YT

M=6YD(&%N9"!A<W-O8VEA=&5D('9A;'5E(&ES("5S(BP :V5Y+"!R97-U;'0I
M.PT*"0E](&-A=&-H*'-T9"YA<G)A>2Y!<G)A>4)O=6YD<T5R<F]R(&5X8RD 

M"7T 8V%T8V H17AC97!T:6]N(&5X8RD >R`-" D)"2\O4%)/0DQ%33H =VAE
M;B!C;VUP:6QE9"!U<VEN9R`M<F5L96%S92!S=VET8V L(&ET(&=E;F5R871E
M<R!!8V-E<W, 5FEO;&%T:6]N(&EN<W1E860 ;V8 07)R87E";W5N9'-%<G)O
M< T*"0D)=W)I=&5F;&XH(DME>2`E<R!N;W0 9F]U;F0 0E54(&5X<&5C=&5D
M('-T9"YA<G)A>2Y!<G)A>4)O=6YD<T5R<F]R(&5X8V5P=&EO;BP 3D]4("5S

`
end

Dec 29 2005

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Ivan Cibiri" <Ivan_member pathlink.com> wrote in message 
news:dp17ln$iic$1 digitaldaemon.com...
 Hello,

 when acessing associative array, there is different exception thrown, 
 depends if
 -release compiler switch is used or not. See attached source file. Compile 
 it
 with out -release switch and run. Then compile it with -release switch and 
 run.

 Problem is that when compiled using -release switch, it generates Access
 Violation exception instead of ArrayBoundsError exception. I expect the 
 same
 exception to be generated in both cases, otherways it is necessary to have
 different application logic to recover from the exception.

 In my opinion, behaviour is correct when std.arry.ArrayBoundsError 
 exception is
 thrown, in both cases.

 Ivan.

Array bounds checking is turned off in release versions for performance, 
unlike certain other slow, paranoid languages ;)  Hence, you get an access 
violation in the release build.

Dec 29 2005

Chris Lajoie <ctlajoie___remove___this___ ___gmail.com> writes:

Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance, 
 unlike certain other slow, paranoid languages ;)  Hence, you get an access 
 violation in the release build. 

I don't like the idea that my debug and release builds might run 
differently. Should that behavior be acceptable? Maybe bounds checking 
should always be on unless there's a pragma(NoBoundsCheck) (or 
something).. or a compiler switch.

Chris

Dec 31 2005

Ivan Cibiri <Ivan_member pathlink.com> writes:

In article <dp5me6$v86$1 digitaldaemon.com>, Chris Lajoie says...
Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance, 
 unlike certain other slow, paranoid languages ;)  Hence, you get an access 
 violation in the release build. 

I don't like the idea that my debug and release builds might run 
differently. Should that behavior be acceptable? Maybe bounds checking 
should always be on unless there's a pragma(NoBoundsCheck) (or 
something).. or a compiler switch.

Chris

I agree with Chris, because I expect the same behaviour from debug and release
builds. I understand that switching off bounds chcecking improves performance,
however it changes behaviour of application in some cases. Of course, turning on
release build you should understand what it does to your application. Maybe more
control on level of optimizations would be useful (e.g. bounds checking off),
but on the other side, it would be more complicated to play (and possible waste
of time) with many different switches.  
Current solution for debug and release build is simple to use, but for me it is
little bit annoying that in debug build you can fine tune recovery from
different exceptions, but in release build it does not work because diffent
exceptions are thrown as a result turned off bounds checking. 
Simillar situation is with asserts in debug and release builds. Both, bounds
chcecking and asserts helps you debug your code, but as a result of using this
features you have to expect possible different behaviour of application when
compiled with or without release switch. It means that you have to run your test
suite against debug build and then also against release build. 

Ivan.

Dec 31 2005

"Ameer Armaly" <ameer_armaly hotmail.com> writes:

"Ivan Cibiri" <Ivan_member pathlink.com> wrote in message 
news:dp5rrm$12uo$1 digitaldaemon.com...
 In article <dp5me6$v86$1 digitaldaemon.com>, Chris Lajoie says...
Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance,
 unlike certain other slow, paranoid languages ;)  Hence, you get an 
 access
 violation in the release build.

I don't like the idea that my debug and release builds might run
differently. Should that behavior be acceptable? Maybe bounds checking
should always be on unless there's a pragma(NoBoundsCheck) (or
something).. or a compiler switch.

Chris

 I agree with Chris, because I expect the same behaviour from debug and 
 release
 builds. I understand that switching off bounds chcecking improves 
 performance,
 however it changes behaviour of application in some cases. Of course, 
 turning on
 release build you should understand what it does to your application. 
 Maybe more
 control on level of optimizations would be useful (e.g. bounds checking 
 off),
 but on the other side, it would be more complicated to play (and possible 
 waste
 of time) with many different switches.
 Current solution for debug and release build is simple to use, but for me 
 it is
 little bit annoying that in debug build you can fine tune recovery from
 different exceptions, but in release build it does not work because 
 diffent
 exceptions are thrown as a result turned off bounds checking.
 Simillar situation is with asserts in debug and release builds. Both, 
 bounds
 chcecking and asserts helps you debug your code, but as a result of using 
 this
 features you have to expect possible different behaviour of application 
 when
 compiled with or without release switch. It means that you have to run 
 your test
 suite against debug build and then also against release build.

 Ivan.

Yes, since debug builds are just that.  They are designed to actively look 
for bugs and to inform you of what happened; release builds are designed for 
when you're 99 percent sure that everything's going to go just right, and 
don't necessarily have to do that kind of checking.  When in doubt, you can 
always just leave off both switches.

Dec 31 2005

Dave <Dave_member pathlink.com> writes:

In article <dp5me6$v86$1 digitaldaemon.com>, Chris Lajoie says...
Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance, 
 unlike certain other slow, paranoid languages ;)  Hence, you get an access 
 violation in the release build. 

I don't like the idea that my debug and release builds might run 
differently. Should that behavior be acceptable? Maybe bounds checking 
should always be on unless there's a pragma(NoBoundsCheck) (or 
something).. or a compiler switch.

Chris

But -release *is* the compiler switch <g>

Ok, I know that the OP example was just that (an example) and therefore
contrived, but D provides better ways to write that loop and most like it to
avoid even the possibility of an ArrayBoundsError:

int* result = key in array;
if(result)
writefln("Key %s found and associated value is %s", key, *result);

or

if(key in array)
writefln("Key %s found and associated value is %s", key, array[key]);

Since perhaps you could leave out the EH, either of these would use less code
and probably faster anyhow.

Same is true of other types of arrays by using the built-in foreach instead of
direct indexing.

Point is, there are expedient ways built right into the language to often work
around even the need for bounds checking, bounds checking can add a lot of
runtime overhead, and it can't always be reliably optimized away by the
compiler, so part of what -release does is remove it. The alternative would
logically conclude with a different switch for every type of runtime check or
contract statement.

Dec 31 2005

"Kris" <fu bar.com> writes:

"Dave" <Dave_member pathlink.com> wrote...
 Point is, there are expedient ways built right into the language to often 
 work
 around even the need for bounds checking, bounds checking can add a lot of
 runtime overhead, and it can't always be reliably optimized away by the
 compiler, so part of what -release does is remove it. The alternative 
 would
 logically conclude with a different switch for every type of runtime check 
 or
 contract statement.

That's very true. But I think the problem here is actually the API? I can't 
find the thread right now, but there was one a few months back where we were 
asking Walter to revert back part of his prior change to the AA API, to take 
care of this issue and a couple of others. From what I recall, a concensus 
was reached in terms of how it should really operate <*gasp*> but I think by 
that time Walter had already had enough, after changing the API once :)

I'll try to locate the thread ~ since it would be good to eliminate this 
kind of concern (in the only place it apparently happens).

Dec 31 2005

Dave <Dave_member pathlink.com> writes:

In article <dp7bmj$212n$1 digitaldaemon.com>, Kris says...
"Dave" <Dave_member pathlink.com> wrote...
 Point is, there are expedient ways built right into the language to often 
 work
 around even the need for bounds checking, bounds checking can add a lot of
 runtime overhead, and it can't always be reliably optimized away by the
 compiler, so part of what -release does is remove it. The alternative 
 would
 logically conclude with a different switch for every type of runtime check 
 or
 contract statement.

That's very true. But I think the problem here is actually the API? I can't 
find the thread right now, but there was one a few months back where we were 
asking Walter to revert back part of his prior change to the AA API, to take 
care of this issue and a couple of others. From what I recall, a concensus 
was reached in terms of how it should really operate <*gasp*> but I think by 
that time Walter had already had enough, after changing the API once :)

I'll try to locate the thread ~ since it would be good to eliminate this 
kind of concern (in the only place it apparently happens). 

Is this part of it:
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/136
?

Just a quick glance at some of the posts suggests that the current API would be
Ok if double lookups (or pointer sytax to avoid them) could be avoided.

If so, what if the AA implementation (as opposed to the API or the compiler) was
changed so that code like the following would avoid a full-fledged double
lookup?

// if(key in array) {
//    writefln("Key %s found and associated value is %s", key, array[key]);
// }

*Maybe* this could be done in the current implementation w/o a huge amount of
work. The trick would be to make sure it was thread-safe and not add a bunch of
overhead elsewhere, I think.

Basically, if the last lookup is the same as the current lookup (for the same
AA), then why go through the entire lookup again?

Just a thought...

- Dave

Jan 02 2006

Oskar Linde <oskar.lindeREM OVEgmail.com> writes:

Dave wrote:

 Is this part of it:
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/136
 ?
 
 Just a quick glance at some of the posts suggests that the current API would be
 Ok if double lookups (or pointer sytax to avoid them) could be avoided.
 
 If so, what if the AA implementation (as opposed to the API or the compiler)
was
 changed so that code like the following would avoid a full-fledged double
 lookup?
 
 // if(key in array) {
 //    writefln("Key %s found and associated value is %s", key, array[key]);
 // }
 
 *Maybe* this could be done in the current implementation w/o a huge amount of
 work. The trick would be to make sure it was thread-safe and not add a bunch of
 overhead elsewhere, I think.

 Basically, if the last lookup is the same as the current lookup (for the same
 AA), then why go through the entire lookup again?

You could just keep a cache of the last lookup for each AA. No need to 
make it thread-safe as the AA isn't anyway. Such a single node cache 
would probably not affect the performance very much for the general 
usage case and might even improve it in some cases.

The current AA implementation already has a magic 1st bucket. Why not 
make the next N buckets contain caches.

(But using the pointer returned from key in array to optimise time 
critical code doesn't seem very ugly to me...)

/Oskar

Jan 03 2006

Dave <Dave_member pathlink.com> writes:

In article <dpdfhq$2jvj$1 digitaldaemon.com>, Oskar Linde says...
Dave wrote:

 Is this part of it:
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/136
 ?
 
 Just a quick glance at some of the posts suggests that the current API would be
 Ok if double lookups (or pointer sytax to avoid them) could be avoided.
 
 If so, what if the AA implementation (as opposed to the API or the compiler)
was
 changed so that code like the following would avoid a full-fledged double
 lookup?
 
 // if(key in array) {
 //    writefln("Key %s found and associated value is %s", key, array[key]);
 // }
 
 *Maybe* this could be done in the current implementation w/o a huge amount of
 work. The trick would be to make sure it was thread-safe and not add a bunch of
 overhead elsewhere, I think.

 Basically, if the last lookup is the same as the current lookup (for the same
 AA), then why go through the entire lookup again?

You could just keep a cache of the last lookup for each AA. No need to 
make it thread-safe as the AA isn't anyway. Such a single node cache 
would probably not affect the performance very much for the general 
usage case and might even improve it in some cases.

The current AA implementation already has a magic 1st bucket. Why not 
make the next N buckets contain caches.

(But using the pointer returned from key in array to optimise time 
critical code doesn't seem very ugly to me...)

/Oskar

Exactly what I had in mind, except I'm not so sure the threading issue can be
dismissed so easily, because this is a cache seperate from the userspace data. 

Whereas the AA implementation key/value pairs can always be thread synchronized
in userland, the cache can't, and shouldn't be seen as a userland responsibility
because the user doesn't have direct control over it.

You could always do something like turn the caching off if more than one thread
is active, and you are no worse off than now w.r.t. double lookups.

Or I suppose you could ensure that if the cached key is updated in the
hashtable, then the cached value is updated as well. Hmmm, that should take care
of the problem I guess.

- Dave

Jan 03 2006

"Kris" <fu bar.com> writes:

"Dave" <Dave_member pathlink.com> wrote ...
That's very true. But I think the problem here is actually the API? I 
can't
find the thread right now, but there was one a few months back where we 
were
asking Walter to revert back part of his prior change to the AA API, to 
take
care of this issue and a couple of others. From what I recall, a concensus
was reached in terms of how it should really operate <*gasp*> but I think 
by
that time Walter had already had enough, after changing the API once :)

I'll try to locate the thread ~ since it would be good to eliminate this
kind of concern (in the only place it apparently happens).

 Is this part of it:
 http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.dtl/136


No ~ I think there's been 4 or 5 threads since then :-)

The more recent concern was along these lines:

If you currently do an AA retrieval with a non-existing key, you'll get an 
exception. Thus you have to either use pointer syntax, trap the exception, 
or do a double-lookup via the use of "x in y" first. The problem is 
apparently compounded when different -release options are applied between a 
client-app and a library.

It's been suggested that a more capable lookup property be introduced for 
AAs (to support the [] syntax). This would look like

double [char[]] aa;
...
double x;
...
if (aa.get ("some key", x))
     // do something with x
else
    // do something else

or more concisely ~ add a get() property to AA's; and possibly a matching 
put() method for the sake of symmetry.

bool get (key, inout value);
void put (key, value);

These AA properties are simple, robust, intuitive, optimal, proven, 
succinct. No redundant lookups. No pointers anywhere to be seen. The get() 
method does not add an empty entry where one is not found, and does not need 
to throw exceptions.

I recall it was Regan Heath who first noted this API, perhaps 2 years back? 
The problem with current AA[] syntax is that it limits the expressiveness of 
the API ~ resulting in these ongoing posts about AA issues.


 If so, what if the AA implementation (as opposed to the API or the 
 compiler) was
 changed so that code like the following would avoid a full-fledged double
 lookup?

 // if(key in array) {
 //    writefln("Key %s found and associated value is %s", key, 
 array[key]);
 // }

 *Maybe* this could be done in the current implementation w/o a huge amount 
 of
 work. The trick would be to make sure it was thread-safe and not add a 
 bunch of
 overhead elsewhere, I think.

That does seem to be adding additional (tricky) work to ensure 
thread-safety, whereas the proposed get/put methods don't require anything 
like that? Walter had also suggested that a compiler might "notice" a 
double-lookup and then attempt optimization ~ of course, the compiler does 
not currently do that. It would surely be simpler to instead expose an API 
capable of handling the need, in a concise and able manner? Is there some 
rule that states AAs cannot have properties or methods?

Would be great if you'd be interested in adding these to the front-end ... 
<g>

- Kris

Jan 03 2006

D Programming

C/C++ Programming

Other

digitalmars.D.bugs - dmd 0.141: acessing associative array and -release compiler switch - array.d