www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - dmd 0.141: acessing associative array and -release compiler switch - array.d

reply Ivan Cibiri <Ivan_member pathlink.com> writes:
Hello,

when acessing associative array, there is different exception thrown, depends if
-release compiler switch is used or not. See attached source file. Compile it
with out -release switch and run. Then compile it with -release switch and run.

Problem is that when compiled using -release switch, it generates Access
Violation exception instead of ArrayBoundsError exception. I expect the same
exception to be generated in both cases, otherways it is necessary to have
different application logic to recover from the exception.

In my opinion, behaviour is correct when std.arry.ArrayBoundsError exception is
thrown, in both cases.

Ivan.
Dec 29 2005
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Ivan Cibiri" <Ivan_member pathlink.com> wrote in message 
news:dp17ln$iic$1 digitaldaemon.com...
 Hello,

 when acessing associative array, there is different exception thrown, 
 depends if
 -release compiler switch is used or not. See attached source file. Compile 
 it
 with out -release switch and run. Then compile it with -release switch and 
 run.

 Problem is that when compiled using -release switch, it generates Access
 Violation exception instead of ArrayBoundsError exception. I expect the 
 same
 exception to be generated in both cases, otherways it is necessary to have
 different application logic to recover from the exception.

 In my opinion, behaviour is correct when std.arry.ArrayBoundsError 
 exception is
 thrown, in both cases.

 Ivan.

Array bounds checking is turned off in release versions for performance, unlike certain other slow, paranoid languages ;) Hence, you get an access violation in the release build.
Dec 29 2005
parent reply Chris Lajoie <ctlajoie___remove___this___ ___gmail.com> writes:
Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance, 
 unlike certain other slow, paranoid languages ;)  Hence, you get an access 
 violation in the release build. 

I don't like the idea that my debug and release builds might run differently. Should that behavior be acceptable? Maybe bounds checking should always be on unless there's a pragma(NoBoundsCheck) (or something).. or a compiler switch. Chris
Dec 31 2005
next sibling parent reply Ivan Cibiri <Ivan_member pathlink.com> writes:
In article <dp5me6$v86$1 digitaldaemon.com>, Chris Lajoie says...
Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance, 
 unlike certain other slow, paranoid languages ;)  Hence, you get an access 
 violation in the release build. 

I don't like the idea that my debug and release builds might run differently. Should that behavior be acceptable? Maybe bounds checking should always be on unless there's a pragma(NoBoundsCheck) (or something).. or a compiler switch. Chris

builds. I understand that switching off bounds chcecking improves performance, however it changes behaviour of application in some cases. Of course, turning on release build you should understand what it does to your application. Maybe more control on level of optimizations would be useful (e.g. bounds checking off), but on the other side, it would be more complicated to play (and possible waste of time) with many different switches. Current solution for debug and release build is simple to use, but for me it is little bit annoying that in debug build you can fine tune recovery from different exceptions, but in release build it does not work because diffent exceptions are thrown as a result turned off bounds checking. Simillar situation is with asserts in debug and release builds. Both, bounds chcecking and asserts helps you debug your code, but as a result of using this features you have to expect possible different behaviour of application when compiled with or without release switch. It means that you have to run your test suite against debug build and then also against release build. Ivan.
Dec 31 2005
parent "Ameer Armaly" <ameer_armaly hotmail.com> writes:
"Ivan Cibiri" <Ivan_member pathlink.com> wrote in message 
news:dp5rrm$12uo$1 digitaldaemon.com...
 In article <dp5me6$v86$1 digitaldaemon.com>, Chris Lajoie says...
Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance,
 unlike certain other slow, paranoid languages ;)  Hence, you get an 
 access
 violation in the release build.

I don't like the idea that my debug and release builds might run differently. Should that behavior be acceptable? Maybe bounds checking should always be on unless there's a pragma(NoBoundsCheck) (or something).. or a compiler switch. Chris

release builds. I understand that switching off bounds chcecking improves performance, however it changes behaviour of application in some cases. Of course, turning on release build you should understand what it does to your application. Maybe more control on level of optimizations would be useful (e.g. bounds checking off), but on the other side, it would be more complicated to play (and possible waste of time) with many different switches. Current solution for debug and release build is simple to use, but for me it is little bit annoying that in debug build you can fine tune recovery from different exceptions, but in release build it does not work because diffent exceptions are thrown as a result turned off bounds checking. Simillar situation is with asserts in debug and release builds. Both, bounds chcecking and asserts helps you debug your code, but as a result of using this features you have to expect possible different behaviour of application when compiled with or without release switch. It means that you have to run your test suite against debug build and then also against release build. Ivan.

for bugs and to inform you of what happened; release builds are designed for when you're 99 percent sure that everything's going to go just right, and don't necessarily have to do that kind of checking. When in doubt, you can always just leave off both switches.
Dec 31 2005
prev sibling parent reply Dave <Dave_member pathlink.com> writes:
In article <dp5me6$v86$1 digitaldaemon.com>, Chris Lajoie says...
Jarrett Billingsley wrote:
 Array bounds checking is turned off in release versions for performance, 
 unlike certain other slow, paranoid languages ;)  Hence, you get an access 
 violation in the release build. 

I don't like the idea that my debug and release builds might run differently. Should that behavior be acceptable? Maybe bounds checking should always be on unless there's a pragma(NoBoundsCheck) (or something).. or a compiler switch. Chris

But -release *is* the compiler switch <g> Ok, I know that the OP example was just that (an example) and therefore contrived, but D provides better ways to write that loop and most like it to avoid even the possibility of an ArrayBoundsError: int* result = key in array; if(result) writefln("Key %s found and associated value is %s", key, *result); or if(key in array) writefln("Key %s found and associated value is %s", key, array[key]); Since perhaps you could leave out the EH, either of these would use less code and probably faster anyhow. Same is true of other types of arrays by using the built-in foreach instead of direct indexing. Point is, there are expedient ways built right into the language to often work around even the need for bounds checking, bounds checking can add a lot of runtime overhead, and it can't always be reliably optimized away by the compiler, so part of what -release does is remove it. The alternative would logically conclude with a different switch for every type of runtime check or contract statement.
Dec 31 2005
parent reply "Kris" <fu bar.com> writes:
"Dave" <Dave_member pathlink.com> wrote...
 Point is, there are expedient ways built right into the language to often 
 work
 around even the need for bounds checking, bounds checking can add a lot of
 runtime overhead, and it can't always be reliably optimized away by the
 compiler, so part of what -release does is remove it. The alternative 
 would
 logically conclude with a different switch for every type of runtime check 
 or
 contract statement.

That's very true. But I think the problem here is actually the API? I can't find the thread right now, but there was one a few months back where we were asking Walter to revert back part of his prior change to the AA API, to take care of this issue and a couple of others. From what I recall, a concensus was reached in terms of how it should really operate <*gasp*> but I think by that time Walter had already had enough, after changing the API once :) I'll try to locate the thread ~ since it would be good to eliminate this kind of concern (in the only place it apparently happens).
Dec 31 2005
parent reply Dave <Dave_member pathlink.com> writes:
In article <dp7bmj$212n$1 digitaldaemon.com>, Kris says...
"Dave" <Dave_member pathlink.com> wrote...
 Point is, there are expedient ways built right into the language to often 
 work
 around even the need for bounds checking, bounds checking can add a lot of
 runtime overhead, and it can't always be reliably optimized away by the
 compiler, so part of what -release does is remove it. The alternative 
 would
 logically conclude with a different switch for every type of runtime check 
 or
 contract statement.

That's very true. But I think the problem here is actually the API? I can't find the thread right now, but there was one a few months back where we were asking Walter to revert back part of his prior change to the AA API, to take care of this issue and a couple of others. From what I recall, a concensus was reached in terms of how it should really operate <*gasp*> but I think by that time Walter had already had enough, after changing the API once :) I'll try to locate the thread ~ since it would be good to eliminate this kind of concern (in the only place it apparently happens).

Is this part of it: digitalmars.D.dtl/136 ? Just a quick glance at some of the posts suggests that the current API would be Ok if double lookups (or pointer sytax to avoid them) could be avoided. If so, what if the AA implementation (as opposed to the API or the compiler) was changed so that code like the following would avoid a full-fledged double lookup? // if(key in array) { // writefln("Key %s found and associated value is %s", key, array[key]); // } *Maybe* this could be done in the current implementation w/o a huge amount of work. The trick would be to make sure it was thread-safe and not add a bunch of overhead elsewhere, I think. Basically, if the last lookup is the same as the current lookup (for the same AA), then why go through the entire lookup again? Just a thought... - Dave
Jan 02 2006
next sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Dave wrote:

 Is this part of it:
 digitalmars.D.dtl/136
 ?
 
 Just a quick glance at some of the posts suggests that the current API would be
 Ok if double lookups (or pointer sytax to avoid them) could be avoided.
 
 If so, what if the AA implementation (as opposed to the API or the compiler)
was
 changed so that code like the following would avoid a full-fledged double
 lookup?
 
 // if(key in array) {
 //    writefln("Key %s found and associated value is %s", key, array[key]);
 // }
 
 *Maybe* this could be done in the current implementation w/o a huge amount of
 work. The trick would be to make sure it was thread-safe and not add a bunch of
 overhead elsewhere, I think.

 Basically, if the last lookup is the same as the current lookup (for the same
 AA), then why go through the entire lookup again?

You could just keep a cache of the last lookup for each AA. No need to make it thread-safe as the AA isn't anyway. Such a single node cache would probably not affect the performance very much for the general usage case and might even improve it in some cases. The current AA implementation already has a magic 1st bucket. Why not make the next N buckets contain caches. (But using the pointer returned from key in array to optimise time critical code doesn't seem very ugly to me...) /Oskar
Jan 03 2006
parent Dave <Dave_member pathlink.com> writes:
In article <dpdfhq$2jvj$1 digitaldaemon.com>, Oskar Linde says...
Dave wrote:

 Is this part of it:
 digitalmars.D.dtl/136
 ?
 
 Just a quick glance at some of the posts suggests that the current API would be
 Ok if double lookups (or pointer sytax to avoid them) could be avoided.
 
 If so, what if the AA implementation (as opposed to the API or the compiler)
was
 changed so that code like the following would avoid a full-fledged double
 lookup?
 
 // if(key in array) {
 //    writefln("Key %s found and associated value is %s", key, array[key]);
 // }
 
 *Maybe* this could be done in the current implementation w/o a huge amount of
 work. The trick would be to make sure it was thread-safe and not add a bunch of
 overhead elsewhere, I think.

 Basically, if the last lookup is the same as the current lookup (for the same
 AA), then why go through the entire lookup again?

You could just keep a cache of the last lookup for each AA. No need to make it thread-safe as the AA isn't anyway. Such a single node cache would probably not affect the performance very much for the general usage case and might even improve it in some cases. The current AA implementation already has a magic 1st bucket. Why not make the next N buckets contain caches. (But using the pointer returned from key in array to optimise time critical code doesn't seem very ugly to me...) /Oskar

Exactly what I had in mind, except I'm not so sure the threading issue can be dismissed so easily, because this is a cache seperate from the userspace data. Whereas the AA implementation key/value pairs can always be thread synchronized in userland, the cache can't, and shouldn't be seen as a userland responsibility because the user doesn't have direct control over it. You could always do something like turn the caching off if more than one thread is active, and you are no worse off than now w.r.t. double lookups. Or I suppose you could ensure that if the cached key is updated in the hashtable, then the cached value is updated as well. Hmmm, that should take care of the problem I guess. - Dave
Jan 03 2006
prev sibling parent "Kris" <fu bar.com> writes:
"Dave" <Dave_member pathlink.com> wrote ...
That's very true. But I think the problem here is actually the API? I 
can't
find the thread right now, but there was one a few months back where we 
were
asking Walter to revert back part of his prior change to the AA API, to 
take
care of this issue and a couple of others. From what I recall, a concensus
was reached in terms of how it should really operate <*gasp*> but I think 
by
that time Walter had already had enough, after changing the API once :)

I'll try to locate the thread ~ since it would be good to eliminate this
kind of concern (in the only place it apparently happens).

Is this part of it: digitalmars.D.dtl/136

No ~ I think there's been 4 or 5 threads since then :-) The more recent concern was along these lines: If you currently do an AA retrieval with a non-existing key, you'll get an exception. Thus you have to either use pointer syntax, trap the exception, or do a double-lookup via the use of "x in y" first. The problem is apparently compounded when different -release options are applied between a client-app and a library. It's been suggested that a more capable lookup property be introduced for AAs (to support the [] syntax). This would look like double [char[]] aa; ... double x; ... if (aa.get ("some key", x)) // do something with x else // do something else or more concisely ~ add a get() property to AA's; and possibly a matching put() method for the sake of symmetry. bool get (key, inout value); void put (key, value); These AA properties are simple, robust, intuitive, optimal, proven, succinct. No redundant lookups. No pointers anywhere to be seen. The get() method does not add an empty entry where one is not found, and does not need to throw exceptions. I recall it was Regan Heath who first noted this API, perhaps 2 years back? The problem with current AA[] syntax is that it limits the expressiveness of the API ~ resulting in these ongoing posts about AA issues.
 If so, what if the AA implementation (as opposed to the API or the 
 compiler) was
 changed so that code like the following would avoid a full-fledged double
 lookup?

 // if(key in array) {
 //    writefln("Key %s found and associated value is %s", key, 
 array[key]);
 // }

 *Maybe* this could be done in the current implementation w/o a huge amount 
 of
 work. The trick would be to make sure it was thread-safe and not add a 
 bunch of
 overhead elsewhere, I think.

That does seem to be adding additional (tricky) work to ensure thread-safety, whereas the proposed get/put methods don't require anything like that? Walter had also suggested that a compiler might "notice" a double-lookup and then attempt optimization ~ of course, the compiler does not currently do that. It would surely be simpler to instead expose an API capable of handling the need, in a concise and able manner? Is there some rule that states AAs cannot have properties or methods? Would be great if you'd be interested in adding these to the front-end ... <g> - Kris
Jan 03 2006