www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - sorting associative array's keys by values

reply maarten van damme <maartenvd1994 gmail.com> writes:
Right now I have an associative array "int[string] aa" and stored the
keys in "string[] keys".
Now I want to sort keys[] so that "aa[keys[0]]>aa[keys[1]]"
I remember someone gave the answer to that question on stackoverflow
but after some googling I couldn't find the right answer.

maarten
Jun 16 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/16/2012 06:34 PM, maarten van damme wrote:
 Right now I have an associative array "int[string] aa" and stored the
 keys in "string[] keys".
 Now I want to sort keys[] so that "aa[keys[0]]>aa[keys[1]]"
 I remember someone gave the answer to that question on stackoverflow
 but after some googling I couldn't find the right answer.

 maarten

schwartzSort!(k=>aa[k])(keys);
Jun 16 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/16/2012 06:41 PM, Timon Gehr wrote:
 On 06/16/2012 06:34 PM, maarten van damme wrote:
 Right now I have an associative array "int[string] aa" and stored the
 keys in "string[] keys".
 Now I want to sort keys[] so that "aa[keys[0]]>aa[keys[1]]"
 I remember someone gave the answer to that question on stackoverflow
 but after some googling I couldn't find the right answer.

 maarten

schwartzSort!(k=>aa[k])(keys);

oops. Seems like you actually want schwartzSort!(k=>aa[k],"a > b")(keys);
Jun 16 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 06/16/2012 07:51 PM, maarten van damme wrote:
 For some crazy reason my program now crashes on seemingly random
 locations when parsing content of the form:
 	<div class="details">
 	<h1 class="unique">randomname</h1>
 I want to extract randomname but an xml parser would be overkill so I
 extract it using
 curLine=curLine[curLine.countUntil(">")+1..$];
 curLine=curLine[0..curLine.countUntil("</")];

 with curLine containing the contents of the second line.
 When it crashes, it does so on the second instruction. I tried
 wrapping try{}catch around it but to no avail.
 I have no problem with it crashing, could be a logic mistake, but I
 really need to be able to rely on try{}catch...

 It's really strange because every string involved in the process is
 completely cleared and it fails sometimes on cases where it has no
 problems the other times.

Possibly memory corruption. Do you use any unsafe constructs?
Jun 16 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
thank you, works perfectly.
I'm really having some troubles understanding the whole std.algorithm
documentation although it is a module I've had to use in nearly every
single program I wrote and it's always extremely powerful.
Jun 16 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
For some crazy reason my program now crashes on seemingly random
locations when parsing content of the form:
	                <div class="details">
	                    <h1 class="unique">randomname</h1>
I want to extract randomname but an xml parser would be overkill so I
extract it using
curLine=curLine[curLine.countUntil(">")+1..$];
curLine=curLine[0..curLine.countUntil("</")];

with curLine containing the contents of the second line.
When it crashes, it does so on the second instruction. I tried
wrapping try{}catch around it but to no avail.
I have no problem with it crashing, could be a logic mistake, but I
really need to be able to rely on try{}catch...

It's really strange because every string involved in the process is
completely cleared and it fails sometimes on cases where it has no
problems the other times.
Jun 16 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
Ok, It turns out that an array out of bound exception isn't an
exception but an error?
Try catch (Error e) worked fine.
Jun 16 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
Nothing unsafe. I use https://github.com/Bystroushaak/DHTTPClient to
download a webpage and other then that I only use slicing...
Jun 16 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, June 16, 2012 21:48:52 maarten van damme wrote:
 Ok, It turns out that an array out of bound exception isn't an
 exception but an error?
 Try catch (Error e) worked fine.

Yes, it's a RangeError. It's considered a programming bug if you access an array out of bounds - just like any assertion failure is an AssertError and considered a bug in your program in if it fails. And so they kill the program when they fail. Catching either of them is generally a very bad idea. - Jonathan M Davis
Jun 16 2012
prev sibling next sibling parent reply maarten van damme <maartenvd1994 gmail.com> writes:
I wanted to catch it because I could not for the life of me understand
how downloading the exact same page twice could alter it's contents in
such a way that it causes the program to crash.

There's something really strange going on (or maybe I'm just too tired
to see the obvious)
My code literally reads

				if(debugtemp){
					writeln(tradeDocument[0..100]);
					writeln(tradeDocument.countUntil("<div class=\""));
					stdout.flush();
				}

And the output I get is

P-BODY SHRUG</h1>
                            <span class="level">Level 1</span><br />(Uncraftable
)

150
<div class="item unique" id="7088388" data="620,20,6" search="level:1;craftable:
false;">

0

Now either I'm going crazy or std.algorithm sees ghosts apearing.

That isn't the worst part, that is

tradeDocument=tradeDocument[1..$];
				
tradeDocument=tradeDocument[tradeDocument.countUntil("<div
class=\"")..$];//go to the next element

crashing immediatly after the previous piece of code. So in reality
std.algorithm sees an ellement that isn't there, says it's at index 0
and a second later my program crashes on the statement x=x[0..$]
What on earth is going on?

I use methods from
import std.algorithm;
import std.array;
import std.stdio;
import std.string;
and no pointers, aren't they supposed to be a subset of safeD so we
can rule out memory corruption?
(I've confirmed this behaviour on two different machines)
Jun 16 2012
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 06/16/2012 03:28 PM, maarten van damme wrote:

 And the output I get is

It is possible that some part of the code is reusing a string buffer. For example, File.byLine does that.
 tradeDocument=tradeDocument[1..$];

For that to work, you must ensure that tradeDocument has at least 2 elements.
 tradeDocument=tradeDocument[tradeDocument.countUntil("<div
 class=\"")..$];//go to the next element

For that to work, you must have ensured that tradeDocument contains "<div class=\"". Otherwise countUntil() returns -1, an invalid index. If those have not been ensured, I see two bugs in the lines above. Ali -- D Programming Language Tutorial: http://ddili.org/ders/d.en/index.html
Jun 16 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, June 17, 2012 00:28:07 maarten van damme wrote:
 I wanted to catch it because I could not for the life of me understand
 how downloading the exact same page twice could alter it's contents in
 such a way that it causes the program to crash.
 
 There's something really strange going on (or maybe I'm just too tired
 to see the obvious)
 My code literally reads
 
 				if(debugtemp){
 					writeln(tradeDocument[0..100]);
 					writeln(tradeDocument.countUntil("<div class=\""));
 					stdout.flush();
 				}
 
 And the output I get is
 
 P-BODY SHRUG</h1>
                             <span class="level">Level 1</span><br
 />(Uncraftable )
 
 150
 <div class="item unique" id="7088388" data="620,20,6"
 search="level:1;craftable: false;">
 
 0
 
 Now either I'm going crazy or std.algorithm sees ghosts apearing.
 
 That isn't the worst part, that is
 
 tradeDocument=tradeDocument[1..$];
 
 tradeDocument=tradeDocument[tradeDocument.countUntil("<div
 class=\"")..$];//go to the next element
 
 crashing immediatly after the previous piece of code. So in reality
 std.algorithm sees an ellement that isn't there, says it's at index 0
 and a second later my program crashes on the statement x=x[0..$]
 What on earth is going on?
 
 I use methods from
 import std.algorithm;
 import std.array;
 import std.stdio;
 import std.string;
 and no pointers, aren't they supposed to be a subset of safeD so we
 can rule out memory corruption?
 (I've confirmed this behaviour on two different machines)

If anything in there is calling trusted code - either directly or indirectly - then it's possible that the trusted code which is being called is wrong (since in that case, it's the programmer guaranteeing that it's safe, not the compiler). It's also possible that you've run into a code generation bug. What's the shortest possible code that you have which reproduces the problem that you can give? It needs to be narrowed down before it can be determined what exactly is going wrong, and I don't think that you've given enough code in your posts for anyone else to compile it and see the failure on their machine. - Jonathan M Davis
Jun 16 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
Ah, wait a second. After playing a bit with the try catches and
actually writing some proper debug output I found out the problem was
with https://github.com/Bystroushaak/DHTTPClient/blob/master/dhttpclient.d.
It doesn't allways download the complete webpage.

I should've written better tests I guess...
Thank you for your time.
Jun 16 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
It should allways contain </html> so it has more then 2 elements and
there is a note section that starts with "<div class" where I stop
parsing and break out of the loop so those two should've been
statisfied. The problem was (I think) the downloader. Now I get waaay
less frequent crashes (yes, there is still a little bug somewhere but
I'll catch it)
Jun 16 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, June 17, 2012 01:35:56 maarten van damme wrote:
 It should allways contain </html> so it has more then 2 elements and
 there is a note section that starts with "<div class" where I stop
 parsing and break out of the loop so those two should've been
 statisfied. The problem was (I think) the downloader. Now I get waaay
 less frequent crashes (yes, there is still a little bug somewhere but
 I'll catch it)

It _should_ always have </html>, but that string is input from outside the program under which you have no control, so you need to check it or at least check that any find or similar function that you use which can return an invalid index when </html> isn't there doesn't return an invalid index. Otherwise, you're going to get RangeErrors when it isn't there. Array slicing assumes that you've checked beforehand that the indices are valid, and it's a bug in your program if they're not. - Jonathan M Davis
Jun 16 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
well, the page I parse is automatically generated and thus allways
contains </html>. (if the page completely downloaded which it didn't).
The second error I found (my mistake) is that newlines get scrambled
up severely when downloading the page causing the markers I try to
find to sometimes break down on two different lines. replacechars from
std.string crashed too with the error "invalid utf sequence". Now I
use std.array's replace and everything (appears) to be working.


If only I'd learned regex, I think that would've saved me quite some time.
Jun 17 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, June 17, 2012 13:07:24 maarten van damme wrote:
 well, the page I parse is automatically generated and thus allways
 contains </html>.

That may be true, but if your code assumes that </html> is there and it ever isn't for any reason, then you're going to get a RangeError thrown in non- release and undefined behavior in -release. It's generally a bad idea to assume that something is correct like that when you have zero control over it within your program, even if it's almost certainly correct. Something could go wrong, and it's better to throw an exception or do some other sort of error handling than it is to get an Error, or worse, undefined behavior. - Jonathan M Davis
Jun 17 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
Ok, everything worked (nearly perfect). Sometimes the webpage gets
completely messed up  ("<" changes to d134 or something like that) but
the tests handle it rather well.
That's why I decided to improve it a bit and use treads. I've always
been afraid of d treads because I never really got the grasp of that
message passing system but hey, there is a first time for everything.
Maybe I was a bit optimistic as I now already run into troubles.

I have two arrays of the form
int[string]

To pass them from the spawned function back to the spawner I wanted to use
tid.send( array1,array2);

This was illegal, they have to be either immutable or shared. I've
read somwhere that D tries to avoid shared where possible (with that
message passing system) so I went for immutable and changed it over to
tid.send(cast(immutable(int[string]))array1,...) (you get the idea)

Not only does this look ugly, it's looks like what I'm doing isn't
really correct.
But anyway, onto the next part, receiving it.

I used
auto msg=receiveOnly!(int[string],int[string]);

of course this fails because I send immutable items and want to receive mutables
I then changed it to
auto msg=receiveOnly!(immutable int[string],immutable int[string]);

Compilation now fails with the very informative
C:\D\dmd2\windows\bin\..\..\src\phobos\std\concurrency.d(612): Error: can only i
nitialize const member _field_field_0 inside constructor
C:\D\dmd2\windows\bin\..\..\src\phobos\std\concurrency.d(612): Error: can only i
nitialize const member _field_field_1 inside constructor

Why is treading made so damn hard?
what have I done wrong now?
Jun 17 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
another problem, when I do use shared.
My code is
    int amountTreads = 20;
	if(upperLimit-lowerLimit<amountTreads)
		amountTreads=upperLimit-lowerLimit;
    int perTrade     = (upperLimit - lowerLimit) / amountTreads;

    for (int x = 0; x < amountTreads; x++)
    {
        //we're at the last trade, parse all the rest
        if (amountTreads - x == 1)
        {
            spawn(&parse, lowerLimit + x * perTrade, upperLimit - x *
perTrade, thisTid);
        }
        else
        {
            spawn(&parse, lowerLimit + x * perTrade, lowerLimit + (x +
1) * perTrade - 1, thisTid);
        }
    }

    //wait for all those treads and process the response
    for (int x = 0; x < amountTreads; x++)
    {
		auto msg=receiveOnly!(shared int[string],shared int[string]);
        int[string] tempDemands = cast(int[string])msg[0];
        int[string] tempOffers  = cast(int[string])msg[1];
		
        foreach (string key; tempDemands.keys){
            demands[key] += tempDemands[key];
		}
        foreach (string key; tempOffers.keys){
            offers[key] += tempOffers[key];
		}
        writeln("ok, we're at ",x);
    }

but when every tread has stopped working , that for-loop has only been
executed 17 times.
Jun 17 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
Everything turned out to be problems with \r \n.
The treading system worked perfectly (although I still don't
understand how one can use immutable and receiveonly).
Jun 18 2012
prev sibling next sibling parent maarten van damme <maartenvd1994 gmail.com> writes:
and something I forgot to ask, is it a conscious decision to not print
out fired asserts in treads? Normally when an assert fails my whole
program crashes and I can see what went wrong. With treads however, it
quietly dies.
Jun 18 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 18 June 2012 at 13:22:24 UTC, maarten van damme wrote:
 and something I forgot to ask, is it a conscious decision to 
 not print out fired asserts in treads? Normally when an assert 
 fails my whole program crashes and I can see what went wrong. 
 With treads however, it quietly dies.

Be horrible if a kernel died due to a program's assert and throws wouldn't it? Since all programs are effectively threads or separate processes. Since a thread is just a branch of the main program, it can only go up the stack as much as it's stack is present (or that's how I see it working). I haven't done thread programming yet, but sometimes when using VisualD when it breaks due to an assert it wouldn't give any detailed information. In those cases I used a try/catch... void threadedFunction(){ //global try/catch try { //your function code } catch(Throwable o) { writeln(o); //still dies afterwards } }
Jun 18 2012
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, June 18, 2012 15:22:15 maarten van damme wrote:
 and something I forgot to ask, is it a conscious decision to not print
 out fired asserts in treads? Normally when an assert fails my whole
 program crashes and I can see what went wrong. With treads however, it
 quietly dies.

It could be a bug. I don't know. Certainly, having the thread die will not necesasrily take your whole program down, regardless of why it died, but having it not print out anything an assertion failure is certainly undesirable. At minimum, creating an enhancement request would probably be in order, and it may just outright be a bug. A message might be sent to the parent TID. I'd have to reread TDPL's concurrency chapter and mess around with std.concurrency a bit to remember/figure out some of those details though (and regardless of what TDPL says, the current implementation may not match it). - Jonathan M Davis
Jun 21 2012