digitalmars.D.learn - associative arrays: to sort or not to sort?
- Mario Kroeplin (15/15) Jul 18 2010 Have a look at the unittest of std.json: the only non-trivial test is co...
- BCS (16/23) Jul 18 2010 Unless JSON requiers that the keys be in some order, the correct solutio...
- Mario Kroeplin (15/27) Jul 20 2010 But this is lengthy and seems to lack additional assert checks.
- BCS (6/28) Jul 21 2010 It's a little wordy but it has one line per things it is checking. And a...
- Mario Kroeplin (75/94) Jul 21 2010 The order of name/value pairs in JSON objects is as unspecified as the o...
- BCS (8/19) Jul 21 2010 On the output side, why not? The unit test is for this implementation af...
Have a look at the unittest of std.json: the only non-trivial test is commented out as "currently broken". Smells like std.json has deep problems when it comes to real-world examples. But then: running the unittest shows that the actual result is {"goodbye":[true,"or",false,["test",42,{"nested":{"a":23.54,"b":0.0012}}]],"hello":{"json":"is great","array":[12,null,{}]}}. The two name/value pairs just appear in different order than expected. That's because "goodbye" < "hello", if you make an illegal assumption for associative arrays, as "the key/value slots are iterated in an unspecified order." [tdpl] Now the question: Shall the "currently broken" unittest be "fixed" by expecting the actual result, with just the order of "goodbye" and "hello" changed? That is, shall we silently make the illegal assumption, that foreach provides the keys in sorted order. Or, shall the foreach (key, value; aa) be replaced with a more clumsy foreach (key; aa.keys.sort)? That is, shall we produce "canonical" JSON text at the price of efficiency. Or, shall the perfect implementation of JSON objects as associative arrays be dropped? Or, what else? By the way: the example before the "currently broken" one is {"a":1,"b":null}; also broken in theory, but not (yet) in practice!
Jul 18 2010
Hello Mario, That is, shall we produce "canonical" JSON text at the price ofefficiency. Or, shall the perfect implementation of JSON objects as associative arrays be dropped? Or, what else?Unless JSON requiers that the keys be in some order, the correct solution is to make the check order independent. For example: string s = CallReturningJSON(); s = replace(s,`"a":23.54`, `X`); s = replace(s,`"b":0.0012`, `X`); s = replace(s,`{"nested":{X,X}}`, `X`); s = replace(s,`"goodbye":[true,"or",false,["test",42,X]]`, `X`); s = replace(s,`"array":[12,null,{}]`, `Y`); s = replace(s,"is\n\ngreat", `Z`); s = replace(s,`"json":Z`, `Y`); s = replace(s,`"hello":{Y,Y}`, `X`); assert(s == `{X,X}`);. -- ... <IXOYE><
Jul 18 2010
Unless JSON requiers that the keys be in some order,No, JSON does not require the names of an object to be in alphabetical order.the correct solution is to make the check order independent. For example: string s = CallReturningJSON(); s = replace(s,`"a":23.54`, `X`); s = replace(s,`"b":0.0012`, `X`); s = replace(s,`{"nested":{X,X}}`, `X`); s = replace(s,`"goodbye":[true,"or",false,["test",42,X]]`, `X`); s = replace(s,`"array":[12,null,{}]`, `Y`); s = replace(s,"is\n\ngreat", `Z`); s = replace(s,`"json":Z`, `Y`); s = replace(s,`"hello":{Y,Y}`, `X`); assert(s == `{X,X}`);.But this is lengthy and seems to lack additional assert checks. One way out could be to avoid real-world examples in the unittest. For the simple example of std.json that is only "broken in theory", the correct solution could be: assert(s == `{"a":1,"b":null}` || s == `{"b":null,"a":1}`) But now assume, I want to implement a JSON-RPC encoder that uses std.json. A JSON-RPC request object has up to four name/value pairs ("jsonrpc", "method", "params", "id"). Instead of only 2!, I now have 4! (too many) possible orders to check in the unittest of the JSON-RPC encoder. The unspecified order of the foreach iteration does not only affect the unittest of std.json, but also leaks out to far away implementations using std.json. The same is true for std.xml: assert(s == `<tag goodbye="1" hello="0"/>`); may pass today, but is broken in theory! You have to notice that XML attributes are stored in an associative array and that toString is implemented using foreach. Until then, broken unit tests pass... This is a problem, isn't it?
Jul 20 2010
Hello Mario,It's a little wordy but it has one line per things it is checking. And adjusting it to give more asserts is trivial.Unless JSON requiers that the keys be in some order,No, JSON does not require the names of an object to be in alphabetical order.the correct solution is to make the check order independent. For example: string s = CallReturningJSON(); s = replace(s,`"a":23.54`, `X`); s = replace(s,`"b":0.0012`, `X`); s = replace(s,`{"nested":{X,X}}`, `X`); s = replace(s,`"goodbye":[true,"or",false,["test",42,X]]`, `X`); s = replace(s,`"array":[12,null,{}]`, `Y`); s = replace(s,"is\n\ngreat", `Z`); s = replace(s,`"json":Z`, `Y`); s = replace(s,`"hello":{Y,Y}`, `X`); assert(s == `{X,X}`);.But this is lengthy and seems to lack additional assert checks.Until then, broken unit tests pass...Spot on. The unittest is broken. Not the thing it's testing and not foreach. -- ... <IXOYE><
Jul 21 2010
The order of name/value pairs in JSON objects is as unspecified as the order of foreach iteration for associative arrays. So, agreed, you would have to unittest a serialization against all permutations. But then, JSON has a jew more unspecified gaps like "whitespace can be inserted between any pair of tokens". Shall we rely on the fact that the implementation currently does not insert whitespace between tokens? This would work for the unittest of std.json, but what's with the unittest of an implementation using std.json? Or, shall we implement a tokenizer in the unittest to get rid of extra whitespace between tokens?It's a little wordy but it has one line per things it is checking. And adjusting it to give more asserts is trivial.the correct solution is to make the check order independent. For example: string s = CallReturningJSON(); s = replace(s,`"a":23.54`, `X`); s = replace(s,`"b":0.0012`, `X`); s = replace(s,`{"nested":{X,X}}`, `X`); s = replace(s,`"goodbye":[true,"or",false,["test",42,X]]`, `X`); s = replace(s,`"array":[12,null,{}]`, `Y`); s = replace(s,"is\n\ngreat", `Z`); s = replace(s,`"json":Z`, `Y`); s = replace(s,`"hello":{Y,Y}`, `X`); assert(s == `{X,X}`);.But this is lengthy and seems to lack additional assert checks.Maybe, the thing under test is not broken. But there seems to be something wrong with the thing when it forces you to examine the unspecified result of toString in a unittest. I mean, it would have been easy for the author of the code to provide toCanonicalJSON, and a lot easier for users of the code. And, of course, foreach is not broken. But D removes a lot of unspecified gaps that make life hard in C or C++. And you have to provide opCmp in order to put your own keys into an associative array, so why don't you get them out in order? However, let's get things done: can the attached unittest be an acceptatble replacement for the broken one of std.json? begin 644 json-unittest.d M=6YI='1E<W0 >PH ("` +R\ 06X ;W9E<FQY('-I;7!L92!T97-T('-U:71E M+"!I9B!I="!C86X <&%R<V4 82!S97)I86QI>F5D('-T<FEN9R!A;F0*("` M("\O('1H96X =7-E('1H92!R97-U;'1I;F< =F%L=65S('1R964 =&\ 9V5N M97)A=&4 86X :61E;G1I8V%L"B` ("`O+R!S97)I86QI>F%T:6]N+"!B;W1H M('1H92!D96-O9&5R(&%N9"!E;F-O9&5R('=O<FMS+ H*("` ($I33TY686QU M92!V86QU93L*("` ('-T<FEN9R!J<V]N.PH ("` "B` ("!V86QU92`]('!A M<G-E2E-/3BA ;G5L;&`I+"!A<W-E<G0H=&]*4T].*"9V86QU92D /3T 8&YU M;&Q *3L*("` ('9A;'5E(#T <&%R<V5*4T].*&!T<G5E8"DL(&%S<V5R="AT M;TI33TXH)G9A;'5E*2`]/2! =')U96`I.PH ("` =F%L=64 /2!P87)S94I3 M3TXH8&9A;'-E8"DL(&%S<V5R="AT;TI33TXH)G9A;'5E*2`]/2! 9F%L<V5 M*3L*("` ('9A;'5E(#T <&%R<V5*4T].*&`P8"DL(&%S<V5R="AT;TI33TXH M/2!P87)S94I33TXH8"TT,S(Q8"DL(&%S<V5R="AT;TI33TXH)G9A;'5E*2`] M/2! +30S,C% *3L*("` ('9A;'5E(#T <&%R<V5*4T].*&`P+C(S8"DL(&%S M<V5R="AT;TI33TXH)G9A;'5E*2`]/2! ,"XR,V`I.PH ("` =F%L=64 /2!P M87)S94I33TXH8"TP+C(S8"DL(&%S<V5R="AT;TI33TXH)G9A;'5E*2`]/2! M+3`N,C- *3L*("` ('9A;'5E(#T <&%R<V5*4T].*&!N=6QL8"DL(&%S<V5R M="AT;TI33TXH)G9A;'5E*2`]/2! ;G5L;&`I.PH ("` =F%L=64 /2!P87)S M94I33TXH8"(B8"DL(&%S<V5R="AT;TI33TXH)G9A;'5E*2`]/2! (B) *3L* M("` ('9A;'5E(#T <&%R<V5*4T].*&`Q+C(R,V4K,C1 *2P 87-S97)T*'1O M2E-/3B F=F%L=64I(#T](&`Q+C(R,V4K,C1 *3L*("` ('9A;'5E(#T <&%R M<V5*4T].*&`B:&5L;&]<;G=O<FQD(F`I+"!A<W-E<G0H=&]*4T].*"9V86QU M92D /3T 8")H96QL;UQN=V]R;&0B8"D["B` ("!V86QU92`]('!A<G-E2E-/ M3BA (EPB7%Q<+UQB7&9<;EQR7'0B8"DL(&%S<V5R="AT;TI33TXH)G9A;'5E M*2`]/2! (EPB7%Q<+UQB7&9<;EQR7'0B8"D["B` ("!V86QU92`]('!A<G-E M2E-/3BA 6UU *2P 87-S97)T*'1O2E-/3B F=F%L=64I(#T](&!;76`I.PH M("` =F%L=64 /2!P87)S94I33TXH8%LQ,BPB9F]O(BQT<G5E+&9A;'-E76`I M+"!A<W-E<G0H=&]*4T].*"9V86QU92D /3T 8%LQ,BPB9F]O(BQT<G5E+&9A M;'-E76`I.PH ("` =F%L=64 /2!P87)S94I33TXH8'M]8"DL(&%S<V5R="AT M;TI33TXH)G9A;'5E*2`]/2! >WU *3L*("` ('9A;'5E(#T <&%R<V5*4T]. M*&![(F$B.C$L(F(B.FYU;&Q]8"D[" H ("` :G-O;B`]('1O2E-/3B F=F%L M=64I.R` +R\ ;W)D97( ;V8 ;F%M92]V86QU92!P86ER<R!I;B!*4T].(&]B M:F5C=', :7, =6YS<&5C:69I960*("` (&%S<V5R="AJ<V]N(#T](&![(F$B M.C$L(F(B.FYU;&Q]8"!\?"!J<V]N(#T](&![(F(B.FYU;&PL(F$B.C%]8"D[ M" H ("` =F%L=64 /2!P87)S94I33TXH8'LB:&5L;&\B.GLB:G-O;B(Z(FES M(&=R96%T(BPB87)R87DB.ELQ,BQN=6QL+'M]77TL8`H ("` ("` ("` ("` M("` ("` ("` 8")G;V]D8GEE(CI;=')U92PB;W(B+&9A;'-E+%LB=&5S="(L M("!J<V]N(#T =&]*4T].*"9V86QU92D[("`O+R!O<F1E<B!O9B!N86UE+W9A M;'5E('!A:7)S(&EN($I33TX ;V)J96-T<R!I<R!U;G-P96-I9FEE9`H ("` M87-S97)T*&EN9&5X3V8H:G-O;BP 8'LB:G-O;B(Z(FES(&=R96%T(BPB87)R M87DB.ELQ,BQN=6QL+'M]77U *2`A/2`M,2!\?`H ("` ("` ("` (&EN9&5X M3V8H:G-O;BP 8'LB87)R87DB.ELQ,BQN=6QL+'M]72PB:G-O;B(Z(FES(&=R M96%T(GU *2`A/2`M,2D["B` ("!J<V]N(#T <F5P;&%C92AJ<V]N+"! >R)J M<V]N(CHB:7, 9W)E870B+")A<G)A>2(Z6S$R+&YU;&PL>WU=?6`L(&![(F%R M<F%Y(CI;,3(L;G5L;"Q[?5TL(FIS;VXB.B)I<R!G<F5A=")]8"D["B` ("!A M("$]("TQ('Q\"B` ("` ("` ("` :6YD97A/9BAJ<V]N+"! >R)B(CHP+C`P M,3(L(F$B.C(S+C4T?6`I("$]("TQ*3L*("` ("!J<V]N(#T <F5P;&%C92AJ M<V]N+"! >R)B(CHP+C`P,3(L(F$B.C(S+C4T?6`L(&![(F$B.C(S+C4T+")B M(CHP+C`P,3)]8"D["B` ("` 87-S97)T*&IS;VX /3U >R)G;V]D8GEE(CI; M;R(Z>R)A<G)A>2(Z6S$R+&YU;&PL>WU=+")J<V]N(CHB:7, 9W)E870B?7U M('Q\"B` ("` ("` ("` (&IS;VX /3T 8'LB:&5L;&\B.GLB87)R87DB.ELQ M,BQN=6QL+'M]72PB:G-O;B(Z(FES(&=R96%T(GTL8`H ("` ("` ("` ("` M+'LB;F5S=&5D(CI[(F$B.C(S+C4T+")B(CHP+C`P,3)]?5U=?6`I.PI]" `` ` endUntil then, broken unit tests pass...Spot on. The unittest is broken. Not the thing it's testing and not foreach.
Jul 21 2010
Hello Mario,But then, JSON has a jew more unspecified gaps like "whitespace can be inserted between any pair of tokens".That can be dealt with by just being consistent.Shall we rely on the fact that the implementation currently does not insert whitespace between tokens?On the output side, why not? The unit test is for this implementation after all.And you have to provide opCmp in order to put your own keys into an associative array, so why don't you get them out in order?That's because AA's use closed hashing with tree used to deal with col3sions.However, let's get things done: can the attached unittest be an acceptatble replacement for the broken one of std.json?Move line 35 and 38 up one place each and you can drop the check for permutations. -- ... <IXOYE><
Jul 21 2010