www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Associative array printing problems

reply bearophile <bearophileHUGS lycos.com> writes:
Printing associative arrays in a decent way is not a luxury, it's a basic skill
I expect D writeln to have.

This is reduced from a program related to the coding Kata Nineteen, Word
Chains. The problem asks to create the longest chain of words. This function
creates an associative array where the keys are the start chars of words, and
the values are sets of words. For simplicity in D2 I have implemented the
string sets as bool[string].


import std.stdio, std.string;

bool[string][char] foo(string[] names) {
    typeof(return) result;
    foreach (name; names)
        result[name[0]][name] = true;
    return result;
}

auto names = "mary patricia linda barbara elizabeth jennifer
    maria susan margaret dorothy lisa nancy karen betty helen
    sandra donna carol ruth sharon michelle laura sarah
    kimberly deborah jessica shirley cynthia angela melissa
    brenda amy anna rebecca virginia kathleen pamela";

void main() {
    writeln(foo(names.split()));
}


The original complete program didn't have to print this result, but there I
have created a bug, so I have had to print result, as I have done in this
reduced program. This is the printout:


p:patricia:true pamela:true l:linda:true lisa:true laura:true d:dorothy:true
donna:true deborah:true h:helen:true m:melissa:true margaret:true michelle:true
maria:true mary:true e:elizabeth:true a:angela:true anna:true amy:true
b:barbara:true betty:true brenda:true j:jessica:true jennifer:true n:nancy:true
r:ruth:true rebecca:true v:virginia:true s:sharon:true susan:true shirley:true
sandra:true sarah:true k:kathleen:true karen:true kimberly:true c:cynthia:true
carol:true


For me this is very bad, I am not able to read it well. Nesting of dictionaries
produces a hard to read output.

To better show what I mean this a Python2.6 translation (here I have used true
sets, that are built-in, but the situation doesn't change a lot):

from collections import defaultdict

def foo(names):
    result = defaultdict(set)
    for name in names:
        result[name[0]].add(name)
    return result

names = """mary patricia linda barbara elizabeth jennifer
    maria susan margaret dorothy lisa nancy karen betty helen
    sandra donna carol ruth sharon michelle laura sarah
    kimberly deborah jessica shirley cynthia angela melissa
    brenda amy anna rebecca virginia kathleen pamela"""

print foo(names.split())


Its textual output allows me to tell apart sub-dictionaries:

defaultdict(<type 'set'>, {'a': set(['amy', 'anna', 'angela']), 'c':
set(['carol', 'cynthia']), 'b': set(['barbara', 'betty', 'brenda']), 'e':
set(['elizabeth']), 'd': set(['dorothy', 'donna', 'deborah']), 'h':
set(['helen']), 'k': set(['kathleen', 'kimberly', 'karen']), 'j':
set(['jessica', 'jennifer']), 'm': set(['margaret', 'melissa', 'michelle',
'mary', 'maria']), 'l': set(['laura', 'linda', 'lisa']), 'n': set(['nancy']),
'p': set(['pamela', 'patricia']), 's': set(['sarah', 'sharon', 'sandra',
'shirley', 'susan']), 'r': set(['ruth', 'rebecca']), 'v': set(['virginia'])})


Using pprint (pretty print) from the Python standard library it improves:
from pprint import pprint
pprint(dict(foo(names.split())))


{'a': set(['amy', 'angela', 'anna']),
 'b': set(['barbara', 'betty', 'brenda']),
 'c': set(['carol', 'cynthia']),
 'd': set(['deborah', 'donna', 'dorothy']),
 'e': set(['elizabeth']),
 'h': set(['helen']),
 'j': set(['jennifer', 'jessica']),
 'k': set(['karen', 'kathleen', 'kimberly']),
 'l': set(['laura', 'linda', 'lisa']),
 'm': set(['margaret', 'maria', 'mary', 'melissa', 'michelle']),
 'n': set(['nancy']),
 'p': set(['pamela', 'patricia']),
 'r': set(['rebecca', 'ruth']),
 's': set(['sandra', 'sarah', 'sharon', 'shirley', 'susan']),
 'v': set(['virginia'])}


This is even better:

{'a': {"amy", "angela", "anna"},
 'b': {"barbara", "betty", "brenda"},
 'c': {"carol", "cynthia"},
 'd': {"deborah", "donna", "dorothy"},
 'e': {"elizabeth"},
 'h': {"helen"},
 'j': {"jennifer", "jessica"},
 'k': {"karen", "kathleen", "kimberly"},
 'l': {"laura", "linda", "lisa"},
 'm': {"margaret", "maria", "mary", "melissa", "michelle"},
 'n': {"nancy"},
 'p': {"pamela", "patricia"},
 'r': {"rebecca", "ruth"},
 's': {"sandra", "sarah", "sharon", "shirley", "susan"},
 'v': {"virginia"}
}


If you want a more apples-to-apples comparison this is Python code that uses
the same data structure used by the D code:

from collections import defaultdict

def foo(names):
    result = defaultdict(dict)
    for name in names:
        result[name[0]][name] = True
    return result

names = """mary patricia linda barbara elizabeth jennifer
    maria susan margaret dorothy lisa nancy karen betty helen
    sandra donna carol ruth sharon michelle laura sarah
    kimberly deborah jessica shirley cynthia angela melissa
    brenda amy anna rebecca virginia kathleen pamela"""

print foo(names.split())


Its textual output:

defaultdict(<type 'dict'>, {'a': {'amy': True, 'anna': True, 'angela': True},
'c': {'carol': True, 'cynthia': True}, 'b': {'barbara': True, 'betty': True,
'brenda': True}, 'e': {'elizabeth': True}, 'd': {'dorothy': True, 'donna':
True, 'deborah': True}, 'h': {'helen': True}, 'k': {'kathleen': True,
'kimberly': True, 'karen': True}, 'j': {'jessica': True, 'jennifer': True},
'm': {'margaret': True, 'melissa': True, 'michelle': True, 'mary': True,
'maria': True}, 'l': {'laura': True, 'linda': True, 'lisa': True}, 'n':
{'nancy': True}, 'p': {'pamela': True, 'patricia': True}, 's': {'sarah': True,
'sharon': True, 'sandra': True, 'shirley': True, 'susan': True}, 'r': {'ruth':
True, 'rebecca': True}, 'v': {'virginia': True}})


Using pprint:

{'a': {'amy': True, 'angela': True, 'anna': True},
 'b': {'barbara': True, 'betty': True, 'brenda': True},
 'c': {'carol': True, 'cynthia': True},
 'd': {'deborah': True, 'donna': True, 'dorothy': True},
 'e': {'elizabeth': True},
 'h': {'helen': True},
 'j': {'jennifer': True, 'jessica': True},
 'k': {'karen': True, 'kathleen': True, 'kimberly': True},
 'l': {'laura': True, 'linda': True, 'lisa': True},
 'm': {'margaret': True,
       'maria': True,
       'mary': True,
       'melissa': True,
       'michelle': True},
 'n': {'nancy': True},
 'p': {'pamela': True, 'patricia': True},
 'r': {'rebecca': True, 'ruth': True},
 's': {'sandra': True,
       'sarah': True,
       'sharon': True,
       'shirley': True,
       'susan': True},
 'v': {'virginia': True}}


Even without pprint the printout of the default dict is usable for my debugging
because it allows me to tell apart the sub-dictionaries. Another help comes
from using "" and '' around chars and strings present inside collections.


A prettyPrint() function in Phobos will help, but first in D2 I'd like
writeln() to print that D data structure more or less like this:

['a': ["amy": true, "anna": true, "angela": true], 'c': ["carol": true,
"cynthia": true], 'b': ["barbara": true, "betty": true, "brenda": true], 'e':
["elizabeth": true], 'd': ["dorothy": true, "donna": true, "deborah": true],
'h': ["helen": true], 'k': ["kathleen": true, "kimberly": true, "karen": true],
'j': ["jessica": true, "jennifer": true], 'm': ["margaret": true, "melissa":
true, "michelle": true, "mary": true, "maria": true], 'l': ["laura": true,
"linda": true, "lisa": true], 'n': ["nancy": true], 'p': ["pamela": true,
"patricia": true], 's': ["sarah": true, "sharon": true, "sandra": true,
"shirley": true, "susan": true], 'r': ["ruth": true, "rebecca": true], 'v':
["virginia": true]]


This is allows me to use the printout for debugging, especially when I reduce
the number of names for debugging purposes:

['a': ["amy": true, "anna": true], 'c': ["carol": true], 'b': ["barbara": true,
"betty": true], 'd': ["dorothy": true], 'k': ["kathleen": true, "karen": true],
's': ["sandra": true, "shirley": true], 'v': ["virginia": true]]

Bye,
bearophile
Jun 22 2011
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 23, 11 02:45, bearophile wrote:
 Printing associative arrays in a decent way is not a luxury, it's a basic
skill I expect D writeln to have.

 This is reduced from a program related to the coding Kata Nineteen, Word
Chains. The problem asks to create the longest chain of words. This function
creates an associative array where the keys are the start chars of words, and
the values are sets of words. For simplicity in D2 I have implemented the
string sets as bool[string].

 Bye,
 bearophile

Basically this is a reiteration of http://d.puremagic.com/issues/show_bug.cgi?id=4605
Jun 22 2011
parent bearophile <bearophileHUGS lycos.com> writes:
KennyTM~:

 Basically this is a reiteration of 
 http://d.puremagic.com/issues/show_bug.cgi?id=4605

Thank you, I have added the note there. The notes show the problems when you want to print nested associative arrays. Bye, bearophile
Jun 22 2011
prev sibling parent KennyTM~ <kennytm gmail.com> writes:
On Jun 23, 11 02:45, bearophile wrote:
 Printing associative arrays in a decent way is not a luxury, it's a basic
skill I expect D writeln to have.

 This is reduced from a program related to the coding Kata Nineteen, Word
Chains. The problem asks to create the longest chain of words. This function
creates an associative array where the keys are the start chars of words, and
the values are sets of words. For simplicity in D2 I have implemented the
string sets as bool[string].


 import std.stdio, std.string;

 bool[string][char] foo(string[] names) {
      typeof(return) result;
      foreach (name; names)
          result[name[0]][name] = true;
      return result;
 }

 auto names = "mary patricia linda barbara elizabeth jennifer
      maria susan margaret dorothy lisa nancy karen betty helen
      sandra donna carol ruth sharon michelle laura sarah
      kimberly deborah jessica shirley cynthia angela melissa
      brenda amy anna rebecca virginia kathleen pamela";

 void main() {
      writeln(foo(names.split()));
 }


 The original complete program didn't have to print this result, but there I
have created a bug, so I have had to print result, as I have done in this
reduced program. This is the printout:


 p:patricia:true pamela:true l:linda:true lisa:true laura:true d:dorothy:true
donna:true deborah:true h:helen:true m:melissa:true margaret:true michelle:true
maria:true mary:true e:elizabeth:true a:angela:true anna:true amy:true
b:barbara:true betty:true brenda:true j:jessica:true jennifer:true n:nancy:true
r:ruth:true rebecca:true v:virginia:true s:sharon:true susan:true shirley:true
sandra:true sarah:true k:kathleen:true karen:true kimberly:true c:cynthia:true
carol:true

 Bye,
 bearophile

Workaround: writeln(to!string(foo(names.split()))); this gives [p:[patricia:true, pamela:true], l:[linda:true, lisa:true, laura:true], d:[dorothy:true, donna:true, deborah:true], h:[helen:true], m:[melissa:true, margaret:true, michelle:true, maria:true, mary:true], e:[elizabeth:true], a:[angela:true, anna:true, amy:true], b:[barbara:true, betty:true, brenda:true], j:[jessica:true, jennifer:true], n:[nancy:true], r:[ruth:true, rebecca:true], v:[virginia:true], s:[sharon:true, susan:true, shirley:true, sandra:true, sarah:true], k:[kathleen:true, karen:true, kimberly:true], c:[cynthia:true, carol:true]] Not the same as what you expect, but at least it's readable enough. But a big problem is std.stdio.writeln, std.format.formattedWrite, std.conv.to all use different mechanism to write a string, as illustrated in: import std.stdio, std.string, std.conv; void main() { int[int] x = [5:6,7:8]; writeln(x); writeln(format("%s", x)); writeln(to!string(x)); } output: 5:6 7:8 [5:6,7:8] [5:6, 7:8]
Jun 22 2011