www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - dmd json file output

reply Walter Bright <newshound2 digitalmars.com> writes:
The current version is pretty verbose. For:

     int ***x;

it will emit as the type:

"type" : {
         "kind" : "pointer",
         "pretty" : "int***",
         "targetType" : {
                 "kind" : "pointer",
                 "pretty" : "int**",
                 "targetType" : {
                         "kind" : "pointer",
                         "pretty" : "int*",
                         "targetType" : {
                                 "kind" : "int",
                                 "pretty" : "int"
                         }
                 }
         }
}

I find this to be excessive, and it helps to produce truly gigantic .json
files. 
I think it's better to just put out the deco for the type:

"type" : "PPPi"

But, you might say, that is not user friendly! Nope, it isn't. But the .json 
output is for a machine to read, not humans, and the deco types are very space 
efficient, and are trivial to convert to whatever data structure the reader 
needs. Much easier than the verbose thing.

What do you think?
Jan 20 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-01-21 08:27, Walter Bright wrote:
 The current version is pretty verbose. For:
 I find this to be excessive, and it helps to produce truly gigantic
 .json files. I think it's better to just put out the deco for the type:

 "type" : "PPPi"

 But, you might say, that is not user friendly! Nope, it isn't. But the
 .json output is for a machine to read, not humans, and the deco types
 are very space efficient, and are trivial to convert to whatever data
 structure the reader needs. Much easier than the verbose thing.

 What do you think?
Is there any documentation for these, or do we have to find it in the compiler sources? -- /Jacob Carlborg
Jan 20 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/20/2013 11:42 PM, Jacob Carlborg wrote:
 Is there any documentation for these, or do we have to find it in the compiler
 sources?
The PPPi is documented in the page on the ABI.
Jan 21 2013
prev sibling next sibling parent reply kenji hara <k.hara.pg gmail.com> writes:
I think there is no problem.

Kenji Hara


2013/1/21 Walter Bright <newshound2 digitalmars.com>

 The current version is pretty verbose. For:

     int ***x;

 it will emit as the type:

 "type" : {
         "kind" : "pointer",
         "pretty" : "int***",
         "targetType" : {
                 "kind" : "pointer",
                 "pretty" : "int**",
                 "targetType" : {
                         "kind" : "pointer",
                         "pretty" : "int*",
                         "targetType" : {
                                 "kind" : "int",
                                 "pretty" : "int"
                         }
                 }
         }
 }

 I find this to be excessive, and it helps to produce truly gigantic .json
 files. I think it's better to just put out the deco for the type:

 "type" : "PPPi"

 But, you might say, that is not user friendly! Nope, it isn't. But the
 .json output is for a machine to read, not humans, and the deco types are
 very space efficient, and are trivial to convert to whatever data structure
 the reader needs. Much easier than the verbose thing.

 What do you think?
Jan 20 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/20/2013 11:50 PM, kenji hara wrote:
 I think there is no problem.
No problem with which scheme?
Jan 21 2013
parent kenji hara <k.hara.pg gmail.com> writes:
Changing output data to mangled name is no problem. It provides enough
informations for the machine readable.

Kenji Hara


2013/1/21 Walter Bright <newshound2 digitalmars.com>

 On 1/20/2013 11:50 PM, kenji hara wrote:

 I think there is no problem.
No problem with which scheme?
Jan 21 2013
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Sun, 20 Jan 2013 23:27:57 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 
 I find this to be excessive, and it helps to produce truly
 gigantic .json files. I think it's better to just put out the deco
 for the type:
 
 "type" : "PPPi"
 
 But, you might say, that is not user friendly! Nope, it isn't. But
 the .json output is for a machine to read, not humans, and the deco
 types are very space efficient, and are trivial to convert to
 whatever data structure the reader needs. Much easier than the
 verbose thing.
 
 What do you think?
How about compressing the json file (lzma)? Should be just as space efficient, can be easily translated to user readable output (uncompress), also trivial to read for machines. And it also compresses the whitespace characters and other text. https://github.com/D-Programming-Deimos/liblzma
Jan 21 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 1/21/13, Walter Bright <newshound2 digitalmars.com> wrote:
 I think it's better to just put out the deco for the type:

 "type" : "PPPi"
It seems the simplest to implement. And core.demangle can be used to get a string representation, which could eliminate the need for the 'pretty' field? FWIW the way this is done for C++ typeinfo in .xml files is: <Variable id="_4" name="foo" type="_3"> <PointerType id="_3" type="_2" size="32" align="32"/> <PointerType id="_2" type="_1" size="32" align="32"/> <PointerType id="_1" type="_0" size="32" align="32"/> <FundamentalType id="_0" name="int" size="32" align="32"/> And then another variable such as PPi would have the type field set to _1. But it would probably be overkill to try to do this for Json right now, PPPi is a simple solution.
Jan 21 2013
prev sibling next sibling parent reply ric <negerns gmail.com> writes:
Would it be reasonable to put an option whether to produce the (too) 
verbose json output or the minimal one?
Jan 21 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/21/2013 10:56 PM, ric wrote:
 Would it be reasonable to put an option whether to produce the (too) verbose
 json output or the minimal one?
I'd rather we make a decision.
Jan 21 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/22/13 2:48 AM, Walter Bright wrote:
 On 1/21/2013 10:56 PM, ric wrote:
 Would it be reasonable to put an option whether to produce the (too)
 verbose
 json output or the minimal one?
I'd rather we make a decision.
Verbose should probably be it. Andrei
Jan 22 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:
 On 1/22/13 2:48 AM, Walter Bright wrote:
 On 1/21/2013 10:56 PM, ric wrote:
 Would it be reasonable to put an option whether to produce the (too)
 verbose
 json output or the minimal one?
I'd rather we make a decision.
Verbose should probably be it.
Rationale?
Jan 22 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/22/13 3:36 PM, Walter Bright wrote:
 On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:
 On 1/22/13 2:48 AM, Walter Bright wrote:
 On 1/21/2013 10:56 PM, ric wrote:
 Would it be reasonable to put an option whether to produce the (too)
 verbose
 json output or the minimal one?
I'd rather we make a decision.
Verbose should probably be it.
Rationale?
You can always filter out the verboseness with a simple program, but you can't add missing information. If the efficiency of generating json ever comes up, _then_ it's worth looking into an option that produces less verbose output directly. For now be verbose and let downstream tools filter it out. Andrei
Jan 22 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/22/2013 9:42 PM, Andrei Alexandrescu wrote:
 On 1/22/13 3:36 PM, Walter Bright wrote:
 On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:
 On 1/22/13 2:48 AM, Walter Bright wrote:
 On 1/21/2013 10:56 PM, ric wrote:
 Would it be reasonable to put an option whether to produce the (too)
 verbose
 json output or the minimal one?
I'd rather we make a decision.
Verbose should probably be it.
Rationale?
You can always filter out the verboseness with a simple program, but you can't add missing information. If the efficiency of generating json ever comes up, _then_ it's worth looking into an option that produces less verbose output directly. For now be verbose and let downstream tools filter it out.
Using the deco string is not missing information - and it's easier to parse it and manipulate it.
Jan 22 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-01-23 06:45, Walter Bright wrote:

 Using the deco string is not missing information - and it's easier to
 parse it and manipulate it.
I vote for the suggestion by Rainer: "type" : { "mangled" : "PPPi", "pretty" : "int***", } -- /Jacob Carlborg
Jan 22 2013
prev sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 23.01.2013 06:42, Andrei Alexandrescu wrote:
 On 1/22/13 3:36 PM, Walter Bright wrote:
 On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:
 On 1/22/13 2:48 AM, Walter Bright wrote:
 On 1/21/2013 10:56 PM, ric wrote:
 Would it be reasonable to put an option whether to produce the (too)
 verbose
 json output or the minimal one?
I'd rather we make a decision.
Verbose should probably be it.
Rationale?
You can always filter out the verboseness with a simple program, but you can't add missing information. If the efficiency of generating json ever comes up, _then_ it's worth looking into an option that produces less verbose output directly. For now be verbose and let downstream tools filter it out. Andrei
I updated dmd from github and had a look at the current json output: it's horrible. Below is a random example of a simple function. - the function parameters are listed three times with different type information - "originalType" seems to be always shown, even though it probably was meant to if it is different from "type" - if the parameter identifiers are listed separately anyway, they should not be part of the type while the types do not have to be repeated n the actual parameter list - package and module are specified inconsistently, sometimes as an array of strings, sometimes in dot-notation, sometimes not at all. - types are sometimes shown expanded, sometimes not (e.g. "string") - template instantiations from imported source files are listed - functions and template instantiations that are only used at compile time are listed - I appreciate that some missing information has been added, like imports and storage class - renamed imports don't show the original module name - functions implemented through template mixins are not listed - surprisingly the average output has only become about 10 times larger for a medium sized project like Visual D (73 MB instead of 8 MB). Having only std.json available for reading it, I suspect it will definitely have an impact on IDE performance, though. I understand that most of these issues are QOI issues but it also seems that there is also a shift in the target usage of the JSON output. It was a means for source code browsing with output similar to generated di files, while it is now showing everything written into object files similar to debug info. Some of this can easily be filtered out (e.g. "template instance") but not all (e.g. functions from other modules only used in CTFE). So I think that we should remove excessive bloat (e.g. always specify package and module lists in dot notation), make output more consistent and avoid listing the same type again and again. If a type is specified by its mangled name in declarations, add it to a dictionary at the end of the json file in its full verbosity. (I agree core.demangle does not help you if you want to do anything more than just getting the pretty type string). Please be aware that you will have to document the JSON type format in addition to the existing name mangling, though. Rainer -------- JSON output for void setAttribute(Element elem, string attr, string val); dmd 2.061: { "name" : "setAttribute", "kind" : "function", "protection" : "public", "type" : "void(Element elem, string attr, string val)", "line" : 37} , dmd 2.062alpha: { "name" : "setAttribute", "kind" : "function", "loc" : { "line" : 37 }, "module" : { "name" : "xmlwrap", "kind" : "module", "package" : [ "visuald" ], "prettyName" : "visuald.xmlwrap" }, "type" : { "kind" : "function", "pretty" : "void(Element elem, string attr, string val)", "returnType" : { "kind" : "void", "pretty" : "void" }, "parameters" : [ { "name" : "elem", "type" : { "kind" : "class", "pretty" : "std.xml.Element" } }, { "name" : "attr", "type" : { "kind" : "darray", "pretty" : "string", "elementType" : { "kind" : "char", "pretty" : "immutable(char)", "modifiers" : " immutable" } } }, { "name" : "val", "type" : { "kind" : "darray", "pretty" : "string", "elementType" : { "kind" : "char", "pretty" : "immutable(char)", "modifiers" : " immutable" } } } ] }, "originalType" : { "kind" : "function", "pretty" : "void(Element elem, string attr, string val)", "returnType" : { "kind" : "void", "pretty" : "void" }, "parameters" : [ { "name" : "elem", "type" : { "kind" : "identifier", "pretty" : "Element", "idents" : [], "rawIdentifier" : "Element", "identifier" : "Element" } }, { "name" : "attr", "type" : { "kind" : "identifier", "pretty" : "string", "idents" : [], "rawIdentifier" : "string", "identifier" : "string" } }, { "name" : "val", "type" : { "kind" : "identifier", "pretty" : "string", "idents" : [], "rawIdentifier" : "string", "identifier" : "string" } } ] }, "parameters" : [ { "name" : "elem", "kind" : "variable", "loc" : { "line" : 37 }, "module" : { "name" : "xmlwrap", "kind" : "module", "package" : [ "visuald" ], "prettyName" : "visuald.xmlwrap" }, "type" : { "kind" : "class", "pretty" : "std.xml.Element" } }, { "name" : "attr", "kind" : "variable", "loc" : { "line" : 37 }, "module" : { "name" : "xmlwrap", "kind" : "module", "package" : [ "visuald" ], "prettyName" : "visuald.xmlwrap" }, "type" : { "kind" : "darray", "pretty" : "string", "elementType" : { "kind" : "char", "pretty" : "immutable(char)", "modifiers" : " immutable" } } }, { "name" : "val", "kind" : "variable", "loc" : { "line" : 37 }, "module" : { "name" : "xmlwrap", "kind" : "module", "package" : [ "visuald" ], "prettyName" : "visuald.xmlwrap" }, "type" : { "kind" : "darray", "pretty" : "string", "elementType" : { "kind" : "char", "pretty" : "immutable(char)", "modifiers" : " immutable" } } } ], "endloc" : { "line" : 40 } },
Jan 26 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/26/2013 2:25 AM, Rainer Schuetze wrote:
 I updated dmd from github and had a look at the current json output: it's
 horrible. Below is a random example of a simple function.
Yeah, it's pretty bad.
Jan 26 2013
prev sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 21.01.2013 08:27, Walter Bright wrote:
 The current version is pretty verbose. For:

      int ***x;

 it will emit as the type:

 "type" : {
          "kind" : "pointer",
          "pretty" : "int***",
          "targetType" : {
                  "kind" : "pointer",
                  "pretty" : "int**",
                  "targetType" : {
                          "kind" : "pointer",
                          "pretty" : "int*",
                          "targetType" : {
                                  "kind" : "int",
                                  "pretty" : "int"
                          }
                  }
          }
 }

 I find this to be excessive, and it helps to produce truly gigantic
 .json files. I think it's better to just put out the deco for the type:

 "type" : "PPPi"

 But, you might say, that is not user friendly! Nope, it isn't. But the
 .json output is for a machine to read, not humans, and the deco types
 are very space efficient, and are trivial to convert to whatever data
 structure the reader needs. Much easier than the verbose thing.

 What do you think?
I agree the verbose output is overkill. Considering that the demangling in druntime still has a number of open issues (e.g. http://d.puremagic.com/issues/show_bug.cgi?id=3034, http://d.puremagic.com/issues/show_bug.cgi?id=6045) and that there are ambiguities in the name mangling (e.g. http://d.puremagic.com/issues/show_bug.cgi?id=5957, http://d.puremagic.com/issues/show_bug.cgi?id=4268), my first reaction was that it might be better to provide a function to parse the pretty type. It is not too difficult and would be a nice start for the lexer/parser topic, but might be burdened with new bugs. Considering function types, the deco does not contain any function argument identifiers anymore, but these are very useful for tooltips in an IDE like Visual D. As a compromise, the type chould just contain the mangled and the pretty name:
 "type" : {
          "mangled" : "PPPi",
          "pretty" : "int***",
 }
Jan 22 2013
next sibling parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 22.01.2013 09:02, schrieb Rainer Schuetze:
 
 Considering function types, the deco does not contain any function argument
identifiers anymore, but
 these are very useful for tooltips in an IDE like Visual D.
 
I thought so, too. But considering that types are always subject to this problem: --- alias StateCallback = void function(int state); static assert(StateCallback.stringof == "void function(int state)"); alias IndexCallback = void function(int index); static assert(IndexCallback.stringof == "void function(int state)"); // still "state" --- ... it may be better to not even make it possible to fall into this trap by excluding them. Except if I'm wrong and the JSON output happens at an earlier stage where the parameter name information is still tagged to the declaration, of course.
Jan 22 2013
prev sibling parent reply "Tove" <tove fransson.se> writes:
On Tuesday, 22 January 2013 at 08:02:26 UTC, Rainer Schuetze 
wrote:
 "type" : {
          "mangled" : "PPPi",
          "pretty" : "int***",
 }
I would favour plain "type" : "int***". Java... etc and the generic tools may be able to handle json from multiple languages, and in this context have no reason to use differently mangled types for different languages. "int***" is both compact and easy enough to parse anyway. Even for pure D-based tools, for unit-test reasons it could be useful to have the pretty name to compare against, thus Rainer's proposal is a reasonable compromise.
Jan 22 2013
parent reply =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 22.01.2013 18:05, schrieb Tove:
 (...)
 
 "int***" is both compact and easy enough to parse anyway.
 
Consider "int[4u] delegate(scope float*[void function(scope int)] p1, Rebindable!(const(C))*[]* b)" There are actually quite some things to parse in human readable type strings, I even remember some expressions. And parsing this is at least as language specific as the mangled name. But I agree that having both should be a good compromise.
Jan 22 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-01-22 20:53, Sönke Ludwig wrote:

 Consider "int[4u] delegate(scope float*[void function(scope int)] p1,
Rebindable!(const(C))*[]* b)"

 There are actually quite some things to parse in human readable type strings,
I even remember some
 expressions. And parsing this is at least as language specific as the mangled
name. But I agree that
 having both should be a good compromise.
This wouldn't be fun to parse. It basically requires a front end. -- /Jacob Carlborg
Jan 22 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/23/13 2:41 AM, Jacob Carlborg wrote:
 On 2013-01-22 20:53, Sönke Ludwig wrote:

 Consider "int[4u] delegate(scope float*[void function(scope int)] p1,
 Rebindable!(const(C))*[]* b)"

 There are actually quite some things to parse in human readable type
 strings, I even remember some
 expressions. And parsing this is at least as language specific as the
 mangled name. But I agree that
 having both should be a good compromise.
This wouldn't be fun to parse. It basically requires a front end.
If we need a secondary parser to slice and dice the json output, we failed producing good json output. Andrei
Jan 22 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-01-23 08:58, Andrei Alexandrescu wrote:

 If we need a secondary parser to slice and dice the json output, we
 failed producing good json output.
That's what I'm saying. Just use what Rainer suggested: "type" : { "mangled" : "PPPi", "pretty" : "int***", } -- /Jacob Carlborg
Jan 23 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/23/13 11:07 AM, Jacob Carlborg wrote:
 On 2013-01-23 08:58, Andrei Alexandrescu wrote:

 If we need a secondary parser to slice and dice the json output, we
 failed producing good json output.
That's what I'm saying. Just use what Rainer suggested: "type" : { "mangled" : "PPPi", "pretty" : "int***", }
Yes please. Err on the side of verboseness as long as filtering out the unnecessary output is easy. Andrei
Jan 23 2013
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/23/2013 05:07 PM, Jacob Carlborg wrote:
 On 2013-01-23 08:58, Andrei Alexandrescu wrote:

 If we need a secondary parser to slice and dice the json output, we
 failed producing good json output.
That's what I'm saying. Just use what Rainer suggested: "type" : { "mangled" : "PPPi", "pretty" : "int***", }
That still requires at least one of two secondary parsers.
Jan 23 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-01-23 18:09, Timon Gehr wrote:

 That still requires at least one of two secondary parsers.
Technically yes, but there's already a demangler available in Phobos/druntime. -- /Jacob Carlborg
Jan 23 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2013 12:02 PM, Jacob Carlborg wrote:
 On 2013-01-23 18:09, Timon Gehr wrote:

 That still requires at least one of two secondary parsers.
Technically yes, but there's already a demangler available in Phobos/druntime.
Yup. The "pretty" attribute is completely redundant.
Jan 23 2013
prev sibling parent reply "Nathan M. Swan" <nathanmswan gmail.com> writes:
On Wednesday, 23 January 2013 at 20:02:36 UTC, Jacob Carlborg 
wrote:
 On 2013-01-23 18:09, Timon Gehr wrote:

 That still requires at least one of two secondary parsers.
Technically yes, but there's already a demangler available in Phobos/druntime.
Not every program using the json output will be in D (especially IDEs). NMS
Jan 23 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-01-23 22:38, Nathan M. Swan wrote:

 Not every program using the json output will be in D (especially IDEs).
That's a good point. I don't think it hurts to have both. It's still a lot less code/text than the current format. -- /Jacob Carlborg
Jan 23 2013