www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Fastest JSON parser in the world is a D project

reply Marco Leise <Marco.Leise gmx.de> writes:
JSON parsing in D has come a long way, especially when you
look at it from the efficiency angle as a popular benchmark
does that has been forked by well known D contributers like
Martin Nowak or S=C3=B6nke Ludwig.

The test is pretty simple: Parse a JSON object, containing an
array of 1_000_000 3D coordinates in the range [0..1) and
average them.

The performance of std.json in parsing those was horrible
still in the DMD 2.066 days*:

DMD     : 41.44s,  934.9Mb
Gdc     : 29.64s,  929.7Mb
Python  : 12.30s, 1410.2Mb
Ruby    : 13.80s, 2101.2Mb

Then with 2.067 std.json got a major 3x speed improvement and
rivaled the popular dynamic languages Ruby and Python:

DMD     : 13.02s, 1324.2Mb

In the mean time several other D JSON libraries appeared with
varying focus on performance or API:

Medea         : 56.75s, 1753.6Mb  (GDC)
libdjson      : 24.47s, 1060.7Mb  (GDC)
stdx.data.json:  2.76s,  207.1Mb  (LDC)

Yep, that's right. stdx.data.json's pull parser finally beats
the dynamic languages with native efficiency. (I used the
default options here that provide you with an Exception and
line number on errors.)

A few days ago I decided to get some practical use out of my
pet project 'fast' by implementing a JSON parser myself, that
could rival even the by then fastest JSON parser, RapidJSON.
The result can be seen in the benchmark results right now:

https://github.com/kostya/benchmarks#json

fast:	   0.34s, 226.7Mb (GDC)
RapidJSON: 0.79s, 687.1Mb (GCC)

(* Timings from my computer, Haswell CPU, Linux amd64.)

--=20
Marco
Oct 14 2015
next sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
fast.json usage:

UTF-8 and JSON validation of used portions by default:

    auto json =3D parseJSONFile("data.json");

Known good file input:

    auto json =3D parseTrustedJSONFile("data.json");
    auto json =3D parseTrustedJSON(`{"x":123}`);

Work with a single key from an object:

    json.singleKey!"someKey"
    json.someKey

Iteration:

    foreach (key; json.byKey)  // object by key
    foreach (idx; json)        // array by index

Remap member names:

     JsonRemap(["clazz", "class"])
    struct S { string clazz; }

     JsonRemap(["clazz", "class"])
    enum E { clazz; }

Example:

    double x =3D 0, y =3D 0, z =3D 0;
    auto json =3D parseTrustedJSON(`{ "coordinates": [ { "x": 1, "y": 2, "z=
": 3 }, =E2=80=A6 ] }`);

    foreach (idx; json.coordinates)
    {
        // Provide one function for each key you are interested in
        json.keySwitch!("x", "y", "z")(
                { x +=3D json.read!double; },
                { y +=3D json.read!double; },
                { z +=3D json.read!double; }
            );
    }

Features:
  - Loads double values in compliance with IEEE round-to-nearest
    (no precision loss in serialization->deserialization round trips)
  - UTF-8 validation of non-string input (file, ubyte[])
  - Currently fastest JSON parser money can buy
  - Reads strings, enums, integral types, double, bool, POD
    structs consisting of those and pointers to such structs

Shortcomings:
  - Rejects numbers with exponents of huge magnitude (>=3D10^28)
  - Only works on Posix x86/amd64 systems
  - No write capabilities
  - Data size limited by available contiguous virtual memory

--=20
Marco
Oct 14 2015
next sibling parent reply Idan Arye <GenericNPC gmail.com> writes:
On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
     auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, 
 "y": 2, "z": 3 }, … ] }`);
I assume parseTrustedJSON is not validating? Did you use it in the benchmark? And were the competitors non-validating as well?
Oct 14 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 14 Oct 2015 07:55:18 +0000
schrieb Idan Arye <GenericNPC gmail.com>:

 On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
     auto json =3D parseTrustedJSON(`{ "coordinates": [ { "x": 1,=20
 "y": 2, "z": 3 }, =E2=80=A6 ] }`);
=20 I assume parseTrustedJSON is not validating? Did you use it in=20 the benchmark? And were the competitors non-validating as well?
That is correct. For the benchmark parseJSONFile was used though, which validates UTF-8 and JSON in the used portions. That probably renders your third question superfluous. I wouldn't know anyways, but am inclined to think they all validate the entire JSON and some may skip UTF-8 validation, which is a low cost operation in this ASCII file anyways. --=20 Marco
Oct 14 2015
prev sibling next sibling parent reply Rory McGuire via Digitalmars-d-announce writes:
On Wed, Oct 14, 2015 at 9:35 AM, Marco Leise via Digitalmars-d-announce <
digitalmars-d-announce puremagic.com> wrote:

 Features:
   - Loads double values in compliance with IEEE round-to-nearest
     (no precision loss in serialization->deserialization round trips)
   - UTF-8 validation of non-string input (file, ubyte[])
   - Currently fastest JSON parser money can buy
   - Reads strings, enums, integral types, double, bool, POD
     structs consisting of those and pointers to such structs
Does this version handle real world JSON? I've keep getting problems with vibe and JSON because web browsers will automatically make a "1" into a 1 which then causes exceptions in vibe. Does yours do lossless conversions automatically?
Oct 14 2015
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 14 Oct 2015 10:22:37 +0200
schrieb Rory McGuire via Digitalmars-d-announce
<digitalmars-d-announce puremagic.com>:

 Does this version handle real world JSON?
 
 I've keep getting problems with vibe and JSON because web browsers will
 automatically make a "1" into a 1 which then causes exceptions in vibe.
 
 Does yours do lossless conversions automatically? 
No I don't read numbers as strings. Could the client JavaScript be fixed? I fail to see why the conversion would happen automatically when the code could explicitly check for strings before doing math with the value "1". What do I miss? -- Marco
Oct 14 2015
parent reply Rory McGuire via Digitalmars-d-announce writes:
In browser JSON.serialize is the usual way to serialize JSON values.
The problem is that on D side if one does deserialization of an object or
struct. If the types inside the JSON don't match exactly then vibe freaks
out.

Another problem with most D JSON implementations is that they don't support
proper JSON, e.g. outputting nan as though it was a valid value etc...
browsers don't like that stuff.

For me the best D JSON implementation at the moment is actually jsvar by
Adam and its not even a JSON parser, it just has that as a needed feature.
It feels slow though I haven't benchmarked, but if I run it over a couple
of Gigs of data(Paged by 1000) it takes a long while.



On Thu, Oct 15, 2015 at 3:42 AM, Marco Leise via Digitalmars-d-announce <
digitalmars-d-announce puremagic.com> wrote:

 Am Wed, 14 Oct 2015 10:22:37 +0200
 schrieb Rory McGuire via Digitalmars-d-announce
 <digitalmars-d-announce puremagic.com>:

 Does this version handle real world JSON?

 I've keep getting problems with vibe and JSON because web browsers will
 automatically make a "1" into a 1 which then causes exceptions in vibe.

 Does yours do lossless conversions automatically?
No I don't read numbers as strings. Could the client JavaScript be fixed? I fail to see why the conversion would happen automatically when the code could explicitly check for strings before doing math with the value "1". What do I miss? -- Marco
Oct 15 2015
parent reply =?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig rejectedsoftware.com> writes:
Am 15.10.2015 um 13:06 schrieb Rory McGuire via Digitalmars-d-announce:
 In browser JSON.serialize is the usual way to serialize JSON values.
 The problem is that on D side if one does deserialization of an object
 or struct. If the types inside the JSON don't match exactly then vibe
 freaks out.
For float and double fields, the serialization code should actually accept both, floating point and integer numbers: https://github.com/rejectedsoftware/vibe.d/blob/2fffd94d8516cd6f81c75d45a54c655626d36c6b/source/vibe/data/json.d#L1603 https://github.com/rejectedsoftware/vibe.d/blob/2fffd94d8516cd6f81c75d45a54c655626d36c6b/source/vibe/data/json.d#L1804 Do you have a test case for your error?
Oct 15 2015
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 15 Oct 2015 18:17:07 +0200
schrieb S=C3=B6nke Ludwig <sludwig rejectedsoftware.com>:

 Am 15.10.2015 um 13:06 schrieb Rory McGuire via Digitalmars-d-announce:
 In browser JSON.serialize is the usual way to serialize JSON values.
 The problem is that on D side if one does deserialization of an object
 or struct. If the types inside the JSON don't match exactly then vibe
 freaks out.
=20 For float and double fields, the serialization code should actually=20 accept both, floating point and integer numbers: =20 https://github.com/rejectedsoftware/vibe.d/blob/2fffd94d8516cd6f81c75d45a=
54c655626d36c6b/source/vibe/data/json.d#L1603
 https://github.com/rejectedsoftware/vibe.d/blob/2fffd94d8516cd6f81c75d45a=
54c655626d36c6b/source/vibe/data/json.d#L1804
=20
 Do you have a test case for your error?
=20 Well it is not an error. Rory originally wrote about conversions between "1" and 1 happening on the browser side. That would mean adding a quirks mode to any well-behaving JSON parser. In this case: "read numbers as strings". Hence I was asking if the data on the client could be fixed, e.g. the json number be turned into a string first before serialization. --=20 Marco
Oct 16 2015
parent reply =?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig rejectedsoftware.com> writes:
Am 16.10.2015 um 17:11 schrieb Marco Leise:
 Am Thu, 15 Oct 2015 18:17:07 +0200
 schrieb Sönke Ludwig <sludwig rejectedsoftware.com>:

 (...)
 Do you have a test case for your error?
Well it is not an error. Rory originally wrote about conversions between "1" and 1 happening on the browser side. That would mean adding a quirks mode to any well-behaving JSON parser. In this case: "read numbers as strings". Hence I was asking if the data on the client could be fixed, e.g. the json number be turned into a string first before serialization.
Okay, I obviously misread that as a once familiar issue. Maybe it indeed makes sense to add a "JavaScript" quirks mode that behaves exactly like a JavaScript interpreter would.
Oct 17 2015
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 17 Oct 2015 09:27:46 +0200
schrieb S=C3=B6nke Ludwig <sludwig rejectedsoftware.com>:

 Am 16.10.2015 um 17:11 schrieb Marco Leise:
 Am Thu, 15 Oct 2015 18:17:07 +0200
 schrieb S=C3=B6nke Ludwig <sludwig rejectedsoftware.com>:

 (...)
 Do you have a test case for your error?
Well it is not an error. Rory originally wrote about conversions between "1" and 1 happening on the browser side. That would mean adding a quirks mode to any well-behaving JSON parser. In this case: "read numbers as strings". Hence I was asking if the data on the client could be fixed, e.g. the json number be turned into a string first before serialization.
=20 Okay, I obviously misread that as a once familiar issue. Maybe it indeed=
=20
 makes sense to add a "JavaScript" quirks mode that behaves exactly like=20
 a JavaScript interpreter would.
Ok, but remember: https://www.youtube.com/watch?v=3D20BySC_6HyY And then think again. :D --=20 Marco
Oct 17 2015
parent reply =?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig rejectedsoftware.com> writes:
Am 17.10.2015 um 13:16 schrieb Marco Leise:
 Am Sat, 17 Oct 2015 09:27:46 +0200
 schrieb Sönke Ludwig <sludwig rejectedsoftware.com>:
 Okay, I obviously misread that as a once familiar issue. Maybe it indeed
 makes sense to add a "JavaScript" quirks mode that behaves exactly like
 a JavaScript interpreter would.
Ok, but remember: https://www.youtube.com/watch?v=20BySC_6HyY And then think again. :D
What about just naming it SerializationMode.WAT?
Oct 17 2015
parent Brad Anderson <eco gnuk.net> writes:
On Saturday, 17 October 2015 at 09:35:47 UTC, Sönke Ludwig wrote:
 Am 17.10.2015 um 13:16 schrieb Marco Leise:
 Am Sat, 17 Oct 2015 09:27:46 +0200
 schrieb Sönke Ludwig <sludwig rejectedsoftware.com>:
 Okay, I obviously misread that as a once familiar issue. 
 Maybe it indeed
 makes sense to add a "JavaScript" quirks mode that behaves 
 exactly like
 a JavaScript interpreter would.
Ok, but remember: https://www.youtube.com/watch?v=20BySC_6HyY And then think again. :D
What about just naming it SerializationMode.WAT?
At the very least that needs to be an undocumented alias easter egg. :)
Oct 17 2015
prev sibling next sibling parent reply Martin Nowak <code dawg.eu> writes:
On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
   - Data size limited by available contiguous virtual memory
Mmaping files for sequential reading is a very debatable choice, b/c the common use case is to read a file once. You should at least compare the numbers w/ drop_caches between each run. https://github.com/mleise/fast/blob/69923d5a69f67c21a37e5e2469fc34d60c9ec3e1/source/fast/json.d#L1441
Oct 17 2015
parent reply Daniel N <ufo orbiting.us> writes:
On Saturday, 17 October 2015 at 08:07:57 UTC, Martin Nowak wrote:
 On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise 
 wrote:
   - Data size limited by available contiguous virtual memory
Mmaping files for sequential reading is a very debatable choice, b/c the common use case is to read a file once. You should at least compare the numbers w/ drop_caches between each run.
It's a sensible choice together with appropriate madvise().
Oct 17 2015
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 17 October 2015 at 08:20:33 UTC, Daniel N wrote:
 On Saturday, 17 October 2015 at 08:07:57 UTC, Martin Nowak 
 wrote:
 On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise 
 wrote:
   - Data size limited by available contiguous virtual memory
Mmaping files for sequential reading is a very debatable choice, b/c the common use case is to read a file once. You should at least compare the numbers w/ drop_caches between each run.
It's a sensible choice together with appropriate madvise().
Mmap is very expensive, as it affects all cores, you need a realistic multithreaded aync benchmark on smaller files to see the effect.
Oct 17 2015
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 17 Oct 2015 08:29:24 +0000
schrieb Ola Fosheim Gr=C3=B8stad
<ola.fosheim.grostad+dlang gmail.com>:

 On Saturday, 17 October 2015 at 08:20:33 UTC, Daniel N wrote:
 On Saturday, 17 October 2015 at 08:07:57 UTC, Martin Nowak=20
 wrote:
 On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise=20
 wrote:
   - Data size limited by available contiguous virtual memory
Mmaping files for sequential reading is a very debatable=20 choice, b/c the common use case is to read a file once. You=20 should at least compare the numbers w/ drop_caches between=20 each run.
The results are: * The memory usage is then fixed at slightly more than the file size. (While it often stays below when using the disk cache.) * It would still be faster than copying the whole thing to a separate memory block. * Depending on whether the benchmark system uses a HDD or SSD, the numbers may be rendered meaningless by a 2 seconds wait on I/O. * Common case yes, but it is possible that you read JSON that had just been saved.
 It's a sensible choice together with appropriate madvise().
Obviously agreed :). Just that in practice (on my HDD system) it never made a difference in I/O bound sequential reads. So I removed posix_madvise.
 Mmap is very expensive, as it affects all cores, you need a=20
 realistic multithreaded aync benchmark on smaller files to see=20
 the effect.
That's valuable information. It is trivial to read into an allocated block when the file size is below a threshold. I would just need a rough file size. Are you talking about 4K pages or mega-bytes? 64 KiB maybe? --=20 Marco
Oct 17 2015
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 17 October 2015 at 09:30:47 UTC, Marco Leise wrote:
 It is trivial to read into an allocated block when the file 
 size is below a threshold. I would just need a rough file size. 
 Are you talking about 4K pages or mega-bytes? 64 KiB maybe?
Maybe, I guess you could just focus on what you think is the primary usage patterns for your library and benchmark those for different parameters. If you want to test processing of many small files combined with computationally/memory intensive tasks then you could try to construct a simple benchmark where you iterate over memory (M*cache 3 size) using a "realistic" pattern like brownian motion in N threads and also repeatedly/concurrently load JSON code for different file sizes so that the CPUs page table mechanisms are stressed by mmap, cache misses and (possibly) page faults.
Oct 17 2015
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 17 Oct 2015 11:12:08 +0000
schrieb Ola Fosheim Gr=C3=B8stad
<ola.fosheim.grostad+dlang gmail.com>:

 [=E2=80=A6] you could try to construct a simple benchmark where you
 iterate over memory (M*cache 3 size) using a "realistic" pattern
 like brownian motion in N threads and also repeatedly/concurrently
 load JSON code for different file sizes so that the CPUs page table
 mechanisms are stressed by mmap, cache misses and (possibly) page
 faults.
=20 O.O Are you kidding me? Just give me the correct value already. --=20 Marco
Oct 17 2015
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 17 October 2015 at 13:09:45 UTC, Marco Leise wrote:
 Am Sat, 17 Oct 2015 11:12:08 +0000
 schrieb Ola Fosheim Grøstad
 <ola.fosheim.grostad+dlang gmail.com>:

 […] you could try to construct a simple benchmark where you 
 iterate over memory (M*cache 3 size) using a "realistic" 
 pattern like brownian motion in N threads and also 
 repeatedly/concurrently load JSON code for different file 
 sizes so that the CPUs page table mechanisms are stressed by 
 mmap, cache misses and (possibly) page faults.
O.O Are you kidding me? Just give me the correct value already.
:-P
Oct 17 2015
prev sibling next sibling parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
 Example:

     double x = 0, y = 0, z = 0;
     auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, 
 "y": 2, "z": 3 }, … ] }`);

     foreach (idx; json.coordinates)
     {
         // Provide one function for each key you are interested 
 in
         json.keySwitch!("x", "y", "z")(
                 { x += json.read!double; },
                 { y += json.read!double; },
                 { z += json.read!double; }
             );
     }
How can `coordinates` member be known at compile-time when the input argument is a run-time string?
Oct 26 2015
parent reply wobbles <grogan.colin gmail.com> writes:
On Monday, 26 October 2015 at 20:04:33 UTC, Nordlöw wrote:
 On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise 
 wrote:
 Example:

     double x = 0, y = 0, z = 0;
     auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, 
 "y": 2, "z": 3 }, … ] }`);

     foreach (idx; json.coordinates)
     {
         // Provide one function for each key you are 
 interested in
         json.keySwitch!("x", "y", "z")(
                 { x += json.read!double; },
                 { y += json.read!double; },
                 { z += json.read!double; }
             );
     }
How can `coordinates` member be known at compile-time when the input argument is a run-time string?
I suspect through the opDispatch operator overload. http://dlang.org/operatoroverloading.html#dispatch
Oct 27 2015
parent reply Martin Nowak <code dawg.eu> writes:
On Tuesday, 27 October 2015 at 13:14:36 UTC, wobbles wrote:
 How can `coordinates` member be known at compile-time when the 
 input argument is a run-time string?
I suspect through the opDispatch operator overload. http://dlang.org/operatoroverloading.html#dispatch
Yikes, this is such an anti-pattern. https://github.com/rejectedsoftware/vibe.d/issues/634
Oct 27 2015
next sibling parent reply wobbles <grogan.colin gmail.com> writes:
On Tuesday, 27 October 2015 at 14:00:07 UTC, Martin Nowak wrote:
 On Tuesday, 27 October 2015 at 13:14:36 UTC, wobbles wrote:
 How can `coordinates` member be known at compile-time when 
 the input argument is a run-time string?
I suspect through the opDispatch operator overload. http://dlang.org/operatoroverloading.html#dispatch
Yikes, this is such an anti-pattern. https://github.com/rejectedsoftware/vibe.d/issues/634
Heh - yeah it is quite problematic. The only time I've needed to use it was when I was reading in Json with some structure like { [ { "timestamp" : { ... timestamp info ... }, "info1" : { ... info ...}, "info2" : { ... info ...}, . . "info 23" : { ... info ...} }, { < more of the above >} ] } and I wanted to be able get a Json[timestamp] map, where the Json is either a info1, info2 etc etc. I didn't want to write 23 different functions "hash_info1", "hash_info2" etc etc. So, opDispatch! Basically I wanted to hash the timestamp and some data. My opDispatch became: ignore auto opDispatch(string name)(){ static assert(name.startsWith("hash_"), "Error, use StatHosts.hash_XYZ to gather XYZ[timestamp] info"); static assert(name.length > 5); enum dataName = name[5..$]; typeof(mixin("StatDetail."~dataName))[StatTimestampDetail] data; foreach(stat; statistics){ data[stat.timestamp] = mixin("stat."~dataName); } return data; } 23 functions merged into 1... The static assert reduces the number of places it can break things at least, still some weird things can happen but for the most part it's ok. So yes - opDispatch is cool but should be used VERY sparingly.
Oct 28 2015
parent reply wobbles <grogan.colin gmail.com> writes:
On Wednesday, 28 October 2015 at 11:26:59 UTC, wobbles wrote:
 So yes - opDispatch is cool but should be used VERY sparingly.
I just had a thought, I could check if dataName is in [__traits(allMembers ... )]. That would at least ensure I'm referencing something that exists. Maybe that'd be useful in vibes Bson/Json code. (Except the opposite, you want to check you're referencing something that DOESN'T exist, so you can be sure it's not 'remove' for example).
Oct 28 2015
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 28 October 2015 at 11:32:22 UTC, wobbles wrote:
 I just had a thought, I could check if dataName is in 
 [__traits(allMembers ... )]. That would at least ensure I'm 
 referencing something that exists.
If the body doesn't compile, the opDispatch acts as if it doesn't exist anyway. (this makes debugging it a bit of a pain but also means you don't strictly need constraints on bodies that don't work)
Oct 28 2015
prev sibling next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 27 October 2015 at 14:00:07 UTC, Martin Nowak wrote:
 Yikes, this is such an anti-pattern.
 https://github.com/rejectedsoftware/vibe.d/issues/634
Every time I use opDispatch, I add an if(name != "popFront") constraint, at least (unless it is supposed to be forwarding). It helps with this a lot and think everyone should do it.
Oct 28 2015
parent Meta <jared771 gmail.com> writes:
On Wednesday, 28 October 2015 at 13:56:27 UTC, Adam D. Ruppe 
wrote:
 On Tuesday, 27 October 2015 at 14:00:07 UTC, Martin Nowak wrote:
 Yikes, this is such an anti-pattern.
 https://github.com/rejectedsoftware/vibe.d/issues/634
Every time I use opDispatch, I add an if(name != "popFront") constraint, at least (unless it is supposed to be forwarding). It helps with this a lot and think everyone should do it.
I would go even farther and say that one should never define opDispatch without a template constraint limiting which members can be dispatched on. It may be a bit radical but we could even go as far as outright deprecating unconstrained opDispatch. //Okay auto opDispatch(string member)() if (member == "get" || isVectorSwizzle!member) { //... } //Deprecated auto opDispatch(string member)() { //... }
Oct 28 2015
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 27 Oct 2015 14:00:06 +0000
schrieb Martin Nowak <code dawg.eu>:

 On Tuesday, 27 October 2015 at 13:14:36 UTC, wobbles wrote:
 How can `coordinates` member be known at compile-time when the 
 input argument is a run-time string?
I suspect through the opDispatch operator overload. http://dlang.org/operatoroverloading.html#dispatch
Yikes, this is such an anti-pattern. https://github.com/rejectedsoftware/vibe.d/issues/634
For my defense I can say that the JSON parser is not a range and thus less likely to be used in UFCS chains. It can be replaced with .singleKey!"coordinates"() -- Marco
Oct 28 2015
prev sibling parent mw <mingwu gmail.com> writes:
On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
 fast.json usage:

     auto json = parseTrustedJSON(`{"x":123}`);

 Work with a single key from an object:

     json.singleKey!"someKey"
     json.someKey
Newbie Q: how to get the value? --------------------------- import std.stdio; import fast.json; void main() { auto json = parseTrustedJSON(`{"x":123}`); writeln(json.x); // shall I get 123 here? } --------------------------- core.exception.AssertError /home/xxx/.dub/packages/fast-0.3.5/fast/sourc /fast/json.d(1208): Assertion failure dmd v2.092.0
May 16 2020
prev sibling next sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 https://github.com/kostya/benchmarks#json
I can't find fast.json here. Where is it?
Oct 14 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 14 Oct 2015 08:19:52 +0000
schrieb Per Nordl=C3=B6w <per.nordlow gmail.com>:

 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 https://github.com/kostya/benchmarks#json
=20 I can't find fast.json here. Where is it?
=C2=BB=C2=BB=C2=BB D Gdc Fast 0.34 226.7 =C2=AB=C2=AB=C2=AB C++ Rapid 0.79 687.1 Granted if he wrote "D fast.json" it would have been easier to identify. --=20 Marco
Oct 14 2015
prev sibling next sibling parent reply Gary Willoughby <dev nomad.so> writes:
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 fast:	   0.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)

 (* Timings from my computer, Haswell CPU, Linux amd64.)
Where's the code?
Oct 15 2015
parent reply Daniel Kozak via Digitalmars-d-announce writes:
Gary Willoughby via Digitalmars-d-announce=20
<digitalmars-d-announce puremagic.com> napsal =C4=8Ct, =C5=99=C3=ADj 15, 20=
15 v=20
10=E2=88=B608 :
 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 fast:	   0.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)
=20
 (* Timings from my computer, Haswell CPU, Linux amd64.)
=20 Where's the code?
code.dlang.org =
Oct 15 2015
parent reply Daniel Kozak <kozzi dlang.cz> writes:
Daniel Kozak via Digitalmars-d-announce p=C3=AD=C5=A1e v =C4=8Ct 15. 10. 20=
15 v 11:07
+0200:
=20
=20
 Gary Willoughby via Digitalmars-d-announce <digitalmars-d-announce pu
 remagic.com> napsal =C4=8Ct, =C5=99=C3=ADj 15, 2015 v 10=E2=88=B608 :
 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 fast:	=C2=A0=C2=A0=C2=A00.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)
=20
 (* Timings from my computer, Haswell CPU, Linux amd64.)
=20
 Where's the code?
code.dlang.org
https://github.com/mleise/fast
Oct 15 2015
parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 15 Oct 2015 11:09:01 +0200
schrieb Daniel Kozak <kozzi dlang.cz>:

 Daniel Kozak via Digitalmars-d-announce p=C3=AD=C5=A1e v =C4=8Ct 15. 10. =
2015 v 11:07
 +0200:
=20
=20
 Gary Willoughby via Digitalmars-d-announce
 <digitalmars-d-announce pu remagic.com> napsal =C4=8Ct, =C5=99=C3=ADj 1=
5, 2015 v
 10=E2=88=B608 :
 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 fast:	=C2=A0=C2=A0=C2=A00.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)
=20
 (* Timings from my computer, Haswell CPU, Linux amd64.)
=20
 Where's the code? =20
code.dlang.org =20
=20 https://github.com/mleise/fast
BTW: Is there a reason why the code is GPL licensed? I understand that people might want to use more restrictive licenses, but isn't LGPL a better replacement for GPL when writing library code? Doesn't the GPL force everybody _using_ fast.json to also use the GPL license? See: http://stackoverflow.com/a/10179181/471401
Oct 15 2015
next sibling parent Jack Stouffer <jack jackstouffer.com> writes:
On Thursday, 15 October 2015 at 12:51:58 UTC, Johannes Pfau wrote:
 BTW: Is there a reason why the code is GPL licensed? I 
 understand that people might want to use more restrictive 
 licenses, but isn't LGPL a better replacement for GPL when 
 writing library code? Doesn't the GPL force everybody _using_ 
 fast.json to also use the GPL license?

 See: http://stackoverflow.com/a/10179181/471401
It also precludes any of this code being used in Phobos :/
Oct 15 2015
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2015-10-15 14:51, Johannes Pfau wrote:

 Doesn't the GPL force everybody _using_ fast.json to also use the GPL license?
Yes, it does have that enforcement. -- /Jacob Carlborg
Oct 15 2015
next sibling parent Jack Stouffer <jack jackstouffer.com> writes:
On Thursday, 15 October 2015 at 19:40:16 UTC, Jacob Carlborg 
wrote:
 On 2015-10-15 14:51, Johannes Pfau wrote:

 Doesn't the GPL force everybody _using_ fast.json to also use 
 the GPL license?
Yes, it does have that enforcement.
Then this is practically useless for the vast majority of programmers.
Oct 15 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/15/15 10:40 PM, Jacob Carlborg wrote:
 On 2015-10-15 14:51, Johannes Pfau wrote:

 Doesn't the GPL force everybody _using_ fast.json to also use the GPL
 license?
Yes, it does have that enforcement.
Then we'd need to ask Marco if he's willing to relicense the code with Boost. -- Andrei
Oct 16 2015
parent Piotrek <nodata nodata.pl> writes:
On Friday, 16 October 2015 at 10:08:06 UTC, Andrei Alexandrescu 
wrote:
 On 10/15/15 10:40 PM, Jacob Carlborg wrote:
 On 2015-10-15 14:51, Johannes Pfau wrote:

 Doesn't the GPL force everybody _using_ fast.json to also use 
 the GPL
 license?
Yes, it does have that enforcement.
Then we'd need to ask Marco if he's willing to relicense the code with Boost. -- Andrei
I've just crossed my fingers. Piotrek
Oct 17 2015
prev sibling parent reply Jonathan M Davis via Digitalmars-d-announce writes:
On Thursday, October 15, 2015 14:51:58 Johannes Pfau via Digitalmars-d-announce
wrote:
 BTW: Is there a reason why the code is GPL licensed? I understand that
 people might want to use more restrictive licenses, but isn't LGPL a
 better replacement for GPL when writing library code? Doesn't the GPL
 force everybody _using_ fast.json to also use the GPL license?

 See: http://stackoverflow.com/a/10179181/471401
I think that you might be able to link code with various other compatible, open source licenses against it, but you definitely can't link any proprietary code aganist it. GPL really makes more sense for programs than for libraries for precisely that reason. And most D libraries are likely to be Boost licensed, since that's the license used by Phobos and generally favored by the D community. There's nothing wrong with releasing a library under the GPL if you really want to, but it seriously limits its usefulness. - Jonathan M Davis
Oct 15 2015
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2015-10-16 00:12, Jonathan M Davis via Digitalmars-d-announce wrote:

 I think that you might be able to link code with various other compatible,
 open source licenses against it, but you definitely can't link any
 proprietary code aganist it. GPL really makes more sense for programs than
 for libraries for precisely that reason. And most D libraries are likely to
 be Boost licensed, since that's the license used by Phobos and generally
 favored by the D community. There's nothing wrong with releasing a library
 under the GPL if you really want to, but it seriously limits its usefulness.
Yes, that's correct. It would be fine if everything used GPL, but that's not the world we live in. Which makes the license not very practical. -- /Jacob Carlborg
Oct 15 2015
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Thursday, 15 October 2015 at 22:13:07 UTC, Jonathan M Davis 
wrote:
 On Thursday, October 15, 2015 14:51:58 Johannes Pfau via 
 Digitalmars-d-announce wrote:
 BTW: Is there a reason why the code is GPL licensed? I 
 understand that people might want to use more restrictive 
 licenses, but isn't LGPL a better replacement for GPL when 
 writing library code? Doesn't the GPL force everybody _using_ 
 fast.json to also use the GPL license?

 See: http://stackoverflow.com/a/10179181/471401
I think that you might be able to link code with various other compatible, open source licenses against it, but you definitely can't link any proprietary code aganist it.
Yes, you can. GPL only affects distribution of executables to third party, it doesn't affect services. Maybe you are thinking of AGPL, which also affects services. But even AGPL allows internal usage.
Oct 16 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/16/15 6:20 AM, Ola Fosheim Grøstad wrote:
 On Thursday, 15 October 2015 at 22:13:07 UTC, Jonathan M Davis wrote:
 On Thursday, October 15, 2015 14:51:58 Johannes Pfau via
 Digitalmars-d-announce wrote:
 BTW: Is there a reason why the code is GPL licensed? I understand
 that people might want to use more restrictive licenses, but isn't
 LGPL a better replacement for GPL when writing library code? Doesn't
 the GPL force everybody _using_ fast.json to also use the GPL license?

 See: http://stackoverflow.com/a/10179181/471401
I think that you might be able to link code with various other compatible, open source licenses against it, but you definitely can't link any proprietary code aganist it.
Yes, you can. GPL only affects distribution of executables to third party, it doesn't affect services. Maybe you are thinking of AGPL, which also affects services. But even AGPL allows internal usage.
No, you cannot link against GPL library without making your code also GPL. "Services" I don't think have anything to do with this, we are talking about binary linking. -Steve
Oct 16 2015
next sibling parent reply Mike Parker <aldacron gmail.com> writes:
On Friday, 16 October 2015 at 12:53:09 UTC, Steven Schveighoffer 
wrote:

 Yes, you can. GPL only affects distribution of executables to 
 third
 party, it doesn't affect services. Maybe you are thinking of 
 AGPL, which
 also affects services. But even AGPL allows internal usage.
No, you cannot link against GPL library without making your code also GPL. "Services" I don't think have anything to do with this, we are talking about binary linking.
There was something called the "server loophole." The language of GPLv2 only requires source to be distributed if the binaries are distributed. The Affero GPL was created to close the loophole, requiring source to be made available even if the binaries aren't distributed. IIRC, GPLv3 requires the same.
Oct 16 2015
parent Mike Parker <aldacron gmail.com> writes:
On Friday, 16 October 2015 at 14:05:50 UTC, Mike Parker wrote:
 On Friday, 16 October 2015 at 12:53:09 UTC, Steven 
 Schveighoffer wrote:

 Yes, you can. GPL only affects distribution of executables to 
 third
 party, it doesn't affect services. Maybe you are thinking of 
 AGPL, which
 also affects services. But even AGPL allows internal usage.
No, you cannot link against GPL library without making your code also GPL. "Services" I don't think have anything to do with this, we are talking about binary linking.
There was something called the "server loophole." The language of GPLv2 only requires source to be distributed if the binaries are distributed. The Affero GPL was created to close the loophole, requiring source to be made available even if the binaries aren't distributed. IIRC, GPLv3 requires the same.
Looks like I got my versions wrong. AGPL is a modified GPLv3. http://www.gnu.org/licenses/why-affero-gpl.en.html
Oct 16 2015
prev sibling next sibling parent reply Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On 10/16/2015 08:53 AM, Steven Schveighoffer wrote:
 On 10/16/15 6:20 AM, Ola Fosheim Grøstad wrote:
 On Thursday, 15 October 2015 at 22:13:07 UTC, Jonathan M Davis wrote:
 On Thursday, October 15, 2015 14:51:58 Johannes Pfau via
 Digitalmars-d-announce wrote:
 BTW: Is there a reason why the code is GPL licensed? I understand
 that people might want to use more restrictive licenses, but isn't
 LGPL a better replacement for GPL when writing library code? Doesn't
 the GPL force everybody _using_ fast.json to also use the GPL license?

 See: http://stackoverflow.com/a/10179181/471401
I think that you might be able to link code with various other compatible, open source licenses against it, but you definitely can't link any proprietary code aganist it.
Yes, you can. GPL only affects distribution of executables to third party, it doesn't affect services. Maybe you are thinking of AGPL, which also affects services. But even AGPL allows internal usage.
No, you cannot link against GPL library without making your code also GPL. "Services" I don't think have anything to do with this, we are talking about binary linking.
This is the real reason I'm not a huge fan of *GPL. Nobody can understand it!
Oct 16 2015
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 16 October 2015 at 14:09:13 UTC, Nick Sabalausky wrote:
 This is the real reason I'm not a huge fan of *GPL. Nobody can 
 understand it!
It is really simple!! The basic idea is that people shouldn't have to reverse engineer software they use in order to fix it/modify it, so when you receive software you should get the right to the means to modify it (the source code). In addition GPL also gives you the right to distribute copies (if you want to) so that you can let others enjoy your improved version of the program. It doesn't give the public the right to demand source code to be made available, only owners of legally obtained copies get the right to demand the full source to be available for them. It also does not forbid linking against anything, it requires the copyright holder to grant rights to the receiver of the copy (access to source code and making copies to distribute under the same terms). As long as you keep your modifications/derived works for yourself, the only party that has been granted GPL for the derived work is yourself. One dilemma here is that a company with a million employees is treated like a single entity legally. So big companies can embrace the GPL freely for internal use and services without the redistribution GPL clauses coming into effect, whereas smaller companies that exchange software between them cannot restrict redistribution...
Oct 16 2015
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 16 October 2015 at 12:53:09 UTC, Steven Schveighoffer 
wrote:
 No, you cannot link against GPL library without making your 
 code also GPL. "Services" I don't think have anything to do 
 with this, we are talking about binary linking.
Yes, you can. GPL is a copyright license which says that if you legally obtain a copy of an executable then you also have the right to the source code and the right to make copies. If you don't hand out an executable then there are no obligations at all, for obvious reasons. GPL, on the other hand, gives the same right to users of a service. https://en.wikipedia.org/wiki/Affero_General_Public_License
Oct 16 2015
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 16 October 2015 at 15:07:17 UTC, Ola Fosheim Grøstad 
wrote:
 GPL, on the other hand, gives the same right to users of a 
 service.
Typo, "AGPL", not "GPL"...
Oct 16 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/16/15 11:07 AM, Ola Fosheim Grøstad wrote:
 On Friday, 16 October 2015 at 12:53:09 UTC, Steven Schveighoffer wrote:
 No, you cannot link against GPL library without making your code also
 GPL. "Services" I don't think have anything to do with this, we are
 talking about binary linking.
Yes, you can. GPL is a copyright license which says that if you legally obtain a copy of an executable then you also have the right to the source code and the right to make copies. If you don't hand out an executable then there are no obligations at all, for obvious reasons.
You certainly can link with it, and then your code becomes GPL. you don't have to distribute the binary, but it's still now GPL. The question is, can you link a proprietary licensed piece of software against GPL, while having the proprietary software retain its license, and the answer is no. If you want to say GPL is fine if you only want to provide your software as a service, then that is not an answer to the question. Your software is effectively GPL, but you don't have to distribute the source because you didn't distribute the binary. However, you would have no recourse if you mistakenly provided the binary to someone. Ever. This is a poison pill risk that most companies will not swallow. Please give me a json library that's 10% slower and won't ruin my entire business, thanks. -Steve
Oct 16 2015
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 16 October 2015 at 15:36:26 UTC, Steven Schveighoffer 
wrote:
 You certainly can link with it, and then your code becomes GPL.
No, the code is code. It is an artifact. The GPL is a legal document. The legal document says what rights you have to the copy you received and what requirements that follows it. You are allowed to modify it and do anything you want with it that is covered under fair use. This varies between jurisdictions. The license primarily comes into effect when you _distribute_ or _publish_, because the legal precedent for putting restrictions on distribution and publishing is much stronger. And WIPO is much more clear there. So, if you build websites for a third party you can use GPL without redistribution by writing the contract in such a way that the third party is using your service. Meaning, you run the software. So circumventing the GPL isn't all that hard if you want to. The AGPL also affects publishing as a service, so it makes such arrangements much more difficult.
Oct 16 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/16/15 11:56 AM, Ola Fosheim Grøstad wrote:
 On Friday, 16 October 2015 at 15:36:26 UTC, Steven Schveighoffer wrote:
 You certainly can link with it, and then your code becomes GPL.
No, the code is code. It is an artifact. The GPL is a legal document. The legal document says what rights you have to the copy you received and what requirements that follows it. You are allowed to modify it and do anything you want with it that is covered under fair use. This varies between jurisdictions. The license primarily comes into effect when you _distribute_ or _publish_, because the legal precedent for putting restrictions on distribution and publishing is much stronger. And WIPO is much more clear there.
Right, so what happens when you accidentally distribute it? What license is it under? For example, let's say you have a product that doesn't use JSON. It's proprietary, and you distribute it under a proprietary license. You want to include JSON parsing, so you incorporate this GPL'd library. Then you distribute it under your proprietary license. Recipient says "Wait, you used fast.json! That means this is now GPL, I want the source". Then what?
 So, if you build websites for a third party you can use GPL without
 redistribution by writing the contract in such a way that the third
 party is using your service. Meaning, you run the software. So
 circumventing the GPL isn't all that hard if you want to.
Being able to use GPL on SAAS doesn't satisfy the use case here. This is a compiled library, it can be used in any piece of software. -Steve
Oct 16 2015
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 16 October 2015 at 17:38:01 UTC, Steven Schveighoffer 
wrote:
 For example, let's say you have a product that doesn't use 
 JSON. It's proprietary, and you distribute it under a 
 proprietary license. You want to include JSON parsing, so you 
 incorporate this GPL'd library. Then you distribute it under 
 your proprietary license.

 Recipient says "Wait, you used fast.json! That means this is 
 now GPL, I want the source". Then what?
The recipient has no say in this, but the original author can demand that you either stop distribution or purchase a compatible license.
 Being able to use GPL on SAAS doesn't satisfy the use case 
 here. This is a compiled library, it can be used in any piece 
 of software.
My point was that you can use GPLed code in a proprietary service. But you can also ship propritary code separately that the end user links with the GPLed code. It is only when you bundle the two that you get a derived work.
Oct 16 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/16/15 2:24 PM, Ola Fosheim Grøstad wrote:
 On Friday, 16 October 2015 at 17:38:01 UTC, Steven Schveighoffer wrote:
 For example, let's say you have a product that doesn't use JSON. It's
 proprietary, and you distribute it under a proprietary license. You
 want to include JSON parsing, so you incorporate this GPL'd library.
 Then you distribute it under your proprietary license.

 Recipient says "Wait, you used fast.json! That means this is now GPL,
 I want the source". Then what?
The recipient has no say in this, but the original author can demand that you either stop distribution or purchase a compatible license.
Exactly my point.
 Being able to use GPL on SAAS doesn't satisfy the use case here. This
 is a compiled library, it can be used in any piece of software.
My point was that you can use GPLed code in a proprietary service. But you can also ship propritary code separately that the end user links with the GPLed code. It is only when you bundle the two that you get a derived work.
And I don't disagree with your point, just that it was not a correct response to "but you definitely can't link any proprietary code aganist [sic] it." -Steve
Oct 16 2015
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 16 October 2015 at 18:53:39 UTC, Steven Schveighoffer 
wrote:
 And I don't disagree with your point, just that it was not a 
 correct response to "but you definitely can't link any 
 proprietary code aganist [sic] it."
That I don't understand. You can indeed build your executable from a mix of proprietary third party libraries and GPL code, that means you definetively can link. You cannot distribute it together to another third party, but your employer can use it and run a service with it. People attribute way too many limitations to GPL codebases. For many organizations the GPL would be perfectly ok for their software stack.
Oct 16 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/16/15 3:36 PM, Ola Fosheim Grøstad wrote:
 On Friday, 16 October 2015 at 18:53:39 UTC, Steven Schveighoffer wrote:
 And I don't disagree with your point, just that it was not a correct
 response to "but you definitely can't link any proprietary code
 aganist [sic] it."
That I don't understand. You can indeed build your executable from a mix of proprietary third party libraries and GPL code, that means you definetively can link. You cannot distribute it together to another third party, but your employer can use it and run a service with it.
The distribution is implied in the comment. If there isn't distribution, the license taint isn't important, why bring it up? In any case, having a GPL license for a library diminishes its usefulness to proprietary software houses.
 People attribute way too many limitations to GPL codebases. For many
 organizations the GPL would be perfectly ok for their software stack.
It depends on what you do. Sure, if you are a pure SAAS house, GPL is perfectly fine, but if one day you want to license that as an installable server, you need to re-develop that GPL piece, and make sure your developers never looked at the code. It's not something to take lightly. -Steve
Oct 16 2015
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Friday, 16 October 2015 at 21:01:02 UTC, Steven Schveighoffer 
wrote:
 The distribution is implied in the comment. If there isn't 
 distribution, the license taint isn't important, why bring it 
 up?
That was not implied. You can have a license which is much more limiting, the GPL is fairly liberal. Most software that is written is not for redistribution!
 In any case, having a GPL license for a library diminishes its 
 usefulness to proprietary software houses.
If that's what you you mean, then be explicit about it.
 It depends on what you do. Sure, if you are a pure SAAS house, 
 GPL is perfectly fine, but if one day you want to license that 
 as an installable server, you need to re-develop that GPL piece,
It isn't obvious, you should be able to lease a server without that being considered obtaining a copy? To figure that out you'll need a legal interpretation of the GPL for your jurisdiction.
 and make sure your developers never looked at the code. It's 
 not something to take lightly.
No, having read code in the past does not affect copyright. If you don't translate the GPL code while writing then it isn't a derived work. What you are thinking about is clean room implementation of reverse engineered APIs to hardware where the code only is tiny stubs of machine code that has to be written in a certain way to be compatible.
Oct 16 2015
prev sibling next sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 fast:	   0.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)
Why not add this to std.experimental?
Oct 15 2015
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/15/15 12:40 PM, Per Nordlöw wrote:
 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 fast:       0.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)
Why not add this to std.experimental?
Sure seems like a good question! At the least a more generic generalization (more character and range types etc) should start from Marco's core implementation. -- Andrei
Oct 15 2015
parent wobbles <grogan.colin gmail.com> writes:
On Thursday, 15 October 2015 at 10:34:16 UTC, Andrei Alexandrescu 
wrote:
 On 10/15/15 12:40 PM, Per Nordlöw wrote:
 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise 
 wrote:
 fast:       0.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)
Why not add this to std.experimental?
Sure seems like a good question! At the least a more generic generalization (more character and range types etc) should start from Marco's core implementation. -- Andrei
Would it not be a better use of effort to attempt to merge the efforts here over to Sonkes new stdx.json? I didn't look at either codebase, so I dont know how difficult that'll be.
Oct 15 2015
prev sibling parent reply Jonathan M Davis via Digitalmars-d-announce writes:
On Thursday, October 15, 2015 09:40:05 Per Nordlöw via Digitalmars-d-announce
wrote:
 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 fast:      0.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)
Why not add this to std.experimental?
I thought that http://code.dlang.org/packages/std_data_json was the json implementation we were looking at adding to Phobos. Or did that fall through? I haven't paid much attention to the discussion on that, though I have used it in one of my own projects. - Jonathan M Davis
Oct 15 2015
parent reply Jacob Carlborg <doob me.com> writes:
On 2015-10-16 00:14, Jonathan M Davis via Digitalmars-d-announce wrote:

 I thought that http://code.dlang.org/packages/std_data_json was the json
 implementation we were looking at adding to Phobos. Or did that fall
 through? I haven't paid much attention to the discussion on that, though I
 have used it in one of my own projects.
Yes, that was the plan. But if a better alternative shows up, should we look at that as well? -- /Jacob Carlborg
Oct 15 2015
parent Jonathan M Davis via Digitalmars-d-announce writes:
On Friday, October 16, 2015 08:21:32 Jacob Carlborg via Digitalmars-d-announce
wrote:
 On 2015-10-16 00:14, Jonathan M Davis via Digitalmars-d-announce wrote:

 I thought that http://code.dlang.org/packages/std_data_json was the json
 implementation we were looking at adding to Phobos. Or did that fall
 through? I haven't paid much attention to the discussion on that, though I
 have used it in one of my own projects.
Yes, that was the plan. But if a better alternative shows up, should we look at that as well?
Sure, but going from std_data_json as being the candidate to talking about putting this other one in std.experimental seems a bit much. It needs to go through the review process first, and if we're doing that, it doesn't make sense to have two winners. They'll have to duke it out (or be merged), and then the one that wins can go in std.experimental. - Jonathan M Davis
Oct 16 2015
prev sibling next sibling parent reply =?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig rejectedsoftware.com> writes:
Am 14.10.2015 um 09:01 schrieb Marco Leise:
 JSON parsing in D has come a long way, especially when you
 look at it from the efficiency angle as a popular benchmark
 does that has been forked by well known D contributers like
 Martin Nowak or Sönke Ludwig.

 The test is pretty simple: Parse a JSON object, containing an
 array of 1_000_000 3D coordinates in the range [0..1) and
 average them.

 The performance of std.json in parsing those was horrible
 still in the DMD 2.066 days*:

 DMD     : 41.44s,  934.9Mb
 Gdc     : 29.64s,  929.7Mb
 Python  : 12.30s, 1410.2Mb
 Ruby    : 13.80s, 2101.2Mb

 Then with 2.067 std.json got a major 3x speed improvement and
 rivaled the popular dynamic languages Ruby and Python:

 DMD     : 13.02s, 1324.2Mb

 In the mean time several other D JSON libraries appeared with
 varying focus on performance or API:

 Medea         : 56.75s, 1753.6Mb  (GDC)
 libdjson      : 24.47s, 1060.7Mb  (GDC)
 stdx.data.json:  2.76s,  207.1Mb  (LDC)

 Yep, that's right. stdx.data.json's pull parser finally beats
 the dynamic languages with native efficiency. (I used the
 default options here that provide you with an Exception and
 line number on errors.)
From when are the numbers for stdx.data.json? The latest results for the pull parser that I know of were faster than RapidJson: http://forum.dlang.org/post/wlczkjcawyteowjbbcpo forum.dlang.org Judging by the relative numbers, your "fast" result is still a bit faster, but if that doesn't validate, it's hard to make an objective comparison.
Oct 15 2015
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 15 Oct 2015 18:46:12 +0200
schrieb S=C3=B6nke Ludwig <sludwig rejectedsoftware.com>:

 Am 14.10.2015 um 09:01 schrieb Marco Leise:
 [=E2=80=A6]
 stdx.data.json:  2.76s,  207.1Mb  (LDC)

 Yep, that's right. stdx.data.json's pull parser finally beats
 the dynamic languages with native efficiency. (I used the
 default options here that provide you with an Exception and
 line number on errors.)
=20 From when are the numbers for stdx.data.json? The latest results for=20 the pull parser that I know of were faster than RapidJson: http://forum.dlang.org/post/wlczkjcawyteowjbbcpo forum.dlang.org
You know, I'm not surprised at the "D new lazy Ldc" result, which is in the ball park figure of what I measured without exceptions & line-numbers, but the Rapid C++ result seems way off compared to kostya's listing. Or maybe that Core i7 doesn't work well with RapidJSON. I used your fork of the benchmark, made some modifications like adding taggedalgebraic and what else was needed to make it compile with vanilla ldc2 0.16.0. Then I removed the flags that disable exceptions and line numbers. Compilation options are the same as for the existing gdc and ldc2 entries. I did not add " -partial-inliner -boundscheck=3Doff -singleobj ".
 Judging by the relative numbers, your "fast" result is still a bit=20
 faster, but if that doesn't validate, it's hard to make an objective=20
 comparison.
Every value that is read (as opposed to skipped) is validated according to RFC 7159. That includes UTF-8 validation. Full validation (i.e. readJSONFile!validateAll(=E2=80=A6);) may add up to 14% overhead here. --=20 Marco
Oct 16 2015
parent reply =?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig rejectedsoftware.com> writes:
Am 16.10.2015 um 18:04 schrieb Marco Leise:
 Every value that is read (as opposed to skipped) is validated
 according to RFC 7159. That includes UTF-8 validation. Full
 validation (i.e. readJSONFile!validateAll(…);) may add up to
 14% overhead here.
Nice! I see you are using bitmasking trickery in multiple places. stdx.data.json is mostly just the plain lexing algorithm, with the exception of whitespace skipping. It was already very encouraging to get those benchmark numbers that way. Good to see that it pays off to go further.
Oct 19 2015
parent reply Suliman <evermind live.ru> writes:
On Monday, 19 October 2015 at 07:48:16 UTC, Sönke Ludwig wrote:
 Am 16.10.2015 um 18:04 schrieb Marco Leise:
 Every value that is read (as opposed to skipped) is validated
 according to RFC 7159. That includes UTF-8 validation. Full
 validation (i.e. readJSONFile!validateAll(…);) may add up to
 14% overhead here.
Nice! I see you are using bitmasking trickery in multiple places. stdx.data.json is mostly just the plain lexing algorithm, with the exception of whitespace skipping. It was already very encouraging to get those benchmark numbers that way. Good to see that it pays off to go further.
Is there any chance that new json parser can be include in next versions of vibed? And what need to including its to Phobos?
Oct 20 2015
parent reply Jonathan M Davis via Digitalmars-d-announce writes:
On Wednesday, October 21, 2015 06:36:31 Suliman via Digitalmars-d-announce
wrote:
 On Monday, 19 October 2015 at 07:48:16 UTC, Sönke Ludwig wrote:
 Am 16.10.2015 um 18:04 schrieb Marco Leise:
 Every value that is read (as opposed to skipped) is validated
 according to RFC 7159. That includes UTF-8 validation. Full
 validation (i.e. readJSONFile!validateAll(…);) may add up to
 14% overhead here.
Nice! I see you are using bitmasking trickery in multiple places. stdx.data.json is mostly just the plain lexing algorithm, with the exception of whitespace skipping. It was already very encouraging to get those benchmark numbers that way. Good to see that it pays off to go further.
Is there any chance that new json parser can be include in next versions of vibed? And what need to including its to Phobos?
It's already available on code.dlang.org: http://code.dlang.org/packages/std_data_json For it to get into Phobos, it has to get through the review process and be voted in. It was put up for formal review two or three months ago, but that didn't get to the point that it was voted on (I assume that there was more work that needed to be done on it first; I haven't really read through that thread though, so I don't know - I was too busy when the review started to get involved in it). So, whatever needs to be done for it to be ready for a formal vote needs to be done, and then it can be voted in, but all of that takes time, so if you want to use it soon, you might as well just grab it from code.dlang.org - and it will make it so that you're in a better position to give feedback on it as well so that it will be that much better if/when it makes it into Phobos. - Jonathan M Davis
Oct 21 2015
parent reply Suliman <evermind live.ru> writes:
 Nice! I see you are using bitmasking trickery in multiple 
 places. stdx.data.json is mostly just the plain lexing 
 algorithm, with the exception of whitespace skipping. It was 
 already very encouraging to get those benchmark numbers that 
 way. Good to see that it pays off to go further.
Is there any chance that new json parser can be include in next versions of vibed? And what need to including its to Phobos?
It's already available on code.dlang.org: http://code.dlang.org/packages/std_data_json
Jonatan, I mean https://github.com/mleise/fast :)
Oct 21 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 21 Oct 2015 17:00:39 +0000
schrieb Suliman <evermind live.ru>:

 Nice! I see you are using bitmasking trickery in multiple 
 places. stdx.data.json is mostly just the plain lexing 
 algorithm, with the exception of whitespace skipping. It was 
 already very encouraging to get those benchmark numbers that 
 way. Good to see that it pays off to go further.
Is there any chance that new json parser can be include in next versions of vibed? And what need to including its to Phobos?
It's already available on code.dlang.org: http://code.dlang.org/packages/std_data_json
Jonatan, I mean https://github.com/mleise/fast :)
That's nice, but it has a different license and I don't think Phobos devs would be happy to see all the inline assembly I used and duplicate functionality like the number parsing and UTF-8 validation and missing range support. -- Marco
Oct 21 2015
prev sibling next sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 https://github.com/kostya/benchmarks#json
Does fast.json use any non-standard memory allocation patterns or plain simple GC-usage?
Oct 16 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 16 Oct 2015 11:09:37 +0000
schrieb Per Nordl=C3=B6w <per.nordlow gmail.com>:

 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 https://github.com/kostya/benchmarks#json
=20 Does fast.json use any non-standard memory allocation patterns or=20 plain simple GC-usage?
Plain GC. I found no way in D to express something as "borrowed", but the GC removes the ownership question from the picture. That said the only thing that I allocate in this benchmark is the dynamic array of coordinates (1_000_000 * 3 * double.sizeof), which can be replaced by lazily iterating over the array to remove all allocations. Using these lazy techniques for objects too and calling .borrow() instead of .read!string() (which .idups) should allow GC free usage. (Well, except for the one in parseJSON, where I append 16 zero bytes to the JSON string to make it SSE safe.) --=20 Marco
Oct 16 2015
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
If this is the benchmark I'm remembering, the bulk of the time is 
spent parsing the floating point numbers. So it isn't a test of 
JSON parsing in general so much as the speed of scanf.
Oct 17 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/17/15 6:43 PM, Sean Kelly wrote:
 If this is the benchmark I'm remembering, the bulk of the time is spent
 parsing the floating point numbers. So it isn't a test of JSON parsing
 in general so much as the speed of scanf.
In many cases the use of scanf can be replaced with drastically faster methods, as I discuss in my talks on optimization (including Brasov recently). I hope they'll release the videos soon. -- Andrei
Oct 17 2015
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Saturday, 17 October 2015 at 16:14:01 UTC, Andrei Alexandrescu 
wrote:
 On 10/17/15 6:43 PM, Sean Kelly wrote:
 If this is the benchmark I'm remembering, the bulk of the time 
 is spent
 parsing the floating point numbers. So it isn't a test of JSON 
 parsing
 in general so much as the speed of scanf.
In many cases the use of scanf can be replaced with drastically faster methods, as I discuss in my talks on optimization (including Brasov recently). I hope they'll release the videos soon. -- Andrei
Oh absolutely. My issue with the benchmark is just that it claims to be a JSON parser benchmark but the bulk of CPU time is actually spent parsing floats. I'm on my phone though so perhaps this is a different benchmark--I can't easily check. The one I recall came up a year or so ago and was discussed on D.general.
Oct 17 2015
next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 17 Oct 2015 16:27:06 +0000
schrieb Sean Kelly <sean invisibleduck.org>:

 On Saturday, 17 October 2015 at 16:14:01 UTC, Andrei Alexandrescu 
 wrote:
 On 10/17/15 6:43 PM, Sean Kelly wrote:
 If this is the benchmark I'm remembering, the bulk of the time 
 is spent
 parsing the floating point numbers. So it isn't a test of JSON 
 parsing
 in general so much as the speed of scanf.
In many cases the use of scanf can be replaced with drastically faster methods, as I discuss in my talks on optimization (including Brasov recently). I hope they'll release the videos soon. -- Andrei
Oh absolutely. My issue with the benchmark is just that it claims to be a JSON parser benchmark but the bulk of CPU time is actually spent parsing floats. I'm on my phone though so perhaps this is a different benchmark--I can't easily check. The one I recall came up a year or so ago and was discussed on D.general.
1/4 to 1/3 of the time is spent parsing numbers in highly optimized code. You see that in a profiler the number parsing shows up on top, but the benchmark also exercises the structural parsing a lot. It is not a very broad benchmark though, lacking serialization, UTF-8 decoding, validation of results etc. I believe the author didn't realize how over time it became the go-to performance test. The author of RapidJSON has a very in-depth benchmark suite, but it would be a bit of work to get something non-C++ integrated: https://github.com/miloyip/nativejson-benchmark It includes conformance tests as well. -- Marco
Oct 17 2015
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Saturday, 17 October 2015 at 16:27:08 UTC, Sean Kelly wrote:
 Oh absolutely. My issue with the benchmark is just that it 
 claims to be a JSON parser benchmark but the bulk of CPU time 
 is actually spent parsing floats.
Well, most of such language-comparison benchmarks are just for fun/marketing. In the real world big JSON files would be compressed and most likely retrieved over a network connection (like a blob from a database). Pull-parsing of mmap'ed memory is a rather unusual scenario for JSON.
Oct 19 2015
prev sibling next sibling parent reply rsw0x <anonymous anonymous.com> writes:
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 JSON parsing in D has come a long way, especially when you look 
 at it from the efficiency angle as a popular benchmark does 
 that has been forked by well known D contributers like Martin 
 Nowak or Sönke Ludwig.

 [...]
Slightly OT: You have a std.simd file in your repo, was this written by you or is there a current std.simd proposal that I'm unaware of?
Oct 17 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 18 Oct 2015 03:40:52 +0000
schrieb rsw0x <anonymous anonymous.com>:

 On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
 JSON parsing in D has come a long way, especially when you look=20
 at it from the efficiency angle as a popular benchmark does=20
 that has been forked by well known D contributers like Martin=20
 Nowak or S=C3=B6nke Ludwig.

 [...]
=20 Slightly OT: You have a std.simd file in your repo, was this written by you or=20 is there a current std.simd proposal that I'm unaware of?
Manu wrote that back in the days with the idea that it would help writing portable SIMD code on many architectures: https://github.com/TurkeyMan/simd Working in the 3D visualization business and having held at least one talk about SIMD it was no coincidence that he was interested in better vector math support. Inclusion into Phobos was planned. DMD needs some upgrading of the somewhat ad hoc SIMD intrinsic implementation though: https://issues.dlang.org/buglist.cgi?keywords=3DSIMD&resolution=3D--- Many instructions cannot be expressed outside of inline assembly which doesn't inline. --=20 Marco
Oct 18 2015
prev sibling next sibling parent reply Laeeth Isharc <Laeeth.nospam nospam-laeeth.com> writes:
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:

 The test is pretty simple: Parse a JSON object, containing an 
 array of 1_000_000 3D coordinates in the range [0..1) and 
 average them.

 The performance of std.json in parsing those was horrible still 
 in the DMD 2.066 days*:

 DMD     : 41.44s,  934.9Mb
 Gdc     : 29.64s,  929.7Mb
 Python  : 12.30s, 1410.2Mb
 Ruby    : 13.80s, 2101.2Mb

 Then with 2.067 std.json got a major 3x speed improvement and 
 rivaled the popular dynamic languages Ruby and Python:

 DMD     : 13.02s, 1324.2Mb

 In the mean time several other D JSON libraries appeared with 
 varying focus on performance or API:

 Medea         : 56.75s, 1753.6Mb  (GDC)
 libdjson      : 24.47s, 1060.7Mb  (GDC)
 stdx.data.json:  2.76s,  207.1Mb  (LDC)

 Yep, that's right. stdx.data.json's pull parser finally beats 
 the dynamic languages with native efficiency. (I used the 
 default options here that provide you with an Exception and 
 line number on errors.)

 A few days ago I decided to get some practical use out of my 
 pet project 'fast' by implementing a JSON parser myself, that 
 could rival even the by then fastest JSON parser, RapidJSON. 
 The result can be seen in the benchmark results right now:

 https://github.com/kostya/benchmarks#json

 fast:	   0.34s, 226.7Mb (GDC)
 RapidJSON: 0.79s, 687.1Mb (GCC)

 (* Timings from my computer, Haswell CPU, Linux
Very impressive. Is this not quite interesting ? Such a basic web back end operation, and yet it's a very different picture from those who say that one is I/O or network bound. I already have JSON files of a couple of gig, and they're only going to be bigger over time, and this is a more generally interesting question. Seems like you now get 2.1 gigbytes/sec sequential read from a cheap consumer SSD today...
Oct 20 2015
next sibling parent reply Kapps <opantm2+spam gmail.com> writes:
On Wednesday, 21 October 2015 at 04:17:19 UTC, Laeeth Isharc 
wrote:
 Seems like you now get 2.1 gigbytes/sec sequential read from a 
 cheap consumer SSD today...
Not many consumer drives give more than 500-600 MB/s (SATA3 limit) yet. There are only a couple that I know of that reach 2000 MB/s, like Samsung's SM951, and they're generally a fair bit more expensive than what most consumers tend to buy (but at about $1 / GB, still affordable for businesses certainly).
Oct 21 2015
parent reply Laeeth Isharc <laeethnospam nospamlaeeth.com> writes:
On Wednesday, 21 October 2015 at 09:59:09 UTC, Kapps wrote:
 On Wednesday, 21 October 2015 at 04:17:19 UTC, Laeeth Isharc 
 wrote:
 Seems like you now get 2.1 gigbytes/sec sequential read from a 
 cheap consumer SSD today...
Not many consumer drives give more than 500-600 MB/s (SATA3 limit) yet. There are only a couple that I know of that reach 2000 MB/s, like Samsung's SM951, and they're generally a fair bit more expensive than what most consumers tend to buy (but at about $1 / GB, still affordable for businesses certainly).
Yes - that's the one I had in mind. It's not dirt cheap, but at GBP280 if you have some money and want speed, the price is hardly an important factor. I should have said consumer grade rather than consumer, but anyway you get my point. That's today, in 2015. Maybe one can do even better than that by striping data, although it sounds like it's not that easy, but still. "The future is here already; just unevenly distributed". Seems like if you're processing JSON, which is not the most difficult task one might reasonably want to be doing, then CPU+memory is the bottleneck more than the SSD. I don't know what outlook is for drive speeds (except they probably won't go down), but data sets are certainly not shrinking. So I am intrigued by the difference between what people say is typical and what seems to be the case, certainly in what I want to do.
Oct 21 2015
parent reply Suliman <evermind live.ru> writes:
Could anybody reddit this benchmark?
Oct 21 2015
parent reply Laeeth Isharc <laeethnospam nospamlaeeth.com> writes:
On Wednesday, 21 October 2015 at 19:03:56 UTC, Suliman wrote:
 Could anybody reddit this benchmark?
done https://www.reddit.com/r/programming/comments/3pojrz/the_fastest_json_parser_in_the_world/
Oct 21 2015
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 04:38 PM, Laeeth Isharc wrote:
 On Wednesday, 21 October 2015 at 19:03:56 UTC, Suliman wrote:
 Could anybody reddit this benchmark?
done https://www.reddit.com/r/programming/comments/3pojrz/the_fastest_json_parser_in_the_world/
Getting good press. Congratulations! -- Andrei
Oct 21 2015
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/21/2015 1:38 PM, Laeeth Isharc wrote:
 On Wednesday, 21 October 2015 at 19:03:56 UTC, Suliman wrote:
 Could anybody reddit this benchmark?
done https://www.reddit.com/r/programming/comments/3pojrz/the_fastest_json_parser_in_the_world/
It's item 9 on the front page of https://news.ycombinator.com/ too! Link to actual article (don't click on this link, or your upvote will not be counted): https://news.ycombinator.com/item?id=10430951
Oct 22 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/22/2015 09:08 AM, Walter Bright wrote:
 On 10/21/2015 1:38 PM, Laeeth Isharc wrote:
 On Wednesday, 21 October 2015 at 19:03:56 UTC, Suliman wrote:
 Could anybody reddit this benchmark?
done https://www.reddit.com/r/programming/comments/3pojrz/the_fastest_json_parser_in_the_world/
It's item 9 on the front page of https://news.ycombinator.com/ too!
This has been a homerun. Congratulations for this work and also for publicizing it! (Consider it might have remained just one forum discussion read by all of 80 persons...) -- Andrei
Oct 22 2015
parent reply Laeeth Isharc <Laeeth.nospam nospam-laeeth.com> writes:
On Thursday, 22 October 2015 at 18:23:08 UTC, Andrei Alexandrescu 
wrote:
 On 10/22/2015 09:08 AM, Walter Bright wrote:
 On 10/21/2015 1:38 PM, Laeeth Isharc wrote:
 On Wednesday, 21 October 2015 at 19:03:56 UTC, Suliman wrote:
 Could anybody reddit this benchmark?
done https://www.reddit.com/r/programming/comments/3pojrz/the_fastest_json_parser_in_the_world/
It's item 9 on the front page of https://news.ycombinator.com/ too!
This has been a homerun. Congratulations for this work and also for publicizing it! (Consider it might have remained just one forum discussion read by all of 80 persons...) -- Andrei
We really do need to stop hiding our light under a bushel. Thinking in marketing terms doesn't always come easy to technically minded people, and I understand why, but ultimately the community benefits a great deal from people becoming aware of the very real benefits D has to offer (alas people won't just get it, even if you think they should), and there are personal career benefits too from helping communicate how you have applied D to do useful work. It's hard to find great programmers and showing what you can do will pay off over time.
Oct 22 2015
next sibling parent Meta <jared771 gmail.com> writes:
On Thursday, 22 October 2015 at 19:16:00 UTC, Laeeth Isharc wrote:
 We really do need to stop hiding our light under a bushel.  
 Thinking in marketing terms doesn't always come easy to 
 technically minded people, and I understand why, but ultimately 
 the community benefits a great deal from people becoming aware 
 of the very real benefits D has to offer (alas people won't 
 just get it, even if you think they should), and there are 
 personal career benefits too from helping communicate how you 
 have applied D to do useful work.  It's hard to find great 
 programmers and showing what you can do will pay off over time.
Yeah, we don't want to repeat Lisp's mistake.
Oct 22 2015
prev sibling parent reply rsw0x <anonymous anonymous.com> writes:
On Thursday, 22 October 2015 at 19:16:00 UTC, Laeeth Isharc wrote:
 On Thursday, 22 October 2015 at 18:23:08 UTC, Andrei 
 Alexandrescu wrote:
 On 10/22/2015 09:08 AM, Walter Bright wrote:
 [...]
This has been a homerun. Congratulations for this work and also for publicizing it! (Consider it might have remained just one forum discussion read by all of 80 persons...) -- Andrei
We really do need to stop hiding our light under a bushel. Thinking in marketing terms doesn't always come easy to technically minded people, and I understand why, but ultimately the community benefits a great deal from people becoming aware of the very real benefits D has to offer (alas people won't just get it, even if you think they should), and there are personal career benefits too from helping communicate how you have applied D to do useful work. It's hard to find great programmers and showing what you can do will pay off over time.
D has no well defined area to be used in. Everyone knows D, when written in a very specific C-mimicking way, is performant. But
Oct 22 2015
parent Laeeth Isharc <Laeeth.nospam nospam-laeeth.com> writes:
On Thursday, 22 October 2015 at 20:10:36 UTC, rsw0x wrote:
 On Thursday, 22 October 2015 at 19:16:00 UTC, Laeeth Isharc 
 wrote:
 On Thursday, 22 October 2015 at 18:23:08 UTC, Andrei 
 Alexandrescu wrote:
 On 10/22/2015 09:08 AM, Walter Bright wrote:
 [...]
This has been a homerun. Congratulations for this work and also for publicizing it! (Consider it might have remained just one forum discussion read by all of 80 persons...) -- Andrei
We really do need to stop hiding our light under a bushel. Thinking in marketing terms doesn't always come easy to technically minded people, and I understand why, but ultimately the community benefits a great deal from people becoming aware of the very real benefits D has to offer (alas people won't just get it, even if you think they should), and there are personal career benefits too from helping communicate how you have applied D to do useful work. It's hard to find great programmers and showing what you can do will pay off over time.
D has no well defined area to be used in. Everyone knows D, when written in a very specific C-mimicking way, is performant.
You reply to my post, but I don't entirely see how it relates. D is very flexible, and that's its virtue. Because splitting a codebase across multiple languages does have a cost, even if it's often worth paying the cost in order to use the right till for the job when those tools are by their nature specialised. I don't think everyone knows D is performant, and I wouldn't say fast JSON is written in a C mimicking way, taken as a whole. Choices are based on making trade-offs, and the relevant data are not static, but constantly shifting. When an SSD in 2015 that isn't especially pricey gives 2.1 Gig a sec throughput and one has many terabytes of text data a month to get through, and that's today and datasets keep growing and what I write today may be in use for years then the right decision will be a very different one to that five years ago. That's not just my perception, but those in other fields where the problems are similar - bioinformatics and advertising data being some of the many others. AdRoll is known for their Python work, but their data scientists use D. And my point, which you didn't really reply to, is that as a community we should do a bit more to share our experiences on how D can be useful in doing real work. As Walter observes, that's also something that pays off personally too.
Oct 24 2015
prev sibling next sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 21 Oct 2015 04:17:16 +0000
schrieb Laeeth Isharc <Laeeth.nospam nospam-laeeth.com>:

 Very impressive.
 
 Is this not quite interesting ?  Such a basic web back end 
 operation, and yet it's a very different picture from those who 
 say that one is I/O or network bound.  I already have JSON files 
 of a couple of gig, and they're only going to be bigger over 
 time, and this is a more generally interesting question.
 
 Seems like you now get 2.1 gigbytes/sec sequential read from a 
 cheap consumer SSD today...
You have this huge amount of Reddit API JSON, right? I wonder if your processing could benefit from the fast skipping routines or even reading it as "trusted JSON". -- Marco
Oct 21 2015
parent reply Laeeth Isharc <laeethnospam nospamlaeeth.com> writes:
On Wednesday, 21 October 2015 at 22:24:30 UTC, Marco Leise wrote:
 Am Wed, 21 Oct 2015 04:17:16 +0000
 schrieb Laeeth Isharc <Laeeth.nospam nospam-laeeth.com>:

 Very impressive.
 
 Is this not quite interesting ?  Such a basic web back end 
 operation, and yet it's a very different picture from those 
 who say that one is I/O or network bound.  I already have JSON 
 files of a couple of gig, and they're only going to be bigger 
 over time, and this is a more generally interesting question.
 
 Seems like you now get 2.1 gigbytes/sec sequential read from a 
 cheap consumer SSD today...
You have this huge amount of Reddit API JSON, right? I wonder if your processing could benefit from the fast skipping routines or even reading it as "trusted JSON".
The couple of gig were just Quandl metadata for one provider, but you're right I have that Reddit data too. And that's just a beginning. What some have been doing for a while, I'm beginning to do now, and many others will be doing in the next few years - just as soon as they have finished having meetings about what to do... I don't suppose they'll be using python, at least not for long. I am sure it could benefit - I kind of need to get some other parts going first. (For once it truly is a case of Knuth's 97%). But I'll be coming back to look at best way, for json, but text files more generally. Have you thought about writing up your experience with writing fast json? A bit like Walter's Dr Dobbs's article on wielding a profiler to speed up dmd. And actually if you have time, would you mind dropping me an email? laeeth at .... kaledicassociates.com Thanks. Laeeth.
Oct 21 2015
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/21/2015 3:40 PM, Laeeth Isharc wrote:
 Have you thought about writing up your experience with writing fast json?  A
bit
 like Walter's Dr Dobbs's article on wielding a profiler to speed up dmd.
Yes, Marco, please. This would make an awesome article, and we need articles like that! You've already got this: https://github.com/kostya/benchmarks/pull/46#issuecomment-147932489 so most of it is already written.
Oct 22 2015
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 22 Oct 2015 06:10:56 -0700
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 10/21/2015 3:40 PM, Laeeth Isharc wrote:
 Have you thought about writing up your experience with writing fast json?  A
bit
 like Walter's Dr Dobbs's article on wielding a profiler to speed up dmd.
Yes, Marco, please. This would make an awesome article, and we need articles like that! You've already got this: https://github.com/kostya/benchmarks/pull/46#issuecomment-147932489 so most of it is already written.
There is at least one hurdle. I don't have a place to publish articles, no personal blog or site I contribute articles to and I don't feel like creating a one-shot one right now. :) -- Marco
Oct 22 2015
next sibling parent bachmeier <no spam.net> writes:
On Thursday, 22 October 2015 at 20:54:01 UTC, Marco Leise wrote:
 Am Thu, 22 Oct 2015 06:10:56 -0700
 schrieb Walter Bright <newshound2 digitalmars.com>:

 On 10/21/2015 3:40 PM, Laeeth Isharc wrote:
 Have you thought about writing up your experience with 
 writing fast json?  A bit like Walter's Dr Dobbs's article 
 on wielding a profiler to speed up dmd.
Yes, Marco, please. This would make an awesome article, and we need articles like that! You've already got this: https://github.com/kostya/benchmarks/pull/46#issuecomment-147932489 so most of it is already written.
There is at least one hurdle. I don't have a place to publish articles, no personal blog or site I contribute articles to and I don't feel like creating a one-shot one right now. :)
That's why we need an official D blog. Perhaps you could publish it in TWID.
Oct 22 2015
prev sibling next sibling parent reply Joakim <dlang joakim.fea.st> writes:
On Thursday, 22 October 2015 at 20:54:01 UTC, Marco Leise wrote:
 Am Thu, 22 Oct 2015 06:10:56 -0700
 schrieb Walter Bright <newshound2 digitalmars.com>:

 On 10/21/2015 3:40 PM, Laeeth Isharc wrote:
 Have you thought about writing up your experience with 
 writing fast json?  A bit like Walter's Dr Dobbs's article 
 on wielding a profiler to speed up dmd.
Yes, Marco, please. This would make an awesome article, and we need articles like that! You've already got this: https://github.com/kostya/benchmarks/pull/46#issuecomment-147932489 so most of it is already written.
There is at least one hurdle. I don't have a place to publish articles, no personal blog or site I contribute articles to and I don't feel like creating a one-shot one right now. :)
The main D forum is as good a place as any. Just start a thread there.
Oct 22 2015
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/22/2015 9:29 PM, Joakim wrote:
 The main D forum is as good a place as any.  Just start a thread there.
No, articles should be more than postings.
Oct 23 2015
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/22/2015 1:53 PM, Marco Leise wrote:
 There is at least one hurdle. I don't have a place to publish
 articles, no personal blog or site I contribute articles to
 and I don't feel like creating a one-shot one right now. :)
You can publish it on my site: http://digitalmars.com/articles/index.html But I highly recommend that you create your own web site. It's great for your professional career.
Oct 23 2015
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2015-10-22 22:53, Marco Leise wrote:

 There is at least one hurdle. I don't have a place to publish
 articles, no personal blog or site I contribute articles to
 and I don't feel like creating a one-shot one right now. :)
You could have a look at this blog implementation by Dicebot [1]. You still need to host it though. [1] https://github.com/Dicebot/mood -- /Jacob Carlborg
Oct 23 2015
parent Laeeth Isharc <laeethnospam nospamlaeeth.com> writes:
On Friday, 23 October 2015 at 19:48:31 UTC, Jacob Carlborg wrote:
 On 2015-10-22 22:53, Marco Leise wrote:

 There is at least one hurdle. I don't have a place to publish
 articles, no personal blog or site I contribute articles to
 and I don't feel like creating a one-shot one right now. :)
You could have a look at this blog implementation by Dicebot [1]. You still need to host it though. [1] https://github.com/Dicebot/mood
Mood is very nice, and I plan on using it in the medium term (made a pull request so it would compile using gdc or ldc). But you might want to wait a little while as you want a blog to be stable, and I think there is a problem with segfaulting right now - perhaps to do with the caching of posts, although it shouldn't be hard either to fix that or rewrite it your own way (as I started doing). It's worth setting one up though - what you use doesn't matter (look at Nikola or one of the other static site generators) - and Walter is right.
Oct 23 2015
prev sibling parent reply Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On 10/21/2015 12:17 AM, Laeeth Isharc wrote:
 I already have JSON files of a couple of gig, and
 they're only going to be bigger over time,
Geez, if they're that big, is JSON really the best format to be using?
Oct 22 2015
parent Laeeth Isharc <Laeeth.nospam nospam-laeeth.com> writes:
On Thursday, 22 October 2015 at 17:35:48 UTC, Nick Sabalausky 
wrote:
 On 10/21/2015 12:17 AM, Laeeth Isharc wrote:
 I already have JSON files of a couple of gig, and
 they're only going to be bigger over time,
Geez, if they're that big, is JSON really the best format to be using?
Of course not. I don't much like JSON myself anyway. But I am not in control of the format it arrives in. Obviously I will eventually pick something better for the existing dump. But that two gig will be updated at least every week, and that's just one provider of 20 plus (mostly smaller). I have 2 Tb of Reddit JSON too. A one off conversion job, but there will be others.
Oct 22 2015
prev sibling next sibling parent reply Suliman <evermind live.ru> writes:
Marco, could you add your lib to review or do any steps that will 
help to include it's in Phobos? I think not only I interesting in 
good base JSON lib in base distribution.
Oct 29 2015
parent Jack Applegame <japplegame gmail.com> writes:
On Thursday, 29 October 2015 at 12:11:54 UTC, Suliman wrote:
 Marco, could you add your lib to review or do any steps that 
 will help to include it's in Phobos? I think not only I 
 interesting in good base JSON lib in base distribution.
Marco's json library doesn't meet requirements for inclusion in Phobos and should stay separately in DUB registry. Phobos needs much more generic library with support for streaming and ranges. I believe, that at the moment the best candidate is std.data.json by Sönke Ludwig.
Oct 29 2015
prev sibling next sibling parent Suliman <evermind live.ru> writes:
What about data validation? Does it's fast complete full 
validation of data, and what about other parsers? Are they 
complete full validation?
Nov 16 2015
prev sibling next sibling parent Mir Al Monsor <mirmonsor gmail.com> writes:
you could check it out on http://jsontuneup.com for treeview your 
json object and wrong inside your json.
Apr 25 2017
prev sibling parent reply iris <iris.panabaker gmail.com> writes:
Any idea about the performance of this json parser? 
https://jsonformatter.org/json-parser ?
Jul 13 2018
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 13 Jul 2018 18:14:35 +0000
schrieb iris <iris.panabaker gmail.com>:

 Any idea about the performance of this json parser? 
 https://jsonformatter.org/json-parser ?
That one is implemented in client side JavaScript. I didn't measure it, but the closest match in Kostya's benchmark could be the Node JS entry that is an order of magnitude slower. -- Marco
Jul 31 2018