From: Joseph Adams on 5 Apr 2010 23:50 Another JSON strictness issue: the JSON standard ( http://www.ietf.org/rfc/rfc4627.txt ) states that JSON text can only be an array or object. However, my implementation currently accepts any valid value. Thus, '3', '"hello"', 'true', 'false', and 'null' are all accepted by my implementation, but are not strictly JSON text. The question is: should the JSON datatype accept atomic values (those that aren't arrays or objects) as valid JSON? I tried a few other JSON implementations to see where they stand regarding atomic types as input: JSON_checker (C) does not accept them. JSON.parse() (JavaScript) accepts them. json_decode() (PHP) accepts them. However, support is currently buggy (e.g. '1' is accepted, but '1 ' is not). cJSON (C) accepts them. JSON.pm (Perl) accepts them if you specify the allow_nonref option. Otherwise, it accepts 'true' and 'false', but not 'null', a number, or a string by itself. In my opinion, we should accept an atomic value as valid JSON content. I suppose we could get away with calling it a "content" fragment as is done with XML without a doctype. Accepting atomic values as valid JSON would be more orthagonal, as it would be possible to have a function like this: json_values(object_or_array JSON) RETURNS SETOF JSON -- extracts values from an object or members from an array, returning them as JSON fragments. Also, should we go even further and accept key:value pairs by themselves? : '"key":"value"'::JSON I don't think we should because doing so would be rather zany. It would mean JSON content could be invalid in value context, as in: // JavaScript var content = "key" : "value"; I improved my JSON library. It now only accepts strict, UTF-8 encoded JSON values (that is, objects, arrays, strings, numbers, true, false, and null). It also has a json_decode_liberal() function that accepts a string, cleans it up, and passes it through the stricter json_decode(). json_decode_liberal() filters out comments, allows single quoted strings, and accepts a lax number format compared to strict JSON. I may add Unicode repair to it later on, but implementing that well really depends on what type of Unicode errors appear in real life, I think. http://constellationmedia.com/~funsite/static/json-0.0.2.tar.bz2 My json.c is now 1161 lines long, so I can't quite call it "small" anymore. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 6 Apr 2010 00:45 On Mon, Apr 5, 2010 at 11:50 PM, Joseph Adams <joeyadams3.14159(a)gmail.com> wrote: > In my opinion, we should accept an atomic value as valid JSON content. That seems right to me. > Also, should we go even further and accept key:value pairs by themselves? : > > '"key":"value"'::JSON Definitely not. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Petr Jelinek on 6 Apr 2010 01:00 Dne 6.4.2010 5:50, Joseph Adams napsal(a): > Another JSON strictness issue: the JSON standard ( > http://www.ietf.org/rfc/rfc4627.txt ) states that JSON text can only > be an array or object. However, my implementation currently accepts > any valid value. Thus, '3', '"hello"', 'true', 'false', and 'null' > are all accepted by my implementation, but are not strictly JSON text. > The question is: should the JSON datatype accept atomic values (those > that aren't arrays or objects) as valid JSON? > Not really sure about this myself, but keep in mind that NULL has special meaning in SQL. > Also, should we go even further and accept key:value pairs by themselves? : > > '"key":"value"'::JSON > > No, especially considering that '{"key":"value"}' is a valid JSON value. > I improved my JSON library. It now only accepts strict, UTF-8 encoded > JSON values (that is, objects, arrays, strings, numbers, true, false, > and null). > Just a note, but PostgreSQL has some UTF-8 validation code, you might want to look at it maybe, at least once you start the actual integration into core, so that you are not reinventing too many wheels. I can see how your own code is good thing for general library which this can (and I am sure will be) used as, but for the datatype itself, it might be better idea to use what's already there, unless it's somehow incompatible of course. -- Regards Petr Jelinek (PJMODOS) -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Petr Jelinek on 6 Apr 2010 02:15 Dne 6.4.2010 7:57, Joseph Adams napsal(a): > On Tue, Apr 6, 2010 at 1:00 AM, Petr Jelinek<pjmodos(a)pjmodos.net> wrote: > >> Not really sure about this myself, but keep in mind that NULL has special >> meaning in SQL. >> > To me, the most logical approach is to do the obvious thing: make > JSON's 'null' be SQL's NULL. For instance, SELECTing on a table with > NULLs in it and converting the result set to JSON would yield a > structure with 'null's in it. 'null'::JSON would yield NULL. I'm not > sure what startling results would come of this approach, but I'm > guessing this would be most intuitive and useful. > +1 >> Just a note, but PostgreSQL has some UTF-8 validation code, you might want >> to look at it maybe, at least once you start the actual integration into >> core, so that you are not reinventing too many wheels. I can see how your >> own code is good thing for general library which this can (and I am sure >> will be) used as, but for the datatype itself, it might be better idea to >> use what's already there, unless it's somehow incompatible of course. >> > Indeed. My plan is to first get a strong standalone JSON library > written and tested so it can be used as a general-purpose library. As > the JSON code is merged into PostgreSQL, it can be adapted. Part of > this adaptation would most likely be removing the UTF-8 validation > function I wrote and using PostgreSQL's Unicode support code instead. > > There are probably other bits that could be PostgreSQLified as well. > I wonder if I should consider leveraging PostgreSQL's regex support or > if it would be a bad fit/waste of time/slower/not worth it. > Regex ? What for ? You certainly don't need it for parsing, you have good parser IMHO and regex would probably be all of the above. -- Regards Petr Jelinek (PJMODOS) -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Mike Rylander on 6 Apr 2010 10:53
On Mon, Apr 5, 2010 at 11:50 PM, Joseph Adams <joeyadams3.14159(a)gmail.com> wrote: > Another JSON strictness issue: the JSON standard ( > http://www.ietf.org/rfc/rfc4627.txt ) states that JSON text can only > be an array or object. However, my implementation currently accepts > any valid value. Thus, '3', '"hello"', 'true', 'false', and 'null' > are all accepted by my implementation, but are not strictly JSON text. > The question is: should the JSON datatype accept atomic values (those > that aren't arrays or objects) as valid JSON? > > I tried a few other JSON implementations to see where they stand > regarding atomic types as input: > > JSON_checker (C) does not accept them. > JSON.parse() (JavaScript) accepts them. > json_decode() (PHP) accepts them. However, support is currently buggy > (e.g. '1' is accepted, but '1 ' is not). > cJSON (C) accepts them. > JSON.pm (Perl) accepts them if you specify the allow_nonref option. > Otherwise, it accepts 'true' and 'false', but not 'null', a number, or > a string by itself. > > In my opinion, we should accept an atomic value as valid JSON content. > I suppose we could get away with calling it a "content" fragment as > is done with XML without a doctype. > > Accepting atomic values as valid JSON would be more orthagonal, as it > would be possible to have a function like this: > > json_values(object_or_array JSON) RETURNS SETOF JSON > -- extracts values from an object or members from an array, returning > them as JSON fragments. > For these reasons, and the fact that my project uses atomic values ;), I think yes, we should support them. IIUC, the reason for requiring an array or object is that the O part of JSON means "some sort of a collection of atomic values". But, in ECMAScript (JavaScript), instances of strings, numbers, bools and null are, indeed, objects. IOW, I think JSON is using a faulty definition of "object" in the spec. It's the one part of the spec that doesn't make sense to me at all. > Also, should we go even further and accept key:value pairs by themselves? : > > '"key":"value"'::JSON > This, though, is probably a step too far. It violates the JS part of JSON ... > I don't think we should because doing so would be rather zany. It > would mean JSON content could be invalid in value context, as in: > > // JavaScript > var content = "key" : "value"; > Right. Thanks, Joseph. I think this will be a great addition! -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: miker(a)esilibrary.com | web: http://www.esilibrary.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |