From: Robert Haas on
On Fri, Jul 23, 2010 at 2:34 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
> - elog() must be used except for "can't happen" situations. �Compare
> the way numeric_in() throws an error message versus json_in().

Er... that should have said "elog() must NOT be used except for can't
happen situations".

Also, I was just looking at json_delete(). While the existing coding
there is kind of fun, I think this can be written more
straightforwardly by doing something like this (not tested):

while (1)
{
while (is_internal(node) && node->v.children.head)
node = node->v.children.head;
if (node->next)
next = node->next;
else if (node->parent)
next = node->parent;
else
break;
free_node(node);
node = next;
}

That gets rid of all of the gotos and one of the local variables and,
at least IMO, is easier to understand... though it would be even
better still if you also added a comment saying something like "We do
a depth-first, left-to-right traversal of the tree, freeing nodes as
we go. We need not bother clearing any of the pointers, because the
traversal order is such that we're never in danger of revisiting a
node we've already freed."

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Fri, Jul 23, 2010 at 2:18 AM, Joseph Adams
<joeyadams3.14159(a)gmail.com> wrote:
> This is a work-in-progress patch of my GSoC project: Add JSON datatype
> to PostgreSQL. �It provides the following:
>
> �* JSON datatype: A TEXT-like datatype for holding JSON-formatted
> text. �Although the JSON RFC decrees that a JSON text be an "object or
> array" (meaning '"hello"' is considered invalid JSON text), this
> datatype lets you store any JSON "value" (meaning '"hello"'::JSON is
> allowed).
> �* Validation: Content is validated when a JSON datum is constructed,
> but JSON validation can also be done programmatically with the
> json_validate() function.
> �* Conversion to/from JSON for basic types. �Conversion functions are
> needed because casting will not unwrap JSON-encoded values. �For
> instance, json('"string"')::text is '"string"', while
> from_json('"string"') is 'string'. �Also, to_json can convert
> PostgreSQL arrays to JSON arrays, providing a nice option for dealing
> with arrays client-side. �from_json currently can't handle JSON
> arrays/objects yet (how they should act is rather unclear to me,
> except when array dimensions and element type are consistent).
> �* Retrieving/setting values in a JSON node (via selectors very
> similar to, but not 100% like, JSONPath as described at
> http://goessner.net/articles/JsonPath/ ).
> �* Miscellaneous functions json_condense and json_type.
>
> This is a patch against CVS HEAD. �This module compiles, installs, and
> passes all 8 tests successfully on my Ubuntu 9.10 system. �It is
> covered pretty decently with regression tests. �It also has SGML
> documentation (the generated HTML is attached for convenience).
>
> Although I am aware of many problems in this patch, I'd like to put it
> out sooner rather than later so it can get plenty of peer review.
> Problems I'm aware of include:
> �* Probably won't work properly when the encoding (client or server?)
> is not UTF-8. �When encoding (e.g. with json_condense), it should (but
> doesn't) use \uXXXX escapes for characters the target encoding doesn't
> support.
> �* json.c is rather autarkic. �It has its own string buffer system
> (rather than using StringInfo) and UTF-8 validator (rather than using
> pg_verify_mbstr_len(?) ).
> �* Some functions/structures are named suggestively, as if they belong
> to (and would be nice to have in) PostgreSQL's utility libraries.
> They are:
> � - TypeInfo, initTypeInfo, and getTypeInfo: A less cumbersome
> wrapper around get_type_io_data.
> � - FN_EXTRA and FN_EXTRA_SZ: Macros to make working with
> fcinfo->flinfo->fn_extra easier.
> � - enumLabelToOid: Look up the Oid of an enum label; needed to
> return an enum that isn't built-in.
> � - utf8_substring: Extract a range of UTF-8 characters out of a UTF-8 string.
> �* Capitalization and function arrangement are rather inconsistent.
> Braces are K&R-style.
> �* json_cleanup and company aren't even used.
> �* The sql/json.sql test case should be broken into more files.
>
> P.S. The patch is gzipped because it expands to 2.6 megabytes.

Some technical points about the submission:

- If you want your coded to be included in PostgreSQL, you need to put
the same license and attribution on it that we use elsewhere.
Generally that looks sorta like this:

/*-------------------------------------------------------------------------
*
* tablecmds.c
* Commands for creating and altering table structures and settings
*
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*
* IDENTIFICATION
* $PostgreSQL$
*
*-------------------------------------------------------------------------
*/

You should have this header on each file, both .c and .h.

- The reason why this patch is 2.6MB is because it has 2.4MB of tests.
I think you should probably pick out the most useful 10% or so, and
drop the rest.

- I was under the impression that we wanted EXPLAIN (FORMAT JSON) to
return type json, but that's obviously not going to be possible if all
of this is contrib. We could (a) move it all into core, (b) move the
type itself and its input and output functions into core and leave the
rest in contrib [or something along those lines], or (c) give up using
it as the return type for EXPLAIN (FORMAT JSON).

- You need to comply with the project coding standards. Thus:

static void
foo()
{
exit(1);
}

Not:

static void foo()
{
exit(1);
}

You should have at least one blank line between every pair of
functions. You should use uncuddled curly braces. Your commenting
style doesn't match the project standard. We prefer explicit NULL
tests over if (!foo). Basically, you should run pgindent on your
code, and also read this:

http://www.postgresql.org/docs/8.4/static/source.html

I don't think this is going to fly either:

/* Declare and initialize a String with the given name. */
#define String(name) String name = NewString()
#define NewString() {{NULL, 0, 0}}

That's just too much magic. We want people to be able to read this
code and easily tell what's going on... spelling a few things out
long hand is OK, good, even.

- You have boatloads of functions in here with no comments whose
functions is entirely non-self-evident. You may or may not need to
rename some of them, but you definitely need to write some comments.

- elog() must be used except for "can't happen" situations. Compare
the way numeric_in() throws an error message versus json_in().

- You have a number of very short convenience functions which don't
seem warranted. For example, build_array_string() is a 2-line
function that is called once. And all the string_append,
string_append_range, string_append_char stuff is duplicating
functionality we already have in appendStringInfoStr,
appendBinaryStringInfo, appendStringInfoChar.

In short, my first impression of this patch is that there's a lot of
good stuff in here, but you need to stop adding features NOW and put A
LOT of work into getting this into a form that the community can
accept. Otherwise, this is likely going to die on the vine, which
would be a shame because a lot of it seems pretty cool.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Joseph Adams on
Update: I'm in the middle of cleaning up the JSON code (
http://git.postgresql.org/gitweb?p=json-datatype.git;a=summary if you
want to see the very latest ), so I haven't addressed all of the major
problems with it yet.

On Fri, Jul 23, 2010 at 2:34 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
> - I was under the impression that we wanted EXPLAIN (FORMAT JSON) to
> return type json, but that's obviously not going to be possible if all
> of this is contrib. �We could (a) move it all into core, (b) move the
> type itself and its input and output functions into core and leave the
> rest in contrib [or something along those lines], or (c) give up using
> it as the return type for EXPLAIN (FORMAT JSON).

I've been developing it as a contrib module because:
* I'd imagine it's easier than developing it as a built-in datatype
right away (e.g. editing a .sql.in file versus editing pg_type.h ).
* As a module, it has PGXS support, so people can try it out right
away rather than having to recompile PostgreSQL.

I, for one, think it would be great if the JSON datatype were all in
core :-) However, if and how much JSON code should go into core is up
for discussion. Thoughts, anyone?

A particularly useful aspect of the JSON support is the ability to
convert PostgreSQL arrays to JSON arrays (using to_json ), as there
currently isn't any streamlined way to parse arrays in the PostgreSQL
format client-side (that I know of).


Joey Adams

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andres Freund on
On Sat, Jul 24, 2010 at 06:57:18PM -0400, Joseph Adams wrote:
> A particularly useful aspect of the JSON support is the ability to
> convert PostgreSQL arrays to JSON arrays (using to_json ), as there
> currently isn't any streamlined way to parse arrays in the PostgreSQL
> format client-side (that I know of).
I really would like to address the latter issue some day. I don't know
how many broken implementations I have seen in my, not that long, time
using pg, but it sure is 10+. I also knowingly have written dumbed
down versions.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Massa, Harald Armin" on
> I, for one, think it would be great if the JSON datatype were all in
> core :-)  However, if and how much JSON code should go into core >is up for discussion.  Thoughts, anyone?
>
in my opinion: As soon as possible. Spinning PostgreSQL as the
Ajax-enabled-database has many great uses.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
Using PostgreSQL is mostly about sleeping well at night.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers