exporting raw parser [PgSql]

Prev: [HACKERS] exporting raw parser
Next: [HACKERS] psql's is_select_command is naive

From: Takahiro Itagaki on 26 May 2010 22:00

Tatsuo Ishii <ishii(a)postgresql.org> wrote:

> I'm thinking about exporting the raw parser and related modules as a C
> library. Though this will not be an immediate benefit of PostgreSQL
> itself, it will be a huge benefit for any PostgreSQL
> applications/middle ware those need to parse SQL statements.

I read your proposal says "postgres.exe" will link to "libSQL.dll",
and "pgpool.exe" will also link to the DLL, right?

I think it is reasonable, but I'm not sure what part of postgres
should be in the DLL. Obviously we should avoid code duplication
between the DLL and "postgres.exe".

> - create an exportable version of memory manager
> - create an exportable exception handling routines(i.e. elog)

Are there any other issues? For example,
- How to split headers for raw parser nodes?
- Which module do we define T_xxx enumerations and support functions?
(outfuncs, readfuncs, copyfuncs, and equalfuncs)

The proposal will be acceptable only when all of the technical issues
are solved. The libSQL should also be available in stand-alone.
It should not be a collection of half-baked functions.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tatsuo Ishii on 26 May 2010 23:01

> My "stand-alone" means libSQL can be used from many modules
> without duplicated codes. For example, copy routines for raw
> parse trees should be in the DLL rather than in postgres.exe.
>
> Then, we need to consider other products than pgpool. Who will
> use the dll? If pgpool is the only user, we might not allow to
> modify core codes only for one usecase. More research other than
> pgpool is required to decide the interface routines for libSQL.

If the user of the new API is only pgpool-II, I hadn't made the
propose in the first place. It's a waste of time and I would rather
keep on borrowing the parse code. I thought there were several people
who needed the API as well in the cluster meeting. If somebody who
made such a vote in the meeting is on the list, please express your
opinion for the API.

I'm not in the position of speaking for other products.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jan Wieck on 1 Jun 2010 12:27

On 5/26/2010 10:16 PM, Tatsuo Ishii wrote:
>> As was already discussed, I don't believe that premise. None of the
>> applications you cite would be able to make use of the raw parser
>> output, because it doesn't contain the semantic information they need.
>> If what you actually meant was the analyzed parse tree, that *might*
>> serve the need depending on just what is wanted (in particular,
>> properties that could be affected by the expansion of views or
>> inlineable functions could still not be determined reliably).
>> But you can't have that without access to the current system catalog
>> contents.
>
> No, what pgpoo-II needs is a raw parse tree. When it needs info in the
> system catalog, it sends SELECT to PostgreSQL. So that would be no
> problem.

But doesn't it need that parse tree BEFORE it makes the decision, which
node to execute the query on?

The parser needs the system catalog in order to create a parse tree.
Where would that stand-alone library version of the parser get the
catalog information from? Don't you need to know which user defined
function in the query is volatile?

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Daniel Farina on 7 Jun 2010 03:17

On Wed, May 26, 2010 at 6:02 PM, Tatsuo Ishii <ishii(a)postgresql.org> wrote:
> I'm thinking about exporting the raw parser and related modules as a C
> library. Though this will not be an immediate benefit of PostgreSQL
> itself, it will be a huge benefit for any PostgreSQL
> applications/middle ware those need to parse SQL statements.

In the past I and people I have known/worked with have made strategic
use of UDFs running on a live server that return the parse tree,
semantically analyzed tree, and planned tree (I think) outNode textual
representation for various projects, and found them highly useful.
Syntactic, semantic, and operational meaning of a query was useful for
our projects.

Some of this code was linked with the server, and so reading the node
using Postgres' parser was easy. Otherwise, a small parser needed be
written for external projects. Perhaps a slightly more ideal state of
affairs would be:

* These hooks to acquire the syntactic/semantic/planned trees would be
bundled "for free"
* When writing code not linked against the server, a more common
serialization format, ala JSON or whatnot

A more ambitious project that I don't think is in the scope of any
initial implementation would be to allow for cross referencing of
these compilation passes, similar to how GNU Bison allows you to
interrogate for the position of a lexeme when reporting errors. In my
experience, code written that mangles one layer (say, semantic, or
harder yet, plan) has a hard time doing the best error because getting
from a node at the "bottom" to the right lexeme(s) at the "top" is
very cumbersome. One could imagine this being useful for other
purposes too, but that is how I felt it firsthand. Feels a lot harder,
though.

fdr

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Dimitri Fontaine on 7 Jun 2010 04:14

Daniel Farina <drfarina(a)acm.org> writes:
> Some of this code was linked with the server, and so reading the node
> using Postgres' parser was easy. Otherwise, a small parser needed be
> written for external projects. Perhaps a slightly more ideal state of
> affairs would be:
>
> * These hooks to acquire the syntactic/semantic/planned trees would be
> bundled "for free"
> * When writing code not linked against the server, a more common
> serialization format, ala JSON or whatnot

Accessing to those data have been talked about with respect to DDL
triggers too. You want to be able to know what exactly is being
executed, and against what objects.

And you want to be able to abuse this information from either a C-coded
server function or a PLpgSQL trigger. I guess the WIP JSON datatype
would help a lot even when working from within the server, as that does
not mean working in C.

Regards,
--
dim

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev |
Pages: 1 2
Prev: [HACKERS] exporting raw parser
Next: [HACKERS] psql's is_select_command is naive