From: Craig Ringer on
On 21/05/10 23:55, Josh Berkus wrote:
> So, here's a working definition:
>
> 1) cannot directly read or write files on the server.

It must also prevent PL-user-level access to file descriptors already
open by the backend. That's implicitly covered in the above, but should
probably be explicit.

> 2) cannot bind network ports
> 3) uses only the SPI interface to interact with postgresql tables etc.
> 4) does any logging only using elog to the postgres log

5) Cannot dynamically load shared libraries from user-supplied locations

(eg in Python, 'import' of a module that had a .so component would be
blocked unless it was in the core module path)

> a) it seems like there should be some kind of restriction on access to
> memory, but I'm not clear on how that would be defined.

Like:

5) Has no way to directly access backend memory, ie doesn't have
PL-user-accessible pointers or user access to any C-level calls that
take/return them. Data structures containing pointers must be opaque to
the PL user.

The idea being that if you have no access to C APIs that work with
pointers to memory, and you can't use files (/dev/mem, /proc/self/mem,
etc), you can't work with backend memory directly.

--
Craig Ringer

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jan Wieck on
On 5/23/2010 10:04 PM, Andrew Dunstan wrote:
>
> Jan Wieck wrote:
>> On 5/23/2010 6:14 PM, Ron Mayer wrote:
>>> Tom Lane wrote:
>>>> Robert Haas <robertmhaas(a)gmail.com> writes:
>>>>> So... can we get back to coming up with a reasonable
>>>>> definition,
>>>>
>>>> (1) no access to system calls (including file and network I/O)
>>>
>>> If a PL has file access to it's own sandbox (similar to what
>>> flash seems to do in web browsers), could that be considered
>>> trusted?
>>
>> That is a good question.
>>
>> Currently, the first of all TRUSTED languages, PL/Tcl, would allow the
>> function of a lesser privileged user access the "global" objects of
>> every other database user created within the same session.
>>
>> These are per backend in memory objects, but none the less, an evil
>> function could just scan the per backend Tcl namespace and look for
>> compromising data, and that's not exactly what TRUSTED is all about.
>>
>> In the case of Tcl it is possible to create a separate "safe"
>> interpreter per DB role to fix this. I actually think this would be
>> the right thing to do.
>>
>
> I think that would probably be serious overkill. Maybe a data stash per
> role rather than an interpreter per role would be doable. it would
> certainly be more lightweight.
>
> ISTM we are in danger of confusing several different things. A user that
> doesn't want data to be shared should not stash it in global objects.
> But to me, trusting a language is not about making data private, but
> about not allowing the user to do things that are dangerous, such as
> referencing memory, or the file system, or the operating system, or
> network connections, or loading code which might do any of those things.

How is "loading code which might do any of those things" different from
writing a stored procedure, that accesses data, a careless "superuser"
left in a global variable? Remember, the code of a PL function is "open"
source - like in "everyone can select from pg_proc". You really don't
expect anyone to scan for your global variables just because they can
write functions in the same language?


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Craig Ringer on
On 22/05/10 02:12, Robert Haas wrote:
> On Fri, May 21, 2010 at 1:58 PM, David Fetter<david(a)fetter.org> wrote:
>> On Fri, May 21, 2010 at 01:45:45PM -0400, Stephen Frost wrote:
>>> * David Fetter (david(a)fetter.org) wrote:
>>>> That is *precisely* the business we need to be in, at least for the
>>>> languages we ship, and it would behoove us to test languages we don't
>>>> ship so we can warn people when they don't pass.
>>>
>>> k, let's start with something simpler first tho- I'm sure we can pull in
>>> the glibc regression tests and run them too. You know, just in case
>>> there's a bug there, somewhere.
>>
>> That's pretty pure straw man argument. I expect much higher quality
>> trolling. D-.
>
> I'm sorely tempted to try to provide some higher-quality trolling, but
> in all seriousness I think that (1) we could certainly use much better
> regression tests in many areas of which this is one and (2) it will
> never be possible to catch all security bugs - in particular - via
> regression testing because they typically stem from cases people
> didn't consider. So... can we get back to coming up with a reasonable
> definition, and if somebody wants to write some regression tests, all
> the better?

Personally, I don't think a PL should be trusted unless it _does_ define
a whitelist of operations. Experience in the wider world has shown that
this is the only approach that works. Regression testing to make sure
all possible approaches to access unsafe features are blocked is doomed
to have holes where there's another approach that hasn't been thought of
yet.

Perl's new approach is whitelist based. Python restricted mode failed
not least because it was a blacklist and people kept on finding ways
around it. Lua and JavaScript are great examples of whitelist
approaches, where the language just doesn't expose features that're
dangerous - in fact, the core language doesn't even *have* those
features. PL/PgSQL is the same, and works well as a trusted language for
that reason.

Java's SecurityManager is whitelist based (allowed classes, allowed
operations), and has proved very secure.

--
Craig Ringer

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Dunstan on


Jan Wieck wrote:
>>
>> ISTM we are in danger of confusing several different things. A user
>> that doesn't want data to be shared should not stash it in global
>> objects. But to me, trusting a language is not about making data
>> private, but about not allowing the user to do things that are
>> dangerous, such as referencing memory, or the file system, or the
>> operating system, or network connections, or loading code which might
>> do any of those things.
>
> How is "loading code which might do any of those things" different
> from writing a stored procedure, that accesses data, a careless
> "superuser" left in a global variable? Remember, the code of a PL
> function is "open" source - like in "everyone can select from
> pg_proc". You really don't expect anyone to scan for your global
> variables just because they can write functions in the same language?
>

Well, that threat arises from the unsafe actions of the careless
superuser. And we could at least ameliorate it by providing a per role
data stash, at very little cost, as I mentioned. It's not like we don't
know about such threats, and I'm certainly not pretending they don't
exist. The 9.0 PL/Perl docs say:

The %_SHARED variable and other global state within the language is
public data, available to all PL/Perl functions within a session.
Use with care, especially in situations that involve use of multiple
roles or SECURITY DEFINER functions.


But the threats I was referring to arise if the language allows them to,
without any requirement for unsafe actions by another user. Protecting
against those is the essence of trustedness in my mind at least.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jan Wieck on
On 5/23/2010 11:19 PM, Andrew Dunstan wrote:
>
> Jan Wieck wrote:
>>>
>>> ISTM we are in danger of confusing several different things. A user
>>> that doesn't want data to be shared should not stash it in global
>>> objects. But to me, trusting a language is not about making data
>>> private, but about not allowing the user to do things that are
>>> dangerous, such as referencing memory, or the file system, or the
>>> operating system, or network connections, or loading code which might
>>> do any of those things.
>>
>> How is "loading code which might do any of those things" different
>> from writing a stored procedure, that accesses data, a careless
>> "superuser" left in a global variable? Remember, the code of a PL
>> function is "open" source - like in "everyone can select from
>> pg_proc". You really don't expect anyone to scan for your global
>> variables just because they can write functions in the same language?
>>
>
> Well, that threat arises from the unsafe actions of the careless
> superuser. And we could at least ameliorate it by providing a per role
> data stash, at very little cost, as I mentioned. It's not like we don't
> know about such threats, and I'm certainly not pretending they don't
> exist. The 9.0 PL/Perl docs say:
>
> The %_SHARED variable and other global state within the language is
> public data, available to all PL/Perl functions within a session.
> Use with care, especially in situations that involve use of multiple
> roles or SECURITY DEFINER functions.
>
>
> But the threats I was referring to arise if the language allows them to,
> without any requirement for unsafe actions by another user. Protecting
> against those is the essence of trustedness in my mind at least.

I can agree with that.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers