[PATCH] Fix leaky VIEWs for RLS [PgSql]

Prev: [HACKERS] SET CONSTRAINTS todo
Next: [HACKERS] hot_standby = on

From: Stephen Frost on 7 Jun 2010 07:06

Heikki,

* Heikki Linnakangas (heikki.linnakangas(a)enterprisedb.com) wrote:
> The big difference is what information can be obtained, not how fast it
> can be obtained.

Actually, I disagree. Time required to acquire the data does matter.

> Imagine a table that holds username/passwords for users. Each user is
> allowed to see his own row, including password, but not anyone else's.
> EXPLAIN side-channel might give pretty accurate information of how many
> rows there is in the table, and via clever EXPLAIN+statistics probing
> you might be able to find out what the top-10 passwords are, for
> example. But if you wanted to know what your neighbor's password is, the
> side-channels would not help you much, but an error message would reveal
> it easily.

Using only built-ins, could you elaborate on how one could pick exactly
what row was revealed using an error case? That strikes me as
difficult, but perhaps I'm not thinking creatively enough.

Thanks,

Stephen

From: Heikki Linnakangas on 7 Jun 2010 07:53

On 07/06/10 14:06, Stephen Frost wrote:
> * Heikki Linnakangas (heikki.linnakangas(a)enterprisedb.com) wrote:
>> The big difference is what information can be obtained, not how fast it
>> can be obtained.
>
> Actually, I disagree. Time required to acquire the data does matter.

Depends on the magnitude, of course. If it takes 1 year per row, that's
probably acceptable. If it takes 1 second, that's extremely slow
compared to normal queries, but most likely still disastreous from a
security point of view.

>> Imagine a table that holds username/passwords for users. Each user is
>> allowed to see his own row, including password, but not anyone else's.
>> EXPLAIN side-channel might give pretty accurate information of how many
>> rows there is in the table, and via clever EXPLAIN+statistics probing
>> you might be able to find out what the top-10 passwords are, for
>> example. But if you wanted to know what your neighbor's password is, the
>> side-channels would not help you much, but an error message would reveal
>> it easily.
>
> Using only built-ins, could you elaborate on how one could pick exactly
> what row was revealed using an error case? That strikes me as
> difficult, but perhaps I'm not thinking creatively enough.

WHERE should do it:

SELECT * FROM secrets_view WHERE username = 'neighbor' AND
password::integer = 1234;
ERROR: invalid input syntax for integer: "neighborssecretpassword"

Assuming that username = 'neighbor' is evaluated before the cast.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Stephen Frost on 7 Jun 2010 08:56

* Heikki Linnakangas (heikki.linnakangas(a)enterprisedb.com) wrote:
> WHERE should do it:
>
> SELECT * FROM secrets_view WHERE username = 'neighbor' AND
> password::integer = 1234;
> ERROR: invalid input syntax for integer: "neighborssecretpassword"
>
> Assuming that username = 'neighbor' is evaluated before the cast.

Fair enough, so we can't allow built-ins either, except perhaps in very
specific/limited situations. Still, if we track that the above WHERE
and password::integer calls *should* be run as "role X", while the view
should run as "role Y", maybe we can at least identify the case where
we've ended up in a situation where we are going to expose unintended
data. I don't know enough about the optimizer or the planner to have
any clue how we might teach them to actually avoid doing such, though I
certainly believe it could end up being a disaster on performance based
on comments from others who know better. :)

Thanks,

Stephen

From: KaiGai Kohei on 7 Jun 2010 20:10

(2010/06/07 20:53), Heikki Linnakangas wrote:
> On 07/06/10 14:06, Stephen Frost wrote:
>> * Heikki Linnakangas (heikki.linnakangas(a)enterprisedb.com) wrote:
>>> The big difference is what information can be obtained, not how fast it
>>> can be obtained.
>>
>> Actually, I disagree. Time required to acquire the data does matter.
>
> Depends on the magnitude, of course. If it takes 1 year per row, that's
> probably acceptable. If it takes 1 second, that's extremely slow
> compared to normal queries, but most likely still disastreous from a
> security point of view.
>
FYI, the classic also mentioned about bandwidth of covert channel,
although it was already obsoleted.

See the page.80 of:
http://csrc.nist.gov/publications/history/dod85.pdf

It said 1bit/sec are acceptable on DoD in 25 years ago.

>>> Imagine a table that holds username/passwords for users. Each user is
>>> allowed to see his own row, including password, but not anyone else's.
>>> EXPLAIN side-channel might give pretty accurate information of how many
>>> rows there is in the table, and via clever EXPLAIN+statistics probing
>>> you might be able to find out what the top-10 passwords are, for
>>> example. But if you wanted to know what your neighbor's password is, the
>>> side-channels would not help you much, but an error message would reveal
>>> it easily.
>>
>> Using only built-ins, could you elaborate on how one could pick exactly
>> what row was revealed using an error case? That strikes me as
>> difficult, but perhaps I'm not thinking creatively enough.
>
> WHERE should do it:
>
> SELECT * FROM secrets_view WHERE username = 'neighbor' AND
> password::integer = 1234;
> ERROR: invalid input syntax for integer: "neighborssecretpassword"
>
> Assuming that username = 'neighbor' is evaluated before the cast.
>

In this case, is it unnecessary to expose the given argument in
the error message (from security perspective), isn't it?
Because it is basically matter of the integer input handler,
it seems to me what we should fix up is int4in(), not optimizer.

Perhaps, we should categorize the issued functionalities base on
the level of its threat when abused.

* High-threat
Functions have side-effect that allows to move the given arguments
into another tables or other high-bandwidth chennel.
E.g) lowrite(), pg_write_file()

-> It should be fixed soon.

* Middle-threat
Functions have side-effect that allows to move the given arguments
using error messages or other low-bandwidth channel.
E.g) int4in()

-> It should be fixed in long term.

* Row-threat
Functions can imply existence of invisible tuples, but it does not
expose the value itself.
E.g) EXPLAIN statement, PK/FK constraints

-> It should not be fixed in PostgreSQL.

Now we allow all of them.
But isn't it valuable to fix the high-threat first?
Then, we can revise error messages in built-in functions, I think.

Thanks,
--
KaiGai Kohei <kaigai(a)ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: KaiGai Kohei on 7 Jun 2010 20:23

(2010/06/07 21:56), Stephen Frost wrote:
> * Heikki Linnakangas (heikki.linnakangas(a)enterprisedb.com) wrote:
>> WHERE should do it:
>>
>> SELECT * FROM secrets_view WHERE username = 'neighbor' AND
>> password::integer = 1234;
>> ERROR: invalid input syntax for integer: "neighborssecretpassword"
>>
>> Assuming that username = 'neighbor' is evaluated before the cast.
>
> Fair enough, so we can't allow built-ins either, except perhaps in very
> specific/limited situations. Still, if we track that the above WHERE
> and password::integer calls *should* be run as "role X", while the view
> should run as "role Y", maybe we can at least identify the case where
> we've ended up in a situation where we are going to expose unintended
> data. I don't know enough about the optimizer or the planner to have
> any clue how we might teach them to actually avoid doing such, though I
> certainly believe it could end up being a disaster on performance based
> on comments from others who know better. :)
>

My opinion is that it is a matter in individual functions, not optimizer.
Basically, built-in functions *should* be trusted, because our security
mechanism is not designed to prevent anything from malicious internal
binary modules.

Historically, we have not known the risk to leak invisible information
using error messages for a long time, so most of internal functions have
not been designed not to return users unnecessary information.
If so, it needs to revise error messages, but it will not complete with
a single commit.

Thanks,
--
KaiGai Kohei <kaigai(a)ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: [HACKERS] SET CONSTRAINTS todo
Next: [HACKERS] hot_standby = on