From: Bart Samwel on
On Fri, Feb 12, 2010 at 02:31, Mark Mielke <mark(a)mark.mielke.cc> wrote:

> But once there, it seems clear that packing hostnames or netmasks onto one
> line is just ugly and hard to manage. I'd like to see this extended to any
> of the many ways to allow hostnames to be specified one per line. For
> example:
>
> set tool_servers {
> 127.0.0.1/32
> ::1/128
> 1.2.3.4/32
> 1.2.3.5/32
> }
>
> host DATABASE USER $tool_servers md5
>
> The above features easy parsing capability.
>
> Of course, then I'll ask for the ability to simplify specifying multiple
> databases:
>
> set databases {
> db1
> db2
> }
>
> set users {
> user1
> user2
> }
>
> host $databases $users $tool_servers md5
>
> Sorry... :-)
>

Definitely sounds useful! But I do now see that this is entirely orthogonal
to what I was trying to do -- which means I don't have to do anything about
it. :-)


> I think wildcards are interesting, but I have yet to see an actual use
> case other than "it's cool and very generalized". In my mind (tell me if I'm
> wrong), the most common type of PostgreSQL authentication setup is within a
> local network within an organization. There, you either authorize an entire
> subnet ("the entire server park" or "all client PCs") or you authorize
> specific hosts (single IP address). The wildcard case is for replacing the
> first case, but for that case, subnets are usually just fine. I'm trying to
> target the second case here.
>
>
> The user case would be an organization with nodes all over the IP space,
> that wants to manage configuration from a single place. DNS would be that
> single place of choice. If moves trust from "trust the netmasks to be kept
> up-to-date" to "trust that DNS will be kept up-to-date". Since DNS has
> important reasons to be up-to-date, it's a pretty safe bet that DNS is equal
> or more up-to-date than pg_hba.conf hard coded netmasks. It makes sense, but
> it can be a later use case. It doesn't have to be in version 1.
>

DNS is preferred to subnets in that regard, definitely. But again, that
points to the per-hostname route, and it's not a use case for the wildcard
route (unless people explicitly choose to organize their DNS hierarchy so
that they can use it for PostgreSQL authorization -- doubtful.)

Cheers,
Bart
From: Mark Mielke on
On 02/11/2010 05:12 PM, Bart Samwel wrote:
> On Thu, Feb 11, 2010 at 23:01, Mark Mielke <mark(a)mark.mielke.cc
> <mailto:mark(a)mark.mielke.cc>> wrote:
>
> On 02/11/2010 04:54 PM, Bart Samwel wrote:
>>
>>> ISSUE #3: Multiple hostnames?
>>>
>>> Currently, a pg_hba entry lists an IP / netmask combination.
>>> I would suggest allowing lists of hostnames in the entries,
>>> so that you can at least mimic the "match multiple hosts by
>>> a single rule". Any reason not to do this?
>>
>> I'm mixed. In some situations, I've wanted to put multiple
>> IP/netmask. I would say that if multiple names are supported,
>> then multiple IP/netmask should be supported. But, this does
>> make the lines unwieldy beyond two or three. This direction
>> leans towards the capability to define "host classes", where
>> the rules allows the host class, and the host class can have
>> a list of hostnames.
>>
>>
>> Yes, but before you know it people will ask for being able to
>> specify multiple host classes. :-) Quite simply put, with a
>> single subnet you can allow multiple hosts in. Allowing only a
>> single hostname is a step backward from that, so adding support
>> for multiple hostnames could be useful if somebody is replacing
>> subnets with hostname-based configuration.
>
> This implies two aspects which may not be true:
>
> 1) All hosts that I want to allow belong to the same subnet.
> 2) If I trust one host on the subnet, then I trust all hosts
> on the subnet.
>
> While the above two points are often true, they are not
> universally true.
>
>
> I don't think we're talking about the same thing here. I wasn't
> suggesting doing hostname-plus-netmask. NO! I was suggesting that
> where a lazy sysadmin would previously configure by subnet, they might
> switch to more fine-grained hostname-based configuration ONLY IF it
> doesn't require duplicating every line in pg_hba.conf for every host
> in the subnet.

Ah yes. You are focusing on allowing a netmask to expand to hostnames.
I'm focusing on how netmasks were never that great on their own.

You want to allow multiple hosts - I want you to allow multiple
netmasks. I think the requirement is the same. I also think that "same
line" has always been an annoying restriction. I have many duplicated
lines today just for:

host DATABASE USER 127.0.0.1/32 md5
host DATABASE USER ::1/128 md5

Isn't that a big silly? If you think it's acceptable to allow multiple
hostname, I'm pointing out that your requirement is not limited to
hostnames only. Why not?

host DATABASE USER 127.0.0.1/32,::1/128 md5

Same requirements, same syntax (assuming you were suggesting ','), same
documentation. Why not?

But once there, it seems clear that packing hostnames or netmasks onto
one line is just ugly and hard to manage. I'd like to see this extended
to any of the many ways to allow hostnames to be specified one per line.
For example:

set tool_servers {
127.0.0.1/32
::1/128
1.2.3.4/32
1.2.3.5/32
}

host DATABASE USER $tool_servers md5

The above features easy parsing capability.

Of course, then I'll ask for the ability to simplify specifying multiple
databases:

set databases {
db1
db2
}

set users {
user1
user2
}

host $databases $users $tool_servers md5

Sorry... :-)

>> 2) What will you do if they specify a hostname and a netmask?
>> This seems like a convenient way of saying "everybody on the same
>> subnet as NAME."
>>
>>
>> Not supported. Either an IP address / netmask combo, or a hostname,
>> but not both. I wouldn't want to recommend hardcoding something such
>> as netmasks (which are definitely subnet dependent) in combination
>> with something as volatile as a host name -- move it to a different
>> subnet, and you might allow a whole bigger subnet than you intended.
>> If they want to specify a netmask, then they should just use
>> hardcoded IPs as well.
>
> Ah yes, I recall this from a previous thread. I think I also
> disagreed on the other thread. :-)
>
> I thought of a use for reverse lookup - it would allow wild card
> hostnames. Still, that's an advanced feature that might be for
> later... :-)
>
>
> I think wildcards are interesting, but I have yet to see an actual use
> case other than "it's cool and very generalized". In my mind (tell me
> if I'm wrong), the most common type of PostgreSQL authentication setup
> is within a local network within an organization. There, you either
> authorize an entire subnet ("the entire server park" or "all client
> PCs") or you authorize specific hosts (single IP address). The
> wildcard case is for replacing the first case, but for that case,
> subnets are usually just fine. I'm trying to target the second case here.

The user case would be an organization with nodes all over the IP space,
that wants to manage configuration from a single place. DNS would be
that single place of choice. If moves trust from "trust the netmasks to
be kept up-to-date" to "trust that DNS will be kept up-to-date". Since
DNS has important reasons to be up-to-date, it's a pretty safe bet that
DNS is equal or more up-to-date than pg_hba.conf hard coded netmasks. It
makes sense, but it can be a later use case. It doesn't have to be in
version 1.

Cheers,
mark

--
Mark Mielke<mark(a)mielke.cc>

From: Euler Taveira de Oliveira on
Mark Mielke escreveu:
> Of course, then I'll ask for the ability to simplify specifying multiple
> databases:
>
We already support multiple users and/or databases for a single pg_hba.conf
line ...


--
Euler Taveira de Oliveira
http://www.timbira.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Mark Mielke on
On 02/11/2010 08:13 AM, Bart Samwel wrote:
> ISSUE #1: Performance / caching
>
> At present, I've simply not added caching. The reasoning for this is
> as follows:
> (a) getaddrinfo doesn't tell us about expiry, so when do you refresh?
> (b) If you put the cache in the postmaster, it will not work for
> exec-based backends as opposed to fork-based backends, since those
> read pg_hba.conf every time they are exec'ed.
> (c) If you put this in the postmaster, the postmaster will have to
> update the cache every once in a while, which may be slow and which
> may prevent new connections while the cache update takes place.
> (d) Outdated cache entries may inexplicably and without any logging
> choose the wrong rule for some clients. Big aargh: people will start
> using this to specify 'deny' rules based on host names.
>
> If you COULD get expiry info out of getaddrinfo you could potentially
> store this info in a table or something like that, and have it updated
> by the backends? But that's way over my head for now. ISTM that this
> stuff may better be handled by a locally-running caching DNS server,
> if people have performance issues with the lack of caching. These
> local caching DNS servers can also handle expiry correctly, etcetera.
>
> We should of course still take care to look up a given hostname only
> once for each connection request.

You should cache for some minimal amount of time or some minimal number
of records - even if it's just one minute, and even if it's a fixed
length LRU sorted list. This would deal with situations where a new
connection is raised several times a second (some types of load). For
connections raised once a minute or less, the benefit of caching is far
less. But, this can be a feature tagged on later if necessary and
doesn't need to gate the feature.

Many UNIX/Linux boxes have some sort of built-in cache, sometimes
persistent, sometimes shared. On my Linux box, I have nscd - "name
server caching daemon" - which should be able to cache these sorts of
lookups. I believe it is used for things as common as mapping uid to
username in output of "/bin/ls -l", so it does need to be pretty fast.

The difference between in process cache and something like "nscd" is the
inter-process communication required to use "nscd".


> ISSUE #2: Reverse lookup?
>
> There was a suggestion on the TODO list on the wiki, which basically
> said that maybe we could use reverse lookup to find "the" hostname and
> then check for that hostname in the list. I think that won't work,
> since IPs can go by many names and may not support reverse lookup for
> some hostnames (/etc/hosts anybody?). Furthermore, due to the
> top-to-bottom processing of pg_hba.conf, you CANNOT SKIP entries that
> might possibly match. For instance, if the third line is for host
> "foo.example.com <http://foo.example.com>" and the fifth line is for
> "bar.example.com <http://bar.example.com>", both lines may apply to
> the same IP, and you still HAVE to check the first one, even if
> reverse lookup turns up the second host name. So it doesn't save you
> any lookups, it just costs an extra one.

I don't see a need to do a reverse lookup. Reverse lookups are sometimes
done as a verification check, in the sense that it's cheap to get a map
from NAME -> IP, but sometimes it is much harder to get the reverse map
from IP -> NAME. However, it's not a reliable check as many legitimate
users have trouble getting a reverse map from IP -> NAME. It also
doesn't same anything as IP -> NAME lookups are a completely different
set of name servers, and these name servers are not always optimized for
speed as IP -> NAME lookups are less common than NAME -> IP. Finally, if
one finds a map from IP -> NAME, that doesn't prove that a map from NAME
-> IP exists, so using *any* results from IP -> NAME is questionable.

I think reverse lookups are unnecessary and undesirable.

> ISSUE #3: Multiple hostnames?
>
> Currently, a pg_hba entry lists an IP / netmask combination. I would
> suggest allowing lists of hostnames in the entries, so that you can at
> least mimic the "match multiple hosts by a single rule". Any reason
> not to do this?

I'm mixed. In some situations, I've wanted to put multiple IP/netmask. I
would say that if multiple names are supported, then multiple IP/netmask
should be supported. But, this does make the lines unwieldy beyond two
or three. This direction leans towards the capability to define "host
classes", where the rules allows the host class, and the host class can
have a list of hostnames.

Two other aspects I don't see mentioned:

1) What will you do for hostnames that have multiple IP addresses? Will
you accept all IP addresses as being valid?
2) What will you do if they specify a hostname and a netmask? This seems
like a convenient way of saying "everybody on the same subnet as NAME."

Cheers,
mark

--
Mark Mielke<mark(a)mielke.cc>

From: "Kevin Grittner" on
Bart Samwel <bart(a)samwel.tk> wrote:

> I've been working on a patch to add hostname support to
> pg_hba.conf.

> At present, I've simply not added caching.

Perhaps you could just recommend using nscd (or similar).

> There was a suggestion on the TODO list on the wiki, which
> basically said that maybe we could use reverse lookup to find
> "the" hostname and then check for that hostname in the list. I
> think that won't work, since IPs can go by many names and may not
> support reverse lookup for some hostnames (/etc/hosts anybody?).

Right. Any reverse lookup should be, at best, for display in error
messages or logs. There can be zero to many names for an IP
address.

> Currently, a pg_hba entry lists an IP / netmask combination. I
> would suggest allowing lists of hostnames in the entries, so that
> you can at least mimic the "match multiple hosts by a single
> rule". Any reason not to do this?

I can't see any reason other than code complexity.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers