From: Cong Wang on 17 Feb 2010 11:20 Eric Dumazet wrote: > Le mardi 16 février 2010 à 21:06 +0800, Cong Wang a écrit : >> Octavian Purdila wrote: >>> On Tuesday 16 February 2010 11:37:04 you wrote: >>>>> BUILD_BUG_ON(sizeof(struct inet_skb_parm) > sizeof(dummy_skb->cb)); >>>>> >>>>> + sysctl_local_reserved_ports = kzalloc(65536 / 8, GFP_KERNEL); >>>>> + if (!sysctl_local_reserved_ports) >>>>> + goto out; >>>>> + >>>> I think we should also consider the ports in ip_local_port_range, >>>> since we can only reserve the ports in that range. >>>> >>> That is subject to changes at runtime, which means we will have to readjust >>> the bitmap at runtime which introduces the need for additional synchronization >>> operations which I would rather avoid. >> Why? As long as the bitmap is global, this will not be hard. >> >> Consider that if one user writes a port number which is beyond >> the ip_local_port_range into ip_local_reserved_ports, we should >> not accept this, because it doesn't make any sense. But with your >> patch, we do. > > I disagree with you. This is perfectly OK. > > A port not being flagged in ip_local_reserved_ports doesnt mean it can > be used for allocation. > > If you want to really block ports from being used at boot, you could for > example : > > # temporarly reduce the ip_local_port_range > echo "61000 61001" >/proc/sys/net/ipv4/ip_local_port_range > # Build our bitmap (could be slow, if a remote database is read) > for port in $LIST_RESERVED_PORT > do > echo $port >/proc/sys/net/ipv4/ip_local_reserved_ports > done > echo "10000 61000" >/proc/sys/net/ipv4/ip_local_port_range > > I don't think so, if you want to avoid race condition, you just need to write the reserved ports before any networking application starts, IOW, as early as possible during boot. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Cong Wang on 20 Feb 2010 03:00 Eric Dumazet wrote: > Le jeudi 18 février 2010 à 00:13 +0800, Cong Wang a écrit : > >> I don't think so, if you want to avoid race condition, you just need to >> write the reserved ports before any networking application starts, IOW, >> as early as possible during boot. >> > > Sure, but I was thinking retrieving the list of reserved port by a > database query, using network :) > > Anyway, I just feel your argument is not applicable. > > Our kernel is capable of doing an intersection for us, we dont need > to forbid user to mark a port as 'reserved' if this port is already > blacklisted by another mechanism (for example, if this port is already > in use) > Oh, I see your points. But this still could make people confused, like me. I think we'd better document this. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Cong Wang on 20 Feb 2010 03:20 Octavian Purdila wrote: > This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which > allows users to reserve ports for third-party applications. > > The reserved ports will not be used by automatic port assignments > (e.g. when calling connect() or bind() with port number 0). Explicit > port allocation behavior is unchanged. > > Signed-off-by: Octavian Purdila <opurdila(a)ixiacom.com> > Signed-off-by: WANG Cong <amwang(a)redhat.com> > Cc: Neil Horman <nhorman(a)tuxdriver.com> > Cc: Eric Dumazet <eric.dumazet(a)gmail.com> > Cc: Eric W. Biederman <ebiederm(a)xmission.com> > --- > Documentation/networking/ip-sysctl.txt | 14 ++++++++++++++ > drivers/infiniband/core/cma.c | 7 ++++++- > include/net/ip.h | 6 ++++++ > net/ipv4/af_inet.c | 8 +++++++- > net/ipv4/inet_connection_sock.c | 6 ++++++ > net/ipv4/inet_hashtables.c | 2 ++ > net/ipv4/sysctl_net_ipv4.c | 17 +++++++++++++++++ > net/ipv4/udp.c | 3 ++- > net/sctp/socket.c | 2 ++ > 9 files changed, 62 insertions(+), 3 deletions(-) > > diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt > index 2dc7a1d..6534ee7 100644 > --- a/Documentation/networking/ip-sysctl.txt > +++ b/Documentation/networking/ip-sysctl.txt > @@ -564,6 +564,20 @@ ip_local_port_range - 2 INTEGERS > (i.e. by default) range 1024-4999 is enough to issue up to > 2000 connections per second to systems supporting timestamps. > > +ip_local_reserved_ports - list of comma separated ranges > + Specify the ports which are reserved for known third-party > + applications. These ports will not be used by automatic port > + assignments (e.g. when calling connect() or bind() with port > + number 0). Explicit port allocation behavior is unchanged. > + > + The format used for both input and output is a comma separated > + list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and > + 10). Writing to the file will clear all previously reserved > + ports and update the current list with the one given in the > + input. > + > + Default: Empty > + > ip_nonlocal_bind - BOOLEAN > If set, allows processes to bind() to non-local IP addresses, > which can be quite useful - but may break some applications. > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 875e34e..06c9fa5 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -1979,6 +1979,8 @@ retry: > /* FIXME: add proper port randomization per like inet_csk_get_port */ > do { > ret = idr_get_new_above(ps, bind_list, next_port, &port); > + if (inet_is_reserved_local_port(port)) > + ret = -EAGAIN; > } while ((ret == -EAGAIN) && idr_pre_get(ps, GFP_KERNEL)); > > if (ret) > @@ -2995,10 +2997,13 @@ static int __init cma_init(void) > { > int ret, low, high, remaining; > > - get_random_bytes(&next_port, sizeof next_port); > inet_get_local_port_range(&low, &high); > +again: > + get_random_bytes(&next_port, sizeof next_port); > remaining = (high - low) + 1; > next_port = ((unsigned int) next_port % remaining) + low; > + if (inet_is_reserved_local_port(next_port)) > + goto again; > > cma_wq = create_singlethread_workqueue("rdma_cm"); > if (!cma_wq) > diff --git a/include/net/ip.h b/include/net/ip.h > index 503994a..3da9004 100644 > --- a/include/net/ip.h > +++ b/include/net/ip.h > @@ -184,6 +184,12 @@ extern struct local_ports { > } sysctl_local_ports; > extern void inet_get_local_port_range(int *low, int *high); > > +extern unsigned long *sysctl_local_reserved_ports; > +static inline int inet_is_reserved_local_port(int port) > +{ > + return test_bit(port, sysctl_local_reserved_ports); > +} > + > extern int sysctl_ip_default_ttl; > extern int sysctl_ip_nonlocal_bind; > > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c > index 33b7dff..e283fbe 100644 > --- a/net/ipv4/af_inet.c > +++ b/net/ipv4/af_inet.c > @@ -1546,9 +1546,13 @@ static int __init inet_init(void) > > BUILD_BUG_ON(sizeof(struct inet_skb_parm) > sizeof(dummy_skb->cb)); > > + sysctl_local_reserved_ports = kzalloc(65536 / 8, GFP_KERNEL); > + if (!sysctl_local_reserved_ports) > + goto out; > + > rc = proto_register(&tcp_prot, 1); > if (rc) > - goto out; > + goto out_free_reserved_ports; > > rc = proto_register(&udp_prot, 1); > if (rc) > @@ -1647,6 +1651,8 @@ out_unregister_udp_proto: > proto_unregister(&udp_prot); > out_unregister_tcp_proto: > proto_unregister(&tcp_prot); > +out_free_reserved_ports: > + kfree(sysctl_local_reserved_ports); > goto out; > } > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > index 8da6429..1acb462 100644 > --- a/net/ipv4/inet_connection_sock.c > +++ b/net/ipv4/inet_connection_sock.c > @@ -37,6 +37,9 @@ struct local_ports sysctl_local_ports __read_mostly = { > .range = { 32768, 61000 }, > }; > > +unsigned long *sysctl_local_reserved_ports; > +EXPORT_SYMBOL(sysctl_local_reserved_ports); > + Sorry, this looks somewhat weird, why not just export inet_is_reserved_local_port()? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Cong Wang on 20 Feb 2010 03:20 Octavian Purdila wrote: > This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which > allows users to reserve ports for third-party applications. > > The reserved ports will not be used by automatic port assignments > (e.g. when calling connect() or bind() with port number 0). Explicit > port allocation behavior is unchanged. > > Changes from the previous version: > - switch the /proc entry format to coma separated list of range ports > - treat -EFAULT just like any other error and acknowledge written values > - use isdigit() in proc_get_ulong > > Octavian Purdila (3): > sysctl: refactor integer handling proc code > sysctl: add proc_do_large_bitmap > net: reserve ports for applications using fixed port numbers Hi, This version looks fine for me, but I need to give them a test, and I will put feedbacks asap. Thanks for your work! Still two things: 1) bitops are always atomic on every arch, right? If yes, then ok. 2) I hope you could add some documentation to show the relations between ip_local_port_range and ip_local_reserved_ports. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Cong Wang on 20 Feb 2010 21:00 Octavian Purdila wrote: > On Saturday 20 February 2010 10:11:40 you wrote: > >> 2) I hope you could add some documentation to show the relations >> between ip_local_port_range and ip_local_reserved_ports. >> > > How does this sound: > > ip_local_reserved_ports - list of comma separated ranges > Specify the ports which are reserved for known third-party > applications. These ports will not be used by automatic port > assignments (e.g. when calling connect() or bind() with port > number 0). Explicit port allocation behavior is unchanged. > > The format used for both input and output is a comma separated > list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and > 10). Writing to the file will clear all previously reserved > ports and update the current list with the one given in the > input. > > Note that ip_local_port_range and ip_local_port_range settings > are independent and both are considered by the kernel when > determining which ports are available for automatic port > assignments. > > You can reserve ports which are not in the current > ip_local_port_range, e.g.: > > $ cat /proc/sys/net/ipv4/ip_local_port_range > 32000 61000 > $ cat /proc/sys/net/ipv4/ip_local_reserved_ports > 8080,9148 > > although this is redundant. However such a setting is useful > if later the port range is changed to a value that will > include the reserved ports. This looks fine for me. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: [PATCH v3 2/7] net: remove old tcp_optlen function Next: Loan Offer |