From: karthikbalaguru on
On Feb 22, 5:23 am, rickman <gnu...(a)gmail.com> wrote:
> On Feb 21, 10:54 am, karthikbalaguru <karthikbalagur...(a)gmail.com>
> wrote:
>
>
>
>
>
> > On Feb 21, 2:55 pm, David Schwartz <dav...(a)webmaster.com> wrote:
>
> > > On Feb 20, 5:10 am, karthikbalaguru <karthikbalagur...(a)gmail.com>
> > > wrote:
>
> > > > How is TCP server able to handle
> > > > large number of very rapid near-simultaneous connections ?
>
> > > Process-per-connection servers tend to do this very poorly.
>
> > It overloads the server.
>
> > > But one
> > > trick is to create the processes before you need them rather than
> > > after.
>
> > But, how many processes should be created at the server ?
> > How will the server know about the number of processes that it
> > has to create ? Any ideas ?
>
> If it is really just the creation time that is an issue for running
> many processes, one could always be waiting as a hot spare.  When it
> is then turned loose for a new connection a new process could then be
> created as a background task.
>

But, even during waiting, it unnecessarily consumes resources.

Karthik Balaguru
From: David Schwartz on
On Feb 21, 6:22 pm, karthikbalaguru <karthikbalagur...(a)gmail.com>
wrote:

> > Note that this is a key weakness of the 'process-per-connection'
> > model, and I recommend just not using that model unless it's mandated
> > by other concerned (such as cases where security is more important
> > than performance).

> But, how is that technique of 'process-per-connection' very
> helpful for security ?

Processes are isolated from each other by the operating system and can
have their own security context. Threads share pretty much everything.

> > But there are two techniques, and they are typically used in
> > combination. One is static configuration. This is key on initial
> > server startup. For example, versions of Apache that were process per
> > connection let you set the number of processes to be started up
> > initially. They also let you set the target number of 'spare' servers
> > waiting for connections.

> In case of static configurations, wouldn't that target number
> of servers started initially load the server ? There seems to be a
> drawback in this approach as the 'spare' servers/processes might
> be created unnecessarily even if there are only less clients.

So what? Who cares about performance when there's no load?

> That is, if there are less clients, then those servers will be waiting
> for connections unnecessarily. This in turn would consume
> system resources.

So what? If there are less clients, you have system resources to
spare.

> > The other technique is dynamic tuning. You monitor the maximum number
> > of servers you've ever needed at once, and you keep close to that many
> > around unless you've had long period of inactivity.

> Dynamic tuning appears to overcome the drawbacks w.r.t
> static configuration, But the scenario of 'long period of inactivity'
> requires some thought. During that time, we  might need to
> unnecessarily terminate and restart enough number of processes.
> But, since we cannot not be completely sure of the time of
> maximum traffic arrival, we might land up in having all those
> servers running unnecessarily for long time :-( . Any thoughts ?

They won't be "running". They'll be waiting.

> The process of termination and recreation also consume
> system resources.

Again, so what? Why are you trying to optimize the case where the
server has little work to do?

DS
From: Jorgen Grahn on
["Followup-To:" header set to comp.protocols.tcp-ip despite your
suggestion comp.arch.embedded, which I don't read. I think it's pretty
clear by now that he has no particular interest in embedded systems.]

On Mon, 2010-02-22, Tim Watts wrote:
> karthikbalaguru <karthikbalaguru79(a)gmail.com>
> wibbled on Monday 22 February 2010 04:19
>
>
>>
>> But, even during waiting, it unnecessarily consumes resources.
>>
>> Karthik Balaguru
>
> You need to tell us more about your system (hardware spec, purpose of
> server, expected load).

In general, he needs to tell us what his goal with this discussion is.
The questions jump all over the place, and every answer immediately
spawns N new questions -- with no clue what (if anything) he's trying
to accomplish, other than perhaps a focus on extremely high accept()
load.

Perhaps the OP would be better served by a book. You cannot learn all
you need to know about socket programming from a Usenet thread. I
recommend Stevens' "Advanced Programming in the Unix environment" vol
1, and "TCP/IP Illustrated vol 1" in order to make sense of the
former.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
From: karthikbalaguru on
On Feb 22, 1:35 am, Tim Watts <t...(a)dionic.net> wrote:
> karthikbalaguru <karthikbalagur...(a)gmail.com>
>   wibbled on Sunday 21 February 2010 08:33
>
>
>
> > Linux !
>
> OK - Stop worrying until it becomes a problem then tune the kernel. Make
> reasonable efforts to make the server app efficient but don't kill yourself
> unless you are writing something so outlandishly loaded that special
> considerations are needed. BTW - this is quite a rare requirement in
> practice.
>
>
>
> > Great. But, Is there a C language version of the same that can help
> > in plain sailing with a multliplexed server ?
>
> Honestly don't know. C++ might have something?
>
> But you have the freedom now to implement any of the strategies. How about a
> worker pool of processes with a master doing the listen() and handing off
> each new connection to an idle worker child?
> This would be good if the
> connections have little interaction with each other. If the interaction is
> heavy, you could either:
>
> a) Use processes with IPC;
> b) Use threads;
> c) Do multiplexed;
>
> or
> (sometimes overlooked - depends on the releative weight of the server's non
> mutually interactive code with the interactive bit)
>
> d) Make the server processes totally non interactive and move the
> interactive logic to another server process which communicates with the
> first server over unix domain sockets (these are quite efficient in linux).
>
> Just another idea.

Incase of heavy flow of data/messages, i think
a queue mechanism will be required so that
the master puts in the queue and the worker
child reads from it whenever data is available
in the queue. It looks like the queue mechanism
cannot be ruled out incase of heavy interactions.

>
>
>
> > I searched the internet to find the features available with perl's
> > IO::Multiplex .
> > It seems that IO::Multiplex is designed to take the effort out of
> > managing
> > multiple file handles. It is essentially a really fancy front end to
> > the select
> > system call. In addition to maintaining the select loop, it buffers
> > all input
> > and output to/from the file handles. It can also accept incoming
> > connections
> > on one or more listen sockets.
> > It is object oriented in design, and will notify you of significant
> > events
> > by calling methods on an object that you supply. If you are not using
> > objects,
> > you can simply supply __PACKAGE__ instead of an object reference.
> > You may have one callback object registered for each file handle, or
> > one
> > global one. Possibly both -- the per-file handle callback object will
> > be
> > used instead of the global one. Each file handle may also have a
> > timer
> > associated with it. A callback function is called when the timer
> > expires.
>
> > Any equivalent C language package available ?
>
> You'll have to wait for someone else or google a bit more.
>
> If you can read perl (even if you don't really use it) you could rip off
> IO::Multiplex and make a C or C++ library that implements exactly the same
> API. It's fairly easy to do a line by line translation of perl to C or C++.
> If you're feeling nice you could even opensource your library.
>
> License issues may exist if you do a direct line by line rip off without
> opensourcing it - I'm not a lawyer, just be aware.
>
> But either way, you should be on reasonably safe ground taking inspiration
> from it.
>
> Multiplexing servers work well for servers that process connection data fast
> and deterministically without blocking for long (or not at all). If your
> process may block talking to a database (for example) then you will face the
> problem of blocking the whole server if you're not careful. In which case
> there is much to be said for forking or threaded servers as at least the
> kernel will help you out.
>

Multiplexing servers appear very interesting. Any link
that talks in detail about this ?

Thx in advans,
Karthik Balaguru
From: karthikbalaguru on
On Feb 22, 10:57 am, David Schwartz <dav...(a)webmaster.com> wrote:
> On Feb 21, 6:22 pm, karthikbalaguru <karthikbalagur...(a)gmail.com>
> wrote:
>
> > > Note that this is a key weakness of the 'process-per-connection'
> > > model, and I recommend just not using that model unless it's mandated
> > > by other concerned (such as cases where security is more important
> > > than performance).
> > But, how is that technique of 'process-per-connection' very
> > helpful for security ?
>
> Processes are isolated from each other by the operating system and can
> have their own security context. Threads share pretty much everything.
>
> > > But there are two techniques, and they are typically used in
> > > combination. One is static configuration. This is key on initial
> > > server startup. For example, versions of Apache that were process per
> > > connection let you set the number of processes to be started up
> > > initially. They also let you set the target number of 'spare' servers
> > > waiting for connections.
> > In case of static configurations, wouldn't that target number
> > of servers started initially load the server ? There seems to be a
> > drawback in this approach as the 'spare' servers/processes might
> > be created unnecessarily even if there are only less clients.
>
> So what? Who cares about performance when there's no load?
>
> > That is, if there are less clients, then those servers will be waiting
> > for connections unnecessarily. This in turn would consume
> > system resources.
>
> So what? If there are less clients, you have system resources to
> spare.
>
> > > The other technique is dynamic tuning. You monitor the maximum number
> > > of servers you've ever needed at once, and you keep close to that many
> > > around unless you've had long period of inactivity.
> > Dynamic tuning appears to overcome the drawbacks w.r.t
> > static configuration, But the scenario of 'long period of inactivity'
> > requires some thought. During that time, we  might need to
> > unnecessarily terminate and restart enough number of processes.
> > But, since we cannot not be completely sure of the time of
> > maximum traffic arrival, we might land up in having all those
> > servers running unnecessarily for long time :-( . Any thoughts ?
>
> They won't be "running". They'll be waiting.

:-)

>
> > The process of termination and recreation also consume
> > system resources.
>
> Again, so what? Why are you trying to optimize the case where the
> server has little work to do?
>

:-)
Just exploring all possible optimization ways.

Thx,
Karthik Balaguru