- Potential performance bottleneck for Linxu TCP [Kernel]

Prev: Add IDE mode support for SB600 SATA
Next: 2.6 driver for Silan SC92031 (second try)

From: Lee Revell on 30 Nov 2006 12:00

On Thu, 2006-11-30 at 09:33 +0000, Christoph Hellwig wrote:
> On Wed, Nov 29, 2006 at 07:56:58PM -0600, Wenji Wu wrote:
> > Yes, when CONFIG_PREEMPT is disabled, the "problem" won't happen. That is why I put "for 2.6 desktop, low-latency desktop" in the uploaded paper. This "problem" happens in the 2.6 Desktop and Low-latency Desktop.
>
> CONFIG_PREEMPT is only for people that are in for the feeling. There is no
> real world advtantage to it and we should probably remove it again.

There certainly is a real world advantage for many applications. Of
course it would be better if the latency requirements could be met
without kernel preemption but that's not the case now.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Wenji Wu on 30 Nov 2006 12:10

>The solution is really simple and needs no kernel change at all: if you
>want the TCP receiver to get a larger share of timeslices then either
>renice it to -20 or renice the other tasks to +19.

Simply give a larger share of timeslices to the TCP receiver won't solve the
problem. No matter what the timeslice is, if the TCP receiving process has
packets within backlog, and the process is expired and moved to the expired
array, RTO might happen in the TCP sender.

The solution does not look like that simple.

wenji

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: David Miller on 30 Nov 2006 15:10

From: Wenji Wu <wenji(a)fnal.gov>
Date: Thu, 30 Nov 2006 10:08:22 -0600

> If the higher prioirty processes become runnable (e.g., interactive
> process), you better yield the CPU, instead of continuing this process. If
> it is the case that the process within tcp_recvmsg() is expriring, then, you
> can continue the process to go ahead to process backlog.

Yes, I understand this, and I made that point in one of my
replies to Ingo Molnar last night.

The only seemingly remaining possibility is to find a way to allow
input packet processing, at least enough to emit ACKs, during
tcp_recvmsg() processing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: David Miller on 30 Nov 2006 15:30

From: Ingo Molnar <mingo(a)elte.hu>
Date: Thu, 30 Nov 2006 11:32:40 +0100

> Note that even without the change the TCP receiving task is already
> getting a disproportionate share of cycles due to softirq processing!
> Under a load of 10.0 it went from 500 mbits to 74 mbits, while the
> 'fair' share would be 50 mbits. So the TCP receiver /already/ has an
> unfair advantage. The patch only deepends that unfairness.

I want to point out something which is slightly misleading about this
kind of analysis.

Your disk I/O speed doesn't go down by a factor of 10 just because 9
other non disk I/O tasks are running, yet for TCP that's seemingly OK
:-)

Not looking at input TCP packets enough to send out the ACKs is the
same as "forgetting" to queue some I/O requests that can go to the
controller right now.

That's the problem, TCP performance is intimately tied to ACK
feedback. So we should find a way to make sure ACK feedback goes
out, in preference to other tcp_recvmsg() processing.

What really should pace the TCP sender in this kind of situation is
the advertised window, not the lack of ACKs. Lack of an ACK mean the
packet didn't get there, which is the wrong signal in this kind of
situation, whereas a closing window means "application can't keep
up with the data rate, hold on..." and is the proper flow control
signal in this high load scenerio.

If you don't send ACKs, packets are retransmitted when there is no
reason for it, and that borders on illegal. :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 30 Nov 2006 15:30

* Wenji Wu <wenji(a)fnal.gov> wrote:

> >The solution is really simple and needs no kernel change at all: if
> >you want the TCP receiver to get a larger share of timeslices then
> >either renice it to -20 or renice the other tasks to +19.
>
> Simply give a larger share of timeslices to the TCP receiver won't
> solve the problem. No matter what the timeslice is, if the TCP
> receiving process has packets within backlog, and the process is
> expired and moved to the expired array, RTO might happen in the TCP
> sender.

if you still have the test-setup, could you nevertheless try setting the
priority of the receiving TCP task to nice -20 and see what kind of
performance you get?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7
Prev: Add IDE mode support for SB600 SATA
Next: 2.6 driver for Silan SC92031 (second try)