Prev: [PATCH] ipc/sem.c: Bugfix for semop() not reporting successful operation
Next: [PATCH 0/2] padata: Separate cpumasks for cb_cpus and parallel workers
From: Dan Kruchinin on 29 Jun 2010 12:40 Hello. The main point of my patches is to make two separate cpumasks. One for parallel and another for serial workers(callback cpus). It'll perform to bind non-intersecting groups of CPUs for serial and parallel workers and do more thin tuning of padata subsystem. My tests shows that proper configuration of serial and parallel cpu masks gives a bit better performance. For example (aes-asm, sha1-generic. Two 16-core machines): 1) 1 point-to-point connection: Non-modified padata gives ~650Mbit of TCP and ~780Mbit of UDP When I exclude callback CPUs from parallel cpumask padata gives ~750Mbit of TCP and ~900Mbit of UDP. 2) 2 IPSEC tunnels between 16-core machines and 4 clients communicating via tunnels with each-other Non-modified padata gives ~1.5Gbit of UDP padata with non-intersecting cpumasks for parallel and serial workers gives ~1.8Gbit Besides the performance growth, there may be situations when serial job takes a lot of time. For example if I add several dozens of firewall rules, serial worker will work slower and padata_do_parallel will continue to enqueue requests into the queue of CPU serial worker executes on. It may significantly slow down parallelization and reordering because one CPU(that is shared by both parallel and serial workers) will always have more requests in its parallel queue than others CPUs(because serialization takes a lot of time). In such cases user may exclude callback CPUs from cpumask for parallel workers. -- W.B.R. Dan Kruchinin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |