Prev: [PATCH] pmcraid : Remove unwanted cast for void * pointers
Next: kernel.h Move preprocessor #warning about using kernel headers in userpsace to types.h
From: Linus Torvalds on 11 Jul 2010 22:50 On Sun, Jul 11, 2010 at 7:19 PM, Rusty Russell <rusty(a)rustcorp.com.au> wrote: > > PS. �When did we start top-commenting and quoting the whole patch? Sorry, my bad. I've been using the gmail web interface for a while now (that's how I tracked my email on my cellphone while I was on vacation, which helped a lot when I got back). I like many of the features, but the email posting takes some getting used to. Partly because gmail seems to actively encourage some bad behavior (like top posting and obviously not having working tabs), but mostly because I'm just a klutz. (The big upside of the gmail web interface being that searching works across folders. So I think I'll stick with it despite the downsides. And I'll try to be less klutzy) Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on 12 Jul 2010 01:20 Le dimanche 11 juillet 2010 à 18:18 -0700, Linus Torvalds a écrit : > On Sun, Jul 11, 2010 at 3:03 PM, Steven Rostedt <rostedt(a)goodmis.org> wrote: > > > > I have seen some hits with cli-sti. I was considering swapping all > > preempt_disable() with local_irq_save() in ftrace, but hackbench showed > > a 30% performance degradation when I did that. > > Yeah, but in that case you almost certainly keep the per-cpu cacheline > hot in the D$ L1 cache, and the stack tracer is presumably also not > taking any extra I$ L1 misses. So you're not seeing any of the > downsides. The upside of plain cli/sti is that they're small, and have > no D$ footprint. > > And it's possible that the interrupt flag - at least if/when > positioned right - wouldn't have any additional D$ footprint under > normal load either. IOW, if there is an existing per-cpu cacheline > that is effectively always already dirty and in the cache, > But that's something that really needs macro-benchmarks - exactly > because microbenchmarks don't show those effects since they are always > basically hot-cache. > Some kernel dev incorrectly assume they own cpu caches... This discussion reminds me I noticed a performance problem with placement of cpu_online_bits and cpu_online_mask on separate sections (and thus separate cache lines) and a network load. static DECLARE_BITMAP(cpu_online_bits, CONFIG_NR_CPUS) __read_mostly; const struct cpumask *const cpu_online_mask = to_cpumask(cpu_online_bits); Two changes are possible : 1) Get rid of the cpu_online_mask (its a const pointer to a known target). I cant see a reason for its need it actually... 2) Dont use a the last const qualifier but __read_mostly to move cpu_online_mask on same section. Rusty, could you comment on one or other way before I submit a patch ? (Of course, possible/present/active have same problem) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on 12 Jul 2010 03:40 Hello, On 07/11/2010 10:29 PM, Linus Torvalds wrote: > You need to show some real improvement on real hardware. > > I can't really care less about qemu behavior. If the emulator is bad > at emulating cli/sti, that's a qemu problem. Yeap, qemu is just nice when developing things like this and I mentioned it mainly to point out how immature the patch is as it behaves good (correctness wise) only there yet probably because qemu doesn't use one of the fancier idle's. > But if it actually helps on real hardware (which is possible), that > would be interesting. However, quite frankly, I doubt you can really > measure it on any bigger load. cli-sti do not tend to be all that > expensive any more (on a P4 it's probably noticeable, I doubt it shows > up very much anywhere else). I'm not very convinced either. Nehalems are said to be able to do cli-sti sequences every 13 cycles or so, which sounds pretty good and managing it asynchronously might not buy anything. But what they said was cli-sti bandwidth, probably meaning that if you do cli-sti's in succession or tight loop, each iteration will take 13 cycles. So, there still could be cost related to instruction scheduling. Another thing is the cost difference of cli/sti's on different archs/machines. This is the reason Rusty suggested it in the first place, I think (please correct me if I'm wrong). This means that we're forced to assume that cli/sti's are relatively expensive when writing generic code. This, for example, impacts how generic percpu access operations are defined. Their semantic is defined as preemption safe but not IRQ safe. ie. IRQ handler may run in the middle of percpu_add() although on many archs including x86 these operations are atomic w.r.t. IRQ. If the cost of interrupt masking operation can be brought down to that of preemption masking across major architectures, those restrictions can be removed. x86 might not be the architecture which would benefit the most from such change but it's the most widely tested architecture so I think it would be better to have it applied on x86 if it helps a bit while not being too invasive if this is done on multiple platforms. (Plus, it's the architecture I'm most familiar with :-) It only took me a couple of days to get it working and the changes are pretty localized, so I think it's worthwhile to see whether it actually helps anything on x86. I'm thinking about doing raw IOs on SSDs which isn't too unrealistic and heavy on both IRQ masking and IRQ handling although actual hardware access cost might just drown any difference and workloads which are heavy on memory allocations and such might be better fit. If you have any better ideas on testing, please let me know. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on 12 Jul 2010 03:50 Hello, Rusty. On 07/12/2010 04:19 AM, Rusty Russell wrote: > Also, is it worth trying to implement this soft disable generically? > I know at least ppc64 does it today... > > (Y'know, because your initial patch wasn't ambitious enough...) We can evolve things such that common parts are factored into generic code but most of important part being heavily dependent on the specific architecture, I don't think there will be too much (calling irqhandler on a separate stack if necessary, generic IRQ masking flag mgmt maybe merged into preemption flag and so on). Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on 12 Jul 2010 04:10
On Mon, 12 Jul 2010 02:41:33 pm Eric Dumazet wrote: > Two changes are possible : > > 1) Get rid of the cpu_online_mask (its a const pointer to a known > target). I cant see a reason for its need it actually... There was a reason, but I'm trying to remember it. ISTR, it was to catch direct frobbing of the masks. That was important: we were converting code everywhere to hand around cpumasks by ptr rather than by copy. But that semantic change meant that a function which previously harmlessly frobbed a copy would now frob (say) cpu_online_mask. However, ((const struct cpumask *)cpu_online_bits)) would work for that too. (Well, renaming cpu_online_bits to __cpu_online_bits would be better since it's not non-static). Ideally, those masks too would be dynamically allocated. But the boot changes required for that are best left until someone really needs > 64k CPUs. > 2) Dont use a the last const qualifier but __read_mostly to move > cpu_online_mask on same section. > > Rusty, could you comment on one or other way before I submit a patch ? > > (Of course, possible/present/active have same problem) Yep. Might want to do a patch to get rid of the remaining 100 references to cpu_online_map (etc) as well if you're feeling enthusiastic :) Thanks! Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |