Prev: [RFC] oom-kill: give the dying task a higher priority
Next: [PATCH 1/2 v2] FLAT: split the stack & data alignments
From: Balbir Singh on 28 May 2010 10:30 * MinChan Kim <minchan.kim(a)gmail.com> [2010-05-28 23:06:23]: > > I confess I failed to distinguish memcg OOM and system OOM and used "in > > case of OOM kill the selected task the faster you can" as the guideline. > > If the exit code path is short that shouldn't be a problem. > > > > Maybe the right way to go would be giving the dying task the biggest > > priority inside that memcg to be sure that it will be the next process from > > that memcg to be scheduled. Would that be reasonable? > > Hmm. I can't understand your point. > What do you mean failing distinguish memcg and system OOM? > > We already have been distinguish it by mem_cgroup_out_of_memory. > (but we have to enable CONFIG_CGROUP_MEM_RES_CTLR). > So task selected in select_bad_process is one out of memcg's tasks when > memcg have a memory pressure. > We have a routine to help figure out if the task belongs to the memory cgroup that cause the OOM. The OOM entry from memory cgroup is different from a regular one. -- Three Cheers, Balbir -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Luis Claudio R. Goncalves on 28 May 2010 10:40 On Fri, May 28, 2010 at 11:06:23PM +0900, Minchan Kim wrote: | On Fri, May 28, 2010 at 09:53:05AM -0300, Luis Claudio R. Goncalves wrote: | > On Fri, May 28, 2010 at 02:59:02PM +0900, KOSAKI Motohiro wrote: .... | > | As far as my observation, RT-function always have some syscall. because pure | > | calculation doesn't need deterministic guarantee. But _if_ you are really | > | using such priority design. I'm ok maximum NonRT priority instead maximum | > | RT priority too. | > | > I confess I failed to distinguish memcg OOM and system OOM and used "in | > case of OOM kill the selected task the faster you can" as the guideline. | > If the exit code path is short that shouldn't be a problem. | > | > Maybe the right way to go would be giving the dying task the biggest | > priority inside that memcg to be sure that it will be the next process from | > that memcg to be scheduled. Would that be reasonable? | | Hmm. I can't understand your point. | What do you mean failing distinguish memcg and system OOM? | | We already have been distinguish it by mem_cgroup_out_of_memory. | (but we have to enable CONFIG_CGROUP_MEM_RES_CTLR). | So task selected in select_bad_process is one out of memcg's tasks when | memcg have a memory pressure. The approach of giving the highest priority to the dying task makes sense in a system wide OOM situation. I though that would also be good for the memcg OOM case. After Balbir Singh's comment, I understand that in a memcg OOM the dying task should have a priority just above the priority of the main task of that memcg, in order to avoid interfering in the rest of the system. That is the point where I failed to distinguish between memcg and system OOM. Should I pursue that new idea of looking for the right priority inside the memcg or is it overkill? I really don't have a clear view of the impact of a memcg OOM on system performance - don't know if it is better to solve the issue sooner (highest RT priority) or leave it to be solved later (highest prio on the memcg). I have the impression the general case points to the simpler solution. Luis -- [ Luis Claudio R. Goncalves Bass - Gospel - RT ] [ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Minchan Kim on 28 May 2010 11:20 On Fri, May 28, 2010 at 07:50:48PM +0530, Balbir Singh wrote: > * MinChan Kim <minchan.kim(a)gmail.com> [2010-05-28 23:06:23]: > > > > I confess I failed to distinguish memcg OOM and system OOM and used "in > > > case of OOM kill the selected task the faster you can" as the guideline. > > > If the exit code path is short that shouldn't be a problem. > > > > > > Maybe the right way to go would be giving the dying task the biggest > > > priority inside that memcg to be sure that it will be the next process from > > > that memcg to be scheduled. Would that be reasonable? > > > > Hmm. I can't understand your point. > > What do you mean failing distinguish memcg and system OOM? > > > > We already have been distinguish it by mem_cgroup_out_of_memory. > > (but we have to enable CONFIG_CGROUP_MEM_RES_CTLR). > > So task selected in select_bad_process is one out of memcg's tasks when > > memcg have a memory pressure. > > > > We have a routine to help figure out if the task belongs to the memory > cgroup that cause the OOM. The OOM entry from memory cgroup is > different from a regular one. I meant it. My english is poor. "out of" isn't proper. > > -- > Three Cheers, > Balbir -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on 28 May 2010 11:30 On Sat, 2010-05-29 at 00:12 +0900, Minchan Kim wrote: > I think highest RT proirity ins't good solution. > As I mentiond, Some RT functions don't want to be preempted by other processes > which cause memory pressure. It makes RT task broken. All the patches I've seen use MAX_RT_PRIO-1, which is actually FIFO-1, which is the lowest RT priority. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Luis Claudio R. Goncalves on 28 May 2010 11:30
On Sat, May 29, 2010 at 12:12:49AM +0900, Minchan Kim wrote: | On Fri, May 28, 2010 at 11:36:17AM -0300, Luis Claudio R. Goncalves wrote: | > On Fri, May 28, 2010 at 11:06:23PM +0900, Minchan Kim wrote: | > | On Fri, May 28, 2010 at 09:53:05AM -0300, Luis Claudio R. Goncalves wrote: | > | > On Fri, May 28, 2010 at 02:59:02PM +0900, KOSAKI Motohiro wrote: | > ... | > | > | As far as my observation, RT-function always have some syscall. because pure | > | > | calculation doesn't need deterministic guarantee. But _if_ you are really | > | > | using such priority design. I'm ok maximum NonRT priority instead maximum | > | > | RT priority too. | > | > | > | > I confess I failed to distinguish memcg OOM and system OOM and used "in | > | > case of OOM kill the selected task the faster you can" as the guideline. | > | > If the exit code path is short that shouldn't be a problem. | > | > | > | > Maybe the right way to go would be giving the dying task the biggest | > | > priority inside that memcg to be sure that it will be the next process from | > | > that memcg to be scheduled. Would that be reasonable? | > | | > | Hmm. I can't understand your point. | > | What do you mean failing distinguish memcg and system OOM? | > | | > | We already have been distinguish it by mem_cgroup_out_of_memory. | > | (but we have to enable CONFIG_CGROUP_MEM_RES_CTLR). | > | So task selected in select_bad_process is one out of memcg's tasks when | > | memcg have a memory pressure. | > | > The approach of giving the highest priority to the dying task makes sense | > in a system wide OOM situation. I though that would also be good for the | > memcg OOM case. | > | > After Balbir Singh's comment, I understand that in a memcg OOM the dying | > task should have a priority just above the priority of the main task of | > that memcg, in order to avoid interfering in the rest of the system. | > | > That is the point where I failed to distinguish between memcg and system OOM. | > | > Should I pursue that new idea of looking for the right priority inside the | > memcg or is it overkill? I really don't have a clear view of the impact of | > a memcg OOM on system performance - don't know if it is better to solve the | > issue sooner (highest RT priority) or leave it to be solved later (highest | > prio on the memcg). I have the impression the general case points to the | > simpler solution. | | I think highest RT proirity ins't good solution. | As I mentiond, Some RT functions don't want to be preempted by other processes | which cause memory pressure. It makes RT task broken. For the RT case, if you reached a system OOM situation, your determinism has already been hurt. If the memcg OOM happens on the same memcg your RT task is - what will probably be the case most of time - again, the determinism has deteriorated. For both these cases, giving the dying task SCHED_FIFO MAX_RT_PRIO-1 means a faster recovery. I don't know what is the system-wide latency effect of a memcg OOM, if any, or if it would affect an RT task running on another memcg. That is the case where a more careful priority selection could be necessary. | On the other hand, normal processes don't have a requirement of RT. | But it isn't a big problem that it lost little time slice, I think. | So how about raising max normal priority? | but I am not sure this is right solution. | Let's listen other's opinion. | I believe Peter have a good idea. Thanks again for helping to discuss this idea. Luis -- [ Luis Claudio R. Goncalves Bass - Gospel - RT ] [ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |