From: Wu Fengguang on
On Tue, Aug 03, 2010 at 06:55:20PM +0800, Jan Kara wrote:
> On Tue 03-08-10 11:01:25, Wu Fengguang wrote:
> > On Tue, Aug 03, 2010 at 04:51:52AM +0800, Jan Kara wrote:
> > > On Fri 30-07-10 12:03:06, Wu Fengguang wrote:
> > > > On Fri, Jul 30, 2010 at 12:20:27AM +0800, Jan Kara wrote:
> > > > > On Thu 29-07-10 19:51:44, Wu Fengguang wrote:
> > > > > > The periodic/background writeback can run forever. So when any
> > > > > > sync work is enqueued, increase bdi->sync_works to notify the
> > > > > > active non-sync works to exit. Non-sync works queued after sync
> > > > > > works won't be affected.
> > > > > Hmm, wouldn't it be simpler logic to just make for_kupdate and
> > > > > for_background work always yield when there's some other work to do (as
> > > > > they are livelockable from the definition of the target they have) and
> > > > > make sure any other work isn't livelockable?
> > > >
> > > > Good idea!
> > > >
> > > > > The only downside is that
> > > > > non-livelockable work cannot be "fair" in the sense that we cannot switch
> > > > > inodes after writing MAX_WRITEBACK_PAGES.
> > > >
> > > > Cannot switch indoes _before_ finish with the current
> > > > MAX_WRITEBACK_PAGES batch?
> > > Well, even after writing all those MAX_WRITEBACK_PAGES. Because what you
> > > want to do in a non-livelockable work is: take inode, write it, never look at
> > > it again for this work. Because if you later return to the inode, it can
> > > have newer dirty pages and thus you cannot really avoid livelock. Of
> > > course, this all assumes .nr_to_write isn't set to something small. That
> > > avoids the livelock as well.
> >
> > I do have a poor man's solution that can handle this case.
> > https://kerneltrap.org/mailarchive/linux-fsdevel/2009/10/7/6476473/thread
> > It may do more extra works, but will stop livelock in theory.
> So I don't think sync work on it's own is a problem. There we can just
> give up any fairness and just go inode by inode. IMHO it's much simpler that
> way.

I would like to reserve my opinion here. IMHO small files should
better get synced first :)

> The remaining types of work we have are "for_reclaim" and then ones
> triggered by filesystems to get rid of delayed allocated data. These cases
> can easily have well defined and low nr_to_write so they wouldn't be
> livelockable either. What do you think?

Right. for_reclaim works won't livelock in itself, since it will be
bounded by either nr_to_write or some range. They may be delayed for a
while by large sync or nr_pages works though.

> > A related question is, what if some for_reclaim works get enqueued?
> > Shall we postpone the sync work as well? The global sync is not likely
> > to hit the dirty pages in a small memcg, or may take long time. It
> > seems not a high priority task though.
> I see some incentive to do this but the simple thing with for_background
> and for_kupdate work is that they are essentially state-less and so they
> can be easily (and automatically) restarted. It would be really hard to
> implement something like this for sync and still avoid livelocks.

So let's ignore the issue for now.

Thanks,
Fengguang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/