From: Vivek Goyal on 7 Jul 2010 11:50 On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote: > RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD, > without further checks. > RQ_NOIDLE can be used to mark the last request of a sequence for which > - we want to idle between the requests of the sequence, to keep locality > - we don't want to idle after the sequence, because we know that no new > nearby requests will follow, so we should switch servicing other > queues. Corrado, in higher layers any WRITE_SYNC request currently is marked as RQ_NOIDLE. At that point it is just not known whether there will be another request after this or not. So I would not think of RQ_NOIDLE as being conclusively telling us that this is last request in the sequence. I think requst being WRITE_SYNC, we just don't know if the application is going to write more or not immediately. fsync, O_SYNC etc fall in this category. But in general I like the idea of getting rid of idling on as many cases as possiblle. Jeff's recent posting to fix fsync issue depends on idling even on WRITE_SYNC queues so your patch and his patchsets are fundamentally incompatible. Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just don't know the answer to that question. :-)). But in general I want to get rid of idling as much as possible otherwise it becomes a serious bottleneck in any kind of performance testing on higher end storage. At the same time not idling runs the risk of process doing WRITE_SYNC not getting fair share in presence of sequential readers if writer does not keep the queue busy. I will do some testing with this patchset little later. Thanks Vivek > This patch fixes this behaviour, making it similar to how it behaved > before 8e55063, but still fixing the corner cases that were the > motivation for it. > > Signed-off-by: Corrado Zoccolo <czoccolo(a)gmail.com> > --- > block/cfq-iosched.c | 15 ++++++++++----- > 1 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index 5ef9a5d..cac3afb 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) > cfqd->noidle_tree_requires_idle |= bitmask; > > /* > - * Idling is enabled for SYNC_WORKLOAD. > - * SYNC_NOIDLE_WORKLOAD idles at the end of the tree > - * only if we processed at least one !rq_noidle request > + * Idling is enabled for: > + * - the last sync queue of a group > + * - SYNC_WORKLOAD queues, for !rq_noidle requests > + * - SYNC_NOIDLE_WORKLOAD "at the end of the tree" > + * if at least one queue sent !rq_noidle requests > + * not followed by at least one rq_noidle request. > */ > - if (cfqd->serving_type == SYNC_WORKLOAD > - || cfqd->noidle_tree_requires_idle > + if ((cfqd->serving_type == SYNC_WORKLOAD > + && !rq_noidle(rq)) > + || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD > + && cfqd->noidle_tree_requires_idle) > || cfqq->cfqg->nr_cfqq == 1) > cfq_arm_slice_timer(cfqd); > } > -- > 1.6.4.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on 7 Jul 2010 12:00 On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote: > On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote: > > RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD, > > without further checks. > > RQ_NOIDLE can be used to mark the last request of a sequence for which > > - we want to idle between the requests of the sequence, to keep locality > > - we don't want to idle after the sequence, because we know that no new > > nearby requests will follow, so we should switch servicing other > > queues. > > Corrado, in higher layers any WRITE_SYNC request currently is marked > as RQ_NOIDLE. At that point it is just not known whether there will be > another request after this or not. So I would not think of RQ_NOIDLE > as being conclusively telling us that this is last request in the > sequence. > > I think requst being WRITE_SYNC, we just don't know if the application > is going to write more or not immediately. fsync, O_SYNC etc fall in > this category. > > But in general I like the idea of getting rid of idling on as many cases > as possiblle. Jeff's recent posting to fix fsync issue depends on idling > even on WRITE_SYNC queues so your patch and his patchsets are > fundamentally incompatible. > > Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just > don't know the answer to that question. :-)). But in general I want to > get rid of idling as much as possible otherwise it becomes a serious > bottleneck in any kind of performance testing on higher end storage. > > At the same time not idling runs the risk of process doing WRITE_SYNC > not getting fair share in presence of sequential readers if writer does > not keep the queue busy. > > I will do some testing with this patchset little later. Hmm..., noticed that you are still using Jens's old mail id. Fixing it. Thanks Vivek > > > This patch fixes this behaviour, making it similar to how it behaved > > before 8e55063, but still fixing the corner cases that were the > > motivation for it. > > > > Signed-off-by: Corrado Zoccolo <czoccolo(a)gmail.com> > > --- > > block/cfq-iosched.c | 15 ++++++++++----- > > 1 files changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > > index 5ef9a5d..cac3afb 100644 > > --- a/block/cfq-iosched.c > > +++ b/block/cfq-iosched.c > > @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) > > cfqd->noidle_tree_requires_idle |= bitmask; > > > > /* > > - * Idling is enabled for SYNC_WORKLOAD. > > - * SYNC_NOIDLE_WORKLOAD idles at the end of the tree > > - * only if we processed at least one !rq_noidle request > > + * Idling is enabled for: > > + * - the last sync queue of a group > > + * - SYNC_WORKLOAD queues, for !rq_noidle requests > > + * - SYNC_NOIDLE_WORKLOAD "at the end of the tree" > > + * if at least one queue sent !rq_noidle requests > > + * not followed by at least one rq_noidle request. > > */ > > - if (cfqd->serving_type == SYNC_WORKLOAD > > - || cfqd->noidle_tree_requires_idle > > + if ((cfqd->serving_type == SYNC_WORKLOAD > > + && !rq_noidle(rq)) > > + || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD > > + && cfqd->noidle_tree_requires_idle) > > || cfqq->cfqg->nr_cfqq == 1) > > cfq_arm_slice_timer(cfqd); > > } > > -- > > 1.6.4.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Corrado Zoccolo on 7 Jul 2010 12:10 On Wed, Jul 7, 2010 at 5:46 PM, Vivek Goyal <vgoyal(a)redhat.com> wrote: > On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote: >> RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD, >> without further checks. >> RQ_NOIDLE can be used to mark the last request of a sequence for which >> - we want to idle between the requests of the sequence, to keep locality >> - we don't want to idle after the sequence, because we know that no new >> nearby requests will follow, so we should switch servicing other >> queues. > > Corrado, in higher layers any WRITE_SYNC request currently is marked > as RQ_NOIDLE. At that point it is just not known whether there will be > another request after this or not. So I would not think of RQ_NOIDLE > as being conclusively telling us that this is last request in the > sequence. Probably WRITE_SYNC are marked as RQ_NOIDLE because the application can always send several write requests together (while for reads, you usually need the result of one read to send the other). This means that cfq will actually care only of the RQ_NOIDLE on the last request in the queue (since, until the queue is empty, we don't even consider idling). > > I think requst being WRITE_SYNC, we just don't know if the application > is going to write more or not immediately. fsync, O_SYNC etc fall in > this category. > > But in general I like the idea of getting rid of idling on as many cases > as possiblle. Jeff's recent posting to fix fsync issue depends on idling > even on WRITE_SYNC queues so your patch and his patchsets are > fundamentally incompatible. I think this can be easily fixed by removing RQ_NOIDLE from those requests on which Jeff wants to idle. Once no more requests can ever be marked RQ_NOIDLE, then we can remove this code completely. > > Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just > don't know the answer to that question. :-)). But in general I want to > get rid of idling as much as possible otherwise it becomes a serious > bottleneck in any kind of performance testing on higher end storage. > > At the same time not idling runs the risk of process doing WRITE_SYNC > not getting fair share in presence of sequential readers if writer does > not keep the queue busy. > > I will do some testing with this patchset little later. Thanks, I've resent the patches for 2.6.36 (this version were based on 2.6.34). Corrado > > Thanks > Vivek > >> This patch fixes this behaviour, making it similar to how it behaved >> before 8e55063, but still fixing the corner cases that were the >> motivation for it. >> >> Signed-off-by: Corrado Zoccolo <czoccolo(a)gmail.com> >> --- >> block/cfq-iosched.c | 15 ++++++++++----- >> 1 files changed, 10 insertions(+), 5 deletions(-) >> >> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c >> index 5ef9a5d..cac3afb 100644 >> --- a/block/cfq-iosched.c >> +++ b/block/cfq-iosched.c >> @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) >> cfqd->noidle_tree_requires_idle |= bitmask; >> >> /* >> - * Idling is enabled for SYNC_WORKLOAD. >> - * SYNC_NOIDLE_WORKLOAD idles at the end of the tree >> - * only if we processed at least one !rq_noidle request >> + * Idling is enabled for: >> + * - the last sync queue of a group >> + * - SYNC_WORKLOAD queues, for !rq_noidle requests >> + * - SYNC_NOIDLE_WORKLOAD "at the end of the tree" >> + * if at least one queue sent !rq_noidle requests >> + * not followed by at least one rq_noidle request. >> */ >> - if (cfqd->serving_type == SYNC_WORKLOAD >> - || cfqd->noidle_tree_requires_idle >> + if ((cfqd->serving_type == SYNC_WORKLOAD >> + && !rq_noidle(rq)) >> + || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD >> + && cfqd->noidle_tree_requires_idle) >> || cfqq->cfqg->nr_cfqq == 1) >> cfq_arm_slice_timer(cfqd); >> } >> -- >> 1.6.4.4 > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo(a)gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on 7 Jul 2010 12:20 On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote: > Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just > don't know the answer to that question. :-)). But in general I want to > get rid of idling as much as possible otherwise it becomes a serious > bottleneck in any kind of performance testing on higher end storage. After I've been thinking about this for a while I think the major problems is that we use WRITE_SYNC for two very different I/O patterns. One is synchronous data I/O (O_SYNC/O_DIRECT/fsync). While this is a high-level synchronous workload in the sense that someone waits for the I/O to finish, the I/O can still be batched as we're doing relatively large amounts of bios. The other one is synchronous writeout of metadata or the journal. Here we typically wait on that single I/O we're just submitting (or at most a handfull), and there is absolutely no point in idling. We already have the REQ_NOIDLE flag to distinguish between the two, so instead of second guessing we should actually make use of it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on 7 Jul 2010 13:20 On Wed, Jul 07, 2010 at 12:13:08PM -0400, Christoph Hellwig wrote: > On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote: > > Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just > > don't know the answer to that question. :-)). But in general I want to > > get rid of idling as much as possible otherwise it becomes a serious > > bottleneck in any kind of performance testing on higher end storage. > > After I've been thinking about this for a while I think the major > problems is that we use WRITE_SYNC for two very different I/O patterns. > > One is synchronous data I/O (O_SYNC/O_DIRECT/fsync). While this is a > high-level synchronous workload in the sense that someone waits for the > I/O to finish, the I/O can still be batched as we're doing relatively > large amounts of bios. > > The other one is synchronous writeout of metadata or the journal. Jeff Moyer had mentioned that in his testing journal writes from jbd threads were appearing as asynchronous (WRITES) in CFQ and we don't do any kind of idling in CFQ on asynchronous WRITES. So this is probably already a non issue. Thanks Vivek > Here > we typically wait on that single I/O we're just submitting (or at most a > handfull), and there is absolutely no point in idling. > > We already have the REQ_NOIDLE flag to distinguish between the two, so > instead of second guessing we should actually make use of it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: Add async PF initialization to PV guest. Next: cfq-iosched: fixing RQ_NOIDLE handling. |