From: Mike Snitzer on 1 Jul 2010 16:20 On Thu, Jul 01 2010 at 9:03am -0400, Mike Snitzer <snitzer(a)redhat.com> wrote: > On Thu, Jul 01 2010 at 6:49am -0400, > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote: > > > This fixes discard page leak by using q->unprep_rq_fn facility. > > > > q->unprep_rq_fn is called when all the data buffer (req->bio and > > scsi_data_buffer) in the request is freed. > > > > sd_unprep() uses rq->buffer to free discard page allocated in > > sd_prepare_discard(). > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> > > Thanks for sorting this out Tomo, all 3 patches work great! > > BTW, there is one remaining (rare) leak in the allocation path. > > The following patch serves to fix it but I'm not sure if there is a more > elegant way to address this. I've continued to look at this to arrive at alternative implementation. Here is a summary of the problem: A 'scsi_setup_discard_cmnd' return other than BLKPREP_OK will not cause a discard request to get completely stripped down ('blk_finish_request' isn't calling 'blk_unprep_request' because REQ_DONTPREP is not set by 'scsi_prep_return' for none BLKPREP_OK return). Therefore the discard request's page will _not_ get cleaned up. Aside from code inspection, I confirmed this by adding some test code to force a one-time initial BLKPREP_DEFER return from 'scsi_setup_discard_cmnd'. > An alternative would be to check if the page is already allocated > (before allocating the page in scsi_setup_discard_cmnd)? Unfortunatey this "alternative" won't work because it completely ignores the case where BLKPREP_KILL is returned from scsi_setup_discard_cmnd'. > Please advise, thanks. In short, I'm not too happy that the following patch doesn't allow for centralized cleanup of the discard request's page (via sd_unprep_fn). But in order to do that we'd likely have to: 1) relax blk_finish_request's REQ_DONTPREP constraint 2) add other weird conditionals within blk_unprep_request because the discard request wasn't _really_ prepared? So given this I'm inclined to stick with the following patch. Jens and/or James, what do you think? Mike > From: Mike Snitzer <snitzer(a)redhat.com> > Subject: scsi: address leak in the error path of discard page allocation > > Be sure to free the discard page if scsi_setup_blk_pc_cmnd fails. > E.g. Returning BLKPREP_DEFER from scsi_setup_blk_pc_cmnd will not cause > the request to be processed by sd_unprep_fn before the request is > retried (preparation included). > > Signed-off-by: Mike Snitzer <snitzer(a)redhat.com> > > --- > block/blk-core.c | 23 +++++++++++++++++++++++ > drivers/scsi/sd.c | 6 +++++- > include/linux/blkdev.h | 1 + > 3 files changed, 29 insertions(+), 1 deletion(-) > > Index: linux-2.6/drivers/scsi/sd.c > =================================================================== > --- linux-2.6.orig/drivers/scsi/sd.c > +++ linux-2.6/drivers/scsi/sd.c > @@ -466,7 +466,11 @@ static int scsi_setup_discard_cmnd(struc > > blk_add_request_payload(rq, page, len); > ret = scsi_setup_blk_pc_cmnd(sdp, rq); > - rq->buffer = page_address(page); > + if (ret != BLKPREP_OK) { > + blk_clear_request_payload(rq); > + __free_page(page); > + } else > + rq->buffer = page_address(page); > return ret; > } > > Index: linux-2.6/block/blk-core.c > =================================================================== > --- linux-2.6.orig/block/blk-core.c > +++ linux-2.6/block/blk-core.c > @@ -1164,6 +1164,29 @@ void blk_add_request_payload(struct requ > } > EXPORT_SYMBOL_GPL(blk_add_request_payload); > > +/** > + * blk_clear_request_payload - clear a request's payload > + * @rq: request to update > + * > + * The driver needs to take care of freeing the payload itself. > + */ > +void blk_clear_request_payload(struct request *rq) > +{ > + struct bio *bio = rq->bio; > + > + rq->__data_len = rq->resid_len = 0; > + rq->nr_phys_segments = 0; > + rq->buffer = NULL; > + > + bio->bi_size = 0; > + bio->bi_vcnt = 0; > + bio->bi_phys_segments = 0; > + > + bio->bi_io_vec->bv_page = NULL; > + bio->bi_io_vec->bv_len = 0; > +} > +EXPORT_SYMBOL_GPL(blk_clear_request_payload); > + > void init_request_from_bio(struct request *req, struct bio *bio) > { > req->cpu = bio->bi_comp_cpu; > Index: linux-2.6/include/linux/blkdev.h > =================================================================== > --- linux-2.6.orig/include/linux/blkdev.h > +++ linux-2.6/include/linux/blkdev.h > @@ -781,6 +781,7 @@ extern void blk_insert_request(struct re > extern void blk_requeue_request(struct request_queue *, struct request *); > extern void blk_add_request_payload(struct request *rq, struct page *page, > unsigned int len); > +extern void blk_clear_request_payload(struct request *rq); > extern int blk_rq_check_limits(struct request_queue *q, struct request *rq); > extern int blk_lld_busy(struct request_queue *q); > extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: James Bottomley on 1 Jul 2010 16:20 On Thu, 2010-07-01 at 16:15 -0400, Mike Snitzer wrote: > On Thu, Jul 01 2010 at 9:03am -0400, > Mike Snitzer <snitzer(a)redhat.com> wrote: > > > On Thu, Jul 01 2010 at 6:49am -0400, > > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote: > > > > > This fixes discard page leak by using q->unprep_rq_fn facility. > > > > > > q->unprep_rq_fn is called when all the data buffer (req->bio and > > > scsi_data_buffer) in the request is freed. > > > > > > sd_unprep() uses rq->buffer to free discard page allocated in > > > sd_prepare_discard(). > > > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> > > > > Thanks for sorting this out Tomo, all 3 patches work great! > > > > BTW, there is one remaining (rare) leak in the allocation path. > > > > The following patch serves to fix it but I'm not sure if there is a more > > elegant way to address this. > > I've continued to look at this to arrive at alternative implementation. > Here is a summary of the problem: > > A 'scsi_setup_discard_cmnd' return other than BLKPREP_OK will not cause > a discard request to get completely stripped down ('blk_finish_request' > isn't calling 'blk_unprep_request' because REQ_DONTPREP is not set by > 'scsi_prep_return' for none BLKPREP_OK return). Therefore the discard > request's page will _not_ get cleaned up. > > Aside from code inspection, I confirmed this by adding some test code to > force a one-time initial BLKPREP_DEFER return from > 'scsi_setup_discard_cmnd'. > > > An alternative would be to check if the page is already allocated > > (before allocating the page in scsi_setup_discard_cmnd)? > > Unfortunatey this "alternative" won't work because it completely ignores > the case where BLKPREP_KILL is returned from scsi_setup_discard_cmnd'. > > > Please advise, thanks. > > In short, I'm not too happy that the following patch doesn't allow for > centralized cleanup of the discard request's page (via sd_unprep_fn). > But in order to do that we'd likely have to: > 1) relax blk_finish_request's REQ_DONTPREP constraint > 2) add other weird conditionals within blk_unprep_request because > the discard request wasn't _really_ prepared? > > So given this I'm inclined to stick with the following patch. > > Jens and/or James, what do you think? The rules are pretty clear: Unprep is only called if the request gets prepped ... that means you have to return BLKPREP_OK. Defer or kill assume there's no teardown to do, so the allocation (if it took place) must be reversed before returning them James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Mike Snitzer on 1 Jul 2010 17:10 On Thu, Jul 01 2010 at 4:19pm -0400, James Bottomley <James.Bottomley(a)suse.de> wrote: > On Thu, 2010-07-01 at 16:15 -0400, Mike Snitzer wrote: > > On Thu, Jul 01 2010 at 9:03am -0400, > > Mike Snitzer <snitzer(a)redhat.com> wrote: > > > > > On Thu, Jul 01 2010 at 6:49am -0400, > > > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote: > > > > > > > This fixes discard page leak by using q->unprep_rq_fn facility. > > > > > > > > q->unprep_rq_fn is called when all the data buffer (req->bio and > > > > scsi_data_buffer) in the request is freed. > > > > > > > > sd_unprep() uses rq->buffer to free discard page allocated in > > > > sd_prepare_discard(). > > > > > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> > > > > > > Thanks for sorting this out Tomo, all 3 patches work great! > > > > > > BTW, there is one remaining (rare) leak in the allocation path. > > > > > > The following patch serves to fix it but I'm not sure if there is a more > > > elegant way to address this. > > > > I've continued to look at this to arrive at alternative implementation. > > Here is a summary of the problem: > > > > A 'scsi_setup_discard_cmnd' return other than BLKPREP_OK will not cause > > a discard request to get completely stripped down ('blk_finish_request' > > isn't calling 'blk_unprep_request' because REQ_DONTPREP is not set by > > 'scsi_prep_return' for none BLKPREP_OK return). Therefore the discard > > request's page will _not_ get cleaned up. > > > > Aside from code inspection, I confirmed this by adding some test code to > > force a one-time initial BLKPREP_DEFER return from > > 'scsi_setup_discard_cmnd'. > > > > > An alternative would be to check if the page is already allocated > > > (before allocating the page in scsi_setup_discard_cmnd)? > > > > Unfortunatey this "alternative" won't work because it completely ignores > > the case where BLKPREP_KILL is returned from scsi_setup_discard_cmnd'. > > > > > Please advise, thanks. > > > > In short, I'm not too happy that the following patch doesn't allow for > > centralized cleanup of the discard request's page (via sd_unprep_fn). > > But in order to do that we'd likely have to: > > 1) relax blk_finish_request's REQ_DONTPREP constraint > > 2) add other weird conditionals within blk_unprep_request because > > the discard request wasn't _really_ prepared? > > > > So given this I'm inclined to stick with the following patch. > > > > Jens and/or James, what do you think? > > The rules are pretty clear: Unprep is only called if the request gets > prepped ... that means you have to return BLKPREP_OK. Defer or kill > assume there's no teardown to do, so the allocation (if it took place) > must be reversed before returning them OK, thanks for clarifying. This confirms that the general approach I took in this patch is correct. It remains to be seen if Jens is agreeable with blk_clear_request_payload. I know Christoph thought my introduction and use of blk_clear_request_payload was reasonable. Christoph, please feel free to add your Ack to this patch if you approve. I look forward to feedback from Tomo and Jens now too. Hopefully Jens will pick this patch up. regards, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on 2 Jul 2010 01:00 On Thu, 01 Jul 2010 15:19:08 -0500 James Bottomley <James.Bottomley(a)suse.de> wrote: > On Thu, 2010-07-01 at 16:15 -0400, Mike Snitzer wrote: > > On Thu, Jul 01 2010 at 9:03am -0400, > > Mike Snitzer <snitzer(a)redhat.com> wrote: > > > > > On Thu, Jul 01 2010 at 6:49am -0400, > > > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote: > > > > > > > This fixes discard page leak by using q->unprep_rq_fn facility. > > > > > > > > q->unprep_rq_fn is called when all the data buffer (req->bio and > > > > scsi_data_buffer) in the request is freed. > > > > > > > > sd_unprep() uses rq->buffer to free discard page allocated in > > > > sd_prepare_discard(). > > > > > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> > > > > > > Thanks for sorting this out Tomo, all 3 patches work great! > > > > > > BTW, there is one remaining (rare) leak in the allocation path. > > > > > > The following patch serves to fix it but I'm not sure if there is a more > > > elegant way to address this. > > > > I've continued to look at this to arrive at alternative implementation. > > Here is a summary of the problem: > > > > A 'scsi_setup_discard_cmnd' return other than BLKPREP_OK will not cause > > a discard request to get completely stripped down ('blk_finish_request' > > isn't calling 'blk_unprep_request' because REQ_DONTPREP is not set by > > 'scsi_prep_return' for none BLKPREP_OK return). Therefore the discard > > request's page will _not_ get cleaned up. > > > > Aside from code inspection, I confirmed this by adding some test code to > > force a one-time initial BLKPREP_DEFER return from > > 'scsi_setup_discard_cmnd'. > > > > > An alternative would be to check if the page is already allocated > > > (before allocating the page in scsi_setup_discard_cmnd)? > > > > Unfortunatey this "alternative" won't work because it completely ignores > > the case where BLKPREP_KILL is returned from scsi_setup_discard_cmnd'. > > > > > Please advise, thanks. > > > > In short, I'm not too happy that the following patch doesn't allow for > > centralized cleanup of the discard request's page (via sd_unprep_fn). > > But in order to do that we'd likely have to: > > 1) relax blk_finish_request's REQ_DONTPREP constraint > > 2) add other weird conditionals within blk_unprep_request because > > the discard request wasn't _really_ prepared? > > > > So given this I'm inclined to stick with the following patch. > > > > Jens and/or James, what do you think? > > The rules are pretty clear: Unprep is only called if the request gets > prepped ... that means you have to return BLKPREP_OK. Defer or kill > assume there's no teardown to do, so the allocation (if it took place) > must be reversed before returning them Seems that scsi-ml calls scsi_unprep_request() for not-prepped requests in scsi_init_io error path. So we could move that scsi_unprep_request() to the error path in scsi_prep_return(). Then we can free discard page in the single place. Applying the rule strictly is fine by me too; we remove scsi_unprep_request() in scsi_init_io error path and clean up things in each prep function's error path. Btw, blk_clear_request_payload() is necessary? Making sure that a request is clean is not a bad idea but if we hit BLKPREP_KILL or BLKPREP_DEFER, we call blk_end_request(). blk_end_request() can free a request properly even if we don't do something like blk_clear_request_payload? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on 2 Jul 2010 06:50
On Thu, Jul 01, 2010 at 09:03:28AM -0400, Mike Snitzer wrote: > On Thu, Jul 01 2010 at 6:49am -0400, > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote: > > > This fixes discard page leak by using q->unprep_rq_fn facility. > > > > q->unprep_rq_fn is called when all the data buffer (req->bio and > > scsi_data_buffer) in the request is freed. > > > > sd_unprep() uses rq->buffer to free discard page allocated in > > sd_prepare_discard(). > > > > Signed-off-by: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> > > Thanks for sorting this out Tomo, all 3 patches work great! > > BTW, there is one remaining (rare) leak in the allocation path. > > The following patch serves to fix it but I'm not sure if there is a more > elegant way to address this. > > An alternative would be to check if the page is already allocated > (before allocating the page in scsi_setup_discard_cmnd)? Ah, should have read your mail first, sorry.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |