Prev: [PATCH 08/13] ARM: add PrimeCell generic DMA to PL011 v8
Next: ib/ehca: bitmask handling for lock_hcalls
From: Eric Sandeen on 11 Jun 2010 12:00 Tao Ma wrote: > In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want > to return fi_extent_max extents, but actually it won't work for > a sparse file. The reason is that in xfs_getbmap we will > calculate holes and set it in 'out', while out is malloced by > bmv_count(fi_extent_max+1) which didn't consider holes. So in the > worst case, if 'out' vector looks like > [hole, extent, hole, extent, hole, ... hole, extent, hole], > we will only return half of fi_extent_max extents. > > So in xfs_vn_fiemap, we should consider this worst case. If the > user wants fi_extent_max extents, we need a 'out' with size of > 2 *fi_extent_max + 1. This all seems right to me, though your commit message above (+1) doesn't match the comment and code in the patch (+2) -Eric > Cc: Alex Elder <aelder(a)sgi.com> > Cc: Christoph Hellwig <hch(a)lst.de> > Cc: Dave Chinner <david(a)fromorbit.com> > Signed-off-by: Tao Ma <tao.ma(a)oracle.com> > --- > fs/xfs/linux-2.6/xfs_iops.c | 16 ++++++++++++++-- > 1 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c > index 9c8019c..1db92e3 100644 > --- a/fs/xfs/linux-2.6/xfs_iops.c > +++ b/fs/xfs/linux-2.6/xfs_iops.c > @@ -672,9 +672,21 @@ xfs_vn_fiemap( > else > bm.bmv_length = BTOBB(length); > > - /* We add one because in getbmap world count includes the header */ > + /* > + * It is a bit tricky for us to calculate the bmv_count from > + * fi_extent_max. > + * If we support to return fi_extent_max extents to the user, > + * we need at most 2 * fi_extent_max + 1 for bmv_count since > + * in xfs_getbmap we will calculate holes while fi_extent_max > + * don't have them. So in the worst case, bmv can looks like > + * [hole, extent, hole, extent, hole, ... hole, extent, hole]. > + * So there will be 2 *fi_extent_max + 1. > + * What's more, in getbmap world count have to include the > + * header, so we need another bmv. So the total number will > + * be 2 * fieinfo->fi_extents_max + 2. > + */ > bm.bmv_count = !fieinfo->fi_extents_max ? MAXEXTNUM : > - fieinfo->fi_extents_max + 1; > + 2 * fieinfo->fi_extents_max + 2; > bm.bmv_count = min_t(__s32, bm.bmv_count, > (PAGE_SIZE * 16 / sizeof(struct getbmapx))); > bm.bmv_iflags = BMV_IF_PREALLOC; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tao Ma on 11 Jun 2010 19:40 Eric Sandeen wrote: > Tao Ma wrote: > >> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want >> to return fi_extent_max extents, but actually it won't work for >> a sparse file. The reason is that in xfs_getbmap we will >> calculate holes and set it in 'out', while out is malloced by >> bmv_count(fi_extent_max+1) which didn't consider holes. So in the >> worst case, if 'out' vector looks like >> [hole, extent, hole, extent, hole, ... hole, extent, hole], >> we will only return half of fi_extent_max extents. >> >> So in xfs_vn_fiemap, we should consider this worst case. If the >> user wants fi_extent_max extents, we need a 'out' with size of >> 2 *fi_extent_max + 1. >> > > This all seems right to me, though your commit message above (+1) > doesn't match the comment and code in the patch (+2) > oh, yes, I will change the commit log and send a v2. and can I add your ack for this patch? Regards, Tao > -Eric > > >> Cc: Alex Elder <aelder(a)sgi.com> >> Cc: Christoph Hellwig <hch(a)lst.de> >> Cc: Dave Chinner <david(a)fromorbit.com> >> Signed-off-by: Tao Ma <tao.ma(a)oracle.com> >> --- >> fs/xfs/linux-2.6/xfs_iops.c | 16 ++++++++++++++-- >> 1 files changed, 14 insertions(+), 2 deletions(-) >> >> diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c >> index 9c8019c..1db92e3 100644 >> --- a/fs/xfs/linux-2.6/xfs_iops.c >> +++ b/fs/xfs/linux-2.6/xfs_iops.c >> @@ -672,9 +672,21 @@ xfs_vn_fiemap( >> else >> bm.bmv_length = BTOBB(length); >> >> - /* We add one because in getbmap world count includes the header */ >> + /* >> + * It is a bit tricky for us to calculate the bmv_count from >> + * fi_extent_max. >> + * If we support to return fi_extent_max extents to the user, >> + * we need at most 2 * fi_extent_max + 1 for bmv_count since >> + * in xfs_getbmap we will calculate holes while fi_extent_max >> + * don't have them. So in the worst case, bmv can looks like >> + * [hole, extent, hole, extent, hole, ... hole, extent, hole]. >> + * So there will be 2 *fi_extent_max + 1. >> + * What's more, in getbmap world count have to include the >> + * header, so we need another bmv. So the total number will >> + * be 2 * fieinfo->fi_extents_max + 2. >> + */ >> bm.bmv_count = !fieinfo->fi_extents_max ? MAXEXTNUM : >> - fieinfo->fi_extents_max + 1; >> + 2 * fieinfo->fi_extents_max + 2; >> bm.bmv_count = min_t(__s32, bm.bmv_count, >> (PAGE_SIZE * 16 / sizeof(struct getbmapx))); >> bm.bmv_iflags = BMV_IF_PREALLOC; >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo(a)vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Sandeen on 11 Jun 2010 20:00 Tao Ma wrote: > Eric Sandeen wrote: >> Tao Ma wrote: >> >>> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want >>> to return fi_extent_max extents, but actually it won't work for >>> a sparse file. The reason is that in xfs_getbmap we will >>> calculate holes and set it in 'out', while out is malloced by >>> bmv_count(fi_extent_max+1) which didn't consider holes. So in the >>> worst case, if 'out' vector looks like >>> [hole, extent, hole, extent, hole, ... hole, extent, hole], >>> we will only return half of fi_extent_max extents. >>> >>> So in xfs_vn_fiemap, we should consider this worst case. If the >>> user wants fi_extent_max extents, we need a 'out' with size of >>> 2 *fi_extent_max + 1. >>> >> >> This all seems right to me, though your commit message above (+1) >> doesn't match the comment and code in the patch (+2) >> > oh, yes, I will change the commit log and send a v2. > and can I add your ack for this patch? Sure, it seems right to me. Thanks, -Eric > Regards, > Tao >> -Eric >> >> >>> Cc: Alex Elder <aelder(a)sgi.com> >>> Cc: Christoph Hellwig <hch(a)lst.de> >>> Cc: Dave Chinner <david(a)fromorbit.com> >>> Signed-off-by: Tao Ma <tao.ma(a)oracle.com> >>> --- >>> fs/xfs/linux-2.6/xfs_iops.c | 16 ++++++++++++++-- >>> 1 files changed, 14 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c >>> index 9c8019c..1db92e3 100644 >>> --- a/fs/xfs/linux-2.6/xfs_iops.c >>> +++ b/fs/xfs/linux-2.6/xfs_iops.c >>> @@ -672,9 +672,21 @@ xfs_vn_fiemap( >>> else >>> bm.bmv_length = BTOBB(length); >>> >>> - /* We add one because in getbmap world count includes the header */ >>> + /* >>> + * It is a bit tricky for us to calculate the bmv_count from >>> + * fi_extent_max. >>> + * If we support to return fi_extent_max extents to the user, >>> + * we need at most 2 * fi_extent_max + 1 for bmv_count since >>> + * in xfs_getbmap we will calculate holes while fi_extent_max >>> + * don't have them. So in the worst case, bmv can looks like >>> + * [hole, extent, hole, extent, hole, ... hole, extent, hole]. >>> + * So there will be 2 *fi_extent_max + 1. >>> + * What's more, in getbmap world count have to include the >>> + * header, so we need another bmv. So the total number will >>> + * be 2 * fieinfo->fi_extents_max + 2. >>> + */ >>> bm.bmv_count = !fieinfo->fi_extents_max ? MAXEXTNUM : >>> - fieinfo->fi_extents_max + 1; >>> + 2 * fieinfo->fi_extents_max + 2; >>> bm.bmv_count = min_t(__s32, bm.bmv_count, >>> (PAGE_SIZE * 16 / sizeof(struct getbmapx))); >>> bm.bmv_iflags = BMV_IF_PREALLOC; >>> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe >> linux-kernel" in >> the body of a message to majordomo(a)vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ >> > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on 13 Jun 2010 20:30 On Sat, Jun 12, 2010 at 10:08:15AM +0800, Tao Ma wrote: > In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want > to return fi_extent_max extents, but actually it won't work for > a sparse file. Define "won't work". i.e. what's the test case? I just created a sparse file and checked it, and it reported all the extents in it: # xfs_bmap -vp testfile testfile: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS 0: [0..7]: hole 8 1: [8..15]: 96..103 0 (96..103) 8 00000 2: [16..23]: hole 8 3: [24..31]: 112..119 0 (112..119) 8 00000 4: [32..39]: hole 8 5: [40..47]: 128..135 0 (128..135) 8 00000 6: [48..55]: hole 8 7: [56..63]: 144..151 0 (144..151) 8 00000 8: [64..71]: hole 8 9: [72..79]: 160..167 0 (160..167) 8 00000 10: [80..87]: hole 8 11: [88..95]: 176..183 0 (176..183) 8 00000 12: [96..103]: hole 8 13: [104..111]: 192..199 0 (192..199) 8 00000 14: [112..119]: hole 8 15: [120..127]: 208..215 0 (208..215) 8 00000 # filefrag -v testfile Filesystem type is: 58465342 File size of testfile is 65536 (16 blocks, blocksize 4096) ext logical physical expected length flags 0 1 12 1 1 3 14 12 1 2 5 16 14 1 3 7 18 16 1 4 9 20 18 1 5 11 22 20 1 6 13 24 22 1 7 15 26 24 1 eof testfile: 9 extents found # FWIW, filefrag seems busted - the file has 8 extents, not 9. For a more fragmented sparse file (25,000 extents): # for i in `seq 1 2 50000`; do dd if=/dev/zero of=testfile bs=4k count=1 seek=$i; done ..... # xfs_bmap -vp testfile | grep -v hole | wc -l 25002 # filefrag -v testfile |tail -1 testfile: 25001 extents found So taking away the 2 header lines from xfs_bmap output we have 25000 extents, and filefrag has over-counted by one again. However, we are we are definitely finding all the extents through fiemap... > The reason is that in xfs_getbmap we will > calculate holes and set it in 'out', while out is malloced by > bmv_count(fi_extent_max+1) which didn't consider holes. So in the > worst case, if 'out' vector looks like > [hole, extent, hole, extent, hole, ... hole, extent, hole], > we will only return half of fi_extent_max extents. Right, it's not broken, we simply return less than fi_extent_mex extents when there are holes. I don't see that as a problem as applications have to handle that case anyway, and.... > So in xfs_vn_fiemap, we should consider this worst case. If the > user wants fi_extent_max extents, we need a 'out' with size of > 2 *fi_extent_max + 2(one more the header). That's rather dangerous, I think. It relies on other code to catch the buffer overrun that this sets up for fragmented, non-sparse files. Personally I'd much prefer to return fewer extents for sparse files than to add a landmine like this into the kernel code.... Cheers, Dave. -- Dave Chinner david(a)fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tao Ma on 14 Jun 2010 02:00 On 06/14/2010 08:27 AM, Dave Chinner wrote: > On Sat, Jun 12, 2010 at 10:08:15AM +0800, Tao Ma wrote: >> In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want >> to return fi_extent_max extents, but actually it won't work for >> a sparse file. > > Define "won't work". i.e. what's the test case? I just created a > sparse file and checked it, and it reported all the extents in it: > > # xfs_bmap -vp testfile > testfile: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS > 0: [0..7]: hole 8 > 1: [8..15]: 96..103 0 (96..103) 8 00000 > 2: [16..23]: hole 8 > 3: [24..31]: 112..119 0 (112..119) 8 00000 > 4: [32..39]: hole 8 > 5: [40..47]: 128..135 0 (128..135) 8 00000 > 6: [48..55]: hole 8 > 7: [56..63]: 144..151 0 (144..151) 8 00000 > 8: [64..71]: hole 8 > 9: [72..79]: 160..167 0 (160..167) 8 00000 > 10: [80..87]: hole 8 > 11: [88..95]: 176..183 0 (176..183) 8 00000 > 12: [96..103]: hole 8 > 13: [104..111]: 192..199 0 (192..199) 8 00000 > 14: [112..119]: hole 8 > 15: [120..127]: 208..215 0 (208..215) 8 00000 ok, so let me explain it. In commit 2d1ff3c75a4642062d314634290be6d8da4ffb03, I add the mode for extent query of fiemap for xfs. So with your test file, it will return that we have 8 extents(because in xfs_fiemap_format we don't return holes). So normally and naturally, a user begin to iterate all the extents by doing fiemap = malloc(sizeof(fiemap) + 8 * sizeof(struct fiemap_extent)); fiemap->fm_extent_count = 8 But what will happen? He will only get 4 extent. So do you think it is acceptable for a user? We told him that we have 8 extents, he has allocated enough space, but he can't get what he wanted. And he need to fiemap = malloc(sizeof(fiemap) + 16 * sizeof(struct fiemap_extent)); fiemap->fm_extent_count = 16 to get 8 extent for your test file. > # filefrag -v testfile > Filesystem type is: 58465342 > File size of testfile is 65536 (16 blocks, blocksize 4096) > ext logical physical expected length flags > 0 1 12 1 > 1 3 14 12 1 > 2 5 16 14 1 > 3 7 18 16 1 > 4 9 20 18 1 > 5 11 22 20 1 > 6 13 24 22 1 > 7 15 26 24 1 eof > testfile: 9 extents found > # > > FWIW, filefrag seems busted - the file has 8 extents, not 9. yeah, filefrag is really broken. > >> The reason is that in xfs_getbmap we will >> calculate holes and set it in 'out', while out is malloced by >> bmv_count(fi_extent_max+1) which didn't consider holes. So in the >> worst case, if 'out' vector looks like >> [hole, extent, hole, extent, hole, ... hole, extent, hole], >> we will only return half of fi_extent_max extents. > > Right, it's not broken, we simply return less than fi_extent_mex > extents when there are holes. I don't see that as a problem as > applications have to handle that case anyway, and.... see my above test case. I guess we really don't want a userspace user to allocate num_extents * 2 + 1 fiemap_extent to get them. > >> So in xfs_vn_fiemap, we should consider this worst case. If the >> user wants fi_extent_max extents, we need a 'out' with size of >> 2 *fi_extent_max + 2(one more the header). > > That's rather dangerous, I think. It relies on other code to catch > the buffer overrun that this sets up for fragmented, non-sparse > files. Personally I'd much prefer to return fewer extents for sparse > files than to add a landmine like this into the kernel code.... We just change the size of our 'out', we don't change fi_extent_max or anything related to the fiemap. So I think what we care is how to keep our 'out' in good shape and fiemap should handle and check their fi_extent_max if we pass it more extents. btw, maybe there is a better solution for the problem I described above. If there is a good one, I am happy to accept it. Regards, Tao -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 3 Prev: [PATCH 08/13] ARM: add PrimeCell generic DMA to PL011 v8 Next: ib/ehca: bitmask handling for lock_hcalls |