Prev: [PATCH 0/5] blkdev: discard optimizations v2 RESEND
Next: [PATCH 4/5] powernow-k8: Fix frequency reporting
From: Boaz Harrosh on 24 Mar 2010 13:40 On 03/24/2010 07:15 PM, Boaz Harrosh wrote: > On 03/24/2010 06:39 PM, Al Viro wrote: >> On Wed, Mar 24, 2010 at 06:10:52PM +0200, Boaz Harrosh wrote: >>> On 03/24/2010 06:07 PM, Al Viro wrote: >>>> On Wed, Mar 24, 2010 at 06:04:56PM +0200, Boaz Harrosh wrote: >>>>>> Bloody impressive... Does that happen to underlying fs or to what you >>>>>> are seeing via NFS? >>>>> >>>>> Only via NFS. All local access is fine. >>>>> >>>>> After the corruption above I can cd to the local mount cp a fresh copy >>>>> of .git/index file and play around just fine. >>>>> Once I return to the NFS mounted directory, a git status will do it. >>>>> It does not matter if caches are cold (Takes a long time) or hot it happens >>>>> every time. >>>>> >>>>> Weird I know, I'm playing some more with it as we speak >>>> >>>> What happens if you export to box running older kernel *or* from box >>>> running older kernel? IOW, is that nfsd or nfs client getting unhappy? >>>> I'd suspect the latter, but... >>> >>> >>> Good question, I'm just getting to that because currently it's all >>> over localhost (same kernel, BTW inside a UML) >>> >>> I will try what you said. Please through any other tests on me, if needed. >> > > As you suspected old-server+new-client fails. any-thing+old-client is > fine. (two separate machines this time) > >> Very interesting... Just to see which path we are hitting: add >> if (IS_ERR(nd->intent.open.file)) >> printk("foo: %s", pathname); >> right after >> error = do_lookup(nd, &nd->last, path); >> if (error) >> goto exit; >> in fs/namei.c:do_last() and see whether we are hitting it or not on objects >> that get corrupted. > > Sorry was busy shifting setups, didn't see your mail, will do that next ... > > Thanks > Boaz Below is what I changed. (I hope its what you meant) It does not get hit, just that git corruption as before but I don't see the prints. I'll try running with nfs dbg-prints on see what it does around the time gits complains Boaz --- diff --git a/fs/namei.c b/fs/namei.c index 1c0fca6..d1c96f0 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1650,6 +1650,12 @@ static struct file *do_last(struct nameidata *nd, struct path *path, error = do_lookup(nd, &nd->last, path); if (error) goto exit; + + if (IS_ERR(nd->intent.open.file)) { + printk(KERN_ERR "foo: %s", pathname); + WARN_ON(1); + } + error = -ENOENT; if (!path->dentry->d_inode) goto exit_dput; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Boaz Harrosh on 24 Mar 2010 13:50 On 03/24/2010 07:32 PM, Boaz Harrosh wrote: > On 03/24/2010 07:15 PM, Boaz Harrosh wrote: >> On 03/24/2010 06:39 PM, Al Viro wrote: >>> On Wed, Mar 24, 2010 at 06:10:52PM +0200, Boaz Harrosh wrote: >>>> On 03/24/2010 06:07 PM, Al Viro wrote: >>>>> On Wed, Mar 24, 2010 at 06:04:56PM +0200, Boaz Harrosh wrote: >>>>>>> Bloody impressive... Does that happen to underlying fs or to what you >>>>>>> are seeing via NFS? >>>>>> >>>>>> Only via NFS. All local access is fine. >>>>>> >>>>>> After the corruption above I can cd to the local mount cp a fresh copy >>>>>> of .git/index file and play around just fine. >>>>>> Once I return to the NFS mounted directory, a git status will do it. >>>>>> It does not matter if caches are cold (Takes a long time) or hot it happens >>>>>> every time. >>>>>> >>>>>> Weird I know, I'm playing some more with it as we speak >>>>> >>>>> What happens if you export to box running older kernel *or* from box >>>>> running older kernel? IOW, is that nfsd or nfs client getting unhappy? >>>>> I'd suspect the latter, but... >>>> >>>> >>>> Good question, I'm just getting to that because currently it's all >>>> over localhost (same kernel, BTW inside a UML) >>>> >>>> I will try what you said. Please through any other tests on me, if needed. >>> >> >> As you suspected old-server+new-client fails. any-thing+old-client is >> fine. (two separate machines this time) >> >>> Very interesting... Just to see which path we are hitting: add >>> if (IS_ERR(nd->intent.open.file)) >>> printk("foo: %s", pathname); >>> right after >>> error = do_lookup(nd, &nd->last, path); >>> if (error) >>> goto exit; >>> in fs/namei.c:do_last() and see whether we are hitting it or not on objects >>> that get corrupted. >> >> Sorry was busy shifting setups, didn't see your mail, will do that next ... >> >> Thanks >> Boaz > > > Below is what I changed. (I hope its what you meant) > It does not get hit, just that git corruption as before but I don't see the prints. > I'll try running with nfs dbg-prints on see what it does around the time gits complains > > Boaz > Attached is an output of when I: $ echo $((0x7fff)) > /proc/sys/sunrpc/nfs_debug and then run git status. (On a new client) We can see the complains after things got broken but what broke it that's hard for me to see. (If the file is too big I'll put it on the web somewhere, see if it arrives) Boaz > --- > diff --git a/fs/namei.c b/fs/namei.c > index 1c0fca6..d1c96f0 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -1650,6 +1650,12 @@ static struct file *do_last(struct nameidata *nd, struct path *path, > error = do_lookup(nd, &nd->last, path); > if (error) > goto exit; > + > + if (IS_ERR(nd->intent.open.file)) { > + printk(KERN_ERR "foo: %s", pathname); > + WARN_ON(1); > + } > + > error = -ENOENT; > if (!path->dentry->d_inode) > goto exit_dput; > > > _______________________________________________ > pNFS mailing list > pNFS(a)linux-nfs.org > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
From: Boaz Harrosh on 24 Mar 2010 14:00 On 03/24/2010 07:47 PM, Boaz Harrosh wrote: >>> On 03/24/2010 06:39 PM, Al Viro wrote: >>>> On Wed, Mar 24, 2010 at 06:10:52PM +0200, Boaz Harrosh wrote: >>>>> On 03/24/2010 06:07 PM, Al Viro wrote: >>>>>>>> Bloody impressive... Does that happen to underlying fs or to what you >>>>>>>> are seeing via NFS? >>>>>>> >>>>>>> Only via NFS. All local access is fine. >>>>>>> <snip> Al hi Would you like to attempt a revert of this patch (or group of patches) Just to get rid of the thought that git bisect was just peeking the wrong guy. Maybe it's just something else? Can you understand the relevance of all this? (I'll try other setups as well but tomorrow, it's getting late out here) Thanks for your help Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Al Viro on 24 Mar 2010 14:10 On Wed, Mar 24, 2010 at 07:58:00PM +0200, Boaz Harrosh wrote: > Al hi > > Would you like to attempt a revert of this patch (or group of patches) > Just to get rid of the thought that git bisect was just peeking the > wrong guy. Maybe it's just something else? Can you understand the > relevance of all this? If you see breakage at that commit and do not see it on its parent, we do have the right guy... As for reverting, try reverting 781b16775ba0bb55fac0e1757bf0bd87c8879632 first, then this commit. How consistent are the effects you are seeing from test to test on the same kernel? This one was very interesting, since it seemed to fail with -EISDIR while opening .git/objects/pack. Which is a directory and which should fail with -EISDIR if and only if we pass O_CREAT to open(). And passing O_CREAT on that one is probably not an intended behaviour of git... Does anybody else see NFS breakage starting at that commit, BTW? Other testcases would be useful... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Trond Myklebust on 24 Mar 2010 14:10 On Wed, 2010-03-24 at 19:47 +0200, Boaz Harrosh wrote: > On 03/24/2010 07:32 PM, Boaz Harrosh wrote: > > On 03/24/2010 07:15 PM, Boaz Harrosh wrote: > >> On 03/24/2010 06:39 PM, Al Viro wrote: > >>> On Wed, Mar 24, 2010 at 06:10:52PM +0200, Boaz Harrosh wrote: > >>>> On 03/24/2010 06:07 PM, Al Viro wrote: > >>>>> On Wed, Mar 24, 2010 at 06:04:56PM +0200, Boaz Harrosh wrote: > >>>>>>> Bloody impressive... Does that happen to underlying fs or to what you > >>>>>>> are seeing via NFS? > >>>>>> > >>>>>> Only via NFS. All local access is fine. > >>>>>> > >>>>>> After the corruption above I can cd to the local mount cp a fresh copy > >>>>>> of .git/index file and play around just fine. > >>>>>> Once I return to the NFS mounted directory, a git status will do it. > >>>>>> It does not matter if caches are cold (Takes a long time) or hot it happens > >>>>>> every time. > >>>>>> > >>>>>> Weird I know, I'm playing some more with it as we speak > >>>>> > >>>>> What happens if you export to box running older kernel *or* from box > >>>>> running older kernel? IOW, is that nfsd or nfs client getting unhappy? > >>>>> I'd suspect the latter, but... > >>>> > >>>> > >>>> Good question, I'm just getting to that because currently it's all > >>>> over localhost (same kernel, BTW inside a UML) > >>>> > >>>> I will try what you said. Please through any other tests on me, if needed. > >>> > >> > >> As you suspected old-server+new-client fails. any-thing+old-client is > >> fine. (two separate machines this time) > >> > >>> Very interesting... Just to see which path we are hitting: add > >>> if (IS_ERR(nd->intent.open.file)) > >>> printk("foo: %s", pathname); > >>> right after > >>> error = do_lookup(nd, &nd->last, path); > >>> if (error) > >>> goto exit; > >>> in fs/namei.c:do_last() and see whether we are hitting it or not on objects > >>> that get corrupted. > >> > >> Sorry was busy shifting setups, didn't see your mail, will do that next ... > >> > >> Thanks > >> Boaz > > > > > > Below is what I changed. (I hope its what you meant) > > It does not get hit, just that git corruption as before but I don't see the prints. > > I'll try running with nfs dbg-prints on see what it does around the time gits complains > > > > Boaz > > > > Attached is an output of when I: > $ echo $((0x7fff)) > /proc/sys/sunrpc/nfs_debug > and then run git status. (On a new client) > > We can see the complains after things got broken but what broke it > that's hard for me to see. > > (If the file is too big I'll put it on the web somewhere, see if it arrives) > > Boaz Something weird is going on in your trace: NFS: open file(5b/46ff70a61cf4e159a0339df0e02113bf35f805) NFS: permission(0:12/323044), mask=0x24, res=0 NFS: revalidating (0:12/323044) --> nfs4_setup_sequence clp 00000000791f3000 session (null) sr_slotid 128 <-- nfs4_setup_sequence status=0 encode_compound: tag= decode_attr_type: type=00 decode_attr_change: change attribute=10077553255782547456 decode_attr_size: file size=921 decode_attr_fsid: fsid=(0x0/0x0) decode_attr_fileid: fileid=0 decode_attr_fs_locations: fs_locations done, error = 0 decode_attr_mode: file mode=00 decode_attr_nlink: nlink=1 decode_attr_owner: uid=-2 decode_attr_group: gid=-2 decode_attr_rdev: rdev=(0x0:0x0) decode_attr_space_used: space used=0 decode_attr_time_access: atime=0 decode_attr_time_metadata: ctime=1269422731 decode_attr_time_modify: mtime=1269422731 decode_attr_mounted_on_fileid: fileid=0 decode_getfattr: xdr returned 0 A file type of '0' in the above trace is just wrong, and probably indicates that the server didn't even return that attribute. I'd say you have a corruption issue either on the server side or on your client. Trond -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 3 4 5 6 7 Prev: [PATCH 0/5] blkdev: discard optimizations v2 RESEND Next: [PATCH 4/5] powernow-k8: Fix frequency reporting |