illumos-joyent icon indicating copy to clipboard operation
illumos-joyent copied to clipboard

panic in fop_getattr()

Open tmwkh opened this issue 9 years ago • 12 comments

The following will leak anonymous memory causing the machine to eventually become unresponsive:

include <stdio.h>

include <sys/inotify.h>

int main() { while (close(inotify_init()) == 0) ; }

while this test-code is running, "vmstat 1" shows a steady loss of free memory. Even after terminating the test, we see no recovery of free memory.

My colleague Youzhong noted that the vmem_create() on line 1081 of https://github.com/joyent/illumos-joyent/blob/master/usr/src/uts/common/io/inotify.c needs a complementary vmem_destroy() in the inotify_detach() routine.

We’re in the process of testing and verifying this.

Thanks, -Ken & Youzhong

tmwkh avatar Jul 30 '15 20:07 tmwkh

I think you meant to say that it needs a corresponding destroy in inotify_close, right? Otherwise, yes, this looks like the right analysis to me. Thanks for bringing this to our attention. If you end up with something you've tested, I'll be happy to integrate it, otherwise I'm happy to do the rest of the work on this.

rmustacc avatar Jul 30 '15 20:07 rmustacc

Hi Robert: Yes – sorry, inotify_close() – actually, if you could take care of this, that would be very much appreciated. We ran into this issue while trying to reproduce a crash involving a NULL pointer fop_getattr() – so we can get back to chasing that! Thank you VERY MUCH! -Ken (& Youzhong)

From: Robert Mustacchi [mailto:[email protected]] Sent: Thursday, July 30, 2015 4:37 PM To: joyent/illumos-joyent Cc: Ken Harris Subject: Re: [illumos-joyent] inotify_init leaks memory (#65)

I think you meant to say that it needs a corresponding destroy in inotify_close, right? Otherwise, yes, this looks like the right analysis to me. Thanks for bringing this to our attention. If you end up with something you've tested, I'll be happy to integrate it, otherwise I'm happy to do the rest of the work on this.

— Reply to this email directly or view it on GitHubhttps://github.com/joyent/illumos-joyent/issues/65#issuecomment-126477045.

tmwkh avatar Jul 30 '15 20:07 tmwkh

On 7/30/15 13:47 , tmwkh wrote:

Hi Robert: Yes – sorry, inotify_close() – actually, if you could take care of this, that would be very much appreciated. We ran into this issue while trying to reproduce a crash involving a NULL pointer fop_getattr() – so we can get back to chasing that! Thank you VERY MUCH! -Ken (& Youzhong)

Okay, I'll take care off this one then and if you have any other details regarding the other issue that you're seeing, please let us know and we'll help out as we can.

Robert

rmustacc avatar Jul 30 '15 20:07 rmustacc

Ken and Youzhong,

As with the other issues you've found in the past, thank you so much for finding this! I'll get this fixed, verified and pushed today. Meanwhile, like Robert, I am (naturally) very curious about the fop_getattr() panic; do you have any details on that? (Or, better yet, a dump?)

      - Bryan

On Thu, Jul 30, 2015 at 1:47 PM, tmwkh [email protected] wrote:

Hi Robert: Yes – sorry, inotify_close() – actually, if you could take care of this, that would be very much appreciated. We ran into this issue while trying to reproduce a crash involving a NULL pointer fop_getattr() – so we can get back to chasing that! Thank you VERY MUCH! -Ken (& Youzhong)

From: Robert Mustacchi [mailto:[email protected]] Sent: Thursday, July 30, 2015 4:37 PM To: joyent/illumos-joyent Cc: Ken Harris Subject: Re: [illumos-joyent] inotify_init leaks memory (#65)

I think you meant to say that it needs a corresponding destroy in inotify_close, right? Otherwise, yes, this looks like the right analysis to me. Thanks for bringing this to our attention. If you end up with something you've tested, I'll be happy to integrate it, otherwise I'm happy to do the rest of the work on this.

— Reply to this email directly or view it on GitHub< https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126477045>.

— Reply to this email directly or view it on GitHub https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126481246 .

bcantrill avatar Jul 30 '15 22:07 bcantrill

Hi Bryan and Robert:

  1.   THANK YOU for all the help you’ve given us with these issues!
    
  2.   Attached, is a text file Youzhong put together capturing the crash details and relevant disassembly. As you can see, RAX was ZERO when it was referenced as a pointer. I’ve added the fem_list and fem_nodes memory observations at the bottom. We can’t explain how/why RAX was zero at the moment of the crash. This has only happened once on one server that’s been up for about 30 days…so it’s not currently hurting us but we wanted to get to the bottom of it _before_ it gets a chance to become a pattern of pain.
    

Again, THANK you for all your help! -Ken and Youzhong

From: Bryan Cantrill [mailto:[email protected]] Sent: Thursday, July 30, 2015 6:29 PM To: joyent/illumos-joyent [email protected] Cc: Ken Harris [email protected] Subject: Re: [illumos-joyent] inotify_init leaks memory (#65)

Ken and Youzhong,

As with the other issues you've found in the past, thank you so much for finding this! I'll get this fixed, verified and pushed today. Meanwhile, like Robert, I am (naturally) very curious about the fop_getattr() panic; do you have any details on that? (Or, better yet, a dump?)

  • Bryan

On Thu, Jul 30, 2015 at 1:47 PM, tmwkh <[email protected]mailto:[email protected]> wrote:

Hi Robert: Yes – sorry, inotify_close() – actually, if you could take care of this, that would be very much appreciated. We ran into this issue while trying to reproduce a crash involving a NULL pointer fop_getattr() – so we can get back to chasing that! Thank you VERY MUCH! -Ken (& Youzhong)

From: Robert Mustacchi [mailto:[email protected]] Sent: Thursday, July 30, 2015 4:37 PM To: joyent/illumos-joyent Cc: Ken Harris Subject: Re: [illumos-joyent] inotify_init leaks memory (#65)

I think you meant to say that it needs a corresponding destroy in inotify_close, right? Otherwise, yes, this looks like the right analysis to me. Thanks for bringing this to our attention. If you end up with something you've tested, I'll be happy to integrate it, otherwise I'm happy to do the rest of the work on this.

— Reply to this email directly or view it on GitHub< https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126477045>.

— Reply to this email directly or view it on GitHub https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126481246 .

— Reply to this email directly or view it on GitHubhttps://github.com/joyent/illumos-joyent/issues/65#issuecomment-126509549.

Jul 28 06:45:53 batfs0552 svc.startd[8]: [ID 694882 daemon.notice] instance svc:/site/rsyncjobs:default exited with status 1 Jul 28 06:46:15 batfs0552 unix: [ID 836849 kern.notice] Jul 28 06:46:15 batfs0552 ^Mpanic[cpu17]/thread=ffffff4dd18507a0: Jul 28 06:46:15 batfs0552 genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff01ee5f25e0 addr=38 occurred in module "genunix" due to a NULL pointer dereference Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 unix: [ID 839527 kern.notice] nfsd: Jul 28 06:46:15 batfs0552 unix: [ID 753105 kern.notice] #pf Page fault Jul 28 06:46:15 batfs0552 unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x38 Jul 28 06:46:15 batfs0552 unix: [ID 243837 kern.notice] pid=4104, pc=0xfffffffffba00c87, sp=0xffffff01ee5f26d0, eflags=0x10246 Jul 28 06:46:15 batfs0552 unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 426f8<osxsav,vmxe,xmme,fxsr,pge,mce,pae,pse,de> Jul 28 06:46:15 batfs0552 unix: [ID 624947 kern.notice] cr2: 38 Jul 28 06:46:15 batfs0552 unix: [ID 625075 kern.notice] cr3: 1ff4316000 Jul 28 06:46:15 batfs0552 unix: [ID 625715 kern.notice] cr8: 0 Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] rdi: ffffff4440936c40 rsi: 1 rdx: b0 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] rcx: ffffff484eab7010 r8: 0 r9: ffffff42e110f380 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] rax: 0 rbx: ffffff4440936c40 rbp: ffffff01ee5f2750 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] r10: 2a0 r11: ffffffffc00b00f0 r12: ffffff55723a7cc0 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] r13: ffffff01ee5f2810 r14: 0 r15: ffffff484eab7010 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] fsb: fffffd7ffe182a40 gsb: ffffff42eb7d4a80 ds: 4b Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 0 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffba00c87 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] cs: 30 rfl: 10246 rsp: ffffff01ee5f26d0 Jul 28 06:46:15 batfs0552 unix: [ID 266532 kern.notice] ss: 38 Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f24c0 unix:die+df () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f25d0 unix:trap+db3 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f25e0 unix:cmntrap+e6 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2750 genunix:vhead_getattr+f7 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f27d0 genunix:fop_getattr+a8 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2800 nfssrv:rfs4_delegated_getattr+20 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2900 nfssrv:rfs3_getattr+85 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2c20 nfssrv:common_dispatch+539 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2c40 nfssrv:rfs_dispatch+2d () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2d20 rpcmod:svc_getreq+1c1 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2d90 rpcmod:svc_run+146 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2dd0 rpcmod:svc_do_run+8e () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2ec0 nfs:nfssys+f1 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2f10 unix:brand_sys_sysenter+1c9 () Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 genunix: [ID 672855 kern.notice] syncing file systems... Jul 28 06:46:17 batfs0552 genunix: [ID 733762 kern.notice] 1 Jul 28 06:46:15 batfs0552 genunix: [ID 904073 kern.notice] done Jul 28 06:46:16 batfs0552 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/zones/dump, offset 65536, content: kernel Jul 28 07:11:41 batfs0552 genunix: [ID 100000 kern.notice] Jul 28 07:11:41 batfs0552 genunix: [ID 665016 kern.notice] ^M100% done: 26742052 pages dumped,

struct fem_node { void fn_available; union { fem_t *fem; vnodeops_t *vnode; fsem_t *fsem; vfsops_t *vfs; void *anon; / anonymous, for updates _/ } fn_op; void (_fn_av_hold)(void ); / Hold for "fn_available" _/ void (_fn_av_rele)(void ); / Release for "fn_available" */ };

struct fem_arg { union { vnode_t _vp, *_vpp; vfs_t *vfsp; void *anon; } fa_vnode; struct fem_node *fa_fnode; };

struct fem_list { uint_t feml_refc; /* reference count / int feml_tos; / top of stack pointer(index) / int feml_ssize; / stack size / int feml_pad; / alignment / struct fem_node feml_nodes[1]; / variable bounds */ };

static int vhead_getattr(vnode_t _vp, vattr_t *vap, int flags, cred_t *cr, caller_context_t *ct) { femarg_t farg; struct fem_list *femsp; int (_func)(); void *arg0; int errc;

if ((femsp = fem_lock(vp->v_femhead)) == NULL) {
    func = (int (*)()) (vp->v_op->vop_getattr);
    arg0 = vp;
    fem_unlock(vp->v_femhead);
    errc = (*func)(arg0, vap, flags, cr, ct);
} else {
    fem_addref(femsp);
    fem_unlock(vp->v_femhead);

    farg.fa_vnode.vp = vp;
    farg.fa_fnode = femsp->feml_nodes + femsp->feml_tos;

    for (;;) { 
        if ((&farg)->fa_fnode->fn_available == NULL) { 
            *(&func) = (int (*)())((&farg)->fa_fnode->fn_op.vnode->vop_getattr); 
            *(&arg0) = (void *)(&farg)->fa_vnode.vp; 
            break;  
        } else if ((*(&func) = (int (*)())((&farg)->fa_fnode->fn_op.fem->femop_getattr)) != NULL) { 
            *(&arg0) = (void *) (&farg); 
            break;  
        } else { 
            (&farg)->fa_fnode--; 
        } 
    } 

    //vsop_find(&farg, &func, int, &arg0, vop_getattr, femop_getattr);

    errc = (*func)(arg0, vap, flags, cr, ct);
    fem_release(femsp);
}
return (errc);

}

vhead_getattr: pushq %rbp vhead_getattr+1: movq %rsp,%rbp vhead_getattr+4: addq $-0x80,%rsp <0xffffffffffffff80> vhead_getattr+8: movq %rbx,-0x58(%rbp) save %rbx vhead_getattr+0xc: movq %r12,-0x50(%rbp) save %r12 vhead_getattr+0x10: movq %rdi,%rbx vp -> %rbx vhead_getattr+0x13: movq %r13,-0x48(%rbp) save %r13 vhead_getattr+0x17: movq %r14,-0x40(%rbp) save %r14 vhead_getattr+0x1b: movq %rsi,%r13 vap -> %r13 vhead_getattr+0x1e: movq %r15,-0x38(%rbp) save %r15 vhead_getattr+0x22: movq %r8,-0x28(%rbp) ct -> stack vhead_getattr+0x26: movl %edx,%r14d flags -> %r14d vhead_getattr+0x29: movq %rcx,-0x20(%rbp) cr -> stack vhead_getattr+0x2d: movq %rdx,-0x18(%rbp) flags -> stack vhead_getattr+0x31: movq %rcx,%r15 cr -> %r15 vhead_getattr+0x34: movq %rsi,-0x10(%rbp) vap -> stack vhead_getattr+0x38: movq %rdi,-0x8(%rbp) vp -> stack vhead_getattr+0x3c: movq 0x78(%rdi),%rdi vp->v_femhead vhead_getattr+0x40: movq %r8,-0x78(%rbp) ct -> stack vhead_getattr+0x44: call -0x919 <fem_lock> vhead_getattr+0x49: testq %rax,%rax vhead_getattr+0x4c: movq %rax,%r12 femsp -> %r12 vhead_getattr+0x4f: movq -0x78(%rbp),%r8 restore %r8 vhead_getattr+0x53: je +0xa7 <vhead_getattr+0x100>

                            fem_addref(femsp);
                            fem_unlock(vp->v_femhead);

vhead_getattr+0x59: movq %rax,%rdi vhead_getattr+0x5c: movq %r8,-0x78(%rbp) ct -> stack vhead_getattr+0x60: call -0x8e5 <fem_addref> vhead_getattr+0x65: movq 0x78(%rbx),%rdi vp->v_femhead vhead_getattr+0x69: call -0x90e <fem_unlock>

    farg.fa_vnode.vp = vp;
    farg.fa_fnode = femsp->feml_nodes + femsp->feml_tos;

    for (;;) { 
        if ((&farg)->fa_fnode->fn_available == NULL) { 
            *(&func) = (int (*)())((&farg)->fa_fnode->fn_op.vnode->vop_getattr); 
            *(&arg0) = (void *)(&farg)->fa_vnode.vp; 
            break;  
        } else if ((*(&func) = (int (*)())((&farg)->fa_fnode->fn_op.fem->femop_getattr)) != NULL) { 
            *(&arg0) = (void *) (&farg); 
            break;  
        } else { 
            (&farg)->fa_fnode--; 
        } 
    } 

vhead_getattr+0x6e: movq %rbx,-0x70(%rbp) !!! vp -> farg.fa_vnode.vp vhead_getattr+0x72: movslq 0x4(%r12),%rax femsp->feml_tos vhead_getattr+0x77: movq -0x78(%rbp),%r8 vhead_getattr+0x7b: leaq 0x1(%rax,%rax),%rdx femsp->feml_tos * 2 + 1 vhead_getattr+0x80: shlq $0x4,%rdx (femsp->feml_tos * 2 + 1) * 16 vhead_getattr+0x84: leaq (%r12,%rdx),%rax vhead_getattr+0x88: movq %rax,-0x68(%rbp) !!! farg.fa_fnode vhead_getattr+0x8c: cmpq $0x0,(%rdx,%r12) (femsp->feml_tos * 2 + 1) * 16 + femsp: femsp+16 is the offset of feml_nodes vhead_getattr+0x91: jne +0x13 <vhead_getattr+0xa6> vhead_getattr+0x93: jmp +0x5b <vhead_getattr+0xf0>

vhead_getattr+0x95: nopl (%rax)

vhead_getattr+0x98: subq $0x20,%rax (&farg)->fa_fnode--; # size of 'struct fem_node' is 32 = 0x20 vhead_getattr+0x9c: movq %rax,-0x68(%rbp) !!! farg.fa_fnode vhead_getattr+0xa0: cmpq $0x0,(%rax) if ((&farg)->fa_fnode->fn_available == NULL) vhead_getattr+0xa4: je +0x4a <vhead_getattr+0xf0>

vhead_getattr+0xa6: movq 0x8(%rax),%rdx (&farg)->fa_fnode->fn_op.fem vhead_getattr+0xaa: movq 0x40(%rdx),%r9 (&farg)->fa_fnode->fn_op.fem->femop_getattr vhead_getattr+0xae: testq %r9,%r9 if (((&func) = (int ()())((&farg)->fa_fnode->fn_op.fem->femop_getattr)) != NULL) vhead_getattr+0xb1: je -0x1b <vhead_getattr+0x98> vhead_getattr+0xb3: leaq -0x70(%rbp),%rdi *(&arg0) = (void *) (&farg);

                    errc = (*func)(arg0, vap, flags, cr, ct);
                    fem_release(femsp);

vhead_getattr+0xb7: movq %r15,%rcx cr -> %rcx vhead_getattr+0xba: movl %r14d,%edx flags -> %edx vhead_getattr+0xbd: movq %r13,%rsi vap -> %rsi vhead_getattr+0xc0: xorl %eax,%eax vhead_getattr+0xc2: call _%r9 errc = (_func)(arg0, vap, flags, cr, ct);

vhead_getattr+0xc5: movq %r12,%rdi vhead_getattr+0xc8: movl %eax,-0x78(%rbp) vhead_getattr+0xcb: call -0x8c0 <fem_release> fem_release(femsp); vhead_getattr+0xd0: movl -0x78(%rbp),%eax

vhead_getattr+0xd3: movq -0x58(%rbp),%rbx vhead_getattr+0xd7: movq -0x50(%rbp),%r12 vhead_getattr+0xdb: movq -0x48(%rbp),%r13 vhead_getattr+0xdf: movq -0x40(%rbp),%r14 vhead_getattr+0xe3: movq -0x38(%rbp),%r15 vhead_getattr+0xe7: leave vhead_getattr+0xe8: ret

vhead_getattr+0xe9: nopl 0x0(%rax)

vhead_getattr+0xf0: movq 0x8(%rax),%rax (&farg)->fa_fnode->fn_op.vnode vhead_getattr+0xf4: movq %rbx,%rdi vp -> %rdi vhead_getattr+0xf7: movq 0x38(%rax),%r9 (&farg)->fa_fnode->fn_op.vnode->vop_getattr <------ CRASH vhead_getattr+0xfb: jmp -0x46 <vhead_getattr+0xb7>

vhead_getattr+0xfd: nopl (%rax)

    func = (int (*)()) (vp->v_op->vop_getattr);
    arg0 = vp;
    fem_unlock(vp->v_femhead);
    errc = (*func)(arg0, vap, flags, cr, ct);

vhead_getattr+0x100: movq 0x40(%rbx),%rax vp->v_op -> %rax vhead_getattr+0x104: movq 0x78(%rbx),%rdi vp->v_femhead -> %rdi vhead_getattr+0x108: movq 0x38(%rax),%r12 vp->v_op->vop_getattr -> %r12 vhead_getattr+0x10c: call -0x9b1 <fem_unlock> vhead_getattr+0x111: movq -0x78(%rbp),%r8 ct vhead_getattr+0x115: movq %r15,%rcx cr vhead_getattr+0x118: movl %r14d,%edx flags vhead_getattr+0x11b: movq %r13,%rsi vap vhead_getattr+0x11e: movq %rbx,%rdi vp vhead_getattr+0x121: xorl %eax,%eax vhead_getattr+0x123: call *%r12 vhead_getattr+0x126: jmp -0x55 <vhead_getattr+0xd3>

ffffff01ee5f2750 vhead_getattr+0xf7(ffffff4440936c40, ffffff01ee5f2810, 0, ffffff484eab7010, 0) ffffff01ee5f27d0 fop_getattr+0xa8(ffffff4440936c40, ffffff01ee5f2810, 0, ffffff484eab7010, 0) ffffff01ee5f2800 rfs4_delegated_getattr+0x20(ffffff4440936c40, ffffff01ee5f2810, 0, ffffff484eab7010) ffffff01ee5f2900 rfs3_getattr+0x85(ffffff01ee5f2aa0, ffffff01ee5f2960, ffffff53d9418780, ffffff01ee5f2ca0, ffffff484eab7010) ffffff01ee5f2c20 common_dispatch+0x539(ffffff01ee5f2ca0, ffffff49b3ce0200, 2, 4, fffffffff83101c6, ffffffffc00b1060) ffffff01ee5f2c40 rfs_dispatch+0x2d(ffffff01ee5f2ca0, ffffff49b3ce0200) ffffff01ee5f2d20 svc_getreq+0x1c1(ffffff49b3ce0200, ffffff49a1e61b00) ffffff01ee5f2d90 svc_run+0x146(ffffff4352feac70) ffffff01ee5f2dd0 svc_do_run+0x8e(1) ffffff01ee5f2ec0 nfssys+0xf1(e, fed00fbc) ffffff01ee5f2f10 _sys_sysenter_post_swapgs+0x149()

%rdi, %rsi, %rdx, %rcx, %r8, and %r9 %rax Function return value Two registers are temporaries (%r10 and %r11). Five registers are callee-saved (%r12, %r13, %r14, %r15, and %rbx)


MEMORY OBSERVATIONS:

  1. R12 contains "struct fem_list *femsp" which is ffffff55723a7cc0, so:

ffffff55723a7cc0::print -a -t 'struct fem_list' ffffff55723a7cc0 struct fem_list { ffffff55723a7cc0 uint_t feml_refc = 0x2 ffffff55723a7cc4 int feml_tos = 0x5 ffffff55723a7cc8 int feml_ssize = 0x8 ffffff55723a7ccc int feml_pad = 0xffffff42 ffffff55723a7cd0 struct fem_node [1] feml_nodes = [ ffffff55723a7cd0 struct fem_node { ffffff55723a7cd0 void fn_available = femtype+0x68 ffffff55723a7cd8 union fn_op = { ffffff55723a7cd8 fem_t *fem = 0xffffff42e110f200 ffffff55723a7cd8 vnodeops_t *vnode = 0xffffff42e110f200 ffffff55723a7cd8 fsem_t *fsem = 0xffffff42e110f200 ffffff55723a7cd8 vfsops_t *vfs = 0xffffff42e110f200 ffffff55723a7cd8 void *anon = 0xffffff42e110f200 } ffffff55723a7ce0 void ()() fn_av_hold = 0 ffffff55723a7ce8 void (*)() fn_av_rele = 0 }, ] }

  1. The feml_ssize tells us we have memory allocated for 8 feml_node elements (each with a size of 0x20) and that the current "tos" is 5

ffffff55723a7cd0::print -a -t 'struct fem_node' ffffff55723a7cd0 struct fem_node { ffffff55723a7cd0 void fn_available = femtype+0x68 ffffff55723a7cd8 union fn_op = { ffffff55723a7cd8 fem_t *fem = 0xffffff42e110f200 ffffff55723a7cd8 vnodeops_t *vnode = 0xffffff42e110f200 ffffff55723a7cd8 fsem_t *fsem = 0xffffff42e110f200 ffffff55723a7cd8 vfsops_t *vfs = 0xffffff42e110f200 ffffff55723a7cd8 void *anon = 0xffffff42e110f200 } ffffff55723a7ce0 void ()() fn_av_hold = 0 ffffff55723a7ce8 void ()() fn_av_rele = 0 } ffffff55723a7cd0+20::print -a -t 'struct fem_node' ffffff55723a7cf0 struct fem_node { ffffff55723a7cf0 void *fn_available = 0 ffffff55723a7cf8 union fn_op = { ffffff55723a7cf8 fem_t *fem = 0xffffff42e07cc1c0 ffffff55723a7cf8 vnodeops_t *vnode = 0xffffff42e07cc1c0 ffffff55723a7cf8 fsem_t *fsem = 0xffffff42e07cc1c0 ffffff55723a7cf8 vfsops_t *vfs = 0xffffff42e07cc1c0 ffffff55723a7cf8 void *anon = 0xffffff42e07cc1c0 } ffffff55723a7d00 void ()() fn_av_hold = 0 ffffff55723a7d08 void ()() fn_av_rele = 0 } ffffff55723a7cd0+40::print -a -t 'struct fem_node' ffffff55723a7d10 struct fem_node { ffffff55723a7d10 void *fn_available = 0xffffff440ae9e838 ffffff55723a7d18 union fn_op = { ffffff55723a7d18 fem_t *fem = 0xffffff4310149200 ffffff55723a7d18 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d18 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d18 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d18 void *anon = 0xffffff4310149200 } ffffff55723a7d20 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d28 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+60::print -a -t 'struct fem_node' ffffff55723a7d30 struct fem_node { ffffff55723a7d30 void *fn_available = 0xffffff43a4024e70 ffffff55723a7d38 union fn_op = { ffffff55723a7d38 fem_t *fem = 0xffffff4310149200 ffffff55723a7d38 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d38 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d38 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d38 void *anon = 0xffffff4310149200 } ffffff55723a7d40 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d48 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+80::print -a -t 'struct fem_node' ffffff55723a7d50 struct fem_node { ffffff55723a7d50 void *fn_available = 0xffffff43ab895f08 ffffff55723a7d58 union fn_op = { ffffff55723a7d58 fem_t *fem = 0xffffff4310149200 ffffff55723a7d58 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d58 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d58 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d58 void *anon = 0xffffff4310149200 } ffffff55723a7d60 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d68 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+A0::print -a -t 'struct fem_node' ffffff55723a7d70 struct fem_node { ffffff55723a7d70 void *fn_available = 0xffffffd53784b008 ffffff55723a7d78 union fn_op = { ffffff55723a7d78 fem_t *fem = 0xffffff4310149200 ffffff55723a7d78 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d78 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d78 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d78 void *anon = 0xffffff4310149200 } ffffff55723a7d80 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d88 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+C0::print -a -t 'struct fem_node' ffffff55723a7d90 struct fem_node { ffffff55723a7d90 void *fn_available = 0 ffffff55723a7d98 union fn_op = { ffffff55723a7d98 fem_t *fem = 0xffffff43962dae00 ffffff55723a7d98 vnodeops_t *vnode = 0xffffff43962dae00 ffffff55723a7d98 fsem_t *fsem = 0xffffff43962dae00 ffffff55723a7d98 vfsops_t *vfs = 0xffffff43962dae00 ffffff55723a7d98 void *anon = 0xffffff43962dae00 } ffffff55723a7da0 void ()() fn_av_hold = 0 ffffff55723a7da8 void ()() fn_av_rele = 0 } ffffff55723a7cd0+E0::print -a -t 'struct fem_node' ffffff55723a7db0 struct fem_node { ffffff55723a7db0 void *fn_available = 0 ffffff55723a7db8 union fn_op = { ffffff55723a7db8 fem_t *fem = 0 ffffff55723a7db8 vnodeops_t *vnode = 0 ffffff55723a7db8 fsem_t *fsem = 0 ffffff55723a7db8 vfsops_t *vfs = 0 ffffff55723a7db8 void *anon = 0 } ffffff55723a7dc0 void ()() fn_av_hold = 0 ffffff55723a7dc8 void ()() fn_av_rele = 0xb0e53d00002a } ffffff55723a7cd0+100::print -a -t 'struct fem_node' ffffff55723a7dd0 struct fem_node { ffffff55723a7dd0 void *fn_available = 8 ffffff55723a7dd8 union fn_op = { ffffff55723a7dd8 fem_t *fem = 0x1100000000000000 ffffff55723a7dd8 vnodeops_t *vnode = 0x1100000000000000 ffffff55723a7dd8 fsem_t *fsem = 0x1100000000000000 ffffff55723a7dd8 vfsops_t *vfs = 0x1100000000000000 ffffff55723a7dd8 void *anon = 0x1100000000000000 } ffffff55723a7de0 void ()() fn_av_hold = 0x230000010e000001 ffffff55723a7de8 void (*)() fn_av_rele = 0 }

  1. The element 5th element looks fine, now. Possibly, at the exact time of the crash, ttos was 7 - that would lead us to point at ffffff55723a7cd0+E0 which contains an NULL fn_available, and a NULL fn_op->vnode. The other possibility is that at the exact moment of the crash, the fifth element had a NULL fn_available and NULL fn_op->vnode but was overwritten with apparently valid values since the access violation and panic declaration.

We've looked at other running threads (actually pegged on a CPU) and none look like they were in the neighborhood of fem* code at the time...but our lead theory at this point is that somebody, somewhere made changes to the fem_list out from underneath us, so we're looking to try to drive this code with a test program that hammers on list changes under an environment of plenty of getattrs. So, far - no luck.

tmwkh avatar Jul 31 '15 14:07 tmwkh

Interesting analysis. If at all possible, I would love to get access to the dump. Might you be able to upload it to Manta and/or put it somewhere else where I can take a look? With my thanks again...

      - Bryan

On Fri, Jul 31, 2015 at 7:29 AM, tmwkh [email protected] wrote:

Hi Bryan and Robert:

  1. THANK YOU for all the help you’ve given us with these issues!
  2. Attached, is a text file Youzhong put together capturing the crash details and relevant disassembly. As you can see, RAX was ZERO when it was referenced as a pointer. I’ve added the fem_list and fem_nodes memory observations at the bottom. We can’t explain how/why RAX was zero at the moment of the crash. This has only happened once on one server that’s been up for about 30 days…so it’s not currently hurting us but we wanted to get to the bottom of it before it gets a chance to become a pattern of pain.

Again, THANK you for all your help! -Ken and Youzhong

From: Bryan Cantrill [mailto:[email protected]] Sent: Thursday, July 30, 2015 6:29 PM To: joyent/illumos-joyent [email protected] Cc: Ken Harris [email protected] Subject: Re: [illumos-joyent] inotify_init leaks memory (#65)

Ken and Youzhong,

As with the other issues you've found in the past, thank you so much for finding this! I'll get this fixed, verified and pushed today. Meanwhile, like Robert, I am (naturally) very curious about the fop_getattr() panic; do you have any details on that? (Or, better yet, a dump?)

  • Bryan

On Thu, Jul 30, 2015 at 1:47 PM, tmwkh <[email protected]<mailto: [email protected]>> wrote:

Hi Robert: Yes – sorry, inotify_close() – actually, if you could take care of this, that would be very much appreciated. We ran into this issue while trying to reproduce a crash involving a NULL pointer fop_getattr() – so we can get back to chasing that! Thank you VERY MUCH! -Ken (& Youzhong)

From: Robert Mustacchi [mailto:[email protected]] Sent: Thursday, July 30, 2015 4:37 PM To: joyent/illumos-joyent Cc: Ken Harris Subject: Re: [illumos-joyent] inotify_init leaks memory (#65)

I think you meant to say that it needs a corresponding destroy in inotify_close, right? Otherwise, yes, this looks like the right analysis to me. Thanks for bringing this to our attention. If you end up with something you've tested, I'll be happy to integrate it, otherwise I'm happy to do the rest of the work on this.

— Reply to this email directly or view it on GitHub<

https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126477045>.

— Reply to this email directly or view it on GitHub < https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126481246>

.

— Reply to this email directly or view it on GitHub< https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126509549>.

Jul 28 06:45:53 batfs0552 svc.startd[8]: [ID 694882 daemon.notice] instance svc:/site/rsyncjobs:default exited with status 1 Jul 28 06:46:15 batfs0552 unix: [ID 836849 kern.notice] Jul 28 06:46:15 batfs0552 ^Mpanic[cpu17]/thread=ffffff4dd18507a0: Jul 28 06:46:15 batfs0552 genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff01ee5f25e0 addr=38 occurred in module "genunix" due to a NULL pointer dereference Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 unix: [ID 839527 kern.notice] nfsd: Jul 28 06:46:15 batfs0552 unix: [ID 753105 kern.notice] #pf Page fault Jul 28 06:46:15 batfs0552 unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x38 Jul 28 06:46:15 batfs0552 unix: [ID 243837 kern.notice] pid=4104, pc=0xfffffffffba00c87, sp=0xffffff01ee5f26d0, eflags=0x10246 Jul 28 06:46:15 batfs0552 unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 426f8<osxsav,vmxe,xmme,fxsr,pge,mce,pae,pse,de> Jul 28 06:46:15 batfs0552 unix: [ID 624947 kern.notice] cr2: 38 Jul 28 06:46:15 batfs0552 unix: [ID 625075 kern.notice] cr3: 1ff4316000 Jul 28 06:46:15 batfs0552 unix: [ID 625715 kern.notice] cr8: 0 Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] rdi: ffffff4440936c40 rsi: 1 rdx: b0 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] rcx: ffffff484eab7010 r8: 0 r9: ffffff42e110f380 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] rax: 0 rbx: ffffff4440936c40 rbp: ffffff01ee5f2750 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] r10: 2a0 r11: ffffffffc00b00f0 r12: ffffff55723a7cc0 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] r13: ffffff01ee5f2810 r14: 0 r15: ffffff484eab7010 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] fsb: fffffd7ffe182a40 gsb: ffffff42eb7d4a80 ds: 4b Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 0 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffba00c87 Jul 28 06:46:15 batfs0552 unix: [ID 592667 kern.notice] cs: 30 rfl: 10246 rsp: ffffff01ee5f26d0 Jul 28 06:46:15 batfs0552 unix: [ID 266532 kern.notice] ss: 38 Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f24c0 unix:die+df () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f25d0 unix:trap+db3 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f25e0 unix:cmntrap+e6 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2750 genunix:vhead_getattr+f7 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f27d0 genunix:fop_getattr+a8 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2800 nfssrv:rfs4_delegated_getattr+20 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2900 nfssrv:rfs3_getattr+85 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2c20 nfssrv:common_dispatch+539 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2c40 nfssrv:rfs_dispatch+2d () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2d20 rpcmod:svc_getreq+1c1 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2d90 rpcmod:svc_run+146 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2dd0 rpcmod:svc_do_run+8e () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2ec0 nfs:nfssys+f1 () Jul 28 06:46:15 batfs0552 genunix: [ID 655072 kern.notice] ffffff01ee5f2f10 unix:brand_sys_sysenter+1c9 () Jul 28 06:46:15 batfs0552 unix: [ID 100000 kern.notice] Jul 28 06:46:15 batfs0552 genunix: [ID 672855 kern.notice] syncing file systems... Jul 28 06:46:17 batfs0552 genunix: [ID 733762 kern.notice] 1 Jul 28 06:46:15 batfs0552 genunix: [ID 904073 kern.notice] done Jul 28 06:46:16 batfs0552 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/zones/dump, offset 65536, content: kernel Jul 28 07:11:41 batfs0552 genunix: [ID 100000 kern.notice] Jul 28 07:11:41 batfs0552 genunix: [ID 665016 kern.notice] ^M100% done: 26742052 pages dumped,

struct fem_node { void fn_available; union { fem_t *fem; vnodeops_t *vnode; fsem_t *fsem; vfsops_t *vfs; void *anon; / anonymous, for updates _/ } fn_op; void (_fn_av_hold)(void ); / Hold for "fn_available" _/ void (_fn_av_rele)(void ); / Release for "fn_available" */ };

struct fem_arg { union { vnode_t _vp, *_vpp; vfs_t *vfsp; void *anon; } fa_vnode; struct fem_node *fa_fnode; };

struct fem_list { uint_t feml_refc; /* reference count / int feml_tos; / top of stack pointer(index) / int feml_ssize; / stack size / int feml_pad; / alignment / struct fem_node feml_nodes[1]; / variable bounds */ };

static int vhead_getattr(vnode_t _vp, vattr_t *vap, int flags, cred_t *cr, caller_context_t *ct) { femarg_t farg; struct fem_list *femsp; int (_func)(); void *arg0; int errc;

if ((femsp = fem_lock(vp->v_femhead)) == NULL) { func = (int (_)()) (vp->v_op->vop_getattr); arg0 = vp; fem_unlock(vp->v_femhead); errc = (_func)(arg0, vap, flags, cr, ct); } else { fem_addref(femsp); fem_unlock(vp->v_femhead);

farg.fa_vnode.vp = vp; farg.fa_fnode = femsp->feml_nodes + femsp->feml_tos;

for (;;) { if ((&farg)->fa_fnode->fn_available == NULL) { (&func) = (int ()())((&farg)->fa_fnode->fn_op.vnode->vop_getattr); (&arg0) = (void *)(&farg)->fa_vnode.vp; break; } else if (((&func) = (int (*)())((&farg)->fa_fnode->fn_op.fem->femop_getattr)) != NULL) { *(&arg0) = (void *) (&farg); break; } else { (&farg)->fa_fnode--; } }

//vsop_find(&farg, &func, int, &arg0, vop_getattr, femop_getattr);

errc = (*func)(arg0, vap, flags, cr, ct); fem_release(femsp); } return (errc); }

vhead_getattr: pushq %rbp vhead_getattr+1: movq %rsp,%rbp vhead_getattr+4: addq $-0x80,%rsp <0xffffffffffffff80> vhead_getattr+8: movq %rbx,-0x58(%rbp) save %rbx vhead_getattr+0xc: movq %r12,-0x50(%rbp) save %r12 vhead_getattr+0x10: movq %rdi,%rbx vp -> %rbx vhead_getattr+0x13: movq %r13,-0x48(%rbp) save %r13 vhead_getattr+0x17: movq %r14,-0x40(%rbp) save %r14 vhead_getattr+0x1b: movq %rsi,%r13 vap -> %r13 vhead_getattr+0x1e: movq %r15,-0x38(%rbp) save %r15 vhead_getattr+0x22: movq %r8,-0x28(%rbp) ct -> stack vhead_getattr+0x26: movl %edx,%r14d flags -> %r14d vhead_getattr+0x29: movq %rcx,-0x20(%rbp) cr -> stack vhead_getattr+0x2d: movq %rdx,-0x18(%rbp) flags -> stack vhead_getattr+0x31: movq %rcx,%r15 cr -> %r15 vhead_getattr+0x34: movq %rsi,-0x10(%rbp) vap -> stack vhead_getattr+0x38: movq %rdi,-0x8(%rbp) vp -> stack vhead_getattr+0x3c: movq 0x78(%rdi),%rdi vp->v_femhead vhead_getattr+0x40: movq %r8,-0x78(%rbp) ct -> stack vhead_getattr+0x44: call -0x919 <fem_lock> vhead_getattr+0x49: testq %rax,%rax vhead_getattr+0x4c: movq %rax,%r12 femsp -> %r12 vhead_getattr+0x4f: movq -0x78(%rbp),%r8 restore %r8 vhead_getattr+0x53: je +0xa7 <vhead_getattr+0x100>

fem_addref(femsp); fem_unlock(vp->v_femhead);

vhead_getattr+0x59: movq %rax,%rdi vhead_getattr+0x5c: movq %r8,-0x78(%rbp) ct -> stack vhead_getattr+0x60: call -0x8e5 <fem_addref> vhead_getattr+0x65: movq 0x78(%rbx),%rdi vp->v_femhead vhead_getattr+0x69: call -0x90e <fem_unlock>

farg.fa_vnode.vp = vp; farg.fa_fnode = femsp->feml_nodes + femsp->feml_tos;

for (;;) { if ((&farg)->fa_fnode->fn_available == NULL) { (&func) = (int ()())((&farg)->fa_fnode->fn_op.vnode->vop_getattr); (&arg0) = (void *)(&farg)->fa_vnode.vp; break; } else if (((&func) = (int (*)())((&farg)->fa_fnode->fn_op.fem->femop_getattr)) != NULL) { *(&arg0) = (void *) (&farg); break; } else { (&farg)->fa_fnode--; } }

vhead_getattr+0x6e: movq %rbx,-0x70(%rbp) !!! vp -> farg.fa_vnode.vp vhead_getattr+0x72: movslq 0x4(%r12),%rax femsp->feml_tos vhead_getattr+0x77: movq -0x78(%rbp),%r8 vhead_getattr+0x7b: leaq 0x1(%rax,%rax),%rdx femsp->feml_tos * 2 + 1 vhead_getattr+0x80: shlq $0x4,%rdx (femsp->feml_tos * 2 + 1) * 16 vhead_getattr+0x84: leaq (%r12,%rdx),%rax vhead_getattr+0x88: movq %rax,-0x68(%rbp) !!! farg.fa_fnode vhead_getattr+0x8c: cmpq $0x0,(%rdx,%r12) (femsp->feml_tos * 2 + 1) * 16 + femsp: femsp+16 is the offset of feml_nodes vhead_getattr+0x91: jne +0x13 <vhead_getattr+0xa6> vhead_getattr+0x93: jmp +0x5b <vhead_getattr+0xf0>

vhead_getattr+0x95: nopl (%rax)

vhead_getattr+0x98: subq $0x20,%rax (&farg)->fa_fnode--; # size of 'struct fem_node' is 32 = 0x20 vhead_getattr+0x9c: movq %rax,-0x68(%rbp) !!! farg.fa_fnode vhead_getattr+0xa0: cmpq $0x0,(%rax) if ((&farg)->fa_fnode->fn_available == NULL) vhead_getattr+0xa4: je +0x4a <vhead_getattr+0xf0>

vhead_getattr+0xa6: movq 0x8(%rax),%rdx (&farg)->fa_fnode->fn_op.fem vhead_getattr+0xaa: movq 0x40(%rdx),%r9 (&farg)->fa_fnode->fn_op.fem->femop_getattr vhead_getattr+0xae: testq %r9,%r9 if (((&func) = (int ()())((&farg)->fa_fnode->fn_op.fem->femop_getattr)) != NULL) vhead_getattr+0xb1: je -0x1b <vhead_getattr+0x98> vhead_getattr+0xb3: leaq -0x70(%rbp),%rdi *(&arg0) = (void *) (&farg);

errc = (*func)(arg0, vap, flags, cr, ct); fem_release(femsp);

vhead_getattr+0xb7: movq %r15,%rcx cr -> %rcx vhead_getattr+0xba: movl %r14d,%edx flags -> %edx vhead_getattr+0xbd: movq %r13,%rsi vap -> %rsi vhead_getattr+0xc0: xorl %eax,%eax vhead_getattr+0xc2: call _%r9 errc = (_func)(arg0, vap, flags, cr, ct);

vhead_getattr+0xc5: movq %r12,%rdi vhead_getattr+0xc8: movl %eax,-0x78(%rbp) vhead_getattr+0xcb: call -0x8c0 <fem_release> fem_release(femsp); vhead_getattr+0xd0: movl -0x78(%rbp),%eax

vhead_getattr+0xd3: movq -0x58(%rbp),%rbx vhead_getattr+0xd7: movq -0x50(%rbp),%r12 vhead_getattr+0xdb: movq -0x48(%rbp),%r13 vhead_getattr+0xdf: movq -0x40(%rbp),%r14 vhead_getattr+0xe3: movq -0x38(%rbp),%r15 vhead_getattr+0xe7: leave vhead_getattr+0xe8: ret

vhead_getattr+0xe9: nopl 0x0(%rax)

vhead_getattr+0xf0: movq 0x8(%rax),%rax (&farg)->fa_fnode->fn_op.vnode vhead_getattr+0xf4: movq %rbx,%rdi vp -> %rdi vhead_getattr+0xf7: movq 0x38(%rax),%r9 (&farg)->fa_fnode->fn_op.vnode->vop_getattr <------ CRASH vhead_getattr+0xfb: jmp -0x46 <vhead_getattr+0xb7>

vhead_getattr+0xfd: nopl (%rax)

func = (int (_)()) (vp->v_op->vop_getattr); arg0 = vp; fem_unlock(vp->v_femhead); errc = (_func)(arg0, vap, flags, cr, ct);

vhead_getattr+0x100: movq 0x40(%rbx),%rax vp->v_op -> %rax vhead_getattr+0x104: movq 0x78(%rbx),%rdi vp->v_femhead -> %rdi vhead_getattr+0x108: movq 0x38(%rax),%r12 vp->v_op->vop_getattr -> %r12 vhead_getattr+0x10c: call -0x9b1 <fem_unlock> vhead_getattr+0x111: movq -0x78(%rbp),%r8 ct vhead_getattr+0x115: movq %r15,%rcx cr vhead_getattr+0x118: movl %r14d,%edx flags vhead_getattr+0x11b: movq %r13,%rsi vap vhead_getattr+0x11e: movq %rbx,%rdi vp vhead_getattr+0x121: xorl %eax,%eax vhead_getattr+0x123: call *%r12 vhead_getattr+0x126: jmp -0x55 <vhead_getattr+0xd3>

ffffff01ee5f2750 vhead_getattr+0xf7(ffffff4440936c40, ffffff01ee5f2810, 0, ffffff484eab7010, 0) ffffff01ee5f27d0 fop_getattr+0xa8(ffffff4440936c40, ffffff01ee5f2810, 0, ffffff484eab7010, 0) ffffff01ee5f2800 rfs4_delegated_getattr+0x20(ffffff4440936c40, ffffff01ee5f2810, 0, ffffff484eab7010) ffffff01ee5f2900 rfs3_getattr+0x85(ffffff01ee5f2aa0, ffffff01ee5f2960, ffffff53d9418780, ffffff01ee5f2ca0, ffffff484eab7010) ffffff01ee5f2c20 common_dispatch+0x539(ffffff01ee5f2ca0, ffffff49b3ce0200, 2, 4, fffffffff83101c6, ffffffffc00b1060) ffffff01ee5f2c40 rfs_dispatch+0x2d(ffffff01ee5f2ca0, ffffff49b3ce0200) ffffff01ee5f2d20 svc_getreq+0x1c1(ffffff49b3ce0200, ffffff49a1e61b00) ffffff01ee5f2d90 svc_run+0x146(ffffff4352feac70) ffffff01ee5f2dd0 svc_do_run+0x8e(1) ffffff01ee5f2ec0 nfssys+0xf1(e, fed00fbc) ffffff01ee5f2f10 _sys_sysenter_post_swapgs+0x149()

%rdi, %rsi, %rdx, %rcx, %r8, and %r9 %rax Function return value Two registers are temporaries (%r10 and %r11). Five registers are callee-saved (%r12, %r13, %r14, %r15, and %rbx)


MEMORY OBSERVATIONS:

  1. R12 contains "struct fem_list *femsp" which is ffffff55723a7cc0, so:

ffffff55723a7cc0::print -a -t 'struct fem_list' ffffff55723a7cc0 struct fem_list { ffffff55723a7cc0 uint_t feml_refc = 0x2 ffffff55723a7cc4 int feml_tos = 0x5 ffffff55723a7cc8 int feml_ssize = 0x8 ffffff55723a7ccc int feml_pad = 0xffffff42 ffffff55723a7cd0 struct fem_node [1] feml_nodes = [ ffffff55723a7cd0 struct fem_node { ffffff55723a7cd0 void fn_available = femtype+0x68 ffffff55723a7cd8 union fn_op = { ffffff55723a7cd8 fem_t *fem = 0xffffff42e110f200 ffffff55723a7cd8 vnodeops_t *vnode = 0xffffff42e110f200 ffffff55723a7cd8 fsem_t *fsem = 0xffffff42e110f200 ffffff55723a7cd8 vfsops_t *vfs = 0xffffff42e110f200 ffffff55723a7cd8 void *anon = 0xffffff42e110f200 } ffffff55723a7ce0 void ()() fn_av_hold = 0 ffffff55723a7ce8 void (*)() fn_av_rele = 0 }, ] }

  1. The feml_ssize tells us we have memory allocated for 8 feml_node elements (each with a size of 0x20) and that the current "tos" is 5

ffffff55723a7cd0::print -a -t 'struct fem_node' ffffff55723a7cd0 struct fem_node { ffffff55723a7cd0 void fn_available = femtype+0x68 ffffff55723a7cd8 union fn_op = { ffffff55723a7cd8 fem_t *fem = 0xffffff42e110f200 ffffff55723a7cd8 vnodeops_t *vnode = 0xffffff42e110f200 ffffff55723a7cd8 fsem_t *fsem = 0xffffff42e110f200 ffffff55723a7cd8 vfsops_t *vfs = 0xffffff42e110f200 ffffff55723a7cd8 void *anon = 0xffffff42e110f200 } ffffff55723a7ce0 void ()() fn_av_hold = 0 ffffff55723a7ce8 void ()() fn_av_rele = 0 } ffffff55723a7cd0+20::print -a -t 'struct fem_node' ffffff55723a7cf0 struct fem_node { ffffff55723a7cf0 void *fn_available = 0 ffffff55723a7cf8 union fn_op = { ffffff55723a7cf8 fem_t *fem = 0xffffff42e07cc1c0 ffffff55723a7cf8 vnodeops_t *vnode = 0xffffff42e07cc1c0 ffffff55723a7cf8 fsem_t *fsem = 0xffffff42e07cc1c0 ffffff55723a7cf8 vfsops_t *vfs = 0xffffff42e07cc1c0 ffffff55723a7cf8 void *anon = 0xffffff42e07cc1c0 } ffffff55723a7d00 void ()() fn_av_hold = 0 ffffff55723a7d08 void ()() fn_av_rele = 0 } ffffff55723a7cd0+40::print -a -t 'struct fem_node' ffffff55723a7d10 struct fem_node { ffffff55723a7d10 void *fn_available = 0xffffff440ae9e838 ffffff55723a7d18 union fn_op = { ffffff55723a7d18 fem_t *fem = 0xffffff4310149200 ffffff55723a7d18 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d18 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d18 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d18 void *anon = 0xffffff4310149200 } ffffff55723a7d20 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d28 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+60::print -a -t 'struct fem_node' ffffff55723a7d30 struct fem_node { ffffff55723a7d30 void *fn_available = 0xffffff43a4024e70 ffffff55723a7d38 union fn_op = { ffffff55723a7d38 fem_t *fem = 0xffffff4310149200 ffffff55723a7d38 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d38 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d38 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d38 void *anon = 0xffffff4310149200 } ffffff55723a7d40 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d48 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+80::print -a -t 'struct fem_node' ffffff55723a7d50 struct fem_node { ffffff55723a7d50 void *fn_available = 0xffffff43ab895f08 ffffff55723a7d58 union fn_op = { ffffff55723a7d58 fem_t *fem = 0xffffff4310149200 ffffff55723a7d58 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d58 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d58 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d58 void *anon = 0xffffff4310149200 } ffffff55723a7d60 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d68 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+A0::print -a -t 'struct fem_node' ffffff55723a7d70 struct fem_node { ffffff55723a7d70 void *fn_available = 0xffffffd53784b008 ffffff55723a7d78 union fn_op = { ffffff55723a7d78 fem_t *fem = 0xffffff4310149200 ffffff55723a7d78 vnodeops_t *vnode = 0xffffff4310149200 ffffff55723a7d78 fsem_t *fsem = 0xffffff4310149200 ffffff55723a7d78 vfsops_t *vfs = 0xffffff4310149200 ffffff55723a7d78 void *anon = 0xffffff4310149200 } ffffff55723a7d80 void ()() fn_av_hold = inotify_watch_hold ffffff55723a7d88 void ()() fn_av_rele = inotify_watch_release } ffffff55723a7cd0+C0::print -a -t 'struct fem_node' ffffff55723a7d90 struct fem_node { ffffff55723a7d90 void *fn_available = 0 ffffff55723a7d98 union fn_op = { ffffff55723a7d98 fem_t *fem = 0xffffff43962dae00 ffffff55723a7d98 vnodeops_t *vnode = 0xffffff43962dae00 ffffff55723a7d98 fsem_t *fsem = 0xffffff43962dae00 ffffff55723a7d98 vfsops_t *vfs = 0xffffff43962dae00 ffffff55723a7d98 void *anon = 0xffffff43962dae00 } ffffff55723a7da0 void ()() fn_av_hold = 0 ffffff55723a7da8 void ()() fn_av_rele = 0 } ffffff55723a7cd0+E0::print -a -t 'struct fem_node' ffffff55723a7db0 struct fem_node { ffffff55723a7db0 void *fn_available = 0 ffffff55723a7db8 union fn_op = { ffffff55723a7db8 fem_t *fem = 0 ffffff55723a7db8 vnodeops_t *vnode = 0 ffffff55723a7db8 fsem_t *fsem = 0 ffffff55723a7db8 vfsops_t *vfs = 0 ffffff55723a7db8 void *anon = 0 } ffffff55723a7dc0 void ()() fn_av_hold = 0 ffffff55723a7dc8 void ()() fn_av_rele = 0xb0e53d00002a } ffffff55723a7cd0+100::print -a -t 'struct fem_node' ffffff55723a7dd0 struct fem_node { ffffff55723a7dd0 void *fn_available = 8 ffffff55723a7dd8 union fn_op = { ffffff55723a7dd8 fem_t *fem = 0x1100000000000000 ffffff55723a7dd8 vnodeops_t *vnode = 0x1100000000000000 ffffff55723a7dd8 fsem_t *fsem = 0x1100000000000000 ffffff55723a7dd8 vfsops_t *vfs = 0x1100000000000000 ffffff55723a7dd8 void *anon = 0x1100000000000000 } ffffff55723a7de0 void ()() fn_av_hold = 0x230000010e000001 ffffff55723a7de8 void (*)() fn_av_rele = 0 }

  1. The element 5th element looks fine, now. Possibly, at the exact time of the crash, ttos was 7 - that would lead us to point at ffffff55723a7cd0+E0 which contains an NULL fn_available, and a NULL fn_op->vnode. The other possibility is that at the exact moment of the crash, the fifth element had a NULL fn_available and NULL fn_op->vnode but was overwritten with apparently valid values since the access violation and panic declaration.

We've looked at other running threads (actually pegged on a CPU) and none look like they were in the neighborhood of fem* code at the time...but our lead theory at this point is that somebody, somewhere made changes to the fem_list out from underneath us, so we're looking to try to drive this code with a test program that hammers on list changes under an environment of plenty of getattrs. So, far - no luck.

— Reply to this email directly or view it on GitHub https://github.com/joyent/illumos-joyent/issues/65#issuecomment-126708843 .

bcantrill avatar Jul 31 '15 16:07 bcantrill

Any update on this? I would love to get access to the dump, if possible.

bcantrill avatar Mar 18 '16 04:03 bcantrill

I don’t think we’ve seen this since.

tmwkh avatar Mar 18 '16 14:03 tmwkh

Would it be possible to get access to the dump? It's going to be hard to move forward on this without it. (Note that the dump needn't be made public -- we just need to be able to get access to it.)

bcantrill avatar Mar 18 '16 15:03 bcantrill

Any update on potentially getting me private access to the dump? I want to keep this issue open because I don't have any reason to believe that we've resolved the underlying issue -- but I also can't debug it until I have access to the dump.

bcantrill avatar Jul 19 '16 17:07 bcantrill

Hi Bryan:

  1.   We think we might have at least one dump from a very similar crash in May (see below) and
    
  2.   We’re trying to find out what it would take to get you some kind of access...also,
    
  3.   Here is Youzhong’s analysis of the 2nd event of this kind (there is a slight different in the offset of vhead_getattr() but most neighboring frames are same).
    

    Given Youzhong’s idea, we’re wondering if we now might be able to reproduce (can we get the list to change out from underneath an NFS op?) Thanks! Ken & Youzhong.

---------- Forwarded message ---------- From: Youzhong Yang <[email protected]mailto:[email protected]> Date: Wed, May 25, 2016 at 2:08 PM Subject: vsop_find(...) not protected by fem_lock() To: illumos-dev <[email protected]mailto:[email protected]>

Hi all,

In two years we had two crashes with the following stack trace:

ffffff01ead274c0 unix:die+df ()

ffffff01ead275d0 unix:trap+db3 ()

ffffff01ead275e0 unix:cmntrap+e6 ()

ffffff01ead27750 genunix:vhead_getattr+aa ()

ffffff01ead277d0 genunix:fop_getattr+a8 ()

ffffff01ead27800 nfssrv:rfs4_delegated_getattr+20 ()

ffffff01ead27900 nfssrv:rfs3_getattr+85 ()

ffffff01ead27c20 nfssrv:common_dispatch+539 ()

ffffff01ead27c40 nfssrv:rfs_dispatch+2d ()

ffffff01ead27d20 rpcmod:svc_getreq+1c1 ()

ffffff01ead27d90 rpcmod:svc_run+e0 ()

ffffff01ead27dd0 rpcmod:svc_do_run+8e ()

ffffff01ead27ec0 nfs:nfssys+f1 ()

ffffff01ead27f10 unix:brand_sys_sysenter+1c9 ()

Each time I observed the crash dump, it crashed when doing vsop_find(....) (a macro), but actually looking at the vnode and its v_femhead and fem_list, nothing seems wrong, which suggests that when vsop_find() traverses the fem_nodes, it's being updated/modified.

As you can see in the following code block, vsop_find() is not protected by any lock.

static int

vhead_getattrhttp://src.illumos.org/source/s?refs=vhead_getattr&project=illumos-gate(vnode_thttp://src.illumos.org/source/s?defs=vnode_t&project=illumos-gate *vphttp://src.illumos.org/source/s?refs=vp&project=illumos-gate, vattr_thttp://src.illumos.org/source/s?defs=vattr_t&project=illumos-gate *vaphttp://src.illumos.org/source/s?refs=vap&project=illumos-gate, int flagshttp://src.illumos.org/source/s?refs=flags&project=illumos-gate, cred_thttp://src.illumos.org/source/s?defs=cred_t&project=illumos-gate *crhttp://src.illumos.org/source/s?refs=cr&project=illumos-gate,

    caller_context_t<http://src.illumos.org/source/s?defs=caller_context_t&project=illumos-gate> *ct<http://src.illumos.org/source/s?defs=ct&project=illumos-gate>)

{

    femarg_t<http://src.illumos.org/source/s?defs=femarg_t&project=illumos-gate>       farg<http://src.illumos.org/source/s?refs=farg&project=illumos-gate>;

    struct fem_list<http://src.illumos.org/source/s?defs=fem_list&project=illumos-gate> *femsp<http://src.illumos.org/source/s?refs=femsp&project=illumos-gate>;

    int            (*func<http://src.illumos.org/source/s?defs=func&project=illumos-gate>)();

    void           *arg0<http://src.illumos.org/source/s?refs=arg0&project=illumos-gate>;

    int            errc<http://src.illumos.org/source/s?refs=errc&project=illumos-gate>;



    if ((femsp<http://src.illumos.org/source/s?defs=femsp&project=illumos-gate> = fem_lock<http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/fem.c#fem_lock>(vp<http://src.illumos.org/source/s?defs=vp&project=illumos-gate>->v_femhead<http://src.illumos.org/source/s?defs=v_femhead&project=illumos-gate>)) == NULL<http://src.illumos.org/source/s?defs=NULL&project=illumos-gate>) {

           func<http://src.illumos.org/source/s?defs=func&project=illumos-gate> = (int (*)()) (vp<http://src.illumos.org/source/s?defs=vp&project=illumos-gate>->v_op<http://src.illumos.org/source/s?defs=v_op&project=illumos-gate>->vop_getattr<http://src.illumos.org/source/s?defs=vop_getattr&project=illumos-gate>);

           arg0<http://src.illumos.org/source/s?defs=arg0&project=illumos-gate> = vp<http://src.illumos.org/source/s?defs=vp&project=illumos-gate>;

           fem_unlock<http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/fem.c#fem_unlock>(vp<http://src.illumos.org/source/s?defs=vp&project=illumos-gate>->v_femhead<http://src.illumos.org/source/s?defs=v_femhead&project=illumos-gate>);

           errc<http://src.illumos.org/source/s?defs=errc&project=illumos-gate> = (*func<http://src.illumos.org/source/s?defs=func&project=illumos-gate>)(arg0<http://src.illumos.org/source/s?defs=arg0&project=illumos-gate>, vap<http://src.illumos.org/source/s?defs=vap&project=illumos-gate>, flags<http://src.illumos.org/source/s?defs=flags&project=illumos-gate>, cr<http://src.illumos.org/source/s?defs=cr&project=illumos-gate>, ct<http://src.illumos.org/source/s?defs=ct&project=illumos-gate>);

    } else {

           fem_addref<http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/fem.c#fem_addref>(femsp<http://src.illumos.org/source/s?defs=femsp&project=illumos-gate>);

           fem_unlock<http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/fem.c#fem_unlock>(vp<http://src.illumos.org/source/s?defs=vp&project=illumos-gate>->v_femhead<http://src.illumos.org/source/s?defs=v_femhead&project=illumos-gate>);

           farg<http://src.illumos.org/source/s?defs=farg&project=illumos-gate>.fa_vnode<http://src.illumos.org/source/s?defs=fa_vnode&project=illumos-gate>.vp<http://src.illumos.org/source/s?defs=vp&project=illumos-gate> = vp<http://src.illumos.org/source/s?defs=vp&project=illumos-gate>;

           farg<http://src.illumos.org/source/s?defs=farg&project=illumos-gate>.fa_fnode<http://src.illumos.org/source/s?defs=fa_fnode&project=illumos-gate> = femsp<http://src.illumos.org/source/s?defs=femsp&project=illumos-gate>->feml_nodes<http://src.illumos.org/source/s?defs=feml_nodes&project=illumos-gate> + femsp<http://src.illumos.org/source/s?defs=femsp&project=illumos-gate>->feml_tos<http://src.illumos.org/source/s?defs=feml_tos&project=illumos-gate>;

           vsop_find<http://src.illumos.org/source/s?defs=vsop_find&project=illumos-gate>(&farg<http://src.illumos.org/source/s?defs=farg&project=illumos-gate>, &func<http://src.illumos.org/source/s?defs=func&project=illumos-gate>, int, &arg0<http://src.illumos.org/source/s?defs=arg0&project=illumos-gate>, vop_getattr<http://src.illumos.org/source/s?defs=vop_getattr&project=illumos-gate>,

               femop_getattr<http://src.illumos.org/source/s?defs=femop_getattr&project=illumos-gate>);

           errc<http://src.illumos.org/source/s?defs=errc&project=illumos-gate> = (*func<http://src.illumos.org/source/s?defs=func&project=illumos-gate>)(arg0<http://src.illumos.org/source/s?defs=arg0&project=illumos-gate>, vap<http://src.illumos.org/source/s?defs=vap&project=illumos-gate>, flags<http://src.illumos.org/source/s?defs=flags&project=illumos-gate>, cr<http://src.illumos.org/source/s?defs=cr&project=illumos-gate>, ct<http://src.illumos.org/source/s?defs=ct&project=illumos-gate>);

           fem_release<http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/fem.c#fem_release>(femsp<http://src.illumos.org/source/s?defs=femsp&project=illumos-gate>);

    }

    return (errc<http://src.illumos.org/source/s?defs=errc&project=illumos-gate>);

}

Does my analysis sound correct?

Thanks,

Youzhong

[snip]

tmwkh avatar Jul 19 '16 20:07 tmwkh

Sorry for the code borkage – here’s a link to the routine: http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/fem.c#595

tmwkh avatar Jul 19 '16 22:07 tmwkh