Skip to content
  • Daniel Lezcano's avatar
    netns : fix kernel panic in timewait socket destruction · d315492b
    Daniel Lezcano authored
    
    
    How to reproduce ?
     - create a network namespace
     - use tcp protocol and get timewait socket
     - exit the network namespace
     - after a moment (when the timewait socket is destroyed), the kernel
       panics.
    
    # BUG: unable to handle kernel NULL pointer dereference at
    0000000000000007
    IP: [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
    PGD 119985067 PUD 11c5c0067 PMD 0
    Oops: 0000 [1] SMP
    CPU 1
    Modules linked in: ipv6 button battery ac loop dm_mod tg3 libphy ext3 jbd
    edd fan thermal processor thermal_sys sg sata_svw libata dock serverworks
    sd_mod scsi_mod ide_disk ide_core [last unloaded: freq_table]
    Pid: 0, comm: swapper Not tainted 2.6.27-rc2 #3
    RIP: 0010:[<ffffffff821e394d>] [<ffffffff821e394d>]
    inet_twdr_do_twkill_work+0x6e/0xb8
    RSP: 0018:ffff88011ff7fed0 EFLAGS: 00010246
    RAX: ffffffffffffffff RBX: ffffffff82339420 RCX: ffff88011ff7ff30
    RDX: 0000000000000001 RSI: ffff88011a4d03c0 RDI: ffff88011ac2fc00
    RBP: ffffffff823392e0 R08: 0000000000000000 R09: ffff88002802a200
    R10: ffff8800a5c4b000 R11: ffffffff823e4080 R12: ffff88011ac2fc00
    R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
    FS: 0000000041cbd940(0000) GS:ffff8800bff839c0(0000)
    knlGS:0000000000000000
    CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 0000000000000007 CR3: 00000000bd87c000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process swapper (pid: 0, threadinfo ffff8800bff9e000, task
    ffff88011ff76690)
    Stack: ffffffff823392e0 0000000000000100 ffffffff821e3a3a
    0000000000000008
    0000000000000000 ffffffff821e3a61 ffff8800bff7c000 ffffffff8203c7e7
    ffff88011ff7ff10 ffff88011ff7ff10 0000000000000021 ffffffff82351108
    Call Trace:
    <IRQ> [<ffffffff821e3a3a>] ? inet_twdr_hangman+0x0/0x9e
    [<ffffffff821e3a61>] ? inet_twdr_hangman+0x27/0x9e
    [<ffffffff8203c7e7>] ? run_timer_softirq+0x12c/0x193
    [<ffffffff820390d1>] ? __do_softirq+0x5e/0xcd
    [<ffffffff8200d08c>] ? call_softirq+0x1c/0x28
    [<ffffffff8200e611>] ? do_softirq+0x2c/0x68
    [<ffffffff8201a055>] ? smp_apic_timer_interrupt+0x8e/0xa9
    [<ffffffff8200cad6>] ? apic_timer_interrupt+0x66/0x70
    <EOI> [<ffffffff82011f4c>] ? default_idle+0x27/0x3b
    [<ffffffff8200abbd>] ? cpu_idle+0x5f/0x7d
    
    
    Code: e8 01 00 00 4c 89 e7 41 ff c5 e8 8d fd ff ff 49 8b 44 24 38 4c 89 e7
    65 8b 14 25 24 00 00 00 89 d2 48 8b 80 e8 00 00 00 48 f7 d0 <48> 8b 04 d0
    48 ff 40 58 e8 fc fc ff ff 48 89 df e8 c0 5f 04 00
    RIP [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
    RSP <ffff88011ff7fed0>
    CR2: 0000000000000007
    
    This patch provides a function to purge all timewait sockets related
    to a network namespace. The timewait sockets life cycle is not tied with
    the network namespace, that means the timewait sockets stay alive while
    the network namespace dies. The timewait sockets are for avoiding to
    receive a duplicate packet from the network, if the network namespace is
    freed, the network stack is removed, so no chance to receive any packets
    from the outside world. Furthermore, having a pending destruction timer
    on these sockets with a network namespace freed is not safe and will lead
    to an oops if the timer callback which try to access data belonging to 
    the namespace like for example in:
    	inet_twdr_do_twkill_work
    		-> NET_INC_STATS_BH(twsk_net(tw), LINUX_MIB_TIMEWAITED);
    
    Purging the timewait sockets at the network namespace destruction will:
     1) speed up memory freeing for the namespace
     2) fix kernel panic on asynchronous timewait destruction
    
    Signed-off-by: default avatarDaniel Lezcano <dlezcano@fr.ibm.com>
    Acked-by: default avatarDenis V. Lunev <den@openvz.org>
    Acked-by: default avatarEric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    d315492b