Prev: [PATCH] drivers/mfd: kzalloc doesn't return ERR_PTR
Next: Quickstart Button ACPI driver to serve PNP0C32 ACPI devices
From: Jay Vosburgh on 28 May 2010 17:10 Flavio Leitner <fbl(a)sysclose.org> wrote: >On Fri, May 28, 2010 at 04:16:34PM +0800, Cong Wang wrote: >> On 05/28/10 02:05, Flavio Leitner wrote: >> > >> >Hi guys! >> > >> >I finally could test this to see if an old problem reported on bugzilla[1] was >> >fixed now, but unfortunately it is still there. >> > >> >The ticket is private I guess, but basically the problem happens when bonding >> >driver tries to print something after it had taken the write_lock (monitor >> >functions, enslave/de-enslave), so the printk() will pass through netpoll, then >> >on bonding again which no matter what mode you use, it will try to read_lock() >> >the lock again. The result is a deadlock and the entire system hangs. >> > >> >> Does the attached patch fix this hang? > >I got another issue now: > >[ 89.523062] bonding: bond0: enslaving eth0 as a backup interface with a down link. >[ 89.580746] bonding: bond0: enslaving eth2 as a backup interface with a down link. >[ 91.198527] e1000: eth2 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None >[ 91.238245] bonding: bond0: link status definitely up for interface eth2. > >[ 91.245381] BUG: scheduling while atomic: bond0/2716/0x10000100 >[ 91.251565] 5 locks held by bond0/2716: >[ 91.255663] #0: ((bond_dev->name)){+.+.+.}, at: [<ffffffff81045fb4>] worker_thread+0x19a/0x2e2 >[ 91.265179] #1: ((&(&bond->mii_work)->work)){+.+.+.}, at: [<ffffffff81045fb4>] worker_thread+0x19a/0x2e2 >[ 91.275554] #2: (rtnl_mutex){+.+.+.}, at: [<ffffffff812daf38>] rtnl_lock+0x12/0x14 >[ 91.284018] #3: (&bond->lock){++.+.+}, at: [<ffffffffa029e06a>] bond_mii_monitor+0x2a2/0x4ed [bonding] >[ 91.294230] #4: (&bond->curr_slave_lock){+...+.}, at: [<ffffffffa029e239>] bond_mii_monitor+0x471/0x4ed [bonding] >[ 91.305387] Modules linked in: bonding sunrpc ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log dm_multipath uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm ppdev parport_pc parport rtc_cmos snd_timer tg3 snd ide_cd_mod i5000_edac i2c_i801 libphy rtc_core rtc_lib edac_core pcspkr e1000 dcdbas uhci_hcd tulip shpchp i2c_core cdrom serio_raw soundcore sg snd_page_alloc raid0 sd_mod button [last unloaded: mperf] >[ 91.357735] Pid: 2716, comm: bond0 Not tainted 2.6.34-04700-gd938a70-dirty #36 >[ 91.371112] Call Trace: >[ 91.373825] [<ffffffff81056002>] ? __debug_show_held_locks+0x22/0x24 >[ 91.380530] [<ffffffff8102e4a2>] __schedule_bug+0x6d/0x72 >[ 91.386284] [<ffffffff81363f6e>] schedule+0xc9/0x791 >[ 91.391600] [<ffffffff81032540>] __cond_resched+0x25/0x30 >[ 91.397350] [<ffffffff81364757>] _cond_resched+0x27/0x32 >[ 91.403013] [<ffffffff810ab243>] kmem_cache_alloc+0x2b/0xac >[ 91.408936] [<ffffffff812c61fd>] skb_clone+0x42/0x5d >[ 91.414253] [<ffffffff812ec696>] netlink_broadcast+0x192/0x369 >[ 91.420436] [<ffffffff812ecdc3>] nlmsg_notify+0x43/0x89 >[ 91.426012] [<ffffffff812dabc7>] rtnl_notify+0x2b/0x2d >[ 91.431501] [<ffffffff812dacbc>] rtmsg_ifinfo+0xf3/0x118 >[ 91.437165] [<ffffffff812dad0c>] rtnetlink_event+0x2b/0x2f >[ 91.443003] [<ffffffff81369fe4>] notifier_call_chain+0x32/0x5e >[ 91.449188] [<ffffffff8104d618>] raw_notifier_call_chain+0xf/0x11 >[ 91.455634] [<ffffffff812cfc73>] call_netdevice_notifiers+0x45/0x4a >[ 91.462253] [<ffffffff812d04f7>] netdev_bonding_change+0x12/0x14 This warning is because the notifier call is happening with spin locks held. >[ 91.468614] [<ffffffffa029d589>] bond_select_active_slave+0xe8/0x123 [bonding] >[ 91.476408] [<ffffffffa029e241>] bond_mii_monitor+0x479/0x4ed [bonding] >[ 91.483375] [<ffffffff81046009>] worker_thread+0x1ef/0x2e2 >[ 91.489212] [<ffffffff81045fb4>] ? worker_thread+0x19a/0x2e2 >[ 91.495227] [<ffffffffa029ddc8>] ? bond_mii_monitor+0x0/0x4ed [bonding] >[ 91.502192] [<ffffffff81049c71>] ? autoremove_wake_function+0x0/0x34 >[ 91.508897] [<ffffffff81045e1a>] ? worker_thread+0x0/0x2e2 >[ 91.514734] [<ffffffff810498bb>] kthread+0x7a/0x82 >[ 91.519878] [<ffffffff81003714>] kernel_thread_helper+0x4/0x10 >[ 91.526060] [<ffffffff81366ffc>] ? restore_args+0x0/0x30 >[ 91.531723] [<ffffffff81049841>] ? kthread+0x0/0x82 >[ 91.536953] [<ffffffff81003710>] ? kernel_thread_helper+0x0/0x10 >[ 91.543343] bonding: bond0: making interface eth2 the new active one. >[ 91.550554] bonding: bond0: first active interface up! >[ 91.556859] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready > > >No other patch applied. Just started netconsole over bonding, so no need >to pull the cable from slaves. Reproduced twice, one I got the >backtrace above, and on the other one the system hangs completely >after the BUG: scheduling message. > >fbl > > >> >> Thanks! >> >> -----------------------> >> >> We should notify netconsole that bond is changing its slaves >> when we use active-backup mode. >> >> Signed-off-by: WANG Cong <amwang(a)redhat.com> >> >> ---- >> > >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index 5e12462..9494c02 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -1199,6 +1199,7 @@ void bond_select_active_slave(struct bonding *bond) >> >> best_slave = bond_find_best_slave(bond); >> if (best_slave != bond->curr_active_slave) { >> + netdev_bonding_change(bond->dev, NETDEV_BONDING_DESLAVE); >> bond_change_active_slave(bond, best_slave); >> rv = bond_set_carrier(bond); >> if (!rv) You can't do this here; the driver is holding various spin locks, and notifier calls can sleep (hence the warning). If you look at the bond_change_active_slave function, it drops all locks other than RTNL before making a notifier call, e.g., void bond_change_active_slave(struct bonding *bond, struct slave *new_active) { [...] if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { [...] write_unlock_bh(&bond->curr_slave_lock); read_unlock(&bond->lock); netdev_bonding_change(bond->dev, NETDEV_BONDING_FAILOVER); read_lock(&bond->lock); write_lock_bh(&bond->curr_slave_lock); } You may be able to add your notifier to this case, or change your handler to notice the _FAILOVER notifier. >> @@ -2154,6 +2155,7 @@ static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_devi >> (old_active) && >> (new_active->link == BOND_LINK_UP) && >> IS_UP(new_active->dev)) { >> + netdev_bonding_change(bond->dev, NETDEV_BONDING_DESLAVE); >> write_lock_bh(&bond->curr_slave_lock); >> bond_change_active_slave(bond, new_active); >> write_unlock_bh(&bond->curr_slave_lock); This case will have the same problem, but will only be hit if a user does a manual "ifenslave -c bond0 ethX". You also probably wanted to do the sysfs path, but if the notifier goes into the change_active_slave function itself, then I don't think additional notifications would be necessary. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar(a)us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jay Vosburgh on 1 Jun 2010 14:50
Cong Wang <amwang(a)redhat.com> wrote: >On 06/01/10 03:08, Flavio Leitner wrote: >> On Mon, May 31, 2010 at 01:56:52PM +0800, Cong Wang wrote: >>> Hi, Flavio, >>> >>> Please use the attached patch instead, try to see if it solves >>> all your problems. >> >> I tried and it hangs. No backtraces this time. >> The bond_change_active_slave() prints before NETDEV_BONDING_FAILOVER >> notification, so I think it won't work. > >Ah, I thought the same. > >> >> Please, correct if I'm wrong, but when a failover happens with your >> patch applied, the netconsole would be disabled forever even with >> another healthy slave, right? >> > >Yes, this is an easy solution, because bonding has several modes, >it is complex to make netpoll work in different modes. If I understand correctly, the root cause of the problem with netconsole and bonding is that bonding is, ultimately, performing printks with a write lock held, and when netconsole recursively calls into bonding to send the printk over the netconsole, there is a deadlock (when the bonding xmit function attempts to acquire the same lock for read). You're trying to avoid the deadlock by shutting off netconsole (permanently, it looks like) for one problem case: a failover, which does some printks with a write lock held. This doesn't look to me like a complete solution, there are other cases in bonding that will do printk with write locks held. I suspect those will also hang netconsole as things exist today, and won't be affected by your patch below. For example: The sysfs functions to set the primary (bonding_store_primary) or active (bonding_store_active_slave) options: a pr_info is called to provide a log message of the results. These could be tested by setting the primary or active options via sysfs, e.g., echo eth0 > /sys/class/net/bond0/bonding/primary echo eth0 > /sys/class/net/bond0/bonding/active If the kernel is defined with DEBUG, there are a few pr_debug calls within write_locks (bond_del_vlan, for example). If the slave's underlying device driver's ndo_vlan_rx_register or ndo_vlan_rx_kill_vid functions call printk (and it looks like some do for error cases, e.g., igbvf, ehea, enic), those would also presumably deadlock (because bonding holds its write_lock when calling the ndo_ vlan functions). It also appears that (with the patch below) some nominally normal usage patterns will immediately disable netconsole. The one that comes to mind is if the primary= option is set (to "eth1" for this example), but that slave not enslaved first (the slaves are added, say, eth0 then eth1). In that situation, when the primary slave (eth1 here) is added, the first thing that will happen is a failover, and that will disable netconsole. Thoughts? -J >Would you like to test the following patch? > >Thanks much! > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index 5e12462..59ade92 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -1109,6 +1109,14 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active) > if (old_active == new_active) > return; > >+ write_unlock_bh(&bond->curr_slave_lock); >+ read_unlock(&bond->lock); >+ >+ netdev_bonding_change(bond->dev, NETDEV_BONDING_DESLAVE); >+ >+ read_lock(&bond->lock); >+ write_lock_bh(&bond->curr_slave_lock); >+ > if (new_active) { > new_active->jiffies = jiffies; > --- -Jay Vosburgh, IBM Linux Technology Center, fubar(a)us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |