Prev: New HID device Philips Remote RC 153_Vista
Next: tpm_infineon: Fix suspend/resume handler for pnp_driver
From: Michael Breuer on 6 Jan 2010 23:10 On 1/6/2010 9:42 PM, Michael Breuer wrote: > On 1/6/2010 6:26 PM, Michael Breuer wrote: >> On 1/6/2010 4:10 PM, Stephen Hemminger wrote: >>> On Wed, 06 Jan 2010 14:49:38 -0500 >>> Michael Breuer<mbreuer(a)majjas.com> wrote: >>> >>>> This patch at first behaved similarly to the previous one - seemed >>>> to be >>>> running a bit better... until the adapter went down :( >>>> >>>> This is the syslog output at the time the network failed: >>>> Jan 6 14:11:01 mail kernel: sky2 0000:06:00.0: error interrupt >>>> status=0x40000008 >>>> Jan 6 14:11:01 mail kernel: sky2 software interrupt status 0x40000008 >>> Could you go back to baseline sky2 driver. The display code might >>> be buggy. >>> These bits indicate an error in the MAC. The interrupt source enabled >>> is Transmit FIFO underrun. >>> >>> Looking at how vendor driver handles this. >>> It looks like the Yukon EC_U chip doesn't really do Jumbo frames >>> correctly. >>> Maybe not enough internal buffering to ensure that the whole packet >>> is in the chip. Of course, none of this is in the chip manual. >>> >>> Does this help >>> -------------- >>> --- a/drivers/net/sky2.c 2010-01-06 12:48:43.012318966 -0800 >>> +++ b/drivers/net/sky2.c 2010-01-06 13:05:31.273987255 -0800 >>> @@ -792,33 +792,21 @@ static void sky2_set_tx_stfwd(struct sky >>> { >>> struct net_device *dev = hw->dev[port]; >>> >>> - if ( (hw->chip_id == CHIP_ID_YUKON_EX&& >>> - hw->chip_rev != CHIP_REV_YU_EX_A0) || >>> - hw->chip_id>= CHIP_ID_YUKON_FE_P) { >>> - /* Yukon-Extreme B0 and further Extreme devices */ >>> - /* enable Store& Forward mode for TX */ >>> - >>> - if (dev->mtu<= ETH_DATA_LEN) >>> - sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), >>> - TX_JUMBO_DIS | TX_STFW_ENA); >>> - >>> - else >>> - sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), >>> - TX_JUMBO_ENA| TX_STFW_ENA); >>> - } else { >>> - if (dev->mtu<= ETH_DATA_LEN) >>> - sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), >>> TX_STFW_ENA); >>> - else { >>> - /* set Tx GMAC FIFO Almost Empty Threshold */ >>> - sky2_write32(hw, SK_REG(port, TX_GMF_AE_THR), >>> - (ECU_JUMBO_WM<< 16) | ECU_AE_THR); >>> - >>> - sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), >>> TX_STFW_DIS); >>> - >>> - /* Can't do offload because of lack of store/forward */ >>> - dev->features&= ~(NETIF_F_TSO | NETIF_F_SG | >>> NETIF_F_ALL_CSUM); >>> - } >>> - } >>> + if ( (hw->chip_id == CHIP_ID_YUKON_EX&& hw->chip_rev != >>> CHIP_REV_YU_EX_A0) || >>> + hw->chip_id>= CHIP_ID_YUKON_FE_P) { >>> + /* Yukon-Extreme B0 and further Extreme devices */ >>> + /* enable Store& Forward mode for TX */ >>> + sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), TX_STFW_ENA); >>> + } else if (dev->mtu> ETH_DATA_LEN) { >>> + /* set Tx GMAC FIFO Almost Empty Threshold */ >>> + sky2_write32(hw, SK_REG(port, TX_GMF_AE_THR), >>> + (ECU_JUMBO_WM<< 16) | ECU_AE_THR); >>> + /* disable Store& Forward mode for TX */ >>> + sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), TX_STFW_DIS); >>> + } else { >>> + /* enable Store& Forward mode for TX */ >>> + sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), TX_STFW_ENA); >>> + } >>> } >>> >>> static void sky2_mac_init(struct sky2_hw *hw, unsigned port) >>> @@ -2185,11 +2173,16 @@ static int sky2_change_mtu(struct net_de >>> if (new_mtu< ETH_ZLEN || new_mtu> ETH_JUMBO_MTU) >>> return -EINVAL; >>> >>> + /* MTU> 1500 on yukon FE and FE+ not allowed */ >>> if (new_mtu> ETH_DATA_LEN&& >>> (hw->chip_id == CHIP_ID_YUKON_FE || >>> hw->chip_id == CHIP_ID_YUKON_FE_P)) >>> return -EINVAL; >>> >>> + /* TSO on Yukon Ultra and MTU> 1500 not supported */ >>> + if (new_mtu> ETH_DATA_LEN&& hw->chip_id == CHIP_ID_YUKON_EC_U) >>> + dev->features&= ~NETIF_F_TSO; >>> + >>> if (!netif_running(dev)) { >>> dev->mtu = new_mtu; >>> return 0; >>> @@ -2233,6 +2226,15 @@ static int sky2_change_mtu(struct net_de >>> if (err) >>> dev_close(dev); >>> else { >>> + /* WA for dev. #4.209 */ >>> + if (hw->chip_id == CHIP_ID_YUKON_EC_U&& >>> + hw->chip_rev == CHIP_REV_YU_EC_U_A1) { >>> + /* enable/disable Store& Forward mode for TX */ >>> + sky2_write32(hw, SK_REG(port, TX_GMF_CTRL_T), >>> + sky2->speed != SPEED_1000 >>> + ? TX_STFW_ENA : TX_STFW_DIS); >>> + } >>> + >>> gma_write16(hw, port, GM_GP_CTRL, ctl); >>> >>> netif_wake_queue(dev); >>> --- a/drivers/net/sky2.h 2010-01-06 12:48:48.632247424 -0800 >>> +++ b/drivers/net/sky2.h 2010-01-06 12:59:57.322078964 -0800 >>> @@ -1901,8 +1901,8 @@ enum { >>> TX_VLAN_TAG_ON = 1<<25,/* enable VLAN tagging */ >>> TX_VLAN_TAG_OFF = 1<<24,/* disable VLAN tagging */ >>> >>> - TX_JUMBO_ENA = 1<<23,/* PCI Jumbo Mode enable (Yukon-EC >>> Ultra) */ >>> - TX_JUMBO_DIS = 1<<22,/* PCI Jumbo Mode enable (Yukon-EC >>> Ultra) */ >>> + TX_PCI_JUM_ENA = 1<<23,/* Enable PCI Jumbo Mode (Yukon-EC >>> Ultra) */ >>> + TX_PCI_JUM_DIS = 1<<22,/* Disable PCI Jumbo Mode (Yukon-EC >>> Ultra) */ >>> >>> GMF_WSP_TST_ON = 1<<18,/* Write Shadow Pointer Test On */ >>> GMF_WSP_TST_OFF = 1<<17,/* Write Shadow Pointer Test Off */ >> Ok ... results - and maybe some more clues... >> >> Running with this patch; Jarek's "alternative 1", and the patch from >> the other thread. Not so good. >> >> No reported errors (sky2, etc.) - however with mtu=9000, lots of >> stuff broke: XDMCP; http via MASQ/netfilter, ssh connections >> intermittently (when large frames involved perhaps), etc. Tried to >> change mtu to 1500 on the fly, got a bunch of errors - and network >> watchdog kicked in. Have now rebooted with the same patches and >> mtu=1500. >> ... with mtu=1500, Everything is again working (i.e., XDMCP, >> netfilter, etc.) >> Load test with mtu=1500 went well for a while - high throughput >> sustained for a few minutes - then similar crash as before... but no >> interrup error messages this time until after the oops: >> <nothing of note before this> >> Jan 6 18:17:54 mail kernel: DRHD: handling fault status reg 2 >> Jan 6 18:17:54 mail kernel: DMAR:[DMA Read] Request device [06:00.0] >> fault addr 1bbfe000 >> Jan 6 18:17:54 mail kernel: DMAR:[fault reason 06] PTE Read access >> is not set >> Jan 6 18:17:54 mail kernel: sky2 0000:06:00.0: error interrupt >> status=0x80000000 >> Jan 6 18:17:54 mail kernel: sky2 0000:06:00.0: PCI hardware error >> (0x2010) >> Jan 6 18:18:04 mail kernel: ------------[ cut here ]------------ >> Jan 6 18:18:04 mail kernel: WARNING: at net/sched/sch_generic.c:261 >> dev_watchdog+0xf3/0x164() >> Jan 6 18:18:04 mail kernel: Hardware name: System Product Name >> Jan 6 18:18:04 mail kernel: NETDEV WATCHDOG: eth0 (sky2): transmit >> queue 0 timed out >> Jan 6 18:18:04 mail kernel: Modules linked in: ip6table_filter >> ip6table_mangle ip6_tables ipt_MASQUERADE iptable_nat nf_nat >> iptable_mangle iptable_raw bridge stp appletalk psnap llc nfsd lockd >> nfs_acl auth_rpcgss exportfs hwmon_vid coretemp sunrpc acpi_cpufreq >> sit tunnel4 ipt_LOG nf_conntrack_netbios_ns nf_conntrack_ftp xt_DSCP >> xt_dscp xt_MARK nf_conntrack_ipv6 xt_multiport ipv6 dm_multipath >> kvm_intel kvm snd_hda_codec_analog snd_ens1371 gameport snd_rawmidi >> snd_ac97_codec snd_hda_intel snd_hda_codec ac97_bus snd_hwdep snd_seq >> snd_seq_device gspca_spca505 gspca_main videodev v4l1_compat snd_pcm >> v4l2_compat_ioctl32 pcspkr asus_atk0110 hwmon i2c_i801 iTCO_wdt >> firewire_ohci iTCO_vendor_support firewire_core crc_itu_t snd_timer >> snd sky2 soundcore wmi snd_page_alloc fbcon tileblit font bitblit >> softcursor raid456 async_raid6_recov async_pq raid6_pq async_xor xor >> async_memcpy async_tx raid1 ata_generic pata_acpi pata_marvell >> nouveau ttm drm_kms_helper drm agpgart fb i2c_algo_bit cfbcopyarea >> i2c_core cfbimgblt cfbfil >> Jan 6 18:18:04 mail kernel: lrect [last unloaded: microcode] >> Jan 6 18:18:04 mail kernel: Pid: 0, comm: swapper Tainted: G >> W 2.6.32-00840-gec8257c-dirty #41 >> Jan 6 18:18:04 mail kernel: Call Trace: >> Jan 6 18:18:04 mail kernel: <IRQ> [<ffffffff8105365a>] >> warn_slowpath_common+0x7c/0x94 >> Jan 6 18:18:04 mail kernel: [<ffffffff810536c9>] >> warn_slowpath_fmt+0x41/0x43 >> Jan 6 18:18:04 mail kernel: [<ffffffff813e12bf>] ? >> netif_tx_lock+0x44/0x6c >> Jan 6 18:18:04 mail kernel: [<ffffffff813e1427>] >> dev_watchdog+0xf3/0x164 >> Jan 6 18:18:04 mail kernel: [<ffffffff81077696>] ? >> sched_clock_cpu+0x47/0xd1 >> Jan 6 18:18:04 mail kernel: [<ffffffff8106316b>] >> run_timer_softirq+0x1c8/0x270 >> Jan 6 18:18:04 mail kernel: [<ffffffff8105ae3b>] >> __do_softirq+0xf8/0x1cd >> Jan 6 18:18:04 mail kernel: [<ffffffff8107ef33>] ? >> tick_program_event+0x2a/0x2c >> Jan 6 18:18:04 mail kernel: [<ffffffff81012e1c>] call_softirq+0x1c/0x30 >> Jan 6 18:18:04 mail kernel: [<ffffffff810143a3>] do_softirq+0x4b/0xa6 >> Jan 6 18:18:04 mail kernel: [<ffffffff8105aa1b>] irq_exit+0x4a/0x8c >> Jan 6 18:18:04 mail kernel: [<ffffffff8146dd32>] >> smp_apic_timer_interrupt+0x86/0x94 >> Jan 6 18:18:04 mail kernel: [<ffffffff810127e3>] >> apic_timer_interrupt+0x13/0x20 >> Jan 6 18:18:04 mail kernel: <EOI> [<ffffffff812c4a06>] ? >> acpi_idle_enter_c1+0xb2/0xd0 >> Jan 6 18:18:04 mail kernel: [<ffffffff812c49ff>] ? >> acpi_idle_enter_c1+0xab/0xd0 >> Jan 6 18:18:04 mail kernel: [<ffffffff813a43b8>] ? >> cpuidle_idle_call+0x9e/0xfa >> Jan 6 18:18:04 mail kernel: [<ffffffff81010c90>] ? cpu_idle+0xb4/0xf6 >> Jan 6 18:18:04 mail kernel: [<ffffffff81463312>] ? >> start_secondary+0x201/0x242 >> Jan 6 18:18:04 mail kernel: ---[ end trace 57f7151f6a5def07 ]--- >> Jan 6 18:18:04 mail kernel: sky2 eth0: tx timeout >> Jan 6 18:18:04 mail kernel: sky2 eth0: transmit ring 21 .. 108 >> report=21 done=21 >> Jan 6 18:18:04 mail kernel: sky2 eth0: disabling interface >> Jan 6 18:18:04 mail kernel: sky2 eth0: enabling interface >> <eth0 dead after this> > Walked through the code based on Jarek's patches... came upon > NET_CLS_ACT. At least in some cases (sch_cbq.c for example), the net > transmit error could be returned from here... after releasing the skb. > A quick scan of the various files in net/sched suggests that with > NET_CLS_ACT the skb may or may not have been freed in the event of an > error. If I have time later I'll see if I can bypass NET_CLS_ACT and > see whether this is even relevant. Ok - so rerunning with Jarek's alternative #2 patch (the one that doesn't re-free the skb after the net_dev_enqueue error) and all else as above (your mtu patch on a clean 2.6.32.2 sky2.c, plus the devtpts inode patch) with an MTU of 1500 I can run without errors (including the interrupt errors previously reported). Changing MTU to 9000, everything basically breaks - Can't use X11 (local or remote - get X11 screen after gdm login locally, but then goes back to greeter; remote gets no greeter); ssh sessions hang; etc. This time I was able to reset the MTU back to 1500 without a reboot - but I did have to ifconfig eth0 down and then up. Looking at the sk98lin code, it looks to me like they do a bit more work with existing buffers before completing the MTU switch. Note that even doing this, X11 did not work (it did with the old mtu change code). Tried changing to mtu 4500 - same effect as 9000... but when I switched back to 1500, ksoftirqd started spinning using 100% of one core. Running with these patches and 1500 MTU seems stable, but the network is running rather slow. Latency is OK, but peak transfer rates seem to be running about 20% of what I saw while receiving the interrupt errors with earlier patches. I'll leave this running overnight to confirm that at least the errors and hangs are resolved with this patch set. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Hemminger on 7 Jan 2010 00:00 On Wed, 06 Jan 2010 23:00:34 -0500 Michael Breuer <mbreuer(a)majjas.com> wrote: > Changing MTU to 9000, everything basically breaks - Can't use X11 (local > or remote - get X11 screen after gdm login locally, but then goes back > to greeter; remote gets no greeter); ssh sessions hang; etc. This time I > was able to reset the MTU back to 1500 without a reboot - but I did have > to ifconfig eth0 down and then up. Looking at the sk98lin code, it looks > to me like they do a bit more work with existing buffers before > completing the MTU switch. Note that even doing this, X11 did not work > (it did with the old mtu change code). Tried changing to mtu 4500 - same > effect as 9000... but when I switched back to 1500, ksoftirqd started > spinning using 100% of one core. The problem is that patch was enabling scatter-gather and checksum offload that won't work on EC_U hardware with 9K MTU. At least, it never worked for me when I tested it. So because of that it really doesn't change anything for the better on that chip version. What version chip is on that motherboard? Mine is: Yukon-2 EC Ultra chip revision 3 which corresponds to B0 step. Another possibility is the PHY register which controls number of ticks of buffering. The default is zero, which gives the most buffering (good), but the firmware could be reprogramming it (bad). In general, the driver doesn't fiddle with bits that are already set correctly, because sometimes vendors need to tweak PCI timing in firmware/BIOS. It seems the firmware on this chip is just a bunch of register setups done on power on. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Michael Breuer on 7 Jan 2010 00:20 On 1/6/2010 11:53 PM, Stephen Hemminger wrote: > On Wed, 06 Jan 2010 23:00:34 -0500 > Michael Breuer<mbreuer(a)majjas.com> wrote: > > >> Changing MTU to 9000, everything basically breaks - Can't use X11 (local >> or remote - get X11 screen after gdm login locally, but then goes back >> to greeter; remote gets no greeter); ssh sessions hang; etc. This time I >> was able to reset the MTU back to 1500 without a reboot - but I did have >> to ifconfig eth0 down and then up. Looking at the sk98lin code, it looks >> to me like they do a bit more work with existing buffers before >> completing the MTU switch. Note that even doing this, X11 did not work >> (it did with the old mtu change code). Tried changing to mtu 4500 - same >> effect as 9000... but when I switched back to 1500, ksoftirqd started >> spinning using 100% of one core. >> > The problem is that patch was enabling scatter-gather and checksum offload > that won't work on EC_U hardware with 9K MTU. At least, it never worked > for me when I tested it. So because of that it really doesn't change anything > for the better on that chip version. > > What version chip is on that motherboard? Mine is: > Yukon-2 EC Ultra chip revision 3 > which corresponds to B0 step. > > Another possibility is the PHY register which controls number of ticks > of buffering. The default is zero, which gives the most buffering (good), > but the firmware could be reprogramming it (bad). In general, the driver > doesn't fiddle with bits that are already set correctly, because sometimes > vendors need to tweak PCI timing in firmware/BIOS. It seems the firmware on this > chip is just a bunch of register setups done on power on. > So at this point, things are working as mentioned - but really slow... at least an order of magnitude slower than with the other set of patches. The other set also generated errors and was not stable :( But, that set worked with mtu=9000, this set doesn't seem to work with anything over 1500. The slowdown may also be (based on earlier testing) attributable to Jarek's alternative #2 patch. As to the chip, I *think* we have the same chip - I'm including lspci -vv - perhaps there is something different. 06:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 14) Subsystem: Giga-byte Technology Device e000 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin A routed to IRQ 58 Region 0: Memory at fbdfc000 (64-bit, non-prefetchable) [size=16K] Region 2: I/O ports at d800 [size=256] Expansion ROM at fbdc0000 [disabled] [size=128K] Capabilities: [48] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data Product Name: Marvell Yukon 88E8056 Gigabit Ethernet Controller Read-only fields: [PN] Part number: Yukon 88E8056 [EC] Engineering changes: Rev. 1.4 [MN] Manufacture ID: 4d 61 72 76 65 6c 6c [SN] Serial number: AbCdEfG001C3B [CP] Extended capability: 01 10 cc 03 [RV] Reserved: checksum good, 9 byte(s) reserved Read/write fields: [RW] Read-write area: 121 byte(s) free End Capabilities: [5c] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee00458 Data: 0000 Capabilities: [e0] Express (v1) Legacy Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 unlimited ClockPM+ Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 1f, GenCap- CGenEn- ChkCap- ChkEn- Kernel driver in use: sky2 Kernel modules: sky2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Michael Breuer on 7 Jan 2010 00:40 On 1/6/2010 11:53 PM, Stephen Hemminger wrote: > On Wed, 06 Jan 2010 23:00:34 -0500 > Michael Breuer<mbreuer(a)majjas.com> wrote: > > >> Changing MTU to 9000, everything basically breaks - Can't use X11 (local >> or remote - get X11 screen after gdm login locally, but then goes back >> to greeter; remote gets no greeter); ssh sessions hang; etc. This time I >> was able to reset the MTU back to 1500 without a reboot - but I did have >> to ifconfig eth0 down and then up. Looking at the sk98lin code, it looks >> to me like they do a bit more work with existing buffers before >> completing the MTU switch. Note that even doing this, X11 did not work >> (it did with the old mtu change code). Tried changing to mtu 4500 - same >> effect as 9000... but when I switched back to 1500, ksoftirqd started >> spinning using 100% of one core. >> > The problem is that patch was enabling scatter-gather and checksum offload > that won't work on EC_U hardware with 9K MTU. At least, it never worked > for me when I tested it. So because of that it really doesn't change anything > for the better on that chip version. > > What version chip is on that motherboard? Mine is: > Yukon-2 EC Ultra chip revision 3 > which corresponds to B0 step. > > Another possibility is the PHY register which controls number of ticks > of buffering. The default is zero, which gives the most buffering (good), > but the firmware could be reprogramming it (bad). In general, the driver > doesn't fiddle with bits that are already set correctly, because sometimes > vendors need to tweak PCI timing in firmware/BIOS. It seems the firmware on this > chip is just a bunch of register setups done on power on. > Also - I'm seeing a huge number of dropped packets (RX) 200-300/second. Probably why this is so slow. Current ifconfig: eth0 Link encap:Ethernet HWaddr 00:26:18:00:1C:3B inet addr:10.0.0.1 Bcast:10.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::226:18ff:fe00:1c3b/64 Scope:Link UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:26647536 errors:0 dropped:517884 overruns:0 frame:0 TX packets:12112780 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:38960063319 (36.2 GiB) TX bytes:1889879762 (1.7 GiB) Interrupt:18 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Michael Breuer on 7 Jan 2010 01:00
On 1/7/2010 12:32 AM, Michael Breuer wrote: > On 1/6/2010 11:53 PM, Stephen Hemminger wrote: >> On Wed, 06 Jan 2010 23:00:34 -0500 >> Michael Breuer<mbreuer(a)majjas.com> wrote: >> >>> Changing MTU to 9000, everything basically breaks - Can't use X11 >>> (local >>> or remote - get X11 screen after gdm login locally, but then goes back >>> to greeter; remote gets no greeter); ssh sessions hang; etc. This >>> time I >>> was able to reset the MTU back to 1500 without a reboot - but I did >>> have >>> to ifconfig eth0 down and then up. Looking at the sk98lin code, it >>> looks >>> to me like they do a bit more work with existing buffers before >>> completing the MTU switch. Note that even doing this, X11 did not work >>> (it did with the old mtu change code). Tried changing to mtu 4500 - >>> same >>> effect as 9000... but when I switched back to 1500, ksoftirqd started >>> spinning using 100% of one core. >> The problem is that patch was enabling scatter-gather and checksum >> offload >> that won't work on EC_U hardware with 9K MTU. At least, it never worked >> for me when I tested it. So because of that it really doesn't change >> anything >> for the better on that chip version. >> >> What version chip is on that motherboard? Mine is: >> Yukon-2 EC Ultra chip revision 3 >> which corresponds to B0 step. >> >> Another possibility is the PHY register which controls number of ticks >> of buffering. The default is zero, which gives the most buffering >> (good), >> but the firmware could be reprogramming it (bad). In general, the >> driver >> doesn't fiddle with bits that are already set correctly, because >> sometimes >> vendors need to tweak PCI timing in firmware/BIOS. It seems the >> firmware on this >> chip is just a bunch of register setups done on power on. > Also - I'm seeing a huge number of dropped packets (RX) > 200-300/second. Probably why this is so slow. > > Current ifconfig: > eth0 Link encap:Ethernet HWaddr 00:26:18:00:1C:3B > inet addr:10.0.0.1 Bcast:10.0.0.255 Mask:255.255.255.0 > inet6 addr: fe80::226:18ff:fe00:1c3b/64 Scope:Link > UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 > RX packets:26647536 errors:0 dropped:517884 overruns:0 frame:0 > TX packets:12112780 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:38960063319 (36.2 GiB) TX bytes:1889879762 (1.7 GiB) > Interrupt:18 > > > Never mind... spoke too soon. Crashed again. Just took longer: Jan 7 00:37:39 mail kernel: DRHD: handling fault status reg 2 Jan 7 00:37:39 mail kernel: DMAR:[DMA Read] Request device [06:00.0] fault addr fff1401fe000 Jan 7 00:37:39 mail kernel: DMAR:[fault reason 06] PTE Read access is not set Jan 7 00:37:39 mail kernel: sky2 0000:06:00.0: error interrupt status=0x80000000 Jan 7 00:37:39 mail kernel: sky2 0000:06:00.0: PCI hardware error (0x2010) Jan 7 00:37:40 mail smbd[4729]: [2010/01/07 00:37:40, 0] lib/util_sock.c:539(read_fd_with_timeout) Jan 7 00:37:40 mail smbd[4729]: [2010/01/07 00:37:40, 0] lib/util_sock.c:1491(get_peer_addr_internal) Jan 7 00:37:40 mail smbd[4729]: getpeername failed. Error was Transport endpoint is not connected Jan 7 00:37:40 mail smbd[4729]: read_fd_with_timeout: client 0.0.0.0 read error = Connection timed out. Jan 7 00:37:40 mail dhcpd: DHCPREQUEST for 10.0.0.32 from 00:26:bb:aa:15:10 (mbitouch) via eth0 Jan 7 00:37:40 mail dhcpd: DHCPACK on 10.0.0.32 to 00:26:bb:aa:15:10 (mbitouch) via eth0 Jan 7 00:37:41 mail dhcpd: DHCPREQUEST for 10.0.0.32 from 00:26:bb:aa:15:10 (mbitouch) via eth0 Jan 7 00:37:41 mail dhcpd: DHCPACK on 10.0.0.32 to 00:26:bb:aa:15:10 (mbitouch) via eth0 Jan 7 00:37:44 mail dhcpd: DHCPREQUEST for 10.0.0.32 from 00:26:bb:aa:15:10 (mbitouch) via eth0 Jan 7 00:37:44 mail dhcpd: DHCPACK on 10.0.0.32 to 00:26:bb:aa:15:10 (mbitouch) via eth0 Jan 7 00:37:47 mail kernel: ------------[ cut here ]------------ Jan 7 00:37:47 mail kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0xf3/0x164() Jan 7 00:37:47 mail kernel: Hardware name: System Product Name Jan 7 00:37:47 mail kernel: NETDEV WATCHDOG: eth0 (sky2): transmit queue 0 timed out Jan 7 00:37:47 mail kernel: Modules linked in: ip6table_filter ip6table_mangle ip6_tables ipt_MASQUERADE iptable_nat nf_nat iptable_mangle iptable_raw bridge stp appletalk psnap llc nfsd lockd nfs_acl auth_rpcgss exportfs hwmon_vid coretemp sunrpc acpi_cpufreq sit tunnel4 ipt_LOG nf_conntrack_netbios_ns nf_conntrack_ftp xt_DSCP xt_dscp xt_MARK nf_conntrack_ipv6 xt_multiport ipv6 dm_multipath kvm_intel kvm snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_ens1371 gameport snd_rawmidi gspca_spca505 gspca_main pcspkr snd_ac97_codec snd_hwdep i2c_i801 snd_seq firewire_ohci videodev v4l1_compat snd_seq_device v4l2_compat_ioctl32 ac97_bus firewire_core crc_itu_t iTCO_wdt snd_pcm iTCO_vendor_support wmi snd_timer snd sky2 asus_atk0110 hwmon soundcore snd_page_alloc fbcon tileblit font bitblit softcursor raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 ata_generic pata_acpi pata_marvell nouveau ttm drm_kms_helper drm agpgart fb i2c_algo_bit cfbcopyarea i2c_core cfbimgblt cfbfil Jan 7 00:37:47 mail kernel: lrect [last unloaded: microcode] Jan 7 00:37:47 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.32-00840-gec8257c-dirty #44 Jan 7 00:37:47 mail kernel: Call Trace: Jan 7 00:37:47 mail kernel: <IRQ> [<ffffffff8105365a>] warn_slowpath_common+0x7c/0x94 Jan 7 00:37:47 mail kernel: [<ffffffff810536c9>] warn_slowpath_fmt+0x41/0x43 Jan 7 00:37:47 mail kernel: [<ffffffff813e12bf>] ? netif_tx_lock+0x44/0x6c Jan 7 00:37:47 mail kernel: [<ffffffff813e1427>] dev_watchdog+0xf3/0x164 Jan 7 00:37:47 mail kernel: [<ffffffff8105f320>] ? cascade+0x6a/0x84 Jan 7 00:37:47 mail kernel: [<ffffffff8106316b>] run_timer_softirq+0x1c8/0x270 Jan 7 00:37:47 mail kernel: [<ffffffff8105ae3b>] __do_softirq+0xf8/0x1cd Jan 7 00:37:47 mail kernel: [<ffffffff8107ef33>] ? tick_program_event+0x2a/0x2c Jan 7 00:37:47 mail kernel: [<ffffffff81012e1c>] call_softirq+0x1c/0x30 Jan 7 00:37:47 mail kernel: [<ffffffff810143a3>] do_softirq+0x4b/0xa6 Jan 7 00:37:47 mail kernel: [<ffffffff8105aa1b>] irq_exit+0x4a/0x8c Jan 7 00:37:47 mail kernel: [<ffffffff8146dd32>] smp_apic_timer_interrupt+0x86/0x94 Jan 7 00:37:47 mail kernel: [<ffffffff810127e3>] apic_timer_interrupt+0x13/0x20 Jan 7 00:37:47 mail kernel: <EOI> [<ffffffff812c4c7a>] ? acpi_idle_enter_bm+0x256/0x28a Jan 7 00:37:47 mail kernel: [<ffffffff812c4c73>] ? acpi_idle_enter_bm+0x24f/0x28a Jan 7 00:37:47 mail kernel: [<ffffffff813a43b8>] ? cpuidle_idle_call+0x9e/0xfa Jan 7 00:37:47 mail kernel: [<ffffffff81010c90>] ? cpu_idle+0xb4/0xf6 Jan 7 00:37:47 mail kernel: [<ffffffff81463312>] ? start_secondary+0x201/0x242 Jan 7 00:37:47 mail kernel: ---[ end trace 57f7151f6a5def07 ]--- Jan 7 00:37:47 mail kernel: sky2 eth0: tx timeout Jan 7 00:37:47 mail kernel: sky2 eth0: transmit ring 79 .. 39 report=79 done=79 Jan 7 00:37:47 mail kernel: sky2 eth0: disabling interface Jan 7 00:37:47 mail kernel: sky2 eth0: enabling interface Jan 7 00:37:51 mail kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both JJan 7 00:38:07 mail kernel: sky2 eth0: tx timeout Jan 7 00:38:07 mail kernel: sky2 eth0: transmit ring 3 .. 90 report=3 done=3 Jan 7 00:38:07 mail kernel: sky2 eth0: disabling interface Jan 7 00:38:07 mail kernel: sky2 eth0: enabling interface and so on. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |