Prev: hw-breakpoints, kgdb, x86: add a flag to passDIE_DEBUG notification
Next: x86/pci Oops with CONFIG_SND_HDA_INTEL
From: Martin Mokrejs on 23 Jul 2010 10:10 Hi, I bought a external harddrive with firewire and USB interfaces (IcyBOX IB-250StUE-B). If I connect it to a desktop computer A I get kernel crash during boot (see both attached dmesg-*.txt files). Further, a laptop computer B is connected to A via firewire as well through firewire-net module. I do not understand why but on computer B I see in dmesg complains from firewire_sbp about the external drive physically connected to computer A! Is that a bug or feature? Nevertheless, the host B cannot really talk to the drive (see below snippet from 2.6.34.1 kernel on the laptop below in the body of this email). Sorry for mixing the two issue into a single email. Maybe this is because of similar underlying issues? The desktop has 2 firewire ports and the laptop also 2 ports. While taking into account that both have firewire_net inserted into the running kernel and on both machines I see only firewire0 interface and not additional firewire1 interface I wonder whether the kernels realizes there are two physical ports on each computer and maybe it mixes together some data or takes an action on the wrong port. You may think of my yesterdays email as of yet another kernel crash and bug in JuJu firewire stack under subject "2.6.31.14: firewire_net issue in generic_sync_sb_inodes". Thanks for any clues, Martin firewire_core: created device fw0: GUID 00e018000305e5fc, S400 firewire_core: created device fw1: GUID 0011d80001762a80, S400 firewire_core: created device fw2: GUID 001b8c8000000105, S400 firewire_core: refreshed device fw0 firewire_net: firewire0: IPv4 over FireWire on device 00e018000305e5fc usb 1-1: new low speed USB device using uhci_hcd and address 2 scsi2 : SBP-2 IEEE-1394 usb 1-1: New USB device found, idVendor=0458, idProduct=0036 usb 1-1: New USB device strings: Mfr=2, Product=1, SerialNumber=0 usb 1-1: Product: NetScroll + Mini Traveler usb 1-1: Manufacturer: Genius firewire_sbp2: fw2.0: logged in to LUN 0000 (0 retries) scsi 2:0:0:0: Direct-Access-RBC JMicron HDD PQ: 0 ANSI: 4 sd 2:0:0:0: Attached scsi generic sg2 type 14 sd 2:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) sd 2:0:0:0: [sdb] Write Protect is off sd 2:0:0:0: [sdb] Mode Sense: 10 00 00 00 sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sdb1 sd 2:0:0:0: [sdb] Attached SCSI disk firewire_sbp2: fw2.0: sbp2_scsi_abort firewire_sbp2: fw2.0: sbp2_scsi_abort sd 2:0:0:0: Device offlined - not ready after error recovery sd 2:0:0:0: [sdb] Unhandled error code sd 2:0:0:0: [sdb] Result: hostbyte=0x02 driverbyte=0x00 sd 2:0:0:0: [sdb] CDB: cdb[0]=0x28: 28 00 00 00 00 00 00 00 20 00 end_request: I/O error, dev sdb, sector 0 Buffer I/O error on device sdb, logical block 0 Buffer I/O error on device sdb, logical block 1 Buffer I/O error on device sdb, logical block 2 Buffer I/O error on device sdb, logical block 3 firewire_ohci: isochronous cycle inconsistent firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries) firewire_core: refreshed device fw1 firewire_core: phy config: card 0, new root=ffc2, gap_count=7 firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries) firewire_ohci: isochronous cycle inconsistent firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries) firewire_core: phy config: card 0, new root=ffc2, gap_count=7 firewire_sbp2: fw2.0: reconnected to LUN 0000 (0 retries) firewire_core: phy config: card 0, new root=ffc1, gap_count=5 sd 2:0:0:0: [sdb] Synchronizing SCSI cache sd 2:0:0:0: [sdb] Result: hostbyte=0x02 driverbyte=0x00 sd 2:0:0:0: [sdb] Stopping disk sd 2:0:0:0: [sdb] START_STOP FAILED sd 2:0:0:0: [sdb] Result: hostbyte=0x02 driverbyte=0x00 firewire_sbp2: released fw2.0, target 2:0:0 firewire_ohci: isochronous cycle inconsistent firewire_core: phy config: card 0, new root=ffc1, gap_count=5 firewire_ohci: isochronous cycle inconsistent firewire_core: created device fw1: GUID 0011d80001762a80, S400 firewire_core: refreshed device fw1
From: Martin Mokrejs on 23 Jul 2010 14:40 Hi Jay, thank you for you thorough explanation. Let me just briefly re-phrase what I have. The topology is as of now: A B VT6306 R5C552 | | | | | ------------- firewire-net+sbp2--------------- | | --- unused port | ------ external drive enclosure (2 FW ports, 1USB port, one PWR port) In other words, I did not plugin two firewire cables into the two sockets on the external drive enclosure, each coming from a different computer. I am not that desperate user. ;) I suspect you thought I have the external drive in between both computers. No, I don't. Computer A (desktop) has VT6306 Fire II IEEE 1394 chip, 3 ports, one connected to the external hard drive, another to computer B (laptop) used for the TCP IP networking. Computer B has Ricoh Co Ltd R5C552 IEEE 1394 chip. I should blacklist firewire_sbp driver so that the laptop does not try to access the external hard drive. Yes, I have realized that the old firewire modules take precedence over the new JuJu stuff. I used only the JuJu driver but after experiencing problems I decided to compile as modules also the old drivers. I will repoduce this with the JuJu drivers alone once again. (I have given that up meanwhile and I use the USB port to transfer the data now - but will re-try and re-post.) Thanks, Martin Jay Fenlason wrote: > On Fri, Jul 23, 2010 at 04:09:21PM +0200, Martin Mokrejs wrote: >> Hi, >> I bought a external harddrive with firewire and USB interfaces (IcyBOX IB-250StUE-B). >> If I connect it to a desktop computer A I get kernel crash during boot (see >> both attached dmesg-*.txt files). >> >> Further, a laptop computer B is connected to A via firewire as well through >> firewire-net module. I do not understand why but on computer B I see in dmesg >> complains from firewire_sbp about the external drive physically connected to >> computer A! Is that a bug or feature? Nevertheless, the host B cannot really >> talk to the drive (see below snippet from 2.6.34.1 kernel on the laptop below >> in the body of this email). >> >> Sorry for mixing the two issue into a single email. Maybe this is because >> of similar underlying issues? The desktop has 2 firewire ports and the laptop >> also 2 ports. While taking into account that both have firewire_net inserted >> into the running kernel and on both machines I see only firewire0 interface >> and not additional firewire1 interface I wonder whether the kernels realizes >> there are two physical ports on each computer and maybe it mixes together >> some data or takes an action on the wrong port. You may think of my yesterdays >> email as of yet another kernel crash and bug in JuJu firewire stack under subject >> "2.6.31.14: firewire_net issue in generic_sync_sb_inodes". > > I think you are confused about how firewire works. Firewire is a bus, > not a point-to-point technology. Any device on a firewire bus may > talk to any other device on the same bus, whether the are directly > physically connected or not. Otherwise you would not be able to > daisy-chain disks, cameras, audio devices, etc. The only way you can > have multiple firewire busses on a device is to have multiple firewire > controllers. (You can do this by putting two firewire PCI cards in a > computer, or by putting a FirWire CardBus card in a laptop with an > on-board firewire controller, but I don't know of any machines that > ship with multiple firewire busses.) Each controller can have any > number (*up to 63, with 1-3 being the most comment) of ports on it. > >>From what you've said above, each of your computers has a single > firewire controller in it (lspci will tell you for sure). One of the > computers has two ports on its controller, and the other has three. > (This in not uncommon on many firewire based systems because the > commonly used PHY chips support up to three ports.) > > Hard disks (and things that emulate them) generally allow only a > single host to control them at a time. (Ignoring for the moment > specialized "multi-initiator" capable hardware used for shared storage > in clustering applications.) This is because if two machines mount > the same (non clustering-aware) filesystem at the same time, they will > write over each others changes to the filesystem and eventually trash > the filesystem's data structures beyond repair. So when you have > created a single bus with two computers and a single hard disk on it, > it's unsurprising that only one of the computers can successfully talk > to it. > > I see in your dmesg that your 2.6.32.16-default computer is using the > old ieee1394 stack, and not the the firewire stack, so it should not > have loaded firewire-net. It should have loaded eth1394 instead. I'm > troubled by the traceback in nodemgr, but since the old stack is > unmaintained and buggy, your first step should be to completely > eliminate iee1394, ohci1394, sbp2 and eth1394 from it and replace them > with firewire-core, firewire-ohci, firewire-sbp2, and firewire-net on > it. Nobody is going to bother to debug the old stack at this point. > > You should then either blacklist firewire-sbp2 on the computer that > you do not want to use the external disk from, or tell firewire-sbp2 > not to try to attach to it (I believe Stefan Richter wrote directions > on how to do that a year or two ago. Check the linux1394-devel > archives). Otherwise both machines will race to connect to it, one of > them will win, and the other will get errors. > > -- JF > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin Mokrejs on 23 Jul 2010 16:30 Hi Jay, I removed the old firewire modules from kernel .config and recompiled&reinstalled the kernel file and modules on compiuter A, and disabled the firewire_sbp2 on host B. I have power problems with the drive when on firewire, though. It seems the desktop PC (ASUS P5K WS 1.0004) is not able to feed the WD 1.0TB 2.5 5200rpm" drive. I tried the firewire ports on the front of the box as well those from the motherboard. Even, plugged in the USB power "jack", no luck. If I use the USB port + USB power it works fine. I just unplugged the device after not being able to even run "fdisk /dev/sdh". At the moment I have screwed superblock on the filesystem and will have to re-start from scratch. The attached dmesg talks about "Device offlined - not ready after error recovery" but I hope this is a temporary issue, and I just disconnected the device at the very end. BTW, some driver is not ACPI compliant according to dmesg. The chip in the external IcyBox IB-250StUE-B is JMicron JMB 353 doing the USB+FireWire+SATA work for the 2.5" WD drive. Martin Jay Fenlason wrote: > On Fri, Jul 23, 2010 at 08:38:26PM +0200, Martin Mokrejs wrote: >> Hi Jay, >> thank you for you thorough explanation. Let me just briefly re-phrase what >> I have. The topology is as of now: >> >> A B >> >> VT6306 R5C552 >> | | | | >> | ------------- firewire-net+sbp2--------------- | >> | --- unused port >> | >> ------ external drive enclosure (2 FW ports, 1USB port, one PWR port) >> >> >> In other words, I did not plugin two firewire cables into the two sockets on the >> external drive enclosure, each coming from a different computer. I am not that >> desperate user. ;) I suspect you thought I have the external drive in between >> both computers. No, I don't. > > The firewire bus is very egalitarian (unlike USB). All devices on the > bus (disk, camera, computer, etc) are devices on the bus. The bus > doesn't care which device is in the middle of a three-node > configuration. (Well, unless some of the devices are capable of > speeds that the other devices can't do, but that's a special case.) > >> Computer A (desktop) has VT6306 Fire II IEEE 1394 chip, 3 ports, one connected >> to the external hard drive, another to computer B (laptop) used for the TCP IP >> networking. >> >> Computer B has Ricoh Co Ltd R5C552 IEEE 1394 chip. I should blacklist firewire_sbp >> driver so that the laptop does not try to access the external hard drive. > > Yes. > >> Yes, I have realized that the old firewire modules take precedence over the new >> JuJu stuff. I used only the JuJu driver but after experiencing problems I decided >> to compile as modules also the old drivers. I will repoduce this with the JuJu >> drivers alone once again. (I have given that up meanwhile and I use the USB port >> to transfer the data now - but will re-try and re-post.) > > I'm curious about how firewire-net is doing. I know eth1394 can be > taken down with a simple ping flood, so I hope it is more resiliant > than that.
From: Stefan Richter on 23 Jul 2010 17:20 Martin Mokrejs wrote at LKML: > Hi Jay, > Jay Fenlason wrote: >> On Fri, Jul 23, 2010 at 04:09:21PM +0200, Martin Mokrejs wrote: >>> Hi, >>> I bought a external harddrive with firewire and USB interfaces (IcyBOX IB-250StUE-B). >>> If I connect it to a desktop computer A I get kernel crash during boot (see >>> both attached dmesg-*.txt files). The crash which you reported is in sbp2 (of the old ieee1394 stack alias linux1394, not in firewire-sbp2 (of the new firewire stack alias juju). >>> Further, a laptop computer B is connected to A via firewire as well through >>> firewire-net module. I do not understand why but on computer B I see in dmesg >>> complains from firewire_sbp about the external drive physically connected to >>> computer A! Is that a bug or feature? Nevertheless, the host B cannot really >>> talk to the drive (see below snippet from 2.6.34.1 kernel on the laptop below >>> in the body of this email). I comment on this further below. >>> Sorry for mixing the two issue into a single email. Maybe this is because >>> of similar underlying issues? The desktop has 2 firewire ports and the laptop >>> also 2 ports. While taking into account that both have firewire_net inserted >>> into the running kernel and on both machines I see only firewire0 interface >>> and not additional firewire1 interface I wonder whether the kernels realizes >>> there are two physical ports on each computer and maybe it mixes together >>> some data or takes an action on the wrong port. You may think of my yesterdays >>> email as of yet another kernel crash and bug in JuJu firewire stack under subject >>> "2.6.31.14: firewire_net issue in generic_sync_sb_inodes". I missed that thread, and amost missed this one. You could have Cc'd linux1394-devel. Chances to get help on specific driver issues on LKML are slim. The crashlog from "2.6.31.14: firewire_net issue in generic_sync_sb_inodes" does not point to firewire-net directly. But perhaps firewire-net corrupted some memory before that crash. There was a bugfix for firewire-net in 2.6.33. But I believe that fix is only necessary on SMP/ multicore machines; your notebook seems to be a singlecore machine. >> I think you are confused about how firewire works. Firewire is a bus, >> not a point-to-point technology. Any device on a firewire bus may >> talk to any other device on the same bus, whether the are directly >> physically connected or not. Otherwise you would not be able to >> daisy-chain disks, cameras, audio devices, etc. The only way you can >> have multiple firewire busses on a device is to have multiple firewire >> controllers. (You can do this by putting two firewire PCI cards in a >> computer, or by putting a FirWire CardBus card in a laptop with an >> on-board firewire controller, but I don't know of any machines that >> ship with multiple firewire busses.) Each controller can have any >> number (*up to 63, with 1-3 being the most comment) of ports on it. >> >> From what you've said above, each of your computers has a single >> firewire controller in it (lspci will tell you for sure). One of the >> computers has two ports on its controller, and the other has three. >> (This in not uncommon on many firewire based systems because the >> commonly used PHY chips support up to three ports.) Absolutely; FireWire devices (including PCs/ laptops) almost always only have a single FireWire link-layer interface, even if they have multiple FireWire physical interfaces. A FireWire device with several ports repeats all traffic between these ports. (Except in case of speed capability differences of different bus segments.) Furthermore, unlike the host-centric USB, FireWire is a peer-to-peer bus or network. All nodes that are present on one bus see each other and can communicate with each other regardless of the particular topology. >> Hard disks (and things that emulate them) generally allow only a >> single host to control them at a time. (Ignoring for the moment >> specialized "multi-initiator" capable hardware used for shared storage >> in clustering applications.) This is because if two machines mount >> the same (non clustering-aware) filesystem at the same time, they will >> write over each others changes to the filesystem and eventually trash >> the filesystem's data structures beyond repair. So when you have >> created a single bus with two computers and a single hard disk on it, >> it's unsurprising that only one of the computers can successfully talk >> to it. >> >> I see in your dmesg that your 2.6.32.16-default computer is using the >> old ieee1394 stack, and not the the firewire stack, so it should not >> have loaded firewire-net. It should have loaded eth1394 instead. On Gentoo Linux and many other distributions, eth1394 is blacklisted (i.e. never automatically loaded). This is because distributors don't like it when eth1394 messes up the "eth%d" networking interface namespace. firewire-net on the other hand is not blacklisted (but also won't intermix with the names of Ethernet interfaces). Hence, if a Linux PC which has firewire-net installed is plugged into a bus with an IPv4-over-1394 capable node present, firewire-net will be auto-loaded regardless whether the FireWire controller is driven by ohci1394 or firewire-ohci at that time. If ohci1394 is at the helm at that moment, firewire-net will of course do nothing but take up space. >> I'm troubled by the traceback in nodemgr, but since the old stack is >> unmaintained and buggy, your first step should be to completely >> eliminate iee1394, ohci1394, sbp2 and eth1394 from it and replace them >> with firewire-core, firewire-ohci, firewire-sbp2, and firewire-net on >> it. Nobody is going to bother to debug the old stack at this point. Exactly. ieee1394, sbp2, ohci1394 etc. are planed to be deleted in 2.6.37(rc-1) which will apparently be in less than 3 months. While a crash bug is something pretty severe, there are simply no resources to chase them anymore. >> You should then either blacklist firewire-sbp2 on the computer that >> you do not want to use the external disk from, or tell firewire-sbp2 >> not to try to attach to it (I believe Stefan Richter wrote directions >> on how to do that a year or two ago. Check the linux1394-devel >> archives). Did I? Right now I would say, just blacklist firewire-sbp2 (and sbp2) on the machine that is not supposed to log into the disk. >> Otherwise both machines will race to connect to it, one of >> them will win, and the other will get errors. >> >> -- JF (Which is harmless except for the fact that which of the two initiators wins the login might not be the one that you wanted.) > thank you for you thorough explanation. Let me just briefly re-phrase what > I have. The topology is as of now: > > A B > > VT6306 R5C552 > | | | | > | ------------- firewire-net+sbp2--------------- | > | --- unused port > | > ------ external drive enclosure (2 FW ports, 1USB port, one PWR port) > > > In other words, I did not plugin two firewire cables into the two sockets on the > external drive enclosure, each coming from a different computer. I am not that > desperate user. ;) I suspect you thought I have the external drive in between > both computers. No, I don't. > > Computer A (desktop) has VT6306 Fire II IEEE 1394 chip, 3 ports, one connected > to the external hard drive, another to computer B (laptop) used for the TCP IP > networking. For IPv4 over 1394 as well as for SBP-2 it does not matter whether the physical order is disk--A--B or A--disk--B or A--B--disk. > Computer B has Ricoh Co Ltd R5C552 IEEE 1394 chip. I should blacklist firewire_sbp > driver so that the laptop does not try to access the external hard drive. > > Yes, I have realized that the old firewire modules take precedence over the new > JuJu stuff. I used only the JuJu driver but after experiencing problems I decided > to compile as modules also the old drivers. I will repoduce this with the JuJu > drivers alone once again. (I have given that up meanwhile and I use the USB port > to transfer the data now - but will re-try and re-post.) Older dual 1394a + USB 2.0 IcyBoxes were based on the infamous Prolific PL3507 chip. That one's FireWire part is extremely unreliable under any OS. Some PL3507 based disks could be made to work /somewhat/ better by installing the latest firmware from Prolific on it. Have a look at https://ieee1394.wiki.kernel.org/index.php/Firmware_Downloads . Prolific's FireWire firmware updater utility works via USB 2.0 and unsurprisingly only runs on Windows. -- Stefan Richter -=====-==-=- -=== =-=== http://arcgraph.de/sr/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Stefan Richter on 23 Jul 2010 17:50
Martin Mokrejs wrote at LKML: > Hi Jay, > I removed the old firewire modules from kernel .config and recompiled&reinstalled > the kernel file and modules on compiuter A, and disabled the firewire_sbp2 on host B. > I have power problems with the drive when on firewire, though. It seems the desktop > PC (ASUS P5K WS 1.0004) is not able to feed the WD 1.0TB 2.5 5200rpm" drive. Although it should, unless if the ASUS motherboard is miswired. A standards compliant bus power provider must be able to supply at least 1.5 A. Most PCs wire their FireWir ports up to 12 Volts, which gives you at least 18 Watts. You can driver several of these WD drives off that. > I tried > the firewire ports on the front of the box as well those from the motherboard. Front panel connectors may be unreliable. Often the connections from mortherboard to front panel are cheap and outside the IEEE 1394 electrical specification. > Even, > plugged in the USB power "jack", no luck. If I use the USB port + USB power it works > fine. I just unplugged the device after not being able to even run "fdisk /dev/sdh". > At the moment I have screwed superblock on the filesystem and will have to re-start > from scratch. The attached dmesg talks about "Device offlined - not ready after error > recovery" but I hope this is a temporary issue, and I just disconnected the device > at the very end. > BTW, some driver is not ACPI compliant according to dmesg. > > The chip in the external IcyBox IB-250StUE-B is JMicron JMB 353 doing the USB+FireWire+SATA > work for the 2.5" WD drive. > Martin Oh, I haven't heard of that chip before. So far I was only aware of PCIe--FireWire chips from JMicron. PCIe--FireWire chips are of course entirely different beasts than SATA--FireWire chips, but those other FireWire offerings from JMicron do not inspire confidence. Your dmesg shows multiple bus resets and one of the three nodes disappearing from the bus from time to time. This points to a probelm at the physical layer (i.e. highly unreliable hardware) which fundamentally cannot be solved by software. -- Stefan Richter -=====-==-=- -=== =-=== http://arcgraph.de/sr/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |