From: Frank Miles on
.... ok, started...

>> [snip]
>>
>>>I fail to see what it's doing, but I cannot see any reference to "eth1",
>>>it's like only one interace is being recognized :-?
>>>
>>>What is the output of "dmesg | grep eth"?
>>
>> [ 6.317161] eth1: RTL8168d/8111d at 0xffffc90000c4e000,xx:xx:xx:xx:xx:xx, XID 083000c0 IRQ 32
>> [ 6.384830] eth1: unable to apply firmware patch
>> [ 7.190453] udev: renamed network interface eth1 to eth0
>> [ 7.229390] udev: renamed network interface eth0_rename to eth1
>> [ 11.276999] r8169: eth0: link up
>> [ 11.277005] r8169: eth0: link up
>> [ 12.215716] eth1: setting full-duplex.
>> [ 21.531029] eth0: no IPv6 routers present
>> [ 22.599867] eth1: no IPv6 routers present
>>
>> Again, eth1 is working fine; eth0 seems completely
>> blocked/nonfunctional, despite all the configuration files and netstats
>> looking fine.
>
>Errr, sir... something goes wrong here.
>
>As per your "/etc/udev/rules.d/70-persistent-net.rules":
>
>eth0 -> realtek
>eth1 -> 3com
>
>But that is not what dmesg says above.

That's the reason for my earlier fact-free speculation , based on a kern.log entry:
> Perhaps the kernel brings eth1 into existence by first establishing it as
> eth0, then renaming it to eth1; then bringing the "real" eth0 into existence.
The kern.log entries don't appear in the dmesg output.

>Also, there is no "link up" or "link down" for eth1 but *both" eth0 going
>up. Not sure how to interpret that.

I don't know how to interpret that either, but the message is completely
unchanged other than time - why assume it is referring to the 3com card
in either case? And in any event, the 3com card is functioning - it's
the realtek that isn't.

>> I made a minor effort earlier to suppress the IPv6 modules, but [a]
>> didn't succeed; and [b] hadn't suppressed them earlier with the
>> one-interface system so wasn't convinced it was worth trying - why
>> shouldn't this cause eth1 to quit as well as eth0? Also the previous
>> system showed some indications of IPv6 in its reports, and it worked
>> fine.
>
>I don't think this issue can have any relation with ipv6 :-?.
>
>How about your "/etc/network/interfaces"?
>
>Besides, you can make a quick probe by disabling "eth1" and test if the
>network works as expected ("ping" et al) and then disable "eth0" and
>perform the same test. I mean, test the network adapters "separately".
>
>Greetings,
>
>--
>Camale??n

Tom H:

Thanks for your queries also. I agree that the ipv6 warning is probably
not an issue. What follows may help answer the questions that both of
you have raised:

-----------------------------------------
....and diverted and continued...

I decided to go back and re-establish the system when it only had a single NIC.
Unfortunately this is the r8169, which has the possible firmware issue. I was
_unable_ to get that working, though it had worked at one time. In the hopes
of satisfying your curiousity, here are some reports from that experiment:

puffin:~# netstat -nr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0

puffin:~# arp -a
grebe (192.168.0.4) at xx:xx:xx:xx:xx:xx [ether] on eth0

puffin:# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
0.0.0.0 192.168.0.10 0.0.0.0 UG 0 0 0 eth0
puffin:# ping 192.168.0.4
PING 192.168.0.4 (192.168.0.4) 56(84) bytes of data.
^C
--- 192.168.0.4 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2016ms

puffin:~# iptables -L -n
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination
puffin:~# /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
inet addr:192.168.0.10 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::6ef0:49ff:fe08:a40/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:18 errors:0 dropped:0 overruns:0 frame:0
TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1080 (1.0 KiB) TX bytes:594 (594.0 B)
Interrupt:32 Base address:0x6000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:69 errors:0 dropped:0 overruns:0 frame:0
TX packets:69 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:18577 (18.1 KiB) TX bytes:18577 (18.1 KiB)

puffin:~# dmesg | fgrep eth
[ 0.534289] Driver 'rtc_cmos' needs updating - please use bus_type methods
[ 6.658237] eth0: RTL8168d/8111d at 0xffffc90000c56000, xx:xx:xx:xx:xx:xx, XID 083000c0 IRQ 32
[ 6.808657] eth0: unable to apply firmware patch
[ 11.283432] r8169: eth0: link up
[ 11.283438] r8169: eth0: link up
[ 21.431334] eth0: no IPv6 routers present

puffin:~# ping 192.168.0.4
PING 192.168.0.4 (192.168.0.4) 56(84) bytes of data.
^C

puffin:# route add default gw puffin
puffin:# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
0.0.0.0 192.168.0.10 0.0.0.0 UG 0 0 0 eth0

puffin:# ping 192.168.0.4
PING 192.168.0.4 (192.168.0.4) 56(84) bytes of data.
^C
--- 192.168.0.4 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2016ms

-----------------------------------------
Briefly, my thinking now is that I somehow managed to mess up the NIC firmware
situation that had originally been set up, probably when I updated the system.
Before this change, I'd loaded the whole system using eth0 - tens of GB - with
no problem!

There are several bug reports associated with the r8169 - and its (lack of)
firmware associated with the 2.6.32 kernel [see, for example, #561309].
I'll have to see if I can scrounge another NIC as a temporary work-around.

Thanks again for your diagnostic tips, it has prodded my thinking. Any
other ideas welcome!

Frank


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
From: Frank Miles on
Thanks so much to Stan, Tom H, and Cameleon!

It seems that the consensus is that it's a NIC problem. In case
it wasn't previously clear, the RealTek 8169 is part of the Gigabyte
motherboard.

I thought that I'd escaped non-free-firmware hell by getting a MB
with the graphics based on an Intel chip. Never had a problem before,
but then I usually stand far back from the bleeding edge.

Stan, like you I usually use my own-build kernels. But I'd had
problems getting my own kernels to run with full graphics
capabilities, so had fallen back on the Debian 2.6.32-trunk.

What I'm going to be doing in the short term is turning the
RealTek off (BIOS setting), and installing another NIC. I should
be able to get things running this way. I will post again once
I've done this. Longer term, I'll try to get the RealTek running.
All this flailing about has put me behind on other things, so that
may not be right away.

Thanks again to you all... it's been real educational.

-Frank


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
From: Stan Hoeppner on
Frank Miles put forth on 2/8/2010 10:32 AM:
> Thanks so much to Stan, Tom H, and Cameleon!
>
> It seems that the consensus is that it's a NIC problem. In case
> it wasn't previously clear, the RealTek 8169 is part of the Gigabyte
> motherboard.
>
> I thought that I'd escaped non-free-firmware hell by getting a MB
> with the graphics based on an Intel chip. Never had a problem before,
> but then I usually stand far back from the bleeding edge.
>
> Stan, like you I usually use my own-build kernels. But I'd had
> problems getting my own kernels to run with full graphics
> capabilities, so had fallen back on the Debian 2.6.32-trunk.
>
> What I'm going to be doing in the short term is turning the
> RealTek off (BIOS setting), and installing another NIC. I should
> be able to get things running this way. I will post again once
> I've done this. Longer term, I'll try to get the RealTek running. All
> this flailing about has put me behind on other things, so that
> may not be right away.
>
> Thanks again to you all... it's been real educational.

You're welcome Frank. I'm sure the Debian kernel team with get the 8169
driver/firmware issue worked out within a point release or two.

Worth a look:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561309
http://www.linux-archive.org/debian-kernel/283643-2-6-32-experimental.html
http://www.newegg.com/Product/Product.aspx?Item=N82E16833180026
http://www.newegg.com/Product/Product.aspx?Item=N82E16833106123
http://www.newegg.com/Product/Product.aspx?Item=N82E16833106033

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
From: Andrei Popescu on
On Mon,08.Feb.10, 01:15:43, Stan Hoeppner wrote:

> > Perhaps the kernel brings eth1 into existence by first establishing it as
> > eth0, then renaming it to eth1; then bringing the "real" eth0 into
> > existence.
>
> The above can happen when you add NICs to the system. I hate UDEV for this, and
> it took me the better part of a day to figure this out a few months ago. UDEV
> names the devices based on PCI bus slot number order. If you add a new PCI NIC
> into an empty slot with a lower number than that of the NIC already in the
> system, UDEV makes the lowest slot number eth0 and the higher slot number eth1.

I seem to recall such issues in the (quite distant) past.

> The solution is to change the PCI slot order or create a UDEV static naming
> rule based on MAC address that overrides the slot number ordering.

This is already done in /etc/udev/rules.d/70-persistent-net.rules (which
is actually generated by another rule).

Regards,
Andrei
--
Offtopic discussions among Debian users and developers:
http://lists.alioth.debian.org/mailman/listinfo/d-community-offtopic
From: Stan Hoeppner on
Andrei Popescu put forth on 2/8/2010 2:29 PM:
> On Mon,08.Feb.10, 01:15:43, Stan Hoeppner wrote:
>
>>> Perhaps the kernel brings eth1 into existence by first establishing it as
>>> eth0, then renaming it to eth1; then bringing the "real" eth0 into
>>> existence.
>>
>> The above can happen when you add NICs to the system. I hate UDEV for this, and
>> it took me the better part of a day to figure this out a few months ago. UDEV
>> names the devices based on PCI bus slot number order. If you add a new PCI NIC
>> into an empty slot with a lower number than that of the NIC already in the
>> system, UDEV makes the lowest slot number eth0 and the higher slot number eth1.
>
> I seem to recall such issues in the (quite distant) past.
>
>> The solution is to change the PCI slot order or create a UDEV static naming
>> rule based on MAC address that overrides the slot number ordering.
>
> This is already done in /etc/udev/rules.d/70-persistent-net.rules (which
> is actually generated by another rule).

So, are you saying it didn't happen? Couldn't have happened? Shouldn't have
happened? I'm imagining things? Are you kidding?

It broke. I fixed it by manually editing the precise file you list above.
Maybe it happened because I have ACPI disabled on this old (1998) 440BX MB due
to its ACPI implementation being buggy. Maybe it's because I have power
management disabled. Maybe it's a BIOS bug. Maybe it happened because both
cards use the 8255x chip (though one was an NC3121 with 82558 and the other an
actual Intel Pro 100 Server Adapter with an 82559). The cause could have been
any number of things.

Regardless, it happened. I fixed it manually. It did not properly
auto-reconfigure.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org