From: lrhorer on
>>>> That's a kernel routing problem - which is why it's a kernel
>>>> parameter that is altered above.
>>
>>>I lost you, there. How is the fact the ISP does not try to maintain
>>>its client's IP address in its DHCP server a kernel routing problem?
>>
>> DHCP is an Ethernet service, not PPP. The kernel routing problem
>> occurs when the system _NORMALLY_ has a default route using a ppp
>> interface - as either booting to a configuration where pppd is run by
>> default and the link may yo-yo, or in a 'demand' mode.
>
> ppp is a peer to peer service. This means that there need by no IP
> addresses on the line--- you send stuff to only one place, your peer.

Well, yes and no. When the application layer hands off a payload to
layer 4, which then passes it to layer 3, layer 3 won't know where to
send the stream unless there is either a default layer 3 address in the
routing table or a specific gateway address in the routing table, and
the routing table can't have an address in it unless it has been put
in. The layer 3 addresses of course must also be bound to layer 2
addresses, at least one of which would be the ppp addreess in question.
Otherwise, the network layer would just discard the payload. Any
protocol which dealt directly with layer 2 could carry on a
conversation, of course, but TCP/IP would be dead as a doornail.

> It is also not necessary that the address delivered on the ppp line is
> the same as the address than either side uses.

Well, yeah, that's partly true. Once layer 3 has handed off to layer
2, the next-hop information is discarded, so as long as the IP layer
can properly resolve the correct layer 2 stack to which to hand the
payload, everything else would work, UNLESS the conversation is
supposed to be between the two hosts on the opposite sides of the ppp
link.

> However, in normal operation, the important addresses are the address
> the ISP sends you to refer to him ( but again he does not care,
> because there is only one ppp connection) and for him the important
> address is yours.

Again, that's true unless the two two hosts are trying to talk to one
another. If host A has a different address for itself than the address
host B has, then when host B sends a packet to host A with a source
address that does not exist on host A, host A is either going to
forward it out the appropriate gateway if the address is not on a local
subnet, or just drop the packet if it is on a local subnet. Either way,
Host A never passes the payload to layer 3 and beyond. Pass-through
packets would work OK, because they have destinations other than the
local host anyway.

> Now the usual situation is that the ISP is asked both for his IP
> address and your IP address. the ISP has a certain limited number of
> addresses,

This is no less true for a dial-up ISP than a broadband ISP. It's true
the dial-up ISP can probably get by with somewhat fewer IP addresses,
since there is never going to be a time when every user is online. This
is not the case for a broadband provider. It's probably not a factor of
three, though.

> and since on the telephone systems, many users will call
> up, there is a huge disincentive to assigning the same address to the
> same caller ( that would mean you would have to know who the caller
> is).

Of course.

> Thus the ppp addresses on each connection tend to be different.
> Now, CDMA broadband may be different, but again there is no incentive
> for them to go to the trouble of giving you the same address, and it
> is hard to do ( how do you identify the remote caller).

By his login and password, if nothing else. Caller ID would also work.
The OS can obtain the caller ID before the link is even established.
Now I am not arguing with you that it is particularly worth the trouble
for a dial-up ISP to try to provide its users with fixed IPs, but it is
certainly possible. This is supposed to be a broadband provider,
though, and for a broadband provider, the etiquette is a bit different.
From: Moe Trin on
On Fri, 15 Jan 2010, in the Usenet newsgroup comp.os.linux.networking, in
article <slrnhl1jr2.kg5.unruh(a)wormhole.physics.ubc.ca>, unruh wrote:

>Moe Trin <ibuprofin(a)painkiller.example.tld.invalid> wrote:

>> The kernel routing problem occurs when the system _NORMALLY_ has a
>> default route using a ppp interface - as either booting to a
>> configuration where pppd is run by default and the link may yo-yo,
>> or in a 'demand' mode.

>ppp is a peer to peer service. This means that there need by no IP
>addresses on the line--- you send stuff to only one place, your peer.

Well, that's true, but in a convoluted way. If the peer lacks an IP
address, your routing table must be set showing a default route but
with no gateway - i.e.

Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 0.0.0.0 0.0.0.0 U 0 0 1234 ppp0

which will cause the kernel to assume that the whole world is directly
reachable on the ppp0 interface. The peer sees no difference in packet
headers, and forwards as per usual. You can't talk directly to the
peer, but that's often the case for procedural/policy reasons.

>However, in normal operation, the important addresses are the address
>the ISP sends you to refer to him ( but again he does not care, because
>there is only one ppp connection) and for him the important address is
>yours.

The important address is yours, but that's so the kernel networking
stack knows what address to assume for the ppp0 interface. The
'/proc/sys/net/ipv4/ip_dynaddr' problem is to avoid problems when
the interface and/or peer IP address changes - a common problem in
'demand' mode (where the pppd assumes the peer is at 10.112.112.112
plus the interface number (see Changes-2.3 in the ppp source, under
2.3.10). The other thing is that the kernel isn't really prepared
to have it's ppp0 interface address of 0.0.0.0 (not legal per RFC1122
section 3.2.1.3(a) and RFC0791 section 3.2).

>there is no incentive for them to go to the trouble of giving you the
>same address, and it is hard to do ( how do you identify the remote
>caller).

It's done with some regularity - USUALLY by having your system ask
for a specific address via the colon option to pppd. Not foolproof,
but done.

Old guy
From: lrhorer on

>>Not yet. I just got the router back this evening, and I haven't had
>>any time to look into the specifics of any of the failures. I'm
>>still running down some coding issues on one of the control modules.
>>One of the header files (not mine) has some issues, and so the code
>>won't compile. I probably won't get back to looking into the failure
>>modes until this weekend, if then. I did turn up yet another failure
>>mode, though. Oddly, the device got registered (non switched), but
>>wouldn't respond to ordinary system calls, so udev couldn't even
>>produce the device targets and the switch code could not flip it.
>
> It's looking a lot more as if this is the problem, rather than pppd.

I agree, although despite what unruh seems to believe, I haven't
fixated on this as a root cause, and I certaily have not even remotely
ruled out ppp problems or something bizarre that wvdial is doing as
being a proximate cause.

>>There's no question I am going to have to shut down and restore power
>>to the device when some of these failure modes are encountered. It
>>just shouldn't be the default action.
>
> Agree. As a guess, the software in the modem isn't being set/reset to
> the "right" mode, and needs the power cycle to reset to sane values.

That, or one of 10,000 other things. I'll probably figure out what,
eventually. If I come up with a solution which eliminates the symptoms
without knowing what is really happening underneath I probably won't
bother, though.

> From a serial data-link point of view, this probably means those
> init-strings aren't correct. For an analog modem, 'ATZ' generally
> means to reset to a _user_ specified stored configuration (that were
> saved using the 'AT&W0' command). This differs from 'AT&Fn' which
> resets the modem to ``factory'' settings. So if the user accidentally
> set up some weird configuration and then ran the 'AT&W0' command, the
> modem powers up to sane values, but gets reset to the strange
> condition by that 'ATZ'. That's why it's generally not a good
> init-string.

Yes, but gracefully exiting the session and then dialing back in
doesn't produce a problem. If resetting the modem kills it in one
case, one might expect that resetting it in the other case would have
the same effect.

>>'My point exactly. I have to address such eventualities, however.
>>That's one of the common issues with unattended devices which must
>>recover from unexpected issues autonomously.
>
> The power on reset does that, but I don't consider it to be the best
> solution.

Me, either.


>>Well, OK. I only briefly skimmed the man page for chat, but it
>>certainly holds promise. If I can make some headway with the device
>>control (or get totally stumped) this weekend, I'll have to look into
>>creating a chat script to be called by the startup script.
>
> Actually, 'chat' is called by pppd using the 'connect' option. I've
> shown two examples in this discussion.

Yeah, I see that. I'll dig into the details when I start to write the
script.

>>It's a rather cheezy, or at least obnoxiously restrictive thing to
>>do. It certainly prevents the subscriber from running any sort of
>>internet-facing server.
>
> I'm not going to defend the ISP, but that's just another fact of
> business. Usually, it's a method of separating the cheap - 'client
> only' customer from the expensive 'allowed to run servers' customer.

Well, dial-up service typically costs about $10 a month or so, so by
comparison this service, at $40 a month, is not cheap. Indeed, it is
about the same as most landline based broadband services, cost-wise.
It is less expensive than other wireless or satellite services, but not
THAT much less expensive.

>
>>Wvdial is calling pppd with the usepeerdns option, which by every
>>reading I have done in the pppd man page tells me it is pppd which is
>>updating /etc/resolv.conf.
> ^^^^^^^^^^^^^^^^
>>The addresses supplied by the peer (if any) are passed to the
>>/etc/ppp/ip-up script in the environment variables DNS1 and DNS2,
>>and the environment variable USEPEERDNS will be set to 1. In
>>addition, pppd will create an /etc/ppp/resolv.conf file
> ^^^^^^^^^^^^^^^^^^^^
> Different file.

OK, I see, and I see now what you were saying. 'My mistake. When I
read over that in the man page, I missed the /ppp/ and so I was
thinking pppd was writing the file.

From: lrhorer on
lrhorer wrote:

> No, I am insistent on tackling the more important issues
> before moving
> on to less important ones. Working out ppp issues, if any, is
> trivial. No matter how well or poorly ppp is working, however, the
> modem will
> never dial out if the computer can't issue commands to it. Some of
> these failure modes are encountered long before the router gets to the
> point of starting to dial out. I need to resolve those sorts of
> issues
> before I try to dig in to problems with ppp. I also have not yet had
> any time at all to work on any of the issues, no matter how large or
> how small.
>
>>> It's not a "router". It's a router, specifically a Linux
>>> router. It's
>>> running Debian "Lenny" under kernel 2.6.26-2-686. The version of
>>> pppd is 2.4.4 Rel 10.1, and of course it is from the Debian Stable
>>> repository.
>>
>> Thanks. Please enable logging for pppd.
>
> It's already enabled. Next time I get a failure in a dialing
> session,
> I'll let you know what it says. It will probably be several days,
> unless I get lucky, so to speak.

OK, I was doing some testing and I had one of the lockups when I was
sitting at the console. This was not one of the 12 hour lockups, as
you can see from the logs. Here are the log outputs:

Cricket:~# grep "Jan 16 00:" /var/log/messages
Jan 16 00:19:37 Cricket rsyslogd: [origin software="rsyslogd"
swVersion="3.18.6" x-pid="2245" x-info="http://www.rsyslog.com"]
restart
Jan 16 00:41:23 Cricket kernel: [ 1692.388023] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:23 Cricket kernel: [ 1692.948014] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:24 Cricket kernel: [ 1693.508017] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:24 Cricket kernel: [ 1694.028016] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:25 Cricket pppd[2682]: Modem hangup
Jan 16 00:41:25 Cricket pppd[2682]: Connect time 27.6 minutes.
Jan 16 00:41:25 Cricket pppd[2682]: Sent 144558 bytes, received 188005
bytes.
Jan 16 00:41:25 Cricket pppd[2682]: Connection terminated.
Jan 16 00:41:25 Cricket kernel: [ 1694.440107] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jan 16 00:41:25 Cricket kernel: [ 1694.440319] usb 1-1: USB disconnect,
address 3
Jan 16 00:41:25 Cricket pppd[2682]: Exit.
Jan 16 00:41:25 Cricket kernel: [ 1694.552196] usb 1-1: new full speed
USB device using uhci_hcd and address 4
Jan 16 00:41:25 Cricket kernel: [ 1695.116015] usb 1-1: new full speed
USB device using uhci_hcd and address 5
Jan 16 00:41:26 Cricket kernel: [ 1695.676005] usb 1-1: new full speed
USB device using uhci_hcd and address 6
Jan 16 00:41:26 Cricket kernel: [ 1696.196015] usb 1-1: new full speed
USB device using uhci_hcd and address 7
Jan 16 00:43:32 Cricket kernel: [ 1821.624006] usb 1-1: new full speed
USB device using uhci_hcd and address 8
Jan 16 00:43:32 Cricket kernel: [ 1822.188016] usb 1-1: new full speed
USB device using uhci_hcd and address 9
Jan 16 00:43:33 Cricket kernel: [ 1822.748041] usb 1-1: new full speed
USB device using uhci_hcd and address 10
Jan 16 00:43:33 Cricket kernel: [ 1823.268022] usb 1-1: new full speed
USB device using uhci_hcd and address 11
Jan 16 00:48:25 Cricket kernel: [ 2115.128018] usb 1-1: new full speed
USB device using uhci_hcd and address 12
Jan 16 00:48:26 Cricket kernel: [ 2115.292074] usb 1-1: configuration #1
chosen from 1 choice
Jan 16 00:48:26 Cricket kernel: [ 2115.304881] scsi1 : SCSI emulation
for USB Mass Storage devices
Jan 16 00:48:26 Cricket kernel: [ 2115.305242] usb 1-1: New USB device
found, idVendor=1f28, idProduct=0021
Jan 16 00:48:26 Cricket kernel: [ 2115.305248] usb 1-1: New USB device
strings: Mfr=1, Product=2, SerialNumber=3
Jan 16 00:48:26 Cricket kernel: [ 2115.305253] usb 1-1: Product: USB
Micro SD Storage
Jan 16 00:48:26 Cricket kernel: [ 2115.305256] usb 1-1: Manufacturer:
Cal-comp E&CC Limited
Jan 16 00:48:26 Cricket kernel: [ 2115.305259] usb 1-1: SerialNumber:
214939913900
Jan 16 00:48:31 Cricket kernel: [ 2120.308859] scsi 1:0:0:0:
Direct-Access Cricket T-Flash Disk 2.31 PQ: 0 ANSI: 2
Jan 16 00:48:31 Cricket kernel: [ 2120.308859] scsi 1:0:0:1: CD-ROM
Cal-Comp CD INSTALLER 2.31 PQ: 0 ANSI: 0
Jan 16 00:48:31 Cricket kernel: [ 2120.329118] sd 1:0:0:0: [sda]
Attached SCSI removable disk
Jan 16 00:48:31 Cricket kernel: [ 2120.556881] Driver 'sr' needs
updating - please use bus_type methods
<snip>


Cricket:~# grep "Jan 16 00:" /var/log/syslog
<snip>
Jan 16 00:41:23 Cricket kernel: [ 1692.388023] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:23 Cricket kernel: [ 1692.508017] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:23 Cricket kernel: [ 1692.732031] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:23 Cricket kernel: [ 1692.948014] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:23 Cricket kernel: [ 1692.965250] cdc_acm: acm_ctrl_irq -
usb_submit_urb failed with result -19
Jan 16 00:41:23 Cricket kernel: [ 1693.068019] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:24 Cricket kernel: [ 1693.292024] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:24 Cricket kernel: [ 1693.508017] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:24 Cricket kernel: [ 1693.916028] usb 1-1: device not
accepting address 3, error -71
Jan 16 00:41:24 Cricket kernel: [ 1694.028016] usb 1-1: reset full speed
USB device using uhci_hcd and address 3
Jan 16 00:41:25 Cricket pppd[2682]: Modem hangup
Jan 16 00:41:25 Cricket pppd[2682]: Connect time 27.6 minutes.
Jan 16 00:41:25 Cricket pppd[2682]: Sent 144558 bytes, received 188005
bytes.
Jan 16 00:41:25 Cricket pppd[2682]: Script /etc/ppp/ip-down started (pid
4457)
Jan 16 00:41:25 Cricket pppd[2682]: Connection terminated.
Jan 16 00:41:25 Cricket kernel: [ 1694.440021] usb 1-1: device not
accepting address 3, error -71
Jan 16 00:41:25 Cricket kernel: [ 1694.440107] sd 0:0:0:0: Device
offlined - not ready after error recovery
Jan 16 00:41:25 Cricket kernel: [ 1694.440319] usb 1-1: USB disconnect,
address 3
Jan 16 00:41:25 Cricket pppd[2682]: Waiting for 1 child processes...
Jan 16 00:41:25 Cricket pppd[2682]: script /etc/ppp/ip-down, pid 4457
Jan 16 00:41:25 Cricket pppd[2682]: Script /etc/ppp/ip-down finished
(pid 4457), status = 0x0
Jan 16 00:41:25 Cricket pppd[2682]: Exit.
Jan 16 00:41:25 Cricket kernel: [ 1694.552196] usb 1-1: new full speed
USB device using uhci_hcd and address 4
Jan 16 00:41:25 Cricket kernel: [ 1694.676007] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:25 Cricket kernel: [ 1694.900051] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:25 Cricket kernel: [ 1695.116015] usb 1-1: new full speed
USB device using uhci_hcd and address 5
Jan 16 00:41:25 Cricket kernel: [ 1695.236042] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:26 Cricket kernel: [ 1695.460007] usb 1-1: device
descriptor read/64, error -71
Jan 16 00:41:26 Cricket kernel: [ 1695.676005] usb 1-1: new full speed
USB device using uhci_hcd and address 6
Jan 16 00:41:26 Cricket kernel: [ 1696.084006] usb 1-1: device not
accepting address 6, error -71
Jan 16 00:41:26 Cricket kernel: [ 1696.196015] usb 1-1: new full speed
USB device using uhci_hcd and address 7
Jan 16 00:41:27 Cricket kernel: [ 1696.604019] usb 1-1: device not
accepting address 7, error -71
Jan 16 00:41:27 Cricket kernel: [ 1696.604049] hub 1-0:1.0: unable to
enumerate USB device on port 1
<snip>

Now will you believe me when I tell you pppd isn't issuing any errors
and it is unlikely the networking layer is the culprit in most of these
failures? Can we move on? I'll investigate possible networking issues
later when I have these far more problematical and obvious problems
ironed out.
From: Moe Trin on
On Fri, 15 Jan 2010, in the Usenet newsgroup comp.os.linux.networking, in
article <ja6dnf02SrRBoszWnZ2dnUVZ_tdi4p2d(a)giganews.com>, lrhorer wrote:

>> It's looking a lot more as if this is the problem, rather than pppd.

> I agree, although despite what unruh seems to believe, I haven't
>fixated on this as a root cause, and I certaily have not even remotely
>ruled out ppp problems or something bizarre that wvdial is doing as
>being a proximate cause.

Mentioned else-thread, I see just two possible ppp related issues:
the disconnects after some amount of time, and the authentication
failure. Everything else seems to be related to strange actions
by the GSM modem or the USB interface.

> Yes, but gracefully exiting the session and then dialing back in
>doesn't produce a problem. If resetting the modem kills it in one
>case, one might expect that resetting it in the other case would have
>the same effect.

Else-thread, you showed:

>Jan 16 00:41:23 Cricket kernel: [ 1692.388023] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:23 Cricket kernel: [ 1692.948014] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:24 Cricket kernel: [ 1693.508017] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:24 Cricket kernel: [ 1694.028016] usb 1-1: reset full speed
>USB device using uhci_hcd and address 3
>Jan 16 00:41:25 Cricket pppd[2682]: Modem hangup
>Jan 16 00:41:25 Cricket pppd[2682]: Connect time 27.6 minutes.
>Jan 16 00:41:25 Cricket pppd[2682]: Sent 144558 bytes, received 188005
>bytes.
>Jan 16 00:41:25 Cricket pppd[2682]: Connection terminated.

Now, I don't know if the kernel messages (the word 'reset' scares me)
are the reason or not, but the "Modem hangup" means that pppd detected
the modem going "on-hook". USB modems are totally serial, and lack
the RS-232 status wires, which the USB software emulates as far as
the serial interface is concerned. With an RS-232 modem, the modem
going on hook _uncommanded_by_pppd_ nearly always means that the peer
disconnected. This _could_ also be an activity timeout function in
the modem itself (an S-register setting, not standardized by
manufacturer).

If a peer hangs up, RFC1661 says that the peer should initiate this
by sending a TermReq message, and the other system responds with a
TermAck. (These packets should show in a debug level log.) Then both
can shut down "cleanly". However, many ISPs fail to follow this, and
essentially yank the plug much the same concept as a -SIGTERM verses a
-SIGKILL. All pppd can then say is that there was a modem hangup,
and try to clean up as best it can. How this would occur on a USB
modem as opposed to an RS-232 modem is something else, but given that
word 'reset' in the immediate preceding message, I'd be looking in
that direction.

You may also want to inquire if the ISP has a connect time limitation.
If they're hanging up on you because you've been connected to long,
the USB <-> RS-232 emulation may cause problems that don't occur under
similar circumstances with a straight RS-232 interface.

Old guy