From: lrhorer on
Moe Trin wrote:

> On Mon, 18 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <JOydnYNd4P63u8jWnZ2dnUVZ_vRi4p2d(a)giganews.com>, lrhorer
> wrote:
>
>>There was a lock-up just an hour ago, although this time without a USB
>>bus reset. I've enabled he persist option in pppd, at least for the
>>time being. It does allow the system to recover from an ordinary idle
>>time-out. This event did not involve a USB reset, /dev/ACM0 stayed
>>online, and the modem was able to respond to AT commands. It would
>>not dial, however, either using pppd or minicom.
>
> Minor confusion - your log shows pppd unable to communicate with the
> modem between (at least) 18:29:54 and 20:11:24.

Right. I was checking some things. When I shut down pppd eventually,
the first attempt to talk to the modem (AT^M) caused it to respond
with, "ERROR". A subsequent AT^M produced an, "OK".

>>Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests
>>Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be
>>disconnected.
>>Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes.
>>Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received 588608
>>bytes.
>>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started
>>(pid 10015)
>>Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer not
>>responding"]
>>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished
>>(pid 10015), status = 0x0
>>Jan 18 18:29:57 Cricket pppd[5836]: sent [LCP TermReq id=0x3 "Peer not
>>responding"]
>>Jan 18 18:30:00 Cricket pppd[5836]: Connection terminated.
>
> LCP Echo is used to detect link failure where the link goes down, but
> the modem (which means the central office) doesn't recognize this and
> remains off-hook. pppd is here shown sending two TermReq, but the
> peer isn't responding. The link may have existed at the physical
> level, but was unable to pass packets.

Exactly. This is one of the failure modes. I removed wvdial from the
system entirely, and this behavior seems to have stopped. 'Fingers
crossed - again.

>>Jan 18 18:31:57 Cricket pppd[5836]: Hangup (SIGHUP)
>>Jan 18 18:31:57 Cricket pppd[5836]: Modem hangup
>
> Nearly two minutes - why? pppd hangs up the modem using the DTR
> "wire"
> which is emulated in a USB modem. "Module problems"?

I was skeptical the "modem" and "crtscts" options were of any value,
and I was wondering if having them specified in the options file for a
modem without signal lines was causing any of the issues. I suppose
it's possible they could have been at least partly responsible for some
of the odd behavior, although by no means all of it. There is a small
whiff of this being the case, here. I've commented them out.


>>Jan 18 18:31:58 Cricket pppd[5836]: Hangup (SIGHUP)
>>Jan 18 18:32:04 Cricket chat[10941]: Can't get terminal parameters:
>>Input/output error
>
> On an attempted restart, the serial port wasn't a serial port.

'Except that minicom is still able to talk to it. I was able to
reproduce this behavior a number of times. Sometimes it gives this
response. Other times it is just silent.

>>That's it. Pppd did not shut down, did not release the loackfile, and
>>did not release /dev/ttyACM0. I shut down pppd with `kill -9`, and
>>re-started it.
>
> Try to avoid using -SIGKILL (-9). The results are messy. -SIGTERM
> (-15) should be enough.

`Not when it locks up like this. Kill -9 is the only thing which seems
to be able to shut down pppd when this happens. The problem is, it
won't restart when I do. If you have any other ideas for cleanly
shutting down and restarting the ppp session, I'm all ears.

>>After restart, it produced only a single line in syslog:
>>
>>Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid 0
>>
>> Again, it produced a lockfile and grabbed /dev/ttyACM0, but it
>>did nit run the chat script or anything else.
>
> I'm assuming you used 'fuser' or 'lsof' to determine that pppd had the
> device.

Yes. I also looked in /var/lock/, and even after removing it manually
(yes, I know the lockfile is old school), subsequently opening pppd
again or running minicom reports the locking in place. I had another
failure just a few minutes ago. Lsof again reports /dev/ACM0 in use by
pppd. Sending a SIGTERM or a SIGHUP has no apparent effect. After
issuing SIGKILL, upon restart, pppd hangs. Minicom can still talk to
the modem, get it to dial, get a carrier, and receive the ppp
negotiation packet.

> Assuming your file system does update access time (some people
> are mounting -noatime)

No, I'm not mounting the partition with noatime, just with
rw,errors=remount-ro.

> , did 'ls -lu /usr/sbin/chat' show current
> access? What about the script that contains the dialing commands?

The chat script, or the call file? What about them? Do you mean were
they showing to be in-use? No, they aren't. Lsof shows quite a few
files in use by pppd, but not them. They aren't in use by anything.

>>Jan 18 20:11:24 Cricket pppd[3062]: pppd 2.4.4 started by root, uid 0
>>Jan 18 20:11:25 Cricket pppd[3062]: Connect script failed
>>Jan 18 20:12:03 Cricket pppd[3062]: Serial connection established.
>
> Below
>
>>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x1 <asyncmap
>>0x0> <auth chap MD5> <magic 0xf27b8c09> <pcomp> <accomp>]
>>Jan 18 20:12:09 Cricket pppd[3062]: sent [LCP ConfNak id=0x1 <auth
>>pap>] Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfAck id=0x1
>><asyncmap 0x0> <magic 0xeaf2ceb9> <pcomp> <accomp>]
>>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x2 <asyncmap
>>0x0> <auth pap> <magic 0xf27b8c09> <pcomp> <accomp>]
>
> The peer requested you authenticate by CHAP-MD5 (RFC1994), and you
> refused, suggesting PAP (RFC1334) instead. This time, the peer
> accepted it. On 'Jan 9 09:49:00', you showed CHAP authentication
> working - what changed?

I was wondering that, myself. Under wvdial, the systems were
authenticating using CHAP. When I moved over to chat, I copied the
pppd command line directly from the wvdial command line. I used the
same calling file and had not yet made any changes to the options file,
yet they started authenticating using PAP. I was puzzled why, but it
did seem to be working well enough not to need my immediate attention.
I made several other changes, during which time ppd continued to
authenticate using PAP. When I removed wvdial, it instantly reverted
to CHAP authentication. I know it sounds a bit fishy, but I swear to
you I made no other configuration changes when I removed wvdial, and it
was authenticating using PAP immediately prior to removal and CHAP
immediately after removing wvdial.

> I'd suggest having identical pap-secrets
> and chap-secrets files, and letting pppd use which-ever it needed.

That's the way it is set up, and I never changed it.

>>Jan 18 20:12:10 Cricket pppd[3062]: sent [CCP ConfReq id=0x1 <deflate
>>15> <deflate(old#) 15> <bsd v1 15>]
>>Jan 18 20:12:10 Cricket pppd[3062]: sent [IPCP ConfReq id=0x1
>><compress VJ 0f 01> <addr 0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns3
>>0.0.0.0>]
>
>>Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01
>>01 00 0f 1a 04 78 00 18 04 78 00 15 03 2f]
>>Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression
>>Control Protocol' (0x80fd) received
>>Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1
>><compress VJ 0f 01>]
>
> This peer doesn't do CCP (RFC1962), or Van Jacobson header compression
> (RFC1144). No big deal, but you _could_ add 'noccp' and 'novj' if it
> did become a problem. It's _not_ a problem now.

Yeah, I spotted that, too. I figured I would leave it.

>>Jan 18 20:11:25 Cricket chat[3106]: abort on (BUSY)
>>Jan 18 20:11:25 Cricket chat[3106]: send (AT^M)
>>Jan 18 20:11:25 Cricket chat[3106]: expect (OK)
>>Jan 18 20:11:25 Cricket chat[3106]: AT^M^M
>>Jan 18 20:11:25 Cricket chat[3106]: OK
>
> An empty 'Hayes' command prefix - not very reliable at initializing
> the modem.

To the best of my ability to tell, neither ATZ nor AT&F do anything,
although the modem responds with, "OK". The manual I have lists
AT+WRST as the wireless reset command. When I try that, it returns an
error.

>>Jan 18 20:11:25 Cricket chat[3106]: send (ATDT#777^M)
>>Jan 18 20:11:25 Cricket chat[3106]: expect (CONNECT)
>>Jan 18 20:11:25 Cricket chat[3106]: ^M
>>Jan 18 20:11:25 Cricket chat[3106]: ATDT#777^M^M
>>Jan 18 20:11:25 Cricket chat[3106]: NO CARRIER
>>Jan 18 20:11:25 Cricket chat[3106]: -- failed
>
> You told the modem to dial - it responded with 'NO CARRIER' rather
> than the expected 'CONNECT blah, blah, blah'. That's a modem issue.

That depends on what you mean by "modem". There are a finite number of
carriers available to each antenna on a cellular base station. It's
not terribly unusual to be denied a channel. I'm sure you have
encountered a "Network busy - try again" response from your cell phone
from time to time even though you have plenty of bars. My guess would
be a full antenna face, although of course it could be a problem with
the modem itself.

>>Jan 18 20:11:57 Cricket chat[3652]: send (ATDT#777^M)
>>Jan 18 20:11:57 Cricket chat[3652]: expect (CONNECT)
>>Jan 18 20:11:57 Cricket chat[3652]: ^M
>>Jan 18 20:12:03 Cricket chat[3652]: ATDT#777^M^M
>>Jan 18 20:12:03 Cricket chat[3652]: CONNECT
>>Jan 18 20:12:03 Cricket chat[3652]: -- got it

> You tried again, and the modem took six seconds, but this time
> reported it found another modem to talk to.
>
>>Finally, sometimes the modem responds and dials out, but the
>>authentication fails.
>
> Above - '/bin/cp /etc/ppp/pap-secrets /etc/ppp/chap-secrets'

They are already identical in the uncommented sections.

Here is the pppd log from the latest lockup. Nothing I tried short of
a reboot would get pppd to invoke the chat script and move forward.
I'm open to suggestions. On the bright side, a soft reboot does allow
the system to come back up, and although it would be rather draconian
to reboot every time just to clear such an issue, it is an option. So
far the really nasty hardware lockups have not returned.

Below is the log from the latest lock-up, experienced when the
following command was issued:

`ps -ef | grep -q pppd` && kill -1 `pidof pppd`

I am issuing the command every couple of hours, and the third time I
issued it in this session, pppd hung.

Jan 19 19:35:29 Cricket pppd[2783]: Hangup (SIGHUP)
Jan 19 19:35:29 Cricket pppd[2783]: Connect time 152.1 minutes.
Jan 19 19:35:29 Cricket pppd[2783]: Sent 1551485 bytes, received 1648658
bytes.
Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down started (pid
10026)
Jan 19 19:35:29 Cricket pppd[2783]: sent [LCP TermReq id=0xe "User
request"]
Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down finished
(pid 10026), status = 0x0
Jan 19 19:35:32 Cricket pppd[2783]: sent [LCP TermReq id=0xf "User
request"]
Jan 19 19:35:35 Cricket pppd[2783]: Connection terminated.
Jan 19 19:39:00 Cricket pppd[2783]: ioctl(TIOCSETD, N_TTY): Interrupted
system call (line 571)
Jan 19 19:39:01 Cricket pppd[2783]: tcsetattr: Interrupted system call
(line 1010)
Jan 19 19:39:01 Cricket pppd[2783]: Hangup (SIGHUP)
Jan 19 19:39:01 Cricket pppd[2783]: Modem hangup
Jan 19 19:39:02 Cricket pppd[2783]: Hangup (SIGHUP)

When I kill and restart pppd, I get:

Jan 19 20:05:37 Cricket pppd[13614]: pppd 2.4.4 started by root, uid 0

and then nothing.
From: lrhorer on
unruh wrote:

> On 2010-01-19, lrhorer <lrhorer(a)satx.rr.com> wrote:
>>
>>>>> You actually do not have debugging switched on for your pppd
>>>>> (option debug to pppd)
>>>>
>>>> Yes, I do. The debug directive is in the options file.
>>>
>>> Good. But what you display is NOT the pppd debug logs.
>>
>> Then pppd isn't creating them, unless it is someplace other
>> than /var/log/. The debug option is in the options file, as I
>> mentioned.
>>
>>> Did you put the line
>>> daemon.*;local2.* /var/log/daemonlog
>>
>> I already responded to that. I did not have to. It was
>> already there.
>
> Then you have a non-standard version of pppd.
> With the debug option, pppd writes detailed reports to the daemon
> syslog facility.

Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options
asyncmap 0
usehostname
noipdefault
usepeerdns
defaultroute
auth
local
lock
hide-password
debug
lcp-echo-interval 30
lcp-echo-failure 4
noipx

Cricket:/etc# egrep -v '#|^ *$' rsyslog.conf
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$FileOwner root
$FileGroup adm
$FileCreateMode 0640
$DirCreateMode 0755
$IncludeConfig /etc/rsyslog.d/*.conf
auth,authpriv.* /var/log/auth.log
*.*;auth,authpriv.none -/var/log/syslog
daemon.*; local2.* -/var/log/daemon.log
kern.* -/var/log/kern.log
lpr.* -/var/log/lpr.log
mail.* -/var/log/mail.log
user.* -/var/log/user.log
<snip>

What more can I say?
>>
>>> (or whereever you want to put it) into /etc/syslog.conf, and restart
>>> syslog.conf
>>> killall -1 syslogd
>>
>> Again, as I already mentioned, this isn't necessary. The
>> router has
>> been both cold and warm booted many, many times, and the logging
>> parameters were already there when the system was created.
>>
>> Oh, and just BTW, this distro does not use syslog.d. It's
>> using
>> rsyslog.d. not that ity really matters.
>
> I do not know rsyslogd. Are you sure it uses /etc/syslog.conf as its
> configuration file?

No, it does not:

Cricket:/etc# find / -name "*syslogd*"
/usr/sbin/rsyslogd
Cricket:/etc# find /etc -name "*syslog*"
/etc/init.d/rsyslog
/etc/rsyslog.d
/etc/rc2.d/S10rsyslog
/etc/rc1.d/K90rsyslog
/etc/rc6.d/K90rsyslog
/etc/default/rsyslog
/etc/rc3.d/S10rsyslog
/etc/logrotate.d/rsyslog
/etc/rsyslog.conf
/etc/rc0.d/K90rsyslog
/etc/rc5.d/S10rsyslog
/etc/rc4.d/S10rsyslog

>> There was a lock-up just an hour ago, although this time
>> without a USB
>> bus reset. I've enabled he persist option in pppd, at least for the
>> time being. It does allow the system to recover from an ordinary
>> idle
>> time-out. This event did not involve a USB reset, /dev/ACM0 stayed
>> online, and the modem was able to respond to AT commands. It would
>> not
>> dial, however, either using pppd or minicom. The first response to a
>> simgle AT command was "ERROR" after I shut down pppd and ran minicom
>> to do some manual testing.
>>
>> Here is the log from the event:
>>
>> Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests
>> Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be
>> disconnected.


> OK, pppd is using echos to test if the link is up. It is failing. Ie,
> the connection has gone down before pppd goes down.
>
> This looks more and more like a hardware flaw, either a bug or a bad
> design. There is NOTHING that software can do to fix it. Only

Nonsense. Since the modem can still dial out, still connect to the
carrier, and still get a ppp session request, clearly the hardware is
still working in this failure mode. Worst case, for this failure mode,
I can reboot the router. It's a Draconian approach, but it works, and
it is pure software. Someone suggested I unbind the ACM driver and
re-bind it. I may try that, although the fact the modem still works
under minicom suggests it isn't ACM.

>> Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes.
>> Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received
>> 588608 bytes.
>> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started
>> (pid 10015)
>> Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer
>> not responding"]
>> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished
>> (pid 10015), status = 0x0
>
> pppd has shut itself down.
>
>
>> Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid
>> 0
>>
>> Again, it produced a lockfile and grabbed /dev/ttyACM0, but
>> it did nit
>> run the chat script or anything else. I killed pppd again, and ran
>> minicom to see if the modem was responding on /dev/ttyACM0. It was.
>> I pulled the modem and reseated it, running pppd, again:
>
> ????

You'll have to make this question a bit more specific before I can
answer it.

> OK, now we are seeing valid pppd debug output.
>
> Where did you get this from?

/var/log/syslog. As I explained in another message, pppd started
spitting out a greater breadth of messages into /var/log/syslog when I
quit calling it from wvdial.

>> Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01
>> 01 00 0f 1a 04 78 00 18 04 78 00 15 03 2f]
>> Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression
>> Control Protocol' (0x80fd) received
>
> PUt in a noccp option.

I don't suppose it will hurt. I seriously doubt it will help, but I
suppose one never knows.

>> Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1
>> <compress VJ 0f 01>]
>
> And novj

Ditto.

>> fails. In others, the modem can be switched, but udev doesn't
>> recognize it. In yet others, udev recognizes the modem and creates
>> the /dev/ACM0 device, but the modem doesn't respond. Finally,
>> sometimes the modem responds and dials out, but the authentication
>> fails.
>
> Again hardware.

I seem to remember hearing someone else pointing that out. There
haven't been any hardware faults since the USB port was replaced. It's
too soon to tell for certain if this is meaningful, but a new USB card
was $24, and I can return it if it doesn't fix the problem.
From: Moe Trin on
On Tue, 19 Jan 2010, in the Usenet newsgroup comp.os.linux.networking, in
article <M6OdnQl2lLMi5svWnZ2dnUVZ_g1i4p2d(a)giganews.com>, lrhorer wrote:

>Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options
>asyncmap 0

Not needed - that's the default

>usehostname

Are you sure you need this?

>noipdefault
>usepeerdns
>defaultroute
>auth

That usually prevents dialin, as most ISPs will _not_ authenticate to
you.

>local

I realize your USB device lacks modem control lines, but most
emulations I've seen work without this.

>noipx

I haven't seen an ISP offering Novell IPX connections in over 15 years.

Old guy
From: lrhorer on
Moe Trin wrote:

> On Tue, 19 Jan 2010, in the Usenet newsgroup comp.os.linux.networking,
> in article <M6OdnQl2lLMi5svWnZ2dnUVZ_g1i4p2d(a)giganews.com>, lrhorer
> wrote:
>
>>Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options
>>asyncmap 0
>
> Not needed - that's the default

I didn't put it in there, but it doesn't seem to cause problems, so I
left it.

>>usehostname
>
> Are you sure you need this?

I'm fairly certain I don't, but again, it was the default, and doesn't
seem to cause any issues.

>>noipdefault
>>usepeerdns
>>defaultroute
>>auth
>
> That usually prevents dialin, as most ISPs will _not_ authenticate to
> you.

Not when it is contradicted in the calling file. See above.

>>local
>
> I realize your USB device lacks modem control lines, but most
> emulations I've seen work without this.

It seems to. I removed the "modem" line and uncommented this one for
testing. It made no difference.

>>noipx
>
> I haven't seen an ISP offering Novell IPX connections in over 15
> years.

It was the default, so I left it. I agree it's unlikely an ISP would
ever support IPX. TCP/IP is definitely the playing field for WAN, and
indeed for more than 90% of LAN. Heck, I feel like an orphan child
because a large amount of our gear runs 7 layer OSI.

From: buck on
lrhorer <lrhorer(a)satx.rr.com> wrote in
news:3_GdndQN--INCPbWnZ2dnUVZ_jhi4p2d(a)giganews.com:

irtualization of DTR, or by +++ATH. I would expect the former.
>
>> Perhaps you may want to
>> experiment with the 'disconnect' option to pppd.
>
> Nope. The problem was a kernel bug. Upgrading the kernel
> fixed it.

So what kernel fixed the problem?
--
buck