Prev: Debian installed - Now what?
Next: Very strange, multiple IP addresses on the same Ethernet card donot route properly
From: lrhorer on 19 Jan 2010 22:04 Moe Trin wrote: > On Mon, 18 Jan 2010, in the Usenet newsgroup comp.os.linux.networking, > in article <JOydnYNd4P63u8jWnZ2dnUVZ_vRi4p2d(a)giganews.com>, lrhorer > wrote: > >>There was a lock-up just an hour ago, although this time without a USB >>bus reset. I've enabled he persist option in pppd, at least for the >>time being. It does allow the system to recover from an ordinary idle >>time-out. This event did not involve a USB reset, /dev/ACM0 stayed >>online, and the modem was able to respond to AT commands. It would >>not dial, however, either using pppd or minicom. > > Minor confusion - your log shows pppd unable to communicate with the > modem between (at least) 18:29:54 and 20:11:24. Right. I was checking some things. When I shut down pppd eventually, the first attempt to talk to the modem (AT^M) caused it to respond with, "ERROR". A subsequent AT^M produced an, "OK". >>Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests >>Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be >>disconnected. >>Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes. >>Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received 588608 >>bytes. >>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started >>(pid 10015) >>Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer not >>responding"] >>Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished >>(pid 10015), status = 0x0 >>Jan 18 18:29:57 Cricket pppd[5836]: sent [LCP TermReq id=0x3 "Peer not >>responding"] >>Jan 18 18:30:00 Cricket pppd[5836]: Connection terminated. > > LCP Echo is used to detect link failure where the link goes down, but > the modem (which means the central office) doesn't recognize this and > remains off-hook. pppd is here shown sending two TermReq, but the > peer isn't responding. The link may have existed at the physical > level, but was unable to pass packets. Exactly. This is one of the failure modes. I removed wvdial from the system entirely, and this behavior seems to have stopped. 'Fingers crossed - again. >>Jan 18 18:31:57 Cricket pppd[5836]: Hangup (SIGHUP) >>Jan 18 18:31:57 Cricket pppd[5836]: Modem hangup > > Nearly two minutes - why? pppd hangs up the modem using the DTR > "wire" > which is emulated in a USB modem. "Module problems"? I was skeptical the "modem" and "crtscts" options were of any value, and I was wondering if having them specified in the options file for a modem without signal lines was causing any of the issues. I suppose it's possible they could have been at least partly responsible for some of the odd behavior, although by no means all of it. There is a small whiff of this being the case, here. I've commented them out. >>Jan 18 18:31:58 Cricket pppd[5836]: Hangup (SIGHUP) >>Jan 18 18:32:04 Cricket chat[10941]: Can't get terminal parameters: >>Input/output error > > On an attempted restart, the serial port wasn't a serial port. 'Except that minicom is still able to talk to it. I was able to reproduce this behavior a number of times. Sometimes it gives this response. Other times it is just silent. >>That's it. Pppd did not shut down, did not release the loackfile, and >>did not release /dev/ttyACM0. I shut down pppd with `kill -9`, and >>re-started it. > > Try to avoid using -SIGKILL (-9). The results are messy. -SIGTERM > (-15) should be enough. `Not when it locks up like this. Kill -9 is the only thing which seems to be able to shut down pppd when this happens. The problem is, it won't restart when I do. If you have any other ideas for cleanly shutting down and restarting the ppp session, I'm all ears. >>After restart, it produced only a single line in syslog: >> >>Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid 0 >> >> Again, it produced a lockfile and grabbed /dev/ttyACM0, but it >>did nit run the chat script or anything else. > > I'm assuming you used 'fuser' or 'lsof' to determine that pppd had the > device. Yes. I also looked in /var/lock/, and even after removing it manually (yes, I know the lockfile is old school), subsequently opening pppd again or running minicom reports the locking in place. I had another failure just a few minutes ago. Lsof again reports /dev/ACM0 in use by pppd. Sending a SIGTERM or a SIGHUP has no apparent effect. After issuing SIGKILL, upon restart, pppd hangs. Minicom can still talk to the modem, get it to dial, get a carrier, and receive the ppp negotiation packet. > Assuming your file system does update access time (some people > are mounting -noatime) No, I'm not mounting the partition with noatime, just with rw,errors=remount-ro. > , did 'ls -lu /usr/sbin/chat' show current > access? What about the script that contains the dialing commands? The chat script, or the call file? What about them? Do you mean were they showing to be in-use? No, they aren't. Lsof shows quite a few files in use by pppd, but not them. They aren't in use by anything. >>Jan 18 20:11:24 Cricket pppd[3062]: pppd 2.4.4 started by root, uid 0 >>Jan 18 20:11:25 Cricket pppd[3062]: Connect script failed >>Jan 18 20:12:03 Cricket pppd[3062]: Serial connection established. > > Below > >>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x1 <asyncmap >>0x0> <auth chap MD5> <magic 0xf27b8c09> <pcomp> <accomp>] >>Jan 18 20:12:09 Cricket pppd[3062]: sent [LCP ConfNak id=0x1 <auth >>pap>] Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfAck id=0x1 >><asyncmap 0x0> <magic 0xeaf2ceb9> <pcomp> <accomp>] >>Jan 18 20:12:09 Cricket pppd[3062]: rcvd [LCP ConfReq id=0x2 <asyncmap >>0x0> <auth pap> <magic 0xf27b8c09> <pcomp> <accomp>] > > The peer requested you authenticate by CHAP-MD5 (RFC1994), and you > refused, suggesting PAP (RFC1334) instead. This time, the peer > accepted it. On 'Jan 9 09:49:00', you showed CHAP authentication > working - what changed? I was wondering that, myself. Under wvdial, the systems were authenticating using CHAP. When I moved over to chat, I copied the pppd command line directly from the wvdial command line. I used the same calling file and had not yet made any changes to the options file, yet they started authenticating using PAP. I was puzzled why, but it did seem to be working well enough not to need my immediate attention. I made several other changes, during which time ppd continued to authenticate using PAP. When I removed wvdial, it instantly reverted to CHAP authentication. I know it sounds a bit fishy, but I swear to you I made no other configuration changes when I removed wvdial, and it was authenticating using PAP immediately prior to removal and CHAP immediately after removing wvdial. > I'd suggest having identical pap-secrets > and chap-secrets files, and letting pppd use which-ever it needed. That's the way it is set up, and I never changed it. >>Jan 18 20:12:10 Cricket pppd[3062]: sent [CCP ConfReq id=0x1 <deflate >>15> <deflate(old#) 15> <bsd v1 15>] >>Jan 18 20:12:10 Cricket pppd[3062]: sent [IPCP ConfReq id=0x1 >><compress VJ 0f 01> <addr 0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns3 >>0.0.0.0>] > >>Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01 >>01 00 0f 1a 04 78 00 18 04 78 00 15 03 2f] >>Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression >>Control Protocol' (0x80fd) received >>Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1 >><compress VJ 0f 01>] > > This peer doesn't do CCP (RFC1962), or Van Jacobson header compression > (RFC1144). No big deal, but you _could_ add 'noccp' and 'novj' if it > did become a problem. It's _not_ a problem now. Yeah, I spotted that, too. I figured I would leave it. >>Jan 18 20:11:25 Cricket chat[3106]: abort on (BUSY) >>Jan 18 20:11:25 Cricket chat[3106]: send (AT^M) >>Jan 18 20:11:25 Cricket chat[3106]: expect (OK) >>Jan 18 20:11:25 Cricket chat[3106]: AT^M^M >>Jan 18 20:11:25 Cricket chat[3106]: OK > > An empty 'Hayes' command prefix - not very reliable at initializing > the modem. To the best of my ability to tell, neither ATZ nor AT&F do anything, although the modem responds with, "OK". The manual I have lists AT+WRST as the wireless reset command. When I try that, it returns an error. >>Jan 18 20:11:25 Cricket chat[3106]: send (ATDT#777^M) >>Jan 18 20:11:25 Cricket chat[3106]: expect (CONNECT) >>Jan 18 20:11:25 Cricket chat[3106]: ^M >>Jan 18 20:11:25 Cricket chat[3106]: ATDT#777^M^M >>Jan 18 20:11:25 Cricket chat[3106]: NO CARRIER >>Jan 18 20:11:25 Cricket chat[3106]: -- failed > > You told the modem to dial - it responded with 'NO CARRIER' rather > than the expected 'CONNECT blah, blah, blah'. That's a modem issue. That depends on what you mean by "modem". There are a finite number of carriers available to each antenna on a cellular base station. It's not terribly unusual to be denied a channel. I'm sure you have encountered a "Network busy - try again" response from your cell phone from time to time even though you have plenty of bars. My guess would be a full antenna face, although of course it could be a problem with the modem itself. >>Jan 18 20:11:57 Cricket chat[3652]: send (ATDT#777^M) >>Jan 18 20:11:57 Cricket chat[3652]: expect (CONNECT) >>Jan 18 20:11:57 Cricket chat[3652]: ^M >>Jan 18 20:12:03 Cricket chat[3652]: ATDT#777^M^M >>Jan 18 20:12:03 Cricket chat[3652]: CONNECT >>Jan 18 20:12:03 Cricket chat[3652]: -- got it > You tried again, and the modem took six seconds, but this time > reported it found another modem to talk to. > >>Finally, sometimes the modem responds and dials out, but the >>authentication fails. > > Above - '/bin/cp /etc/ppp/pap-secrets /etc/ppp/chap-secrets' They are already identical in the uncommented sections. Here is the pppd log from the latest lockup. Nothing I tried short of a reboot would get pppd to invoke the chat script and move forward. I'm open to suggestions. On the bright side, a soft reboot does allow the system to come back up, and although it would be rather draconian to reboot every time just to clear such an issue, it is an option. So far the really nasty hardware lockups have not returned. Below is the log from the latest lock-up, experienced when the following command was issued: `ps -ef | grep -q pppd` && kill -1 `pidof pppd` I am issuing the command every couple of hours, and the third time I issued it in this session, pppd hung. Jan 19 19:35:29 Cricket pppd[2783]: Hangup (SIGHUP) Jan 19 19:35:29 Cricket pppd[2783]: Connect time 152.1 minutes. Jan 19 19:35:29 Cricket pppd[2783]: Sent 1551485 bytes, received 1648658 bytes. Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down started (pid 10026) Jan 19 19:35:29 Cricket pppd[2783]: sent [LCP TermReq id=0xe "User request"] Jan 19 19:35:29 Cricket pppd[2783]: Script /etc/ppp/ip-down finished (pid 10026), status = 0x0 Jan 19 19:35:32 Cricket pppd[2783]: sent [LCP TermReq id=0xf "User request"] Jan 19 19:35:35 Cricket pppd[2783]: Connection terminated. Jan 19 19:39:00 Cricket pppd[2783]: ioctl(TIOCSETD, N_TTY): Interrupted system call (line 571) Jan 19 19:39:01 Cricket pppd[2783]: tcsetattr: Interrupted system call (line 1010) Jan 19 19:39:01 Cricket pppd[2783]: Hangup (SIGHUP) Jan 19 19:39:01 Cricket pppd[2783]: Modem hangup Jan 19 19:39:02 Cricket pppd[2783]: Hangup (SIGHUP) When I kill and restart pppd, I get: Jan 19 20:05:37 Cricket pppd[13614]: pppd 2.4.4 started by root, uid 0 and then nothing.
From: lrhorer on 19 Jan 2010 22:42 unruh wrote: > On 2010-01-19, lrhorer <lrhorer(a)satx.rr.com> wrote: >> >>>>> You actually do not have debugging switched on for your pppd >>>>> (option debug to pppd) >>>> >>>> Yes, I do. The debug directive is in the options file. >>> >>> Good. But what you display is NOT the pppd debug logs. >> >> Then pppd isn't creating them, unless it is someplace other >> than /var/log/. The debug option is in the options file, as I >> mentioned. >> >>> Did you put the line >>> daemon.*;local2.* /var/log/daemonlog >> >> I already responded to that. I did not have to. It was >> already there. > > Then you have a non-standard version of pppd. > With the debug option, pppd writes detailed reports to the daemon > syslog facility. Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options asyncmap 0 usehostname noipdefault usepeerdns defaultroute auth local lock hide-password debug lcp-echo-interval 30 lcp-echo-failure 4 noipx Cricket:/etc# egrep -v '#|^ *$' rsyslog.conf $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat $FileOwner root $FileGroup adm $FileCreateMode 0640 $DirCreateMode 0755 $IncludeConfig /etc/rsyslog.d/*.conf auth,authpriv.* /var/log/auth.log *.*;auth,authpriv.none -/var/log/syslog daemon.*; local2.* -/var/log/daemon.log kern.* -/var/log/kern.log lpr.* -/var/log/lpr.log mail.* -/var/log/mail.log user.* -/var/log/user.log <snip> What more can I say? >> >>> (or whereever you want to put it) into /etc/syslog.conf, and restart >>> syslog.conf >>> killall -1 syslogd >> >> Again, as I already mentioned, this isn't necessary. The >> router has >> been both cold and warm booted many, many times, and the logging >> parameters were already there when the system was created. >> >> Oh, and just BTW, this distro does not use syslog.d. It's >> using >> rsyslog.d. not that ity really matters. > > I do not know rsyslogd. Are you sure it uses /etc/syslog.conf as its > configuration file? No, it does not: Cricket:/etc# find / -name "*syslogd*" /usr/sbin/rsyslogd Cricket:/etc# find /etc -name "*syslog*" /etc/init.d/rsyslog /etc/rsyslog.d /etc/rc2.d/S10rsyslog /etc/rc1.d/K90rsyslog /etc/rc6.d/K90rsyslog /etc/default/rsyslog /etc/rc3.d/S10rsyslog /etc/logrotate.d/rsyslog /etc/rsyslog.conf /etc/rc0.d/K90rsyslog /etc/rc5.d/S10rsyslog /etc/rc4.d/S10rsyslog >> There was a lock-up just an hour ago, although this time >> without a USB >> bus reset. I've enabled he persist option in pppd, at least for the >> time being. It does allow the system to recover from an ordinary >> idle >> time-out. This event did not involve a USB reset, /dev/ACM0 stayed >> online, and the modem was able to respond to AT commands. It would >> not >> dial, however, either using pppd or minicom. The first response to a >> simgle AT command was "ERROR" after I shut down pppd and ran minicom >> to do some manual testing. >> >> Here is the log from the event: >> >> Jan 18 18:29:54 Cricket pppd[5836]: No response to 4 echo-requests >> Jan 18 18:29:54 Cricket pppd[5836]: Serial link appears to be >> disconnected. > OK, pppd is using echos to test if the link is up. It is failing. Ie, > the connection has gone down before pppd goes down. > > This looks more and more like a hardware flaw, either a bug or a bad > design. There is NOTHING that software can do to fix it. Only Nonsense. Since the modem can still dial out, still connect to the carrier, and still get a ppp session request, clearly the hardware is still working in this failure mode. Worst case, for this failure mode, I can reboot the router. It's a Draconian approach, but it works, and it is pure software. Someone suggested I unbind the ACM driver and re-bind it. I may try that, although the fact the modem still works under minicom suggests it isn't ACM. >> Jan 18 18:29:54 Cricket pppd[5836]: Connect time 306.6 minutes. >> Jan 18 18:29:54 Cricket pppd[5836]: Sent 445560 bytes, received >> 588608 bytes. >> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down started >> (pid 10015) >> Jan 18 18:29:54 Cricket pppd[5836]: sent [LCP TermReq id=0x2 "Peer >> not responding"] >> Jan 18 18:29:54 Cricket pppd[5836]: Script /etc/ppp/ip-down finished >> (pid 10015), status = 0x0 > > pppd has shut itself down. > > >> Jan 18 19:07:17 Cricket pppd[30605]: pppd 2.4.4 started by root, uid >> 0 >> >> Again, it produced a lockfile and grabbed /dev/ttyACM0, but >> it did nit >> run the chat script or anything else. I killed pppd again, and ran >> minicom to see if the modem was responding on /dev/ttyACM0. It was. >> I pulled the modem and reseated it, running pppd, again: > > ???? You'll have to make this question a bit more specific before I can answer it. > OK, now we are seeing valid pppd debug output. > > Where did you get this from? /var/log/syslog. As I explained in another message, pppd started spitting out a greater breadth of messages into /var/log/syslog when I quit calling it from wvdial. >> Jan 18 20:12:10 Cricket pppd[3062]: rcvd [LCP ProtRej id=0x1 80 fd 01 >> 01 00 0f 1a 04 78 00 18 04 78 00 15 03 2f] >> Jan 18 20:12:10 Cricket pppd[3062]: Protocol-Reject for 'Compression >> Control Protocol' (0x80fd) received > > PUt in a noccp option. I don't suppose it will hurt. I seriously doubt it will help, but I suppose one never knows. >> Jan 18 20:12:11 Cricket pppd[3062]: rcvd [IPCP ConfRej id=0x1 >> <compress VJ 0f 01>] > > And novj Ditto. >> fails. In others, the modem can be switched, but udev doesn't >> recognize it. In yet others, udev recognizes the modem and creates >> the /dev/ACM0 device, but the modem doesn't respond. Finally, >> sometimes the modem responds and dials out, but the authentication >> fails. > > Again hardware. I seem to remember hearing someone else pointing that out. There haven't been any hardware faults since the USB port was replaced. It's too soon to tell for certain if this is meaningful, but a new USB card was $24, and I can return it if it doesn't fix the problem.
From: Moe Trin on 20 Jan 2010 20:42 On Tue, 19 Jan 2010, in the Usenet newsgroup comp.os.linux.networking, in article <M6OdnQl2lLMi5svWnZ2dnUVZ_g1i4p2d(a)giganews.com>, lrhorer wrote: >Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options >asyncmap 0 Not needed - that's the default >usehostname Are you sure you need this? >noipdefault >usepeerdns >defaultroute >auth That usually prevents dialin, as most ISPs will _not_ authenticate to you. >local I realize your USB device lacks modem control lines, but most emulations I've seen work without this. >noipx I haven't seen an ISP offering Novell IPX connections in over 15 years. Old guy
From: lrhorer on 22 Jan 2010 00:42 Moe Trin wrote: > On Tue, 19 Jan 2010, in the Usenet newsgroup comp.os.linux.networking, > in article <M6OdnQl2lLMi5svWnZ2dnUVZ_g1i4p2d(a)giganews.com>, lrhorer > wrote: > >>Cricket:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options >>asyncmap 0 > > Not needed - that's the default I didn't put it in there, but it doesn't seem to cause problems, so I left it. >>usehostname > > Are you sure you need this? I'm fairly certain I don't, but again, it was the default, and doesn't seem to cause any issues. >>noipdefault >>usepeerdns >>defaultroute >>auth > > That usually prevents dialin, as most ISPs will _not_ authenticate to > you. Not when it is contradicted in the calling file. See above. >>local > > I realize your USB device lacks modem control lines, but most > emulations I've seen work without this. It seems to. I removed the "modem" line and uncommented this one for testing. It made no difference. >>noipx > > I haven't seen an ISP offering Novell IPX connections in over 15 > years. It was the default, so I left it. I agree it's unlikely an ISP would ever support IPX. TCP/IP is definitely the playing field for WAN, and indeed for more than 90% of LAN. Heck, I feel like an orphan child because a large amount of our gear runs 7 layer OSI.
From: buck on 5 Feb 2010 00:37
lrhorer <lrhorer(a)satx.rr.com> wrote in news:3_GdndQN--INCPbWnZ2dnUVZ_jhi4p2d(a)giganews.com: irtualization of DTR, or by +++ATH. I would expect the former. > >> Perhaps you may want to >> experiment with the 'disconnect' option to pppd. > > Nope. The problem was a kernel bug. Upgrading the kernel > fixed it. So what kernel fixed the problem? -- buck |