Topic: Falling off the network

Over the years I've noticed both my C1s will sometimes disappear off the network; they won't respond to pings for a while.  However if there's any network activity (eg an open ssh session) then they stay alive.   After a while they will re-appear; possibly when it's time to do wifi key renegotiation.  I've seen this with three different access points (well, 2 APs, 3 different firmware).   (I reported this on the old board, but never bothered to chase it up).

So I'm trying an experiment; get the chumby to ping the default gateway once every 30 seconds.  I'll see if that's enough to keep it active on the network and not fall off.

Basically:

chumby:~# cat /psp/rfs1/userhook1
#!/bin/sh

/psp/rfs1/network_keepalive.sh

chumby:~# cat /psp/rfs1/network_keepalive.sh
#!/bin/sh

# Send a ping packet every 30 seconds to keep the network up.
# If we don't have a default gateway then sleep and retry.
# If the network goes down then we should lose the gateway...

# And we do it all in the background

while [ 1 ]
do
  gw=

  while [ -z "$gw" ]
  do
    gw=`netstat -rn | awk '/^0.0.0.0/ { print $2 }'`
    if [ -z "$gw" ]
    then
      sleep 5
    fi
  done
  ping -c 1 $gw > /dev/null 2>&1
  sleep 30
done &

Finger's crossed, maybe this will keep the chumby's remotely accessible :-)

Re: Falling off the network

This is pretty neat, and a clever solution. Please let us know how you fare.

Re: Falling off the network

I've seen the same thing in my c1s

they stop responding to arps, and when the arp table ages them off on a local machine, you can't get to them anymore.

I hardcoded their mac addresses into one of my servers, and it can *always* get into the c1s.

once the beta is a bit more stable, I'll pester duane about it wink   or maybe I'll go ahead and put in a bug report now, for him to look at when he feels like it.

Cleaning up any loose bits and bytes.

4 (edited by sweh 2014-04-01 11:07:14)

Re: Falling off the network

... it failed.

As soon as it dropped out of the "arp" table on my client machine then the client doesn't see the chumby again.  The router can, but my workstation doesn't.

Sniffing on the router I see the ARP requests go out on the wifi network, but the chumby never responds.

Hmm, I need a tcpdump for the chumby, so I can run a sniff at the same time.

Re: Falling off the network

try manually adding the chumby to the arp table on your client... I bet it'll respond then, it just won't answer arps.

Cleaning up any loose bits and bytes.

Re: Falling off the network

diamaunt wrote:

try manually adding the chumby to the arp table on your client... I bet it'll respond then, it just won't answer arps.

Oh, I'm 99% sure you're correct.  I'm just interested as to whether the incoming ARP request shows up in a tcpdump run on the chumby.  If it doesn't then the problem is likely at the wifi driver layer, and the network stack just doesn't see the packet to respond to it.

Re: Falling off the network

good question... I don't know the answer, and don't have a tcpdump for arm sad

I can tell you that the problem doesn't appear to affect the classic (the one that I have) but does affect all three c1 units that I have, or have access to.  I don't know if it affects the i3.

Cleaning up any loose bits and bytes.

Re: Falling off the network

Cephalopodian tentacles are waving in extreme agitation as one of my chumby's calls upon the power of the Great Cthulhu himself to decipher runic incantations and provide a solution to this problem.  a.k.a I have a C1 compiling tcpdump from source :-)

Re: Falling off the network

I bow before you!

Cleaning up any loose bits and bytes.

Re: Falling off the network

Hmm, is there any way to attach things to the forum?  If there is, I can upload tcpdump+libpcap compiled for /psp/usr

Re: Falling off the network

I just use Dropbox and link stuff through there.

12 (edited by sweh 2014-04-01 14:10:36)

Re: Falling off the network

Well, that seems definitive.  With a ping running in one window from a machine that had lost contact with the chumby, I saw this...

# /psp/usr/sbin/tcpdump not port 22 and not port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), capture size 96 bytes
18:01:34.726127 08:00:27:e4:8a:14 (oui Unknown) Unknown SSAP 0x7c > 1c:3e:84:03:62:a9 (oui Unknown) Unknown DSAP 0xdc Information, send seq 88, rcv seq 16, Flags [Command], length 117  
18:01:34.734502 08:00:27:e4:8a:14 (oui Unknown) Unknown SSAP 0x7c > 1c:3e:84:03:62:a9 (oui Unknown) Unknown DSAP 0xdc Supervisory, Receiver Ready, rcv seq 16, Flags [Command], length 99
18:01:42.213864 00:25:22:26:13:c0 (oui Unknown) Null > d8:5d:4c:94:e3:fc (oui Unknown) Unknown DSAP 0x08 Information, send seq 0, rcv seq 16, Flags [Command], length 62
18:01:42.223114 00:25:22:26:13:c0 (oui Unknown) Null > d8:5d:4c:94:e3:fc (oui Unknown) Unknown DSAP 0x08 Information, send seq 0, rcv seq 16, Flags [Command], length 239
18:01:42.315334 f8:d1:11:bc:68:7c (oui Unknown) Null > d8:5d:4c:94:e3:fc (oui Unknown) Unknown DSAP 0x0a Information, send seq 0, rcv seq 16, Flags [Command], length 76

I can see packets from a Windows VM to the router, from my VM host and from the router to the _other_ chumby, but no traffic to this chumby!

But when the chumby wakes up and talks to the network... then traffic really flows :-)  I can see the ARP packets, the ICMP packets, the whole thing.

My gut feeling; a driver issue.

Re: Falling off the network

nathanm wrote:

I just use Dropbox and link stuff through there.

The downside to that is that if I delete the stuff then the link breaks.

But, sure, why not... https://dl.dropboxusercontent.com/u/251 … cpdump.tgz

Re: Falling off the network

I've put it on http://files.chumby.com/contrib/falconwing/tcpdump.tgz

Cleaning up any loose bits and bytes.

Re: Falling off the network

Ta!

Re: Falling off the network

FWIW, it's possible for IPv4 and IPv6 to get "out of sync"; a "ping" to the chumby can work but a "ping6" doesn't (typically if the IPv4 connection is in use; eg an ssh connection) but the IPv6 network has been left idle.  I assume this is a symptom of the same problem.  So just a data point!