Prev: Cisco VPN client fails to connect
Next: Share windows partitions on Ubuntu server from Ubuntu Client?
From: Rahul on 10 Jul 2010 17:38 I did a tcpdump like so: tcpdump -c 1000 -ennqti eth3 \( arp or icmp \) In a one minute period I get 1000 ARP requests. Is this normal? I reproduce below the traffic in case this helps diagnosis. The network is static, no new devices are being added or removed. The MAC<->IP association is also static. Why is there such a lot of ARP traffic or is this normal? The network has ~265 servers. There is only a single physical network but twin subnets: 10.0.x.x (primary traffic) and 172.16.x.x (monitoring). i.e. each server has a single physical card but it reponds to two MAC and IP addresses. Would increasing the size of my ARP cache be a solution? I'm a bit confused because (as I understant ARP caching) my ARP cache size is set to 512 or 1024 (not sure which) but the actual ARP table seems to have only 265 entries (values below). Or is my understanding of ARP wrong? cat /proc/net/arp | wc -l 265 ip neigh | wc -l 264 cat /proc/sys/net/ipv4/neigh/default/gc_thresh2 512 cat /proc/sys/net/ipv4/neigh/default/gc_thresh3 1024 ############################ 00:26:b9:58:ec:29 > ff:ff:ff:ff:ff:ff, ARP, length 60: arp reply 172.16.2.5 is-at 00:26:b9:58:ec:29 00:26:b9:58:ec:2a > 00:26:b9:58:d7:2f, ARP, length 60: arp who-has 10.0.3.2 tell 10.0.0.11 00:26:b9:58:ec:2c > ff:ff:ff:ff:ff:ff, ARP, length 60: arp reply 172.16.0.11 is-at 00:26:b9:58:ec:2c 00:26:b9:58:ec:48 > 00:26:b9:58:d7:2f, ARP, length 60: arp who-has 10.0.3.2 tell 10.0.1.66 00:26:b9:58:ec:4a > ff:ff:ff:ff:ff:ff, ARP, length 60: arp reply 172.16.1.66 is-at 00:26:b9:58:ec:4a 00:26:b9:58:ec:56 > ff:ff:ff:ff:ff:ff, ARP, length 60: arp reply 172.16.1.12 is-at 00:26:b9:58:ec:56 00:26:b9:58:ec:5a > 00:26:b9:58:d7:2f, ARP, length 60: arp who-has 10.0.3.2 tell 10.0.0.52 ################################ -- Rahul
From: Chris Cox on 10 Jul 2010 21:15 Rahul wrote: > I did a tcpdump like so: > > tcpdump -c 1000 -ennqti eth3 \( arp or icmp \) > > In a one minute period I get 1000 ARP requests. Is this normal? I > reproduce below the traffic in case this helps diagnosis. The network is > static, no new devices are being added or removed. The MAC<->IP > association is also static. Why is there such a lot of ARP traffic or is > this normal? In general, I'd say pretty normal. Things are always making queries... who-has messages abound, as well as i-have messages.
From: Moe Trin on 11 Jul 2010 14:27 On Sat, 10 Jul 2010, in the Usenet newsgroup comp.os.linux.networking, in article <Xns9DB1A958CCCBE6650A1FC0D7811DDBC81(a)188.40.43.230>, Rahul wrote: >In a one minute period I get 1000 ARP requests. Is this normal? Depends. How "busy" is the network - how many hosts talking to how many hosts how often? >The network is static, no new devices are being added or removed. The >MAC<->IP association is also static. Why is there such a lot of ARP >traffic or is this normal? RFC1122 Section 2.3.2.1 ARP Cache Validation BRIEFLY - ARP is used to resolve IP->MAC. The querying and answering systems will keep an individual entry for on the order of one minute. For the Linux kernel, this is NORMALLY a compile-time setting. You may be able to increase the timeout. >The network has ~265 servers. There is only a single physical network >but twin subnets: 10.0.x.x (primary traffic) and 172.16.x.x >(monitoring) .i.e. each server has a single physical card but it >responds to two MAC and IP addresses. Overlaying networks rarely serves any useful purpose other than to increase overhead. Are you sure this is needed? It's bad enough with 265 hosts in one collision domain, never mind 530. (Our original subnet mask was 255.255.252.0 allowing 1000 hosts per segment - in 1994, we installed Etherswitches to break the coax into segments with no more than 50 hosts per, resulting in significant improvement in network speed.). Doing a traffic analysis (who is talking to who) could be a real eye-opener, suggesting a more efficient layout. Look at RFC0950 (Internet Standard Subnetting Procedure) and related documents (RFC0917, RFC0925, RFC0932, RFC0936, RFC0940 and even RFC1027) which should provide useful background. >Would increasing the size of my ARP cache be a solution? I'm a bit >confused because (as I understant ARP caching) my ARP cache size is >set to 512 or 1024 (not sure which) but the actual ARP table seems >to have only 265 entries (values below). Or is my understanding of ARP >wrong? ARP is used when host A wants to talk to host B. If it doesn't need to talk to B, why should it be caching B's MAC? Also, how is your network _physically_ connected? Is this coax (10Base2 or 10Base5) or twisted pairs with a _hub_ junction? Every host on such a "party line" hears everyone else, and ARP _may_ be configured to cache ARP replies heard from "other" systems. If the network is using switches, _broadcast_ packets are heard by all (depending on the switch), while _unicast_ packets (ARP replies) are heard only by the "interested" party. If using switches, you need also look at the timeouts in the individual switches as well. >00:26:b9:58:ec:29 > ff:ff:ff:ff:ff:ff, ARP, length 60: arp reply >172.16.2.5 is-at 00:26:b9:58:ec:29 >00:26:b9:58:ec:2a > 00:26:b9:58:d7:2f, ARP, length 60: arp who-has >10.0.3.2 tell 10.0.0.11 >00:26:b9:58:ec:2c > ff:ff:ff:ff:ff:ff, ARP, length 60: arp reply >172.16.0.11 is-at 00:26:b9:58:ec:2c [compton ~]$ etherwhois 00:26:b9 00-26-B9 (hex) Dell Inc 0026B9 (base 16) Dell Inc One Dell Way, MS RR5-45 Round Rock Texas 78682 UNITED STATES [compton ~]$ My condolences. Something is fucked with your capture data. Example - the first line shows Dull 58:ec:29 _broadcasting_ an ARP reply. That should be a unicast from Dull 58:ec:29 to the MAC of the querying system. In the second line, Dull ec:2a sends a _unicast_ query asking who is "10.0.3.2". That should be a broadcast unless this is a reconfirm. You may also want to look at RFC0826, which is the specification for ARP referenced in RFC1122. 0826 Ethernet Address Resolution Protocol: Or Converting Network Protocol Addresses to 48.bit Ethernet Address for Transmission on Ethernet Hardware. D. Plummer. November 1982. (Format: TXT=21556 bytes) (Updated by RFC5227, RFC5494) (Also STD0037) (Status: STANDARD) 1122 Requirements for Internet Hosts - Communication Layers. R. Braden, Ed.. October 1989. (Format: TXT=295992 bytes) (Updates RFC0793) (Updated by RFC1349, RFC4379) (Also STD0003) (Status: STANDARD) If the number of ARP packets is a concern, look at increasing the ARP timeout, or simply bite the bullet and use permanent entries in the arp cache (man arp look at the -s and/or -f options). Old guy
From: Rahul on 12 Jul 2010 15:31 ibuprofin(a)painkiller.example.tld.invalid (Moe Trin) wrote in news:slrni3k385.dip.ibuprofin(a)compton.phx.az.us: Thanks Moe for a detailed analysis! > BRIEFLY - ARP is used to resolve IP->MAC. The querying and answering > systems will keep an individual entry for on the order of one minute. > For the Linux kernel, this is NORMALLY a compile-time setting. You > may be able to increase the timeout. > Why is the cache maintained on a time basis? Isn't it more logial to specify the max number of ARP cache entries? Or are the two approaches identical? >>In a one minute period I get 1000 ARP requests. Is this normal? > > Depends. How "busy" is the network - how many hosts talking to how > many hosts how often? I know there are ~265 physical servers and x2 = 530 IP addresses. The 10.0.x.x should be fairly busy. But I have no way to quantify it right now. In fact, what tool does one use to answer the question you raised: "How "busy" is the network?" Maybe the answer is in the RFC's you quoted. I'm reading them now. But if anyone has pointers as to how to answer the above question please do tell. I don't have access to the switches so can't get any switch side stats. unfortunately. All monitoring will have to be server-side. > > Overlaying networks rarely serves any useful purpose other than to > increase overhead. Are you sure this is needed? I am not sure. Maybe my design decision was wrong. The situation is that we have normal traffic as well as IPMI (maintainance mode) traffic piggybacking over the same physical wire and adapters. Conceptually I thought it made sense to keep those seperate? But I am open to sugesstions if this was a bad idea. > It's bad enough > with 265 hosts in one collision domain, never mind 530. But that is only relevant for broadcast traffic, correct? Unicast traffic will be intelligently handled by the switch so that the collission domain is only equal to the number of switch ports? Pardon my networking ignorance if this is wrong. > improvement in network speed.). Doing a traffic analysis (who is > talking to who) could be a real eye-opener, suggesting a more > efficient layout. Is tcpdump the tool of choice for this? Or wireshark? Or something else? > ARP is used when host A wants to talk to host B. If it doesn't need > to talk to B, why should it be caching B's MAC? Is there a downside to having a larger ARP cache? I mean sure, it takes more memory but these days RAM is cheap and anyways a 1000 row IP<->MAC lookup table is not a big size. >Also, how is your > network _physically_ connected? Is this coax (10Base2 or 10Base5) It's a 1GigE ethernet cable. I think it's CAT5e (1000BASE-T). > cache ARP replies heard from "other" systems. If the network is > using switches, _broadcast_ packets are heard by all (depending on The network is switched. Each switch takes around 48 hosts so we have 6 Cisco-Catalyst switches interconnected with 10GigE fiber links. > the switch), while _unicast_ packets (ARP replies) are heard only > by the "interested" party. If using switches, you need also look at > the timeouts in the individual switches as well. Ah! Thanks! I didn't realize the switches have a ARP cache timeout too. Makes sense. I'll ask my networking folks about that. > > My condolences. For using Dell? :) I'm confused. > Something is fucked with your capture data. Example - the first line > shows Dull 58:ec:29 _broadcasting_ an ARP reply. That should be a > unicast from Dull 58:ec:29 to the MAC of the querying system. In > the second line, Dull ec:2a sends a _unicast_ query asking who is > "10.0.3.2". That should be a broadcast unless this is a reconfirm. > You may also want to look at RFC0826, which is the specification for > ARP referenced in RFC1122. Wow! You are right. I never noticed this. I will definately dig deeper into this. Something is not right. -- Rahul
From: Rick Jones on 12 Jul 2010 17:03 Rahul <nospam(a)nospam.invalid> wrote: > Why is the cache maintained on a time basis? It helps to bound "fail-over" time when an IP is migrated from being associated with one MAC address to another. > Is tcpdump the tool of choice for this? Or wireshark? Or something > else? If one is a fan of Star Trek "TOS" tcpdump can be though of as the mnemonic memory circuits made from stone knives and bearskins. It is a basic CLI (command-line interface) packet capture utility. Wireshark adds a gooey and whatnot. They both use libpcap to perform actual packet capture. The differences would be in what they can decode and how they display it. > Ah! Thanks! I didn't realize the switches have a ARP cache timeout too. > Makes sense. I'll ask my networking folks about that. Indeed, anything with an ARP cache needs to have a way to keep it up-to-date. rick jones -- oxymoron n, commuter in a gas-guzzling luxury SUV with an American flag these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
|
Next
|
Last
Pages: 1 2 3 Prev: Cisco VPN client fails to connect Next: Share windows partitions on Ubuntu server from Ubuntu Client? |