This is a perl script that checks to see if it can ping and get the mac address of the default gateway. If it can't, it'll reboot the (host, not router) machine.
Standard tools are used:
- arp
- cat
- ifconfig
- ping
- route
- and reboot
Uptime for the systems comes from /proc/uptime.
theory of operation
Cronjob running every 2 minutes checks to see if it can ping and arp the default router. The default router is inferred from the interface given on the command line using the 'route' command.
We also do a ping broadcast just to see what mac information for the default router is in the arp cache.
If the default router cannot be reached, an error log is appended to. Log format is the unix time followed by the uptime of the machine at the time of the failure. At the same time, if the time of the first failure (stored in a file) is known, we compare with the stored time to see if we reach the preset limit. If we have, then a reboot is initiated. If the file with the first failure time is not found, it is created. The first failure time file is deleted by any successful run of the script.
monitor.pl line arguments
One argument is required: the name of the network interface we want to monitor. E.g. run with 'monitor.pl eth0'.
output files generated
Up to two files are generated named according to the interface being monitored. If eth0 were specified on the command line, the files generated would then be eth0.log and eth0.first.
project files
This script was developed on a debian 3.1 machine. It should work on any other linux machine, but the paths and pattern matches may have to be adjusted.
the files
monitor.pl - main perl program
sample time-of-first-error file
A crontab entry to run the script could look like:
*/2 * * * * /cronjobs/arp_monitor/monitor.pl eth0 1> /dev/null 2> /dev/null