We are running 1.3b15 on an ALIX (Netgate m1n1wall) box on a network of about 100 residential computers. The box has been working great for several months, but in the last few weeks we have been getting "hangs" at least once per day. I have 2 identical Netgate boxes and have swapped back and forth between them, but both exhibit identical behavior.
When I say hang, I mean that traffic is no longer being forwarded to/from the WAN and it is not possible to reach the web interface. Rebooting by power cycling the box resolves the problem for a while. Nothing of interest appears in the logs (we enabled remote syslog).
We made one change recently that corresponds to the start of this problem--we enabled DNS forwarding. We have one internal email/web server which is on the list of exceptions for DNS forwarding, with two different host names with the same IP address on the internal NATed subnet attached to the LAN port:
intertoad ev.ithaca.ny.us 192.168.243.1 toad-hall ev.ithaca.ny.us 192.168.243.1
If it matters, this host is double-homed on a different publicly-accessible subnet which is not attached to the monowall. This subnet used to be attached to the 3rd LAN interface of monowall, but we have since disconnected it (although the configuration entries for that interface were not removed until today).
Can you recommend any avenues to explore to determine the cause of this phenomenon? Is there a way to improve the logging (the logs always seem to get cleared with the reboot--is there some way to have them persist locally between reboots)? If we attach to the serial port of the box, could we see anything useful?
Obviously, disabling DNS forwarding would be a good test, but we need to provide this functionality somehow to keep our network operating.
Any suggestions will be welcome.
Regards,
Jeff
|