News: This forum is now permanently frozen.
Pages: [1]
Topic: m0n0wall having frequent failures on different hardware  (Read 2194 times)
« on: May 09, 2007, 00:39:57 »
gr7000 *
Posts: 3

I've tried to post this to the mailing list twice this week, but it seems that I can't post to the list for some reason, so I'll post this here:

Hi all,
I've been using m0n0wall on my home network for the past couple years.  At
first it worked flawlessly, but several months ago, it began to fail.  It
would simply crash, with no response to network or console commands.  First,
it would happen only occasionally.  It grew to be a daily occurrence a few
weeks ago.  I blamed the failures on the old hardware I was running it on and
bought a Soekris Net 4801 box.

This was installed a few days ago and is set up the same way as the old box. 
It fails too, in exactly the same way.  There are no log messages on my
logging server, the box will not respond to any network traffic, and the
serial console echos input but does nothing else.  The error light is not on,
and the watchdog timer does not reset the box (does m0n0wall support that?). 
When I reset the box, it boots right up again and runs for maybe 12-48 hours
more without crashing.

I have no more information than that, since it becomes unresponsive and has to
be reset.  IIRC, there are no errors of any kind printed out on the console.

I am using the old config file on the new router and have it connected to the
same equipment as the old one (A Zoom DSL modem (bridge mode), a Linksys
WAP11, and an NPI layer 3 switch).  It is running the most recent release of
m0n0wall.

The only major features I am using are: routing, NAT, firewall, traffic
shaping, and the SNMP agent.

Any thoughts on how to fix this?

-Rob
« Reply #1 on: May 09, 2007, 01:24:16 »
cmb *****
Posts: 851

Try the 1.3 beta version. It's known that there is something that causes hard freezes for a small number of users, but it's not known what causes it and the vast majority of us cannot replicate it. 1.3 has been reported to fix it.
« Reply #2 on: May 09, 2007, 02:06:34 »
gr7000 *
Posts: 3

OK.  I upgraded to the beta version.  I guess if it is still up in a couple weeks, it worked.
« Reply #3 on: May 16, 2007, 15:11:08 »
bitonw **
Posts: 79

is it possible that there is some thing in your config file?

why not set the soekris back to factory defaults and create a new config and not using the old config?
« Reply #4 on: May 22, 2007, 08:06:30 »
gr7000 *
Posts: 3

Wall, so far it has been running fine with the beta firmware.  Of course, it may fail right after I write this, but that seems to have fixed it.  Thanks for the advice about that!

Oh, FWIW, I read the mailing lists posts, and my guess is that the problem may be related to the firewall tracking a large number of connection states.  I have sometimes had several thousand listed in the status page.  I have been writing an experimental search engine and I have a web crawler that is largely responsible for that.  There also seems to be a correlation of the problem with P2P use from what I read in the mailing list.  Both of these make frequent short duration connections with many hosts.  Perhaps a race condition/overflow/etc... occurs when the firewall is handling too many new connections too quickly?
 
Pages: [1]
 
 
Powered by SMF 1.1.20 | SMF © 2013, Simple Machines