News: This forum is now permanently frozen.
Pages: [1]
Topic: Hang with kernel panic  (Read 7184 times)
« on: March 13, 2009, 16:49:29 »
fjgalesloot *
Posts: 6

Hi,

I have 3 m0n0walls 1.235 running on VMware machines. 2 on Vmware Server 1.0.5 and 1 on VMware ESXi 3.5 with all pathces. I'm having trouble with the one running on ESXi.
We had 3 crashes so far running 2 weeks now, while the ones on VMware Server run 65 and 66 days respectively. We are in the process of deciding to migrate all the machines to ESXi or Citrix XenServer. Is there known incompatibility with m0n0wall and VMware ESXi?

The first 2 crashes, the m0n0wall did not respond anymore. The last time it just rebooted out of the blue. I only have one log entry from the latest crash (reboot):

Code:
Mar 13 15:45:10 /kernel: Rebooting...
Mar 13 15:45:10 /kernel: Automatic reboot in 15 seconds - press a key on the console to abort
Mar 13 15:45:10 /kernel: Uptime: 2d21h45m23s
Mar 13 15:45:10 /kernel: done
Mar 13 15:45:10 /kernel: syncing disks...
Mar 13 15:45:10 /kernel: 
Mar 13 15:45:10 /kernel: panic: sbdrop


Is there some debugging I can do?

Thanks for your help!

Floris Jan
« Reply #1 on: March 18, 2009, 09:39:18 »
knightmb ****
Posts: 341

Hi,

I have 3 m0n0walls 1.235 running on VMware machines. 2 on Vmware Server 1.0.5 and 1 on VMware ESXi 3.5 with all pathces. I'm having trouble with the one running on ESXi.
We had 3 crashes so far running 2 weeks now, while the ones on VMware Server run 65 and 66 days respectively. We are in the process of deciding to migrate all the machines to ESXi or Citrix XenServer. Is there known incompatibility with m0n0wall and VMware ESXi?

The first 2 crashes, the m0n0wall did not respond anymore. The last time it just rebooted out of the blue. I only have one log entry from the latest crash (reboot):

Code:
Mar 13 15:45:10 /kernel: Rebooting...
Mar 13 15:45:10 /kernel: Automatic reboot in 15 seconds - press a key on the console to abort
Mar 13 15:45:10 /kernel: Uptime: 2d21h45m23s
Mar 13 15:45:10 /kernel: done
Mar 13 15:45:10 /kernel: syncing disks...
Mar 13 15:45:10 /kernel: 
Mar 13 15:45:10 /kernel: panic: sbdrop


Is there some debugging I can do?

Thanks for your help!

Floris Jan

Well, when you mix in a virtual environment, it's going to be tough to sort out if it's really the m0n0wall crashing or the VM doing something that makes it crash or some hardware issue that only comes about using VM or only comes about using m0n0wall straight on the hardware.

I use several m0n0wall machines across many hardware types that have near infinite uptime on a heavy loads for many large businesses that they are installed in. The only time they go down is for an upgrade or power outage in which the UPS didn't last long enough.

Are you running PPTP on the machines?

Can you run m0n0wall straight on the hardware and remove the VM environment variable?

Radius Service for m0n0wall Captive Portal - http://amaranthinetech.com
« Reply #2 on: March 18, 2009, 16:00:50 »
fjgalesloot *
Posts: 6

Are you running PPTP on the machines?

Can you run m0n0wall straight on the hardware and remove the VM environment variable?

Yes we are running PPTP on the machine that's giving the problems. When the m0n0wall freezes, the other machines on the same VMware ESXi platform continue to run without any interruption. These machines are Linux (CentOS/RHEL) and Windows 2008 machines.

Because the hardware is used for a lot of other virtual machines, it is not possible to run the m0n0wall directly on this hardware. A dual-quad core with 10GB memory is a but heavy for just a m0n0wall firewall Grin

Is it advisable to run the 1.3b version of m0n0wall? I noticed there is a special VMware image for the new version. In what stage is the beta version? Is it almost ready to be called stable?
« Reply #3 on: March 18, 2009, 18:54:31 »
Manuel Kasper
Administrator
*****
Posts: 364

I would definitely recommend using 1.3b on VMware ESX. I've been running FreeBSD 6.3 (on which m0n0wall is based) on ESX 3.5 at work for quite a while now, and it works very well (though one needs to manually configure e1000 network emulation in the .vmx file for decent performance).

1.3b15 is pretty stable, probably no less than 1.235, and I plan on re-releasing 1.3b15 as 1.3 with only minor changes soon (getting down to zero bug reports is of course impossible, as somebody is always going to have some strange problem somewhere).
« Reply #4 on: March 19, 2009, 00:48:02 »
knightmb ****
Posts: 341

The reason I asked about PPTP was because of the kernel error you posted. How many PPTP clients do have going at once (light load or heavy load)?

Radius Service for m0n0wall Captive Portal - http://amaranthinetech.com
« Reply #5 on: March 19, 2009, 09:53:24 »
fjgalesloot *
Posts: 6

@knightmb
The firewall is not experiencing heavy load. Perhaps 2-3 PPTP sessions at once max. The environment it's protecting is now in a pre-production phase, so it will be more heavily loaded soon. But only traffic wise, the amount of PPTP tunnels will probably be below 5. Are there know issues with heavy PPTP loads?

@Manuel
Is version 1.3b based on a different FreeBSD version than 1.2?
I will upgrade to 1.3b with the e1000 emulation in ESXi as soon as possible. It's good to know that it's in a final beta stage.


Thank you both for your replies!

« Reply #6 on: March 19, 2009, 12:21:22 »
Manuel Kasper
Administrator
*****
Posts: 364

Is version 1.3b based on a different FreeBSD version than 1.2?

Yes. 1.3b is based on FreeBSD 6.3, while 1.23x is based on FreeBSD 4.11.

- Manuel
« Reply #7 on: March 19, 2009, 16:41:08 »
knightmb ****
Posts: 341

@knightmb
The firewall is not experiencing heavy load. Perhaps 2-3 PPTP sessions at once max. The environment it's protecting is now in a pre-production phase, so it will be more heavily loaded soon. But only traffic wise, the amount of PPTP tunnels will probably be below 5. Are there know issues with heavy PPTP loads?
I'd read some threads about FreeBSD crashing sometimes with PPTP enabled under a heavy load, but from what I read, it might have been a memory issue (ran out of RAM and had no where to go)

m0n0wall doesn't use virtual memory, but going by the system specs you plan on using, I don't see you running out of RAM.

Tough to say, maybe some realtime components freak out if there is any delay. I've also read that some hardware can cause this by getting overloaded. You would see a bunch of errors in the m0n0wall system log before a crash happened that read like "tx underrun, increasing tx start threshold to 120 bytes" or similar.

Radius Service for m0n0wall Captive Portal - http://amaranthinetech.com
« Reply #8 on: April 23, 2009, 14:07:16 »
fjgalesloot *
Posts: 6

Just to let you know:

We are running 1.3b15 and 1.3b16 now without any failures on our ESXi servers so far.
Using PPTP and IPSEC as well.

Thanks for your help!

Floris Jan

 
Pages: [1]
 
 
Powered by SMF 1.1.20 | SMF © 2013, Simple Machines