Internet Related/Filtering/Firewall Thread, Microsoft Forefront TMG - BSOD's and fweng.sys in Technical; Hello there.
What I'm going to try and explain now may be very vague, but I just want to see ...
What I'm going to try and explain now may be very vague, but I just want to see if anyone else has had the same issue as me, so here goes!
We have a Server 2008 Standard server. Installed on the server is Forefront TMG. I have configured a web chain out to a proxy the entire site uses provided by our local council. I have added all the rules into the firewall that we need as a school, and the net connection works exactly how I would expect it to.
The issue I have, is that when any changes are made to any aspect of networking on the server, I get a BSOD telling me that fweng.sys is the problem (I don't have dump log to hand, but I can get it if anyone thinks they might be onto something.)
The server can run in a stable fashion if the following is done:
Disconnect internal and external interfaces from server
Turn on server and leave to boot into Windows
Log on and leave to settle
reconnect network cables one at a time
Check that internal shares can be reached
Check that external websites can be reached
Once it's up and running, it serves the whole site perfectly until you try to make any changes to it. I can log onto the server and use it in any fashion I wish. So long as I don't try to make any changes in Forefront or adjust any network settings, I can be sure it won't "fall over" again. Obviously this is no good for us, as we are unable to do anything to this server in it's current state.
We have disabled the internal network card, and have added an HP Broadcomm adapter in it's place. All drivers have been updated and we have attempted to run Windows update. When Windows update is run, we get a BSOD and the same error message about fweng.sys (which I believe is the Firewall Engine). We have NOT updated the firmware on the network cards, but we will be doing so over half term.
Basically, I was wondering if anyone has come across an issue like this. Idea's and suggestions are very welcome.
What's the onboard card? Is it also a Broadcom? What happens if you sling in an Intel Pro1000MT? (or GT, just as a test?)
Has the server always done this since Day 01, or is this something that's happening recently?
If you stop Forefront services, then fiddle with networking, does it still BSOD?
Hello Pete, and thanks for your reply.
The server itself is also used as a storage server for all of the staff/student data. It ran for months prior to having Forefront being installed on it, and it was was 100% stable.
The onboard card is listed as "HP NC362i Integrated DP Gigabit Server Adapter". We disabled this and added the "Broadcom BCM5709C NetXtreme II" card, as we had a suspicion that it was the network card that it wasn't liking. This obviously wasn't the case, as the issue is still there.
I haven't tried stopping the Forefront Service yet, but this is something that I will be testing. I can't fiddle about with it during the day as its serving the whole schools net connection and file shares.
1. Use a bootable cd of some sort to thoroughly test the RAM on the server
2. FWENG.SYS is the firewall engine and by the looks of some logs ( obviously yours might be different ) but you would need to look into the log itself for TMG to see if its the same but it looks like FWENG.SYS is pointing to a wrong memory address which makes it fall over, wondering if there is an update for TMG to resolve this as per below link
1. BSOD involves debugging so would take time. Some cases will require source code. But you can try yourself to analys the dump.
2. Not sure what version your firewall engine is but you can try referencing this Description of the Forefront Threat Management Gateway, Medium Business Edition hotfix package: August 13, 2009 and compare the version, if the KB is newer then apply this to get a newer version of the file.
3.It may mean you have to call a MSFT and ask them to do the analysis of the dump
I have managed to apply all available updates for Forefront that come through on Windows update. Calling Microsoft is something I have been thinking of doing, as the issue that we have seems pretty specific to us. The version we are running can only be the following:
Fweng.sys 6.0.6417.154 755,120 13-Aug-2009 05:31 x86
I'll take a proper look after the weekend. I'm hopeful about getting this up and running properly. If I get a result from this next week, I will update the thread with the solution.
Alright, I'll be the bad guy and ask, why are you running it on a file server, there must be lots of holes in it to enable this and some of the compromises could be what is causing this. A few questions, do you have an AV on it and is it setup to exclude stuff like the TMG cache folder. Also is it setup to be hands off with the network stack, many AVs decide to 'help' add security and really stuff up TMG.
I would recommend upgrading the firmware on the NICs, removing TMG from the physical machine and installing the Hyper-V role on the 2008 server. Even with 2008 Standard you get rights for one virtual instance. Enable the NICs again and install another instance under Hyper-V and install TMG on that. This will isolate it from all the other stuff on server and will hopefully solve the problem along with being more secure and stable overall.
The main function of the server was to replace our NAS server that was old and on it's way out. It was replaced, and Forefront was an after thought that was added mainly to deal with caching locally in order to reduce traffic to our external proxy. I can see why you would question putting Forefront on a server for file shares, but in reality there is very little work needed on the firewall to enable shares to be accessed easily. Pretty much just add CIFS to the firewall exceptions and you're away. On paper it may not be the best solution logically, but we can only work with the hardware we have to hand. I really do appreciate all of these ideas, and I'm taking them all into account.
For our antivirus solution, we use McAfee Enterprise. You have correctly pointed out that it may well be monitoring the cache that Forefront generates and going mad over it. I am currently waiting for someone at the local council who runs this remotely to add this as an exception. I am unable to add this as an exception at the moment but I can and WILL be removing the AV from the server as part of this testing I will be doing. I'm glad that you suggested this, as it confirms that it may be a good idea.
The hyper-v solution you suggested would be something I would look to do as a last resort. It's not something I had really considered but in some circumstances (like this) I guess it allows you to totally seperate out the 2 roles into their own instances in order to stop failure of both services if it decides to BSOD again.
I will be doing all this work next week, and I am taking notes of all of these suggestions. Thanks for all your input, and if you have any more suggestions I will more than welcome them.
I thought I would update this thread, so that it may help others in the future.
I have come to the conclusion that removing Forefront from the server is the best solution. I will be running Forefront from a dedicated HP DL120. Here is a list of all the action taken which resulted in the same result:
Update firmware on Broadcom cards, both ports.
Totally removed Forefront and all associated services
after removing Forefront from the Server, it runs perfectly. As soon as Forefront is added back onto the server it starts up again. Strange, as the Server (HP DL180) is actually recommended by Microsoft for Forefront compatibility.
Anyway, I'll let you know if running on alternative hardware gives me any more joy.