Network security blunders: Tales from the field

We’ve all made one in our career, I’m talking about that blunder you thought you would be fired for. My first blunder was rebooting all the campus router pairs at one time, not one by one, all at once. I had written a script to install a security update on all the routers and reboot them all one by one, I thought. Turns out my script had an error and didn’t wait between routers.

I thought for sure I was fired, turns out it was a great learning experience for everyone involved. We all learned a little about crisis management, I was introduced to change management, and my boss took a few hours to teach me how to verify the network was working properly as everything came back up online.

Sometimes our blunders are not so instantly noticeable, sometimes we make blunders that linger in our firewalls until one day they either cause an outage or an auditor calls us on them. One of my favorite activities is visiting and talking with firewall engineers around the world, here are some common blunders I’ve seen and heard our engineers talk about, maybe you’ll recognize one or two of them your self.

Creating firewall groups with no meaning
One of our engineers saw this client’s configuration which had a certain network object in over half the rules. It was named after a famous football star, let’s call it “Joe_Montana” (after all I need to change the facts to protect the guilty). Whenever they need some access which is not clear they add the IP address to this group which is used in many permissive rules. The result is that the rule base looks OK to an auditor (because there are no rules that contain ANY) but it’s actually a huge security hole. The rules have become meaningless and the cleanup of the rulebase is going to take many months of searching out what the actual businesses needs are.

Failing to upgrade you firewall software
We find many clients running incredibly old firewall software. I always hear the same story about keeping the version fixed for stability; the firewalls cannot be taken down for upgrades, etc. The fact is, your firewall vendor is upgrading their software for a reason. You don’t need to be on the latest greatest release, but when I see you on a version that is 15 or 20 releases ago and 7 years old I can’t help but think you don’t take security seriously.

Using the wrong technology
I’ve seen all sorts of ways in which the square peg is pounded into the round hole, but here is one of my favorite blunders. One client argued with their auditors that because they have a firewall in front of their secure webservers it formed a second layer of authentication and so they were using two-factor authentication, a password and a firewall. I like the effort, but a firewall (by itself) is not a two factor authentication solution. Two factor authentication requires your users have something, it’s something they Know and something they Have, a token and password for example.

The accidental outage
Many of us have caused an outage at one time or another. One of our engineers tells a story of being at a client site when one such outage happened. They were working on the production firewall server gathering some data for a support case, the server was windows. The admin reached across the table and accidentally leaned on the mouse, which was over the Start Menu. As fate has it in for us network engineers at all times, the mouse activated the Start Menu and was unbelievably over the shutdown menu item when it popped up. Yep, right there in the middle of production this financial corporation watched their production firewall shutdown.

Poor documentation
All too often we talk to customers who are trying to understand what the heck all their rules do. As we get in a hurry day after day and get lazy about documentation we create a time bomb waiting to blow up. We often find we are working with a customer and they tell me “I’m afraid to make changes on my firewall now, all the senior guys have left and we don’t know what most of these names mean or what these rules do.”

Using excessive Drop rules
Often when we are in a hurry we create rules with very excessive accesses, and place a rule just above it dropping the access we don’t want to allow. We do this because we don’t want to figure out how to write the correct rule. We see rules like allow All DMZ devices to All Internal devices ACCEPT, with a rule right above it that says All DMZ devices to Secure Network device DROP. The two rules look OK at first, but it is really an ugly hack because we didn’t write out the business need in the first rule. When we operate like this over time we have a rule base with lots of the rule “pairs’ and reordering the rule base or editing rulebases is likely to expose more risk or block necessary traffic. Either way, we have a mess we must rewrite at some point.

Using routing as your security policy
I’ve seen many a firewall where rulebase changes need routing changes to accompany them. It’s understandable when the change involves a new network on the firewall. There are two versions I see of this blunder all the time. The first is the firewall with no default route. Every route is added to the firewall by hand and the smallest netmask possible is used so traffic will not reach unintended devices if the firewall has no policy. Wow, that sounds great, but it’s totally unnecessary if you remove the policy on modern firewalls they revert to DENY ANY. This design becomes so hard to manage that the entire team begins to dread making changes. Soon every change needs an engineer to examine the routing, thus every change takes too long and impacts servicing the business in a timely fashion so there is no real value in added security.

The second version of this blunder is most often seen on Cisco devices where admins have ACLs between two interfaces that include the source or destination ANY. They don’t actually mean ANY, they mean all the addresses behind this interface, but I’m too lazy to type the addresses in. This leads to a rulebase that can be understood only by knowing the routing table along with the firewall and trying to do the match in your head – way too complicated for a junior firewall admin to take over this firewall.

Using DNS objects in a rule base
One of the options many firewalls provide is inserting a source or destination as a DNS object like www.google.com. It sounds great because google.com can use so many IP addresses and this allows my firewall to always pass the traffic as google.com changes IP addresses. This blunder leads to many risks most organizations should consider unacceptable. First your firewall is now more susceptible to Denial of Service attacks. What happens when it can’t resolve google.com? Second your firewall is going to waste CPU, memory, and network IO on doing DNS lookups for every packet trying to decide if it might belong to google.com. Third, what happens if your DNS is poisoned and malicious addresses for the command and control of the attest botnet are returned with the google.com addresses? You now just allowed all the botnet command and control traffic though your firewall, and logged it as google.com.

Making changes in panic mode
Imagine that something goes wrong and you lose one of the RAID disks. You replace it and while the RAID is rebuilding, the service is slowed down – but you don’t realize that it’s because of the RAID.

At this point, your customers have been denied service for 40 hours and you’re losing a lot of money every minute, not to mention customers who are leaving you for alterative services from your competitors. You go into panic mode and start changing configurations: switches, routers, load balancers and firewalls – anything that you suspect may be causing the problem.

After another 24 hours, another sleepless night and many hours of expensive consultants, someone figures out the real cause of the problem. Now you want to revert all the changes you made on the switches, routers, load balancers and firewalls but no one knows what they were because they were made in a rush without any documentation.
So you spend another 3 days figuring out how to revert the system back until it’s fully operational.

I hope you don’t see any of these in your organization. But if you do, rest assured you are not alone. The best run organizations can find these blunders or others lurking in their firewall rules. The good news is tools to automate the discovery of these blunders are now available, and the tools to proactively keep these mistakes from happening again are available.

More about

Don't miss