Tessian wrote:It's funny you guys mention this because I am starting a proof of concept with
Mazu, a Network Behavior Analysis (NBA). You feed it netflow from your routers and through that and an app sensor it's able to tell you pretty much everything that's going on in your network at a high level. It'll baseline your network behavior and warn you when shit changes. Should be great for troubleshooting and tracking changes, virus outbreaks, etc. I've yet to see if it's all they claim to be but I'm very interested to find out.
Basically, a deep packet inspector. We have Sandvine for that. I just wish we were using it for stuff besides reporting. It's capable of mitigation of virus/DDoS traffic, and a whole lot more, but big boss is paranoid of any active changes on our network, with all of the net neutrality bullshit and Comcast's lawsuit about auto-closing connections.
Barret wrote:I'm not sure if BB is GNU, though this may push us to finally fully implement it if you are correct about Solarwinds and the lack of application level monitoring (IE Windows Services and Event Log message monitoring specifically).
Actually, there is an application monitoring piece that I forgot about. It's just that we weren't using it. However, I would ask if it works for both UNIX and Windows. (SolarWinds is a Windows product.) Also, we have our SolarWinds split up between the web server, database server, and app servers.
Barret wrote:We need something that monitors for SNMP traps on all devices. We are mostly HP and use Insight manager but we need something for other devices like Brocade switches and Netapp devices that periodically send traps when connections fail or drives get full.
HP has OpenView for that. We use Netcool for our "manager of managers". It's basically a global alarm system that accepts different types of SNMP traps or syslogs and put it up into an alarm system. Basically, our NOC monitors just Netcool 24/7, tickets the alarm, and dives into the problem, escalating if neccesary.
Netcool, OpenView, or something similar IS A REQUIREMENT of a NOC! Cannot stress this enough. You can't have your NOCs just looking at a bunch of different status pages for different applications. And email isn't a tracking system, so don't use it for alarms. For example, we have all of our CMTS, switches, routers syslogging into Netcool. Fiber transport syslogs to Netcool. SolarWinds sends syslog messages to Netcool. All outages are put out as syslogs to Netcool. Most of the servers are syslogging into the Netcool, or they are syslogging into OpManager (which gets forwarded to Netcool). We get SNMP traps for everywhere, from NetApp (we have that SAN, too), Netbotz (awesome headend env monitoring system), the mail system, etc., etc., that go into Netcool.
Also, you will need a Netcool admin to keep up with rules file changes, etc.
Netcool got bought by IBM recently, but we've talked with them after the buyout and they assured us that they really like Netcool, and they didn't buy it to bury the product.
Barret wrote:Nagios was another product we were looking at.
Yeah, we have that, too. Not sure how the data center is using it, though I do know that they use it for security testing. We just had to pass our PCI compliance recently.
Barret wrote:We already had Unicenter and that is a piece of shit. HP openview, while it would work awesome with 95% of our environment, requires at minimum 2 fulltime developers and 3 admins to keep it happy.
Didn't think it was that resource intensive. Try out Netcool. Of course, keep in mind that you still DO need an admin for it. For any of these type of systems, it's requires somebody to keep up with the rules files and alarm changes. But, it's something that you NEED TO HAVE. Your company will just need to bite the bullet and hire another rep. But the business impact is in catching a problem 2 minutes after it happens. You can't get that from a thousand different monitoring systems without some sort of "glue" to pull it together.