Using Apache Druid to Monitor Your Network's Performance

It doesn’t matter what industry you’re in, maintaining the integrity of your IT infrastructure has never been more crucial than it is today. And, in order to make that happen, you’ve got to keep control on how your network is performing. The busier your network gets, the slower it can become. But what if it blows out under pressure, or if it’s compromised by a security breach?  The damage and downtime caused can have a major impact on your business, your reputation, and your bottom line.

We understand that problem better than anybody. As data management specialists, our network is constantly under pressure to perform. That’s why, in the great tradition of practicing what we preach, we use the power of Druid to streamline our network’s performance for us. Take a glimpse into our world and find out how Druid could also help your network run more safely, securely and efficiently.

The Challenge

We work in the world of Big Data – as do most of our clients – which means the demands placed on our IT infrastructure are enormous. In order to keep things running as smoothly and optimally as possible, it’s vital to know what is happening on our network every moment of every day, 365 days a year. That kind of awareness involves a lot of different elements, such as anticipating connectivity issues and flagging suspicious traffic before any unscrupulous visitors get the chance to snoop around our network and cause trouble. It also involves knowing how the traffic is ebbing and flowing across our servers, pinpointing the exact moments our servers experience unusual peaks and troughs in activity, and identifying exactly what applications might be causing sudden increases in bandwidth utilisation. If there’s a configuration error or an abnormal traffic fluctuation, we need to know the moment it occurs so that we can deal with it immediately. Above all, we need an effective way to monitor, analyse and maintain the health of our network on a second-per-second basis, so that we can predict potential service disruptions before they occur and investigate any network anomalies the moment they happen.

It’s a challenge that any business with a medium-to-large IT infrastructure will be familiar with. The fact you’re reading this suggests it may be familiar to you, too.

Our Solution

Apache Druid’s analytics are perfectly suited for collecting the real-time data from network flows and querying it across any set of dimensions the user chooses. For example, Druid will stream in logs and monitor information from network switches, routers, firewall and servers and ingest millions of events per second, allowing the user to perform advance metrics and slice and dice data across a limitless set of attributes.

Druid’s ad-hoc and exploratory analytics mean it’s also possible to make numerous iterative queries and probe deeper to understand an unusual data pattern or anomaly, and its inverted indexes mean it can quickly prune out unnecessary data and only scan for what it needs to complete a particular query. In every way, Druid makes it easier to stay on top of network performance by simplifying decision making and letting the user make better judgement calls about how their network is being utilised.

In addition, we created an intuitive dashboard and email alerts so that our technical team could always monitor the network in real time, no matter where in the world they’re based.

In the interests of transparency, we should probably let you into a little secret. Until recently, all our servers were based in our office. Not only did they take up an uncomfortably large amount of space and generate an even more uncomfortable amount of heat but, as our network has grown, we became increasingly more concerned about their physical security.

Partnering with MIGSOLV and relocating our servers to their high-security data centre put an end to all those problems and is one of the best decisions we’ve ever made. Take a look at this article to find out more about the Spicule/MIGSOLV partnership.


Apache Druid

Druid’s speed, flexibility and real-time interactivity makes it ideal for monitoring and investigating network flow and ensuring your IT infrastructure remains stable and effective regardless of the rigorous demands being placed upon it.

Because it can ingest, store and simultaneously analyse infinite amounts of real-time data, slicing and dicing along any set of attributes and using ad-hoc analytics to drill down to the root of any query, Druid is the perfect platform for managing your network’s performance, anticipating problems and solving them before they happen, keeping your traffic flow moving… and keeping your business in business.

