One of the most difficult tasks in network operations is gathering accurate sampling data from a switch to create a dashboard that shows the overall health of the network. Accurate network visibility and analytics is the cornerstone to operating an efficient and reliable network. After all, how do you know that your network is running smoothly if you have no idea what’s going across it?
Network analytics is a broad term, but in general—as network operators—we want to provide context and answer the following questions:
What types of applications are consuming network resources?
What’s the current capacity and utilization of a given switch?
How can I quickly identify peaks and valleys?
How can I detect microbursts?
Are there hotspots forming in the network?
Answering these questions has become more difficult with the standardization of 10GbE access ports in the data center. The amount of traffic is increasing rapidly and traditional sampling techniques such as sFlow and IPFIX only provide answers to some of the questions posed. Because microbursts and latency spikes can happen in very small windows, tools that rely on sampling every few seconds are unable to detect these events that interrupt business applications. Microburst events occur when there are multiple ports of ingress traffic that’s all destined to a single egress port, and the egress port’s buffer is exceeded. For example, if server 1 sent a query to a set of compute clusters, ...