Session

Machine Learning Practices in Network Traffic across Data Centers

Speakers

Jasmine Mou

Label

Nuts and Bolts

Session Type

Talk

Contents

Description

With the growth of users and increase of product lines, the demands of the network bandwidths also expands, making it a great challenge for the network engineers in network traffic management and capacity planning. Meanwhile, network traffic across data centers in different regions always has a higher cost than the network traffic within the same data center. Optimizing such traffic with a clear priority focus on networking landscape will help save the costs during the resource planning and operations. The network traffic data itself is also insightful by having different feature dimensions, such as the region and product pairs in source and destination, capacity, and traffic data with timestamps. With such data information, we are able to study the normal time series trends, and abnormal trends of unexpected increase or drop, which will help deliver a smoother product experience and save costs in the long run.

In this talk, we will share our practices of how machine learning, statistical profiling, and visualization techniques are applied with network traffic data. The following topics will be discussed:

  1. The discovery of trends in network traffic across different data centers, especially in the mid-term time range.
  2. Alerting system when network traffic will hit the “water level” or get close to the capacity in the recent future.
  3. Major contributors to the unexpected traffic to be labeled as a high priority focus.
  4. Projection tool to check capacity capability given the number of servers and product pairs on source and destination ends.

We will also briefly talk about the automation to make such practices a routine to enhance productivity.