We’re kicking off a new decade with a new blog series called, “Lessons Learned: Network Security” that takes a look back at the biggest issues, factors and milestones from the past two decades and how we need to approach things differently in 2020.
Our team possesses 30+ years of network security and data analytics experience, so it’s only right we reflect on this experience to recognize what mattered most. In this first post, we explore the power of data and gradual applications of data science – with our very own chief data scientist (CDO), Andrew Fast. In his current role, Andrew is responsible for leading the machine learning and artificial intelligence efforts at CounterFlow. Before joining the company, Andrew worked as a consultant helping hundreds of companies expand their data science knowledge sets and capabilities.
Q: What was the biggest mistake you saw organizations make when it came to data application in the 2000s?
It wasn’t so much mistakes as much as it was limited knowledge of the transformational power of data. Specifically, what to look for in data and how to apply it outside of operational databases and systems. Knowledge discovery and data mining (KDD) methodologies started coming into focus in the 1990s and 2000s, but it was the research community that embraced them, not the enterprise. Most businesses had not fully mastered their internal data and, if they did, they didn’t exactly know how to maximize the value of that data.
Q: What was the biggest turning point for data science in the last two decades?
There were several turning points. The first was the broader adoption of lean manufacturing principles. The whole idea behind lean manufacturing is to ‘minimize waste’ and optimize workflows. It forced businesses to re-evaluate how they run their business operations. Part of that was looking for any and all data to help inform the decision-making process. The second turning point that also had a major influence on companies and the way they look at data was Tom Davenport’s book, Competing on Analytics: The New Science of Winning. Tom provided many powerful cases studies of businesses transformed through the application of advanced modeling and predictive analytics (a precursor to machine learning) across the enterprise . The third and most obvious turning point is the exponential growth and availability of data sets for companies to get their hands on and evaluate and open-source software with which to process that data. Fun facts: In 2010, Statistica found that the total amount of data created worldwide was just 2 zettabytes. By 2025, analyst firm IDC sees the total amount of data increasing to 175 zettabytes.
Q: What lessons did organizations learn in the 2000s when it came to data analytics?
Companies had relatively limited data – think a few GBs or less – and limited compute power. This combination forced companies and individuals to really think hard about the questions they wanted the data to help to answer. As time went on and companies gradually got more access to more data, it proved that the more data you have, the more power you have to use the data in certain ways, gaining more knowledge along the way. As a result, we saw the whole field of data analytics and data science start to really expand. It no longer was a specialist field involving a small group of talented individuals who excelled at complex math. We saw the creation of more accessible data processing software such as R, Python Scikit-learn and other technology tools that helped others across the business pursue data analytics.
Q: Why is data science so critical to the future of network security?
The scale and complexity of today’s enterprise network has introduced myriad visibility challenges for the network defenders. The influx of all the new data mentioned above has led to a 3x increase in overall network traffic. In addition, we’re seeing an increase in data encryption, and it makes it that much more difficult to evaluate individual connections to determine if there is something not quite right that needs to be flagged. Applying data science is absolutely necessary because it will enable businesses to effectively prioritize which network events to investigate without decrypting the data, which is too time consuming and impractical.
Q: Why has the cybersecurity industry been slow to adopt AI approaches and what does the future hold for AI?
There are two driving factors. The first is the sheer volume of data has limited the adoption of AI as it takes time for security teams to understand and trust the data. The second is how the security industry has been conditioned to embrace a non-statistical approach with data. A transition is happening, but it’s been a marathon not a sprint. It will continue to take a significant amount of training, comfort and confidence to adopt AI approaches. And when that happens, it will be a great thing.
The future of cybersecurity is a world that sees an unbreakable partnership between humans and computers. Many have argued that AI will eliminate jobs, but that has not happened in other fields adopting AI. Computers and AI can offload a lot of the heavy lifting, but there will always be a need for analysts and their creative and analytical thinking to identify new patterns to fuel critical decision making.