IT training oreilly security with ai and machine learning khotailieu

Co m en ts of Laurent Gil & Allan Liska im Using Advanced Tools to Improve Application Security at the Edge pl Security with AI and Machine Learning Security with AI and Machine Learning Using Advanced Tools to Improve Application Security at the Edge Laurent Gil and Allan Liska Beijing Boston Farnham Sebastopol Tokyo Security with AI and Machine Learning by Laurent Gil and Allan Liska Copyright © 2019 O’Reilly Media All rights reserved Printed in the United States of America Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472 O’Reilly books may be purchased for educational, business, or sales promotional use Online editions are also available for most titles (http://oreilly.com/safari) For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com Editor: Virginia Wilson Production Editor, Proofreader: Nan Barber Copyeditor: Octal Publishing, LLC October 2018: Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Rebecca Demarest First Edition Revision History for the First Edition 2018-10-08: First Release The O’Reilly logo is a registered trademark of O’Reilly Media, Inc Security with AI and Machine Learning, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc The views expressed in this work are those of the authors, and not represent the publisher’s views While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work Use of the information and instructions contained in this work is at your own risk If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights This work is part of a collaboration between O’Reilly and Oracle Dyn See our state‐ ment of editorial independence 978-1-492-04312-6 [LSI] Table of Contents Preface v The Role of ML and AI in Security Where Rules-Based, Signature-Based, and Firewall Solutions Fall Short Preparing for Unexpected Attacks Understanding AI, ML, and Automation AI and ML Automation Challenges in Adopting AI and ML The Way Forward 10 11 Focusing on the Threat of Malicious Bots 15 Bots and Botnets Bots and Remote Code Execution 15 18 The Evolution of the Botnet 23 A Thriving Underground Market The Bot Marketplace AI and ML Adoption in Botnets Staying Ahead of the Next Attack with Threat Intelligence 23 24 29 30 AI and ML on the Security Front: A Focus on Web Applications 33 Finding Anomalies Bringing ML to Bot Attack Remediation 33 35 iii Using Supervised ML-Based Defenses for Security Events and Log Analysis Deploying Increasingly Sophisticated Malware Detection Using AI to Identify Bots 35 36 37 AI and ML on the Security Front: Beyond Bots 39 Identifying the Insider Threat Tracking Attacker Dwell Time Orchestrating Protection ML and AI in Security Solutions Today 39 40 41 42 ML and AI Case Studies 43 Case Study: Global Media Company Fights Scraping Bots When Nothing Else Works: Using Very Sophisticated ML Engines with a Data Science Team The Results 43 51 54 Looking Ahead: AI, ML, and Managed Security Service Providers 57 The MSSP as an AI and ML Source Cloud-Based WAFs Using AI and ML 57 59 Conclusion: Why AI and ML Are Pivotal to the Future of Enterprise Security 61 iv | Table of Contents Preface It seems that every presentation from every security vendor begins with an introductory slide explaining how the number and com‐ plexity of attacks an organization faces have continued to grow exponentially Of course, everyone from security operations center (SOC) analysts, who are drowning in alerts, to chief information security officers (CISOs), who are desperately trying to make sense of the trends in security, is acutely aware of the situation The ques‐ tion is how we, collectively, solve the problem of overwhelmed security teams? The answer in many cases now involves machine learning (ML) and artificial intelligence (AI) The goal of this report is to present a high-level overview aimed at a security leadership audience of ML and AI and demonstrate the ways security tools are using both of these technologies to identify threats earlier, connect attack patterns, and allow operators and ana‐ lysts to focus on their core mission rather than chasing around false positives This report also looks at the ways in which managed secu‐ rity service providers (MSSPs) are using AI and ML to identify pat‐ terns from across their customer base to improve security for everyone A secondary goal of the report is also to help tamp down the hype associated with ML and AI It seems that ML and AI have become the new buzzwords at security conferences, replacing “big data” and “threat intelligence” as the go-to marketing terms This report pro‐ vides a reasoned overview of the strengths and limitations of ML and AI in security today as well as going forward v CHAPTER The Role of ML and AI in Security Why has there been such a sudden explosion of ML and AI in secu‐ rity? The truth is that these technologies have been underpinning many security tools for years Frankly, both tools are necessary pre‐ cisely because there has been such a rapid increase in the number and complexity of attacks These attacks carry a high cost for busi‐ ness Recent studies predict that global annual cybercrime costs will grow from $3 trillion in 2015 to $6 trillion annually by 2021 This includes damage and destruction of data, stolen money, lost produc‐ tivity, theft of intellectual property, theft of personal and financial data, embezzlement, fraud, post-attack disruption to the normal course of business, forensic investigation, restoration and deletion of hacked data and systems, and reputational harm.1 Global spending on cybersecurity products and services for defending against cyber‐ crime is projected to exceed $1 trillion cumulatively from 2017 to 2021.2 The reality is that organizations have not been able to rely for a while on a “set it and forget it” approach to security using antiqua‐ ted, inflexible, and static defenses Instead, adaptive and automated security tools that rely on ML and AI under the hood are becoming the norm in security, and your security team must adapt to these technologies in order to be able to succeed Cybersecurity Ventures Annual Crime Report Cybersecurity Market Report; published quarterly by Cybersecurity Ventures; 2018 Security teams are tasked with protecting an organization’s data, operations, and people To protect against the current attack posture of their adversaries, these teams will need increasingly advanced tools As the sophistication level of malicious bots and other attacks increases, traditional approaches to security, like antivirus software or basic malware detection, become less effective In this chapter, we examine what is not working now and what will still be insufficient in the future, while laying the groundwork for the increased use of ML- and AI-based security tools and solutions Where Rules-Based, Signature-Based, and Firewall Solutions Fall Short To illustrate why rules-based and signature-based security solutions are not strong enough to manage today’s attackers, consider antivi‐ rus software, which has become a staple of organizations over the past 30 years Traditional antivirus software is rules-based, triggered to block access when recognized signature patterns are encountered For example, if a known remote access Trojan (RAT) infects a sys‐ tem, the antivirus installed on the system recognizes the RAT based on a signature (generally a file hash) and stops the file from executing What the antivirus solution does not is close off the infection point, whether that is a vulnerability in the browser, a phishing email, or some other attack vector Unfortunately, this leaves the attacker free to strike again with a new variation of the RAT for which the victim’s antivirus solution does not currently have a signa‐ ture Antivirus software also does not account for legitimate pro‐ grams being used in malicious ways To avoid being detected by traditional antivirus software, many malware authors have switched to so-called file-less malware This malware relies on tools already installed on the victims’ systems such as a web browser, PowerShell, or another scripting engine to carry out their malicious commands Because these are well-known “good” programs, the antivirus solu‐ tions allow them to operate, even though they are engaging in mali‐ cious activity This is why many antivirus developers have switched detection to more heuristic methods Rather than search just for matching file | Chapter 1: The Role of ML and AI in Security It would not stop… Advanced bot protection has now been in place for more than months, but the attackers are still sending daily bot traffic, even though the site is now being well protected Limiting resource utilizations There is also an unexpected benefit to the enhanced bot protection The company’s image store infrastructure had been strained due to the massive amount of static content that the sites were expected to serve to its global user base Also, as expected, most of the bot traffic represents uncacheable search from the site; hence, it shows a disproportionate workload for the web servers The caching functionality (included with the botnet protections implemented) dramatically reduced the load on infrastructure (see Figure 7-7), providing much-needed headroom while the organiza‐ tion continued to upgrade its infrastructure Figure 7-7 Caching improvements created by the bot management sol‐ ution 50 | Chapter 7: ML and AI Case Studies When Nothing Else Works: Using Very Sophisticated ML Engines with a Data Science Team For this organization, there were about 20,000 suspect requests over a few days (from a total of 56,000,000 legitimate requests over the same period) Those suspect requests came from many IP address, with a clear pattern: the same IP/user would systematically go through the entire subcontent of the site and never come back and then be replaced by another IP looking for different subcontent The traffic would come from apparently legitimate user agents, as shown here: A data science team applied an unsupervised ML algorithm to see whether the pattern of these 60,000 requests could be identified This is what was found: • The attack traffic was interspersed within the legitimate traffic, spread over several hundred IPs Graphical analysis of the attack was inconclusive (that is, human analysts were not able to see visual patterns), as demonstrated in Figure 7-8 When Nothing Else Works: Using Very Sophisticated ML Engines with a Data Science Team | 51 Figure 7-8 Example of attack traffic (in red) and legitimate traffic (in green), using a data visualization tool (x and y are irrelevant for this exercise, the goal is to identify patterns) • The ML platform finally identified a very strong pattern (with a correlation of almost 100%, the pattern represents almost all the attack traffic), using vectors of more than 120 elements (that is, more than 120 different pieces of information define an attack request) The pattern could only be identified over a space of 120 dimensions, as shown in Figure 7-9 52 | Chapter 7: ML and AI Case Studies Figure 7-9 Example of a vector that defines the pattern of the attack traffic Correlated elements The highest correlated elements of the vector represent a random sort group of headers and header content (certain sorts of HTTPS headers and values can represent the main part of the attack traffic), as shown in Figure 7-10 When Nothing Else Works: Using Very Sophisticated ML Engines with a Data Science Team | 53 Figure 7-10 Main elements of the traffic vector, with correlation to the attack traffic After the ML engine was deployed in production, the platform was able to identify almost all malicious traffic with only a few false posi‐ tives, as illustrated here: The Results The bot management solution allowed the media company to miti‐ gate attack traffic using several strategies: ML-based, automated HIC (a behavior-based analysis of the traffic) and sophisticated, super‐ vised ML that prevent traffic that corresponds to a pattern that is hard for humans to visualize This was represented by vectors of 54 | Chapter 7: ML and AI Case Studies more than 120 elements Such ML was applied by a specialized team of data scientists At the same time, the company was able to apply controls to restrict resources (bandwidth and CPU) allocated to illicit traffic The com‐ pany continues to work closely with the managed services provider that provided the cloud-based bot management solution in order to research and identify increasingly advanced—and, in many cases, custom—malicious bots and other targeted attacks The Results | 55 CHAPTER Looking Ahead: AI, ML, and Managed Security Service Providers Chapter discussed how AI and ML are improving bot detection, whereas Chapter discussed other areas in security for which the introduction of AI and ML has had a real impact This chapter examines how managed security service providers (MSSPs) are incorporating these technologies, developing AI and ML techniques of their own and how that can benefit your organization Your orga‐ nization might not be ready to adopt AI and ML solutions in-house —given the many challenges associated with AI and ML and the fact that most security teams are already overworked, the idea of adding new capability seems daunting However, by using an MSSP, your organization can potentially reap the benefits of the investment that the MSSP has made in AI and ML technologies Just as with any‐ thing else in security, a successful partnership with an MSSP does require work on your end, but it can help improve your organiza‐ tion’s security posture The MSSP as an AI and ML Source MSSPs have always had an inherent advantage when it comes to security: rather than protecting a single organization from attacks, the MSSP is protecting hundreds or thousands of clients from all types of attacks The security operations center (SOC) analysts who work for MSSPs are presented with thousands of different attacks at any time, and they work with organizations ranging from sprawling 57 government agencies to small businesses This means that the SOC analysts not only need to be able to quickly pivot from one attack type to another, they also need to be able to pivot from one environ‐ ment, including the level of experience of the security person or team in that environment, to another That is why, over the years, MSSPs have developed the use of inde‐ pendent layers of threat detection techniques Many of these MSSPs have had to build AI and ML solutions in-house to process billions of security events across thousands of different security solutions every hour The results of that AI and ML becomes operational, tac‐ tical, and strategic threat intelligence that gets fed back into moni‐ toring systems for all customers and allows MSSPs to quickly respond to a customer threat and alert that customer in a manner in which the customer can process and act on the event Technologies such as anti-DDoS systems, web application firewalls (WAFs), and bot management solutions are fully capable of con‐ suming operational and even tactical threat intelligence and can be used not only to detect threat actors, but also to stop their activity This allows MSSPs to share information garnered from monitoring one customer’s systems with all other customers, irrespective of whether the other customers have the same system Almost all of this is invisible to the customer, unless an alert is triggered More importantly, MSSPs are often incentivized to share what they have learned (without attribution, of course) not just with their cus‐ tomers, but with the broader internet Through blogs, conference presentations, and white papers, MSSPs are helping customers and noncustomers alike better protect themselves against the most advanced bots MSSPs have been taking advantage of AI-enabled log management systems and tools to find the critical events that are of the highest importance Some of the events could indicate that attackers are probing targeted victims in search of ways to get in Other events could suggest that an attack is currently underway or indicate that an attack has already occurred, and the attacker has achieved a foot‐ hold Time is of the essence when it comes to identifying and halting malicious activity Believe it or not, according to a report by Trust‐ wave, “the median number of days from the first intrusion to detec‐ tion of the compromise decreased to 49 days in 2016 from 80.5 days 58 | Chapter 8: Looking Ahead: AI, ML, and Managed Security Service Providers in 2015, with values ranging from zero days to almost 2,000 days (more than five years).”1 What this means is that the time from a system being taken over to the time it’s detected is currently being measured in months, not minutes Most organizations that have been breached allow an attacker to remain resident in their networks for days, weeks, months, or even longer Again, the promise of AI and ML enable advanced persistent threat (APT) detection technologies that are likely to help reduce the attacker dwell time to days, hours, or even seconds That is another advantage that an MSSP brings The MSSP SOC monitors its customers’ security stack 24/7/365 and is con‐ stantly on the lookout for new types of intrusions The MSSP SOC staff knows about new tactics long before most in-house SOCs do, and they apply that knowledge, using AI and ML, to all of their cus‐ tomers Cloud-Based WAFs Using AI and ML WAF appliances installed within the data center were, for the longest time, a standard requirement for many enterprises to combat mali‐ cious traffic at the network layers In recent years the static, appliance-based WAF has been replaced by cloud-based WAF offer‐ ings Cloud-based WAFs provide additional scalability, costeffectiveness due to a lack of hardware spend, and the flexibility of real-time updates from the threat intelligence team that operates the cloud-based WAF Because of the proliferation of malicious applica‐ tion layer attacks, such as volumetric DDoS and content scraping, cloud-based WAFs have almost become a requirement Their ease of deployment, flexibility, expandability, and ability to rapidly deploy protections against newly discovered threats has made them an indispensable tool for any organization looking to protect their web applications As the frequency and breadth of application layer data breaches con‐ tinue to increase throughout 2018, the use of cloud-based WAFs is likely to surge in lockstep Investments from cloud providers to expand the functionality of their respective WAF offerings should drive a shift away from deploying third-party virtual machines 2017 Trustwave Global Security Report Cloud-Based WAFs Using AI and ML | 59 (VMs) toward adopting proprietary alternatives That will still be able to take advantage of well-recognized rule sets from pure-play security vendors such as Alert Logic, Fortinet, and F5 The use of ML and AI to bolster WAF rule sets and reputation feeds will increase, ensuring that applications are up to date with the most recent patches to better defend against previously unknown threats Addressing the Application Security Challenge One of the greatest challenges of web application security is securing applications appropriately, without blocking good traffic It is actually quite the balancing act for those configuring and tuning edge defenses For example, WAFs often take months to tune effec‐ tively, while at the same time DevOps groups are turning out appli‐ cation updates at intervals that are outpacing their SecOps counterparts This is where AI and ML comes in With AI and ML, operators can teach the WAF to get better at its job by reducing false positives and negatives; in an extremely short period of time The time to tune a ML-enabled WAF is often in measured in hours not months, and those that embrace the technol‐ ogy are beginning to stay ahead of DevOps—and attackers, as well In truth, exploited application vulnerabilities are the primary cause of web application data breaches, and WAFs are one of the most dif‐ ficult technologies to use effectively As today’s security vendors begin to embed AI and ML functionality into their cloud-based WAF technology, they are enabling the human-computer synergy so badly needed in web application security WAF vendors who are not embracing AI and ML for a host of different reasons will eventually go by the wayside, like their first-generation firewall counterparts 60 | Chapter 8: Looking Ahead: AI, ML, and Managed Security Service Providers CHAPTER Conclusion: Why AI and ML Are Pivotal to the Future of Enterprise Security There is no doubt that AI- and ML-enabled technologies are already a critical part of many security team’s arsenal Despite the hesitation many in security have around AI and ML, especially as buzzwords, the fact is that many security tools are already using AI and ML behind the scenes There are just too many new and evolving threats for even the largest security team to effectively track them AI and ML allow security vendors and security teams to focus on their core mission while letting the AI and ML the bulk of the grunt work to build better security solutions Here are some steps that organizations can follow when adopting AI and ML: • Embrace AI and ML approaches • Agree that this is where security is going • Develop a team to investigate the feasibility of using what is available now • Stay abreast of what is coming • Document and track all your research and findings AI and ML might become very important because of regulations like General Data Protection Regulation (GDPR) in the European 61 Union, The Personal Information Protection and Electronics Docu‐ ments Act in Canada, the California Consumer Privacy Act of 2018, and other regulations that are likely coming soon An organization must everything possible to protect the consumer-based data it maintains Organizations that fail to so will face huge fines However, what most organizations don’t realize is that these regula‐ tions not advise on what types of technology are needed to pro‐ tect the data of their customers and employees The regulations broadly state that it is the responsibility of the organization to everything possible to keep applications and data secure Reading through the specific language used in these regulations, you will often find terms such as “reasonable security procedures,” “appro‐ priate practices,” or “nature of the information to protect.” What this means is that an organization that has been breached will need to demonstrate, most likely in a court of law, that everything possible was done to protect the personally identifiable information and other data it stores This completely hints at the concept of due care, which is defined as the effort made by an ordinarily prudent or rea‐ sonable party to avoid harm to another One can easily envision the courtrooms of the future in which the defendants will be CISOs, CIOs, and CEOs of major corporations standing in front of a jury of their peers—or even worse, standing in front of a group of their government legislators—trying to explain why they did not exercise due care similar to their peers It doesn’t take a lot of imagination to envision this, just look at Mark Zucker‐ berg’s (CEO of Facebook) or Richard Smith’s (former CEO of Equi‐ fax) testimony in front of Congress Moving forward, as AI and ML become embedded in the existing tools in use today, or the new tools making their way to the market with AI and ML already baked in, highly skilled human operators will still be needed to understand how to use the tools to their fullest ability Just as pilots understand their aircraft to an extreme degree, security professionals will need to understand the AI- and MLenabled tools at their disposal Or to put it another way, no one shows up to a modern-day battlefield carrying a spear The future of AI-enabled security is quite promising Organizations are already beginning to understand how to operate their humancomputer, AI- and ML-enabled defenses more like pilots operate their fighter jets 62 | Chapter 9: Conclusion: Why AI and ML Are Pivotal to the Future of Enterprise Security In that spirit, attackers beware Modern-day cyberpilots are getting better equipped and becoming much smarter at defeating your attacks Conclusion: Why AI and ML Are Pivotal to the Future of Enterprise Security | 63 About the Authors Laurent Gil runs product strategy for internet security at Oracle Cloud A cofounder of Zenedge Inc., Laurent joined Oracle Dyn Global Business Unit in early 2018 with Oracle’s acquisition of Zen‐ edge Prior to that, Laurent was CEO and cofounder of facial recog‐ nition software and machine learning company, Viewdle, which was acquired by Google in 2012 Laurent holds degrees from the Cybernetic Institute of Ukraine (Doctorate Honoris Causa), the Wharton School of Business (MBA), Supélec (M.Sc., Computer Science and Signal processing), the Collège des Ingénieurs in Paris (post-graduate degree, Manage‐ ment), and is Summa Cum Laude of The University of Bordeaux (B.S Mathematics) Allan Liska is a solutions architect at Recorded Future Allan has more than 15 years experience in information security and has worked as both a blue teamer and a red teamer for the intelligence community and the private sector Allan has helped countless organizations improve their security posture using more effective and integrated intelligence He is the author of The Practice of Net‐ work Security, Building an Intelligence-Led Security Program (Syn‐ gress) and NTP Security: A Quick-Start Guide (Apress), and the coauthor of DNS Security: Defending the Domain Name System (Syn‐ gress) and Ransomware: Defending Against Digital Extortion (O’Reilly) ... Security with AI and Machine Learning Using Advanced Tools to Improve Application Security at the Edge Laurent Gil and Allan Liska Beijing Boston Farnham Sebastopol Tokyo Security with AI and. .. strengths and limitations of ML and AI in security today as well as going forward v CHAPTER The Role of ML and AI in Security Why has there been such a sudden explosion of ML and AI in secu‐ rity?... automated security tools that rely on ML and AI under the hood are becoming the norm in security, and your security team must adapt to these technologies in order to be able to succeed Cybersecurity

IT training oreilly security with ai and machine learning khotailieu

Thông tin tài liệu

Từ khóa liên quan

Mục lục

Cover

Copyright

Table of Contents

Preface

Chapter 1. The Role of ML and AI in Security

Where Rules-Based, Signature-Based, and Firewall Solutions Fall Short

Preparing for Unexpected Attacks

Chapter 2. Understanding AI, ML, and Automation

AI and ML

Automation

Challenges in Adopting AI and ML

The Way Forward

Chapter 3. Focusing on the Threat of Malicious Bots

Bots and Botnets

Bots and Remote Code Execution

More Flexible Malicious Bots, More Risks to Your Business

Chapter 4. The Evolution of the Botnet

A Thriving Underground Market

The Bot Marketplace

AI and ML Adoption in Botnets

Staying Ahead of the Next Attack with Threat Intelligence

Chapter 5. AI and ML on the Security Front: A Focus on Web Applications

Finding Anomalies

Bringing ML to Bot Attack Remediation

Using Supervised ML-Based Defenses for Security Events and Log Analysis

Deploying Increasingly Sophisticated Malware Detection

Tài liệu cùng người dùng

Tài liệu liên quan