Artificial Intelligence (AI) is hot right now, there’s no denying it. AI is new, it’s shiny, and it does some cool things. Almost every organization I talk to is using it, looking to use it, and/or building their own. It’s a gold rush, and if you’re not digging for gold, then you’re selling pickaxes to those who are. Many, if not most, of these organizations looking to leverage AI have posed questions like the following to themselves: “What are the risks of AI?”, “Is AI secure?”, “Is my data safe?”. I intend to help answer those questions, and maybe a few others, by looking at AI through the lens of cybersecurity fundamentals.
Before we get into my observations, I want to set the stage regarding what AI is, and what it is not. I’m not going to go into the strict definitions of Large Language Models (LLMs), Artificial General Intelligence (AGI), Machine Learning (ML), Generative AI (GenAI), or Neural Nets. I am going to use the term “AI” broadly, but in general, I will be referring to GenAI and LLMs that utilize Neural Nets. There are other forms of AI models that do not use Neural Nets, but currently, those are in the minority. AI doesn’t think; AGI is still a pipe dream given our current technological limitations. AI is a stochastic parrot. Based on the data a model was trained on, it will present a somewhat randomized response that is statistically likely to be considered “good.” The more data there is, and the more training there is, the more likely “good” responses are to occur. It becomes a brute-force math equation, and that’s why AI models need high-powered GPUs. GPUs are number-crunching math machines. We use the same hardware in cybersecurity to crack password hashes through brute-force methods.
Open vs. Closed Models
There is another important distinction between AI models: open and closed. Open models share data from all users, like OpenAI’s ChatGPT. Closed models do not share data between organizations. This gets to the first question posed above, “What are the risks of AI?”. Well, in the case of an open model and the use of confidential or sensitive data, the obvious risk is the exposure of that confidential/sensitive information. This is what has led organizations wanting to leverage AI to build and train their own closed-model AI platforms. They solved the confidentiality problem, but did they answer the question, “Is AI secure?”
Security (cybersecurity/information security) is often thought of as being an amalgamation of three different things: Confidentiality, Integrity, and Availability. This is referred to as the “CIA Triad.” The focus of confidentiality is the assurance that data is restricted from unauthorized use. Threats affecting confidentiality may lead to unauthorized disclosure. Data integrity is about the veracity of data and protecting it from being altered from its original protected state. There are three components to integrity:
- Preventing unauthorized parties from making modifications
- Preventing authorized parties from making unauthorized modifications (such as mistakes)
- Maintaining consistency of data in a verifiable manner.
Last, but not least, availability is about ensuring that authorized parties have timely and uninterrupted access to data.
Trust Without Verification – AI’s Black Box
AI neural nets operate as a “black box,” and that’s where the fundamental security issue resides. Data and prompts go in and responses come out, but once the process has been initiated, there is no way to know what decisions the AI platform made to result in the responses given. As an example, if an autonomous vehicle strikes a pedestrian when we would expect it to hit the brakes, the black-box nature of the system means we can’t trace the system’s processes and see why it made the decision to hit the pedestrian. The black-box nature of AI is in stark contrast to the second and third components of integrity. Is AI secure? Fundamentally, no, it is not. We lose control of the data with AI, and unauthorized modifications can be made to it. These modifications may also impact the consistency of the data going forward.
AI also fails to preserve data integrity when it mistakenly creates content with no basis in reality. Colloquially, this content is called an AI hallucination. Hallucinations may be as innocuous as six or seven fingers on the hand of a person in an AI-generated image, but could be potentially far more impactful. Consider the implications and potential liability if we were to trust AI to make medical diagnoses and treatment recommendations. How could we possibly require AI to take up the Hippocratic Oath, let alone follow it?
Let’s look at another example of when data is out of our control but we maintain confidentiality, ransomware. A ransomware attack encrypts your data, and the attackers require a ransom to release the encryption keys to you. In the case of ransomware, unauthorized parties have made modifications to your data by encrypting it, and in doing so you’ve also lost access to your data. This is why ransomware is considered a breach, not just an incident, even if one can prove that no data exfiltration occurred (confidentiality is maintained). Does this mean that the use of AI, even a closed model, is considered a breach? I won’t pretend to be a lawyer or a judge, so I cannot make that determination. I can say that it does give me pause.
Trends in AI Implementation
So what are other people thinking and/or doing regarding AI implementation? In a recent whitepaper, PagerDuty polled 100 Fortune 1000 executives regarding AI. 25% of these executives do not trust AI, and 100% of them have concerns about security risks. 98% have paused AI implementation to establish policies, and only 29% have guidelines in place. Another report by Elastic polled 3,200 individuals from all over the world. In this report, 99% of respondents acknowledge the positive impacts of AI; however, 89% reported that their use of AI is being slowed. Of the various reasons for the slowing of AI use, 40% responded that it was due to fears around security and data privacy.
Embracing the Cutting Edge
So let’s revisit the three questions from earlier.
- What are the risks associated with AI?
- This varies from organization to organization and from use case to use case. A proper risk assessment should be conducted to identify and analyze threats associated with AI.
- Is AI secure?
- No. AI is inherently and fundamentally insecure. At least until the black-box problem is solved.
- Is my data safe?
- No. Due to the black-box nature of AI, the “safety” of one’s data is not unlike the state of Schrödinger’s cat. Because we can’t look inside the black box at the data, it is both safe and unsafe at the same time.
I urge caution to organizations and individuals who are using AI, but I don’t want to entirely dissuade people from using AI. There are plenty of tools that we use daily that are also inherently insecure, like SMS texting. Text messages are not encrypted, they are transmitted in plain text. Does that mean you shouldn’t text anyone? No, of course not, but people should be thoughtful of what they send via text message because it can be easily intercepted. Individuals can make the call about what they include in their own texts. I urge similar caution with the use of AI, particularly considering the sensitivity of the data in play, and any regulations surrounding it. Understand the limitations and risks associated with the use of AI, and make decisions accordingly.