LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (2024)

The Wonderful World of AI

As a small kid, I remember watching flying monkeys, talking lions, and houses landing on evil witches in the film The Wizard of Oz and thinking how amazing it was. Once the curtain pulled back, exposing the wizard as a smart but ordinary person, I felt slightly let down. The recent explosion of AI, and more specifically, large language models (LLMs), feels similar.

On the surface, they look like magic, but behind the curtain, LLMs are just complex systems created by humans. While impressive, LLMs are still within the realm of reality — and we can manage and protect them just as we have done with other technologies for decades.

Even though every software product, dishwasher, and dentist claims, “Now with AI!”, little has been done to help protect LLMs and their users. Over the past year, we’ve seen exploitation of LLM-based applications, causing them to operate outside their guardrails. For example:

There’s a view that defending these sophisticated tools is an insurmountable challenge. This misconception arises from the perceived complexity of LLMs and the rapid pace of AI advancements. Dispelling this myth is crucial, as we are seeing numerous LLM-based applications deployed with very little security in mind.

Despite their advanced capabilities, we can defend LLMs using established cybersecurity principles and methodologies. We can implement effective defenses by understanding the specific vulnerabilities and threat vectors associated with LLMs.

In this blog, we examine threats to LLMs and provide some examples, using the OWASP Top 10 framework and Splunk, to better defend LLM-based applications and their users.

Unveiling the top threats to LLMs today

We wanted a framework to follow that clearly outlines the threats to LLMs, and one of the best we found was the OWASP Top 10 for Large Language Model Applications.

If you’re unfamiliar, OWASP stands for the Open Worldwide Application Security Project. OWASP has a lofty but simple vision of “No more insecure software”. The most well-known OWASP project is the OWASP Top 10, which concentrates on web application security and has been around for more than 20 years.

Building on the original Top 10, the more specific Top 10 for LLM Applications framework provides clear guidance on:

  • Vulnerabilities within LLMs
  • Where in the application flow the vulnerabilities exist
  • Some suggestions on how to mitigate them

Below are the Top 10 threats to LLM Applications as defined by OWASP:

LLM01:
Prompt Injection
Prompt Injection Vulnerability occurs when an attacker manipulates a large language model (LLM) through crafted inputs.
LLM02:
Insecure Output Handling
Insecure Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems.
LLM03:
Training Data Poisoning
Training data poisoning refers to manipulation of pre-training data or data involved within the fine-tuning or embedding processes to introduce vulnerabilities.
LLM04:
Model Denial of Service
An attacker interacts with an LLM in a method that consumes an exceptionally high amount of resources, which results in a decline in the quality of service for them and other users.
LLM05:
Supply Chain Vulnerabilities
The supply chain in LLMs can be vulnerable, impacting the integrity of training data, ML models, and deployment platforms.
LLM06:
Sensitive Information Disclosure
LLM applications have the potential to reveal sensitive information, proprietary algorithms, or other confidential details through their output.
LLM07:
Insecure Plugin Design
LLM plugins are extensions that, when enabled, are called automatically by the model during user interactions. They are driven by the model, and there is no application control over the execution.
LLM08:
Excessive Agency
Excessive Agency is the vulnerability that enables damaging actions to be performed in response to unexpected/ambiguous outputs from an LLM (regardless of what is causing the LLM to malfunction).
LLM09:
Overreliance
Overreliance can occur when an LLM produces erroneous information and provides it in an authoritative manner. While LLMs can produce creative and informative content, they can also generate content that is factually incorrect, inappropriate, or unsafe.
LLM10:
Model Theft
This entry refers to the unauthorized access and exfiltration of LLM models by malicious actors or APTs. This arises when the proprietary LLM models (being valuable intellectual property), are compromised, physically stolen, copied, or weights and parameters are extracted to create a functional equivalent.

How we collected the data

Instead of directly interfacing with each LLM that arises, we looked at where we could make the biggest impact and decided on prompts and responses. Prompts are the inputs on which the LLM works, and responses are what the LLM answers back with.

We found that we could sit to the side of the LLM and easily capture these prompts and responses for our analysis.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (1)

High-level diagram outlining our logging approach

How did we do this? Logging. We instrumented our LLM-based application (a simple chat interface) to log all prompts, responses, and other items of interest (token use, latency, etc), which were then ingested into Splunk using the Splunk Open Telemetry Collector (OTel). We used the OTel collector instead of the more commonly used Universal Forwarder, as we also wanted to collect metrics and traces as part of our LLM-based Application stack.

We wrote the frontend ourselves, so we were able to include relevant logs and metrics at development time, which were collected using the OTel Collector’s journald receiver. We then sent the data to our Splunk instance via the built-in HTTP Event Collector exporter. For the backend LLM service (llama-cpp-python), we ran it with the OpenTelemetry auto-instrumentation package to collect telemetry at runtime.

Detecting LLM threats

We found that five of the OWASP Top 10 could be addressed using just the prompts and responses from the LLM:

  • LLM01: Prompt Injection
  • LLM02: Insecure Output Handling
  • LLM04: Model Denial of Service
  • LLM06: Sensitive Information Disclosure
  • LLM10: Model Theft

Our research is still theoretical in nature. We’ve developed detections for these areas that work in our lab, but if I’m being honest with you (would I ever lie to you?), I’d tweak and tune for use in your own environments.

Another moment of honesty here — regardless of vendor, most out-of-the-box detections should be tweaked and tuned to best suit your environments. (I’ll climb off my soapbox now.)

LLM01: Prompt Injection

Detecting prompt injection involves many interconnected factors. Prompt injection attacks manipulate an LLM by feeding it malicious inputs designed to alter its intended output. The complexity arises from the numerous ways an attacker can craft these inputs, making it difficult to create a simple detection rule.

In such cases, traditional search methods might fall short. But, by leveraging machine learning and anomaly detection techniques within Splunk, we can build models that identify unusual patterns indicative of a prompt injection attack.

Luckily, Protect AI already created a model for prompt injection detection that we could utilize. However, as with any public model, make sure you do your due diligence, as they can be a risk unto themselves. For our research, we built a Jupyter Notebook that took data from Splunk, passed it through this detection model, and then outputted the results back into Splunk.

The first stage of most Jupyter Notebooks is where you prepare your environment to run the subsequent code by installing and importing various Python libraries.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (2)

Jupyter Notebook library import phase

In Stage 1, we collect the necessary data from Splunk for use within the notebook. You can read more in the image below, but the main aspect is utilizing the fit command to forward the search results into our container.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (3)

Jupyter Notebook data collection phase

In Stage 2, we define our tokenizer, model, and classifier function. This is where the Notebook's main work will take place.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (4)

Jupyter Notebook model initialization phase

Further down in the Notebook, we set a relatively high threshold for detecting prompt injection (the closer you get to 1.0, the closer the match). However, this can be easily tweaked and tuned, depending on how many false positives you’re willing to live with.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (5)

Jupyter Notebook threshold code

The last thing done in our Notebook is passing the data we collected from Splunk in Stage 1 through our model and then passing the results back to Splunk as a new sourcetype, which we’ve called prompt:injection.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (6)

Jupyter Notebook code for running data through a model and exporting to Splunk

As shown in the screenshot below, the top three results from our dataset are all prompt injection attempts, which the model correctly identifies with high probability.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (7)

Splunk prompt injection detection results

LLM02: Insecure Output Handling

Insecure output handling can impact both the users of the LLM and the LLM-based application itself:

  • Cross-site scripting (XSS) could impact a user’s browser if unsanitized LLM responses are allowed to be presented back.
  • SQL injection (SQLi) could pose problems for backend systems if user prompts are also not sanitized before being processed.

Here’s what I want to make clear: you should implement basic web application security measures for your LLM-based applications. And you should also create your detections to help pick up when the bad stuff slips through the cracks.

Both XSS and SQLi detections can make use of regex, but that becomes cumbersome very quickly. As shown in the prompt injection example, we utilized Jupyter Notebooks to detect the presence of XSS and SQLi in our data. We couldn’t find publicly available models to detect either of these threats, so we trained our own models. Running through the training of the models would be a blog unto itself, so I’ll summarize the steps we took along with our results here.

XSS

For XSS, we attempted to train models using multiple algorithms:

  • Logistic Regression
  • Random Forest
  • Gradient Boosting
  • Support Vector Machine

We combined multiple public datasets (they are malicious, so I won’t link them here) and then ran them through our training to determine the best algorithm for XSS detection.

The images shown below are called confusion matrices and are a way to display true and false positives. Number 1 is our label for actual XSS data, while 0 is our label for benign data. Essentially, we want bigger numbers in the darker boxes and smaller numbers in the lighter boxes. A final score, the F1 score, is then calculated to give us an average score useful for evaluating the accuracy of our models.

Logistic Regression provided very good results, with an F1 score of .9898.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (8)

XSS Logistic Regression training results

Random Forest provided slightly better results with an F1 score of .9934.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (9)

XSS Random Forest training results

Gradient Boosting came in a bit lower with an F1 score of .9855.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (10)

XSS Gradient Boosting training results

Support Vector Machine also had a very high F1 score of .9939.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (11)

XSS Support Vector Machine training results

All of the algorithms had F1 scores high enough for us to consider them usable for detecting XSS. That means we need to look at additional factors to determine the best algorithms for our use case. So, we processed 33,426 events in Splunk, looking for XSS. Below are the times it took to process each algorithm:

AlgorithmTime to Process
Logistic Regression52.40 seconds
Random Forest60.99 seconds
Gradient Boosting52.11 seconds
Support Vector Machine99.48 seconds


Logistic regression and gradient boosting were very close to one another, with random forest not too far behind. Support vector machine took nearly twice as long to process our events. Unless the support vector machine algorithm had a much higher accuracy, I’d probably leave it out of the running at this point.

We now want to examine the other three algorithms to see if we can eliminate any based on other factors.

Here is a snippet of the decisions from each algorithm on whether or not a particular response was XSS. Logistic regression is the only algorithm that correctly identifies all three of these XSS responses as malicious.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (12)

XSS detections per algorithm

We have shown here our method of determining which algorithm best suits our use case. This time, I would lean toward logistic regression, but it’s always worthwhile to perform periodic tests to validate that your results haven’t drifted over time.

SQL Injection

For SQL Injection, we decided to fine-tune an existing model, DistilBERT, which many other security-related machine learning models have successfully used. As with the XSS example, we won’t outline all of the training steps; instead, we will just call out some major points.

DistilBERT is a smaller, faster, and more efficient version of the BERT (Bidirectional Encoder Representations from Transformers) model. I also want to point out that transformers don’t mean the toys you played with growing up — they are a neural network, and NVIDIA gives a great overview of them here.

Transformers allow us to use Graphics Processing Units (GPUs) to accelerate our training and detections. This NVIDIA blog explains the benefits of using GPUs for AI workloads in more detail.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (13)

Jupyter Notebook code to use a GPU if available

Model training typically involves multiple epochs where training is split into multiple sets with these benefits.

  • Improved accuracy: Multiple epochs allow the model to learn the underlying patterns in the data more effectively.
  • Overfitting & underfitting detection: By evaluating the model on a validation dataset after each epoch, we can detect overfitting (model performs well on training data but poorly on validation data) and underfitting (model performs poorly on both training and validation data).
  • Model tuning: Epoch training allows for monitoring and adjusting hyperparameters, such as learning rate, batch size, and the number of epochs, to improve model performance.

Our code implements a feature called early stopping. Early stopping is a technique used to halt training when the model's performance on a validation set stops improving, which can help prevent overfitting.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (14)

SQLi fine-tuned DistilBERT model training epochs

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (15)

SQLi fine-tuned DistilBERT model confusion matrix

After fine-tuning DistilBERT, our SQLi model returned with an F1 score of .9946. This is very accurate. We now have enough confidence to begin testing our detections using the model.

Using this model on a publicly available dataset containing SQL injections (I won’t link to this one either, as it is malicious), we see that we accurately identify SQL injections in our responses.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (16)

SQLi detection results

LLM04: Model Denial of Service

Denial of service is a threat that we can detect with some standard Splunk techniques. We log the token usage per prompt and response session, the users consuming these tokens, and the latency involved in the transactions. This helps us detect outliers where users may attempt to trick the system into consuming too many resources.

We looked at a variety of ways that users could over-consume resources and cause impact to services:

  • Continuous requests hit token limits, where token usage rates are high per prompt.
  • Queries crafted to trigger slow inference speeds, with complex inputs and low tokens/second.
  • High prompt rates overload the task queuing system because each backend can only support so many requests per node.

The queries for the above examples are all very implementation-specific, but if you’re using Splunk Observability, the APM detectors can provide excellent baselining and anomaly detection for these metrics if the telemetry is sent that way.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (17)

Attempt to consume system resources by repetitive cheese prompt

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (18)

Token usage stats from cheese prompt

LLM06: Sensitive Information Disclosure

To help detect sensitive information disclosure, we utilized a pre-existing SDK created by Microsoft called Presidio. Presidio is a powerful tool for identifying and anonymizing Personally Identifiable Information (PII). We used our Jupyter Notebook approach again to feed prompts and responses through Presidio to identify all types of PII data.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (19)

Presidio PII results

But as you can see in the image above, Presidio is identifying the same number as multiple types of PII. An initial solution was to take any data that Presidio marked as PII, and then send that data with some further context (50 characters before and 50 characters after) into another LLM to more accurately label the type of PII data.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (20)

Jupyter Notebook code to further analyze Presidio PII detection with LLM

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (21)

Response after sending Presidio results into an LLM for correct contextualisation

LLM10: Model Theft

We are still exploring model theft. Based on prior research, we believe that this threat can also be detected solely through the analysis of prompts and responses. The sticking point right now is finding a large enough dataset for our research.

However, the same methods that we used to detect XSS, SQLi, and PII data should also work very nicely for detecting model theft.

Conclusion

Our work analyzing the threats aligned to the OWASP Top 10 for LLM-based Applications should lead you down a path to better securing your own LLM-based applications. This landscape is changing as rapidly as any we’ve seen in the past, so undoubtedly, we’ll come across new threats but also new defenses to counter them.

We are currently building production detections for these areas with our teammates in the Splunk Threat Research Team (STRT). For now, please use our research and model development examples as the basis for better protecting your own LLM-based applications.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (22)

James Hodgkinson

James works as a Security Strategist within the SURGe team at Splunk. He calls Brisbane, Australia home and has spent much of his career supporting the state law enforcement and emergency services. When he's not solving problems in Python, he's being "that guy" about how Rust avoids the bugs of the past, or contributing to open source projects. He claims it's rarely DNS.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (23)

Shannon Davis

Security practitioner, Melbourne, Australia via Seattle, USA.

LLM Security: Splunk & OWASP Top 10 for LLM-based Applications | Splunk (2024)

FAQs

What is the security of LLM? ›

Importance of Security in LLM Usage

The implications of a data breach in LLM systems extend beyond simple data loss to include regulatory and reputational damage. Entities using LLMs must adopt rigorous data protection strategies, frequent security audits, and incident response plans to mitigate these risks.

What are the security vulnerabilities of large language model? ›

While offering significant advantages, these models are also vulnerable to security and privacy attacks, such as jailbreaking attacks, data poisoning attacks, and Personally Identifiable Information (PII) leakage attacks.

What is LLM best for? ›

The LLM was created for lawyers to expand their knowledge, study a specialized area of law, and gain international qualifications if they have earned a law degree outside the U.S. or Canada. If you're looking to advance your legal career or take the next step in your academic journey, an LLM could be for you.

How to make LLM safe? ›

1. How to ensure LLM safety? To ensure the safety of Large Language Models (LLM), it's important to implement robust data handling and processing protocols, regularly update and audit the models for biases or errors, and enforce strict access controls with data security tools like OptIQ.

What are the 4 main types of security vulnerability? ›

What are the 4 major types of security vulnerability?
  • Process (or procedural) vulnerabilities.
  • Operating system vulnerabilities.
  • Network vulnerabilities.
  • Human vulnerabilities.
Jan 12, 2024

Which of the following is an LLM vulnerability? ›

#1 Prompt Injection

Prompt Injection Vulnerability occurs when an attacker manipulates a large language model (LLM) through crafted inputs, causing the LLM to unknowingly execute the attacker's intentions.

What are the three models of vulnerability? ›

The triple vulnerability model (Barlow, 2000, 2002) posits that three vulnerabilities contribute to the etiology of emotional disorders: (1) general biological vulnerability (i.e., dimensions of temperament such as neuroticism and extraversion); (2) general psychological vulnerability (i.e., perceived control over life ...

What is an LLM guard? ›

LLM Guard is a permissively licensed open- source project that allows for rapid innovation, transparency, and is accessible to all. A commercial version with expanded features, capabilities, and integrations within the Protect AI platform is coming soon.

Is an LLM in the U.S. worth it? ›

Most LL. M. programs do not provide a significant early-career earnings boost compared to just getting a JD. Given their significant cost, they may not be worth it for most lawyers.

What is an LLM firewall? ›

LLM prompt firewalls evaluate users' prompts, thereby identifying and preventing malicious use or misuse. The firewall redacts sensitive information, preventing important data from being accessed or used by the LLM.

References

Top Articles
Latest Posts
Article information

Author: Lilliana Bartoletti

Last Updated:

Views: 6353

Rating: 4.2 / 5 (73 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Lilliana Bartoletti

Birthday: 1999-11-18

Address: 58866 Tricia Spurs, North Melvinberg, HI 91346-3774

Phone: +50616620367928

Job: Real-Estate Liaison

Hobby: Graffiti, Astronomy, Handball, Magic, Origami, Fashion, Foreign language learning

Introduction: My name is Lilliana Bartoletti, I am a adventurous, pleasant, shiny, beautiful, handsome, zealous, tasty person who loves writing and wants to share my knowledge and understanding with you.