Skip to main content

Securing RAG & Agentic Chatbots with OWASP LLM Top 10

Β· 5 min read
Dinesh Gopal
Technology Leader, AI Enthusiast and Practitioner

Over the past two years, I’ve been working on AI applications πŸ€–, guiding organizations to build AI governance frameworks, responsible AI policies, and deploying production-ready systems.

From this experience, I can confidently say: figuring out the technical part is fun πŸŽ‰ and often the easier part. The bigger challengeβ€”and where most time is spentβ€”is building responsible AI practices and governance frameworks that scale across the enterprise.

In my previous post, I discussed how to approach AI governance and frameworks at the enterprise level. In this post, let’s go through a quick 101 on designing AI application architectures responsibly.

πŸ“– Reference: OWASP Top 10 for LLM Applications


πŸ—οΈ Why Architecture Matters in AI Applications​

The AI landscape changes daily ⚑, making it difficult to lock down a future-proof architecture. A good starting point is defining:

  • 🎯 The objective of the AI application
  • πŸ–₯️ The platform on which it will be built

These early decisions shape the system design and architecture.

For this discussion, let’s use an example: a domain-specific chatbot πŸ’¬ that uses customer data and a foundational model to generate responses. To make it more complex, we’ll add tool calling πŸ› οΈ and agents πŸ•ΉοΈ for real-time, domain-specific functions.


πŸ—‚οΈ Base Architecture​

Figure 1 – Basic Chatbot Architecture

OWASP_LLM_10_Base.drawio.png

At first glance πŸ‘€, this architecture may look ready for production. However, during an architecture review board πŸ§‘β€πŸ’» or discussions with security and compliance teams πŸ”, this base setup will quickly fall short.

Why? Because we haven’t yet considered the security, safety, and compliance risks 🚨 that can be exploited in such a design.

Just as we use OWASP Top 10 to secure web applications 🌐, OWASP has released the LLM Top 10β€”a framework to secure AI and LLM-powered applications.


🌍 OWASP GenAI Security Project​

The OWASP GenAI Security Project is a global, open-source initiative dedicated to identifying, mitigating, and documenting security and safety risks associated with generative AI technologies, including large language models (LLMs), agentic AI systems, and AI-driven applications.

βœ… This framework is an excellent starting point for both beginners πŸš€ and experts 🧠 to evaluate architecture, identify vulnerabilities, and mitigate risks.


❓ Key Security Questions for Chatbot Applications​

When designing AI applications, consider:

  • πŸ›‘οΈ How will the application handle prompt injection (at both input and output)?
  • πŸ” How is sensitive data (PII) handled? Is it anonymized or masked?
  • πŸ“¦ How is training data stored and secured? What if training data is poisoned?
  • 🧩 How are custom libraries and tools secured? Are they scanned for vulnerabilities?
  • πŸ“’ Does the application disclose its use of AI and align with Responsible AI policies?
  • 🧬 How are fine-tuned or custom models protected? What happens if they’re exposed?

πŸ”Ÿ OWASP LLM Top 10​

Here are the 10 key risks to consider:

  1. πŸ“ LLM01: Prompt Injection
  2. ⚠️ LLM02: Insecure Output Handling
  3. πŸ§ͺ LLM03: Training Data Poisoning
  4. πŸ›‘ LLM04: Model Denial of Service
  5. πŸ”— LLM05: Supply Chain Vulnerabilities
  6. πŸ”’ LLM06: Sensitive Information Disclosure
  7. 🧩 LLM07: Insecure Plugin Design
  8. πŸ€– LLM08: Excessive Agency
  9. πŸ‘€ LLM09: Overreliance
  10. πŸ•΅οΈ LLM10: Model Theft

πŸ—ΊοΈ Mapping to Architecture​

Figure 2 – Mapping OWASP LLM Top 10 to Architecture

OWASP_LLM10.drawio.png

By applying these guidelines, you can create a matrix πŸ“Š that scores your architecture against the OWASP framework. This provides:

  • A baseline security posture πŸ” for AI applications
  • A reference template πŸ“‘ for future system design
  • A governance-aligned approach πŸ›οΈ to AI architecture

πŸ“Š OWASP LLM Top 10 – Scoring Matrix Template​

Each item can be scored on a scale of 1–5 (1 = poor 🚫, 5 = strong πŸ’ͺ).

πŸ†” OWASP Risk IDπŸ›‘ Risk CategoryπŸ“– Description🏷️ Score (1-5)πŸ› οΈ Notes / Mitigation Plan
LLM01Prompt InjectionProtection against prompt injection attempts
LLM02Insecure Output HandlingValidation and sanitization of model outputs
LLM03Training Data PoisoningSafeguards against corrupted training data
LLM04Model Denial of ServiceRate limiting, monitoring, and throttling
LLM05Supply Chain VulnerabilitiesVerification of datasets, plugins, libraries
LLM06Sensitive Info DisclosureAnonymization, masking, encryption of PII
LLM07Insecure Plugin DesignPlugin isolation and secure coding practices
LLM08Excessive AgencyControls to limit agent autonomy
LLM09OverrelianceHuman-in-the-loop and fallback mechanisms
LLM10Model TheftAccess controls, encryption, monitoring

πŸ§ͺ Sample Scoring Matrix: Chatbot + RAG + Agent​

Here’s a worked example for a domain-specific chatbot πŸ’¬ that uses RAG (Retrieval Augmented Generation πŸ“š) with tool calling πŸ› οΈ and agentic workflows πŸ€–.

πŸ†” OWASP Risk IDπŸ›‘ Risk CategoryπŸ“– Description🏷️ Score (1-5)πŸ› οΈ Notes / Mitigation Plan
LLM01Prompt InjectionModerate risk, mitigated with input/output filters3Add context validation + regex sanitization
LLM02Insecure Output HandlingHigh risk due to tool execution2Enforce strict schema validation + guardrails
LLM03Training Data PoisoningModerate risk if knowledge base ingestion is not validated3Add data quality checks + signed data sources
LLM04Model Denial of ServiceHigh risk (agents can loop or generate heavy queries)2Add rate limiting + monitoring
LLM05Supply Chain VulnerabilitiesPlugins & APIs could be compromised3Use dependency scanning & signed artifacts
LLM06Sensitive Info DisclosureRAG may retrieve PII or confidential data2Add anonymization + retrieval filters
LLM07Insecure Plugin DesignHigh risk with tool calling2Implement zero-trust plugin execution
LLM08Excessive AgencyAgents may overstep bounds2Add role-based execution policies
LLM09OverrelianceUsers may blindly trust answers3Add disclaimers + confidence scoring
LLM10Model TheftLower risk in managed cloud (e.g. Bedrock)4Rely on provider safeguards + IAM

🎯 Final Thoughts​

The OWASP LLM Top 10 is not just a checklistβ€”it’s a security lens πŸ” for AI system design.

By using it in combination with your enterprise AI governance framework, you’ll be better equipped to build secure πŸ”, responsible 🌱, and accountable πŸ“Š AI applications that can withstand real-world risks.