Table of contents:
In the world of GitOps, the speed of delivery is crucial, but it cannot come at the expense of security. We all know the scenario: a scanner detects a critical vulnerability (CVE), the pipeline grinds to a halt, and a developer has to spend hours hunting for the right image version or a configuration fix.
What if the delivery pipeline didn’t just find the errors, but prepared the solution itself?
In this article, you’ll find out how we built a system that leverages an AI agent, a vector database containing CIS Benchmarks and the Trivy scanner to automatically generate merge requests with security patches.
Advantages of integrating a retrieval-augmented generation (RAG) AI agent to the DevOps pipeline
In a RAG architecture, large language models (LLMs) refer to an external knowledge base before they provide an answer. This can include Common Vulnerabilities and Exposures (CVE) repositories, image documentation and your organization’s security policies. RAG enables AI models to give precise and verifiable recommendations based on the real, up-to-date context of the project. Cloud-native architectures usually generate hundreds of container images in different versions. Manually keeping track of CVEs for every image container is costly and practically impossible. Classic scanners like Trivy, Grype and Snyk detect vulnerability, but an expert needs to understand the context and suggest the right security improvement. This can create a bottleneck that hinders the shift-left security approach.
By integrating an RAG AI agent into your CI/CD pipeline, you automate this entire decision-making process. The AI agent points out what needs to be fixed and recommends how to fix it, taking into account your tech stack and company policies. This results in:
- Shorter mean time to respond (MTTR),
- Reduced alert fatigue,
- Decreasing repetitive workload for DevSecOps,
- Cutting the risk exposure time from days to minutes.
Additionally, every AI-driven decision is auditable, which makes the AI-powered CI/CD pipeline compliant with regulations like NIS2 and DORA.
RAG agent system architecture
Our process is based on an intelligent feedback loop where AI acts as a “security assistant.” At the heart of the solution is a Python application integrated with a large language model (LLM), serving as a bridge between scan results and source code. Following design patterns used in graph-based flow solutions (e.g., LangGraph, LangChain), the processing operates within a structured architecture.
The key components of this solution include:
- Scanner – Trivy (and/or Kubescape): The first line of defense, responsible for scanning container images and file systems for vulnerabilities.
- Vector Database (RAG) with Center for Internet Security (CIS) Benchmarks: Instead of relying solely on the general knowledge of an LLM (like the GPT-5 model family), we feed it precise CIS standards stored in a database like pgvector. This ensures the AI agent knows exactly which configuration in a Kubernetes environment is considered secure.
- Cloud AI Assistant (Graph Agent): The “brain” of the operation, it primarily utilizes a Map-Reduce algorithm and distributes analyzed files across independent nodes in a graph to scan manifests and generate fixes (generate_manifest_fix_proposition) asynchronously. Next, it safely merges the reduced patches (merge_node). The system then analyzes updated reports and context from the vector database to provide ready-to-use code.
- Skopeo: An essential tool for interacting with remote repositories, it enables the dynamic retrieval of available tags for analyzed container images.
- LangGraph: This framework serves as the technological and architectural backbone. It helps build a complex, multi-node decision graph and precisely control the flow of information between agents. This library is what enables the Map-Reduce approach, making it possible to asynchronously split tasks for concurrent scanning (Map) and perform a controlled merge of partial fixes into one coherent list (Reduce).
- GitOps Workflow: The entire process is triggered automatically with every commit to the configuration repository within the CI/CD chain.
AI-driven security scanning workflow: From commit to merge request
How does this work in practice? The main process flow follows a few repeatable steps:
1.Trigger: A DevOps engineer commits to the GitOps repository (e.g., changing an image version or tag).
2.Analysis: The CI/CD pipeline runs the Python application, which triggers a Trivy scan on specific files (e.g., manifest.yaml).
3.Benchmark Verification: If Trivy detects an error, the application searches the vector database for relevant CIS Benchmark recommendations.
4.Fix Generation: The AI agent receives a prompt: “Here is the error from Trivy, and here are the CIS recommendations. Fix the Dockerfile/Kubernetes manifest to eliminate the vulnerability.” Changes are then verified for syntactic correctness, for example, via local simulations like kubectl apply –dry-run.
5.Merge Request: The system automatically creates a new branch and opens a merge request/pull request with a description of the generated changes.
6.Human-in-the-Loop: Based on the workflow of an experienced DevSecOps engineer, the engineer simply reviews the ready-made fix and, after verification, accepts it with one click for final deployment.
![]()
AI agent-enhanced application deployment process flow within the CI/CD pipeline.
Autonomous search for the safest images with a live AI agent
One of the most innovative elements of our system is a dedicated path for the AI agent to find the safest alternative for a used container image. This process runs fully “live” and consists of several interconnected steps within the decision graph:
- Fetching Tags: The agent uses Skopeo to list all available tags for a given image directly from the registry (e.g., Docker Hub, Harbor).
- Selection of Variants: The LLM verifies the list and selects a representative pool of diverse tags for analysis, giving preference to various major/minor versions and special variants (e.g., alpine, slim).
- On-the-Fly Scanning: The system dynamically runs Trivy for each selected image variant.
- Selecting the Safest Option: The AI agent verifies the results and chooses the version with the minimal number of vulnerabilities. It then updates the tag in the target Kubernetes manifest.
Why is this so important? Traditional systems and static scanners usually only inform you that an image has errors or that using the :latest tag is dangerous. Unfortunately, they don’t tell you which specific version you should switch to. While simple scripts might try to hard-code replacements, only an autonomous agent connected to Skopeo and a scanner can make informed decisions. The agent investigates in real-time which version is actually vulnerability-free at that moment, removing the tedious manual process of downloading and scanning dozens of versions.
Benefits of integrating an RAG AI agent into your DevOps pipeline
Traditional security scanning often results in “information noise.” The solution we’ve developed shifts the burden from reporting to remedying and significantly reduces the manual effort required to run this process. For a real-world example of these productivity gains, take a look at the data from 11 pilot projects where we implemented our RAG AI Agent. Each project involved around 20 Kubernetes services and used three to four images.
![]()
Along with speeding up the process, integrating the RAG AI agent in security scanning also offers the following benefits:
- Technical Debt Reduction: Fixes are created immediately, rather than waiting for a slot in a future sprint.
- Team Education: AI-generated merge requests include justifications based on CIS standards, constantly raising security awareness among developers.
- Compliance: The vector database ensures that fixes align 100% with official guidelines, rather than being hallucinations or guesses by the model.
Strengthen your DevSecOps with AI
Integrating AI into CI/CD pipelines represents real-time savings at a significant scale and raises the security bar for Kubernetes clusters. The RAG AI agent showcased in this article doesn’t replace human experts; it removes the most repetitive and tedious parts of their job, while accelerating the security scanning process and enabling them to apply their expertise to other value-added tasks.
The workflow you’ve seen earlier focuses on the shift-left security approach. In the next stage of developing this solution, we’re expanding it to the shift-right approach using tools like Falco and Kyverno, alongside an AI agent on the runtime side, to detect vulnerabilities in services that are already running.
If you want to find out how our DevOps and AI experts can help you boost operational efficiency with AI, get in touch with us. You’ll be able to see the demo of the RAG AI agent and learn more about tailoring AI to support your processes.
FAQ
What are the key components of the RAG AI agent for security scanning?
This solution involves a scanner like Trivy or Kubescape, a vector database with information about CIS Benchmarks, a cloud-based AI agent, a container image management tool (Skopeo), an agent orchestration framework (LangGraph) and a GitOps workflow.
What are the main steps in the AI-driven security scanning process?
When an operator creates a commit to the GitOps repository, it triggers a Python application and a Trivy scan. When an error is detected, the application looks for CIS benchmark recommendations. Then the AI agent generates a fix to eliminate the vulnerability and opens a merge request. Finally, the proposed fix is verified and approved by a human operator.
How can companies benefit from integrating AI into security scanning?
This solution helps speed up the security scanning process, strengthen security, boost automation in the CI/CD pipeline, reduce technical debt, increase security awareness within a development team and enhance compliance.
About the authorMateusz Kozieł
DevOps Engineer
With over 4 years’ experience in software development, Mateusz has been creating and managing CI/CD pipelines as well as implementing DevSecOps and AI. He’s supported leading telecom operators, leveraging his expertise in combining security and effective CI/CD. In his work, he likes to explore the cloud-native approach to app implementation and management as well as utilizing Kubernetes in on-premise and cloud environments.
About the authorSławomir Bednarczyk
Principal Systems Engineer
A Principal Systems Engineer with over 18 years’ experience in the telecom and IT industries, Sławomir has cooperated with various mobile network providers. His extensive telecom and Linux knowledge enable him to effectively automate tasks and efficiently manage networks and protocols. A keen problem-solver, Sławomir enjoys exploring protocols and network architecture, as well as automation and DevOps strategies.
















