Threat modeling has always been about understanding how the components of an application interact, where the boundaries lie, and what could go wrong at each connection. Traditionally, this process has been manual, relying on diagrams and the expertise of security professionals to map out relationships and identify risks. But as modern applications grow in complexity, so do the relationships between their components. Keeping track of every data flow, trust boundary, and potential attack surface quickly becomes overwhelming.
This is where AI, and particularly LLMs, fundamentally change the game. By leveraging LLMs, we can now parse complex application descriptions, codebases, and architecture diagrams with a level of depth and speed that was previously unattainable. Instead of painstakingly mapping out each component and connection by hand, AI-powered tools like Arrows analyze the system holistically.
LLMs excel at extracting and synthesizing information from diverse sources, be it documentation, code, or even informal descriptions. They can identify how components communicate, where sensitive data flows, and which trust boundaries are present. This enhanced understanding is not just about automation; it’s about seeing the bigger picture. AI enables us to visualize the architecture as a dynamic network, making it easier to spot weak links and potential attack paths that emerge from the interplay between components.The result is a more comprehensive and nuanced threat model. With AI-driven analysis, security teams can move beyond static diagrams to interactive visualizations that reflect the true complexity of modern systems. This not only accelerates the threat modeling process but also improves its accuracy, ensuring that critical relationships between components are understood and potential threats are identified before they can be exploited.
Arrows is an AI Agent for threat modeling that I developed as a side project with Fadi Arab, as part of the SANS AI Cybersecurity Hackathon. Its goal is to map out the threats of a web application, producing a final report with an interactive diagram.
Arrows got two options threat modeling a system : either by technical text description or by giving the codebase directly for a whitebox analysis. Both options can be launched with the LLM at our conveniance, we choosed 3 options : Claude 3.7 for both speed and quality (most expensive), GPT-3.5 Turbo (fastest), Deepseek R1 (cheapest but slowest).
Arrows allows you to quickly grasb a sense of the threats of your software by providing a text description, including technical details such as : data flows, used technologies, trust boundaries etc. It will then follow the STRIDE methodology in order to produce a threat model with an interactive diagram.
For example, let’s prompt Arrows with this description :
“The web application consists of a React frontend, a Node.js backend with a RESTful API, a PostgreSQL database for structured data, and an S3-compatible file storage system for user uploads, with Redis used for caching. Data flows from the client to the backend via HTTPS, where it is processed, stored, or forwarded to integrated third-party APIs (e.g., payment gateways or analytics services), all secured with API keys and mutual TLS. Trust boundaries are defined by role-based access control (RBAC), with user roles (admin, editor, viewer) restricting access to sensitive data, while JWT tokens handle authentication and session management. The system processes personally identifiable information (PII) and financial data, encrypted at rest and in transit, with OAuth 2.0 for third-party integrations and multi-factor authentication (MFA) for enhanced security.”
And we get the following threat model (with a few elements not being shown in the screenshot) with components, data flows, assets and trust boundaries. Arrows also prints out a summary of the possible threats at the right side, with threats such as Data tampering via sqli, Identity spoofing via jwt token etc.
Perhaps the most compelling feature of Arrows is its ability to perform whitebox analysis, mapping out threats directly from your web application’s codebase. Unlike traditional approaches that rely on high-level diagrams or manual documentation, Arrows dives deep into the source code, leveraging the power of LLMs to systematically uncover the architecture and potential vulnerabilities of your application.
The process begins with component identification. Arrows starts by scanning the codebase for a predefined set of common web application elements: entry points, routes, controllers, models, middleware, authentication systems, databases, APIs, and configuration files. This initial sweep is language-agnostic, prioritizing files written in widely used languages such as JavaScript, TypeScript, Python, PHP, Ruby, Go, and Java, as well as infrastructure and configuration files. By focusing on these high-value targets, Arrows ensures that the analysis is both thorough and efficient.
Each file is then subjected to LLM-powered scrutiny. For every eligible file, whether it’s a controller in Python or a middleware in Node.js, the system uses expert-crafted prompts to instruct the LLM to act as a security analyst. The model identifies the purpose of each component and provides concise, context-aware descriptions of their implementation. Large files are intelligently chunked to stay within token limits, ensuring that even sprawling codebases can be processed without losing critical context.
Once all relevant files have been analyzed, Arrows synthesizes the findings into a comprehensive textual summary, grouping components by type and detailing their relationships. This summary is then fed into the system analyzer, which extracts a high-level architectural model: mapping out components, data flows, trust boundaries, and key assets. This model forms the backbone for the next phase, automated STRIDE threat analysis.
In the STRIDE phase, Arrows deploys specialized analyzers for each threat category: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege. Each analyzer uses tailored prompts to guide the LLM in identifying vulnerabilities, assessing risk levels, and suggesting actionable mitigations. The result is a detailed, context-rich threat model that not only highlights security gaps but also provides practical recommendations for remediation.
All of this is seamlessly integrated into the Arrows workflow. Users simply upload their codebase (as a ZIP file) through a dedicated endpoint, and the analysis runs in the background—extracting architecture, running STRIDE, and generating interactive visualizations and detailed threat reports. To keep the process efficient and manageable, the system analyzes up to 30 files per run and employs chunking strategies for large files.
The whitebox analysis capability of Arrows is designed for real-world use: language-agnostic, framework-flexible, and scalable. By automating the extraction of architectural models and threat identification directly from source code, Arrows empowers security teams to move beyond static diagrams and manual reviews, making threat modeling both practical and actionable for modern web applications.
Let’s try whitebox analysis with a classic web application page that I zipped into app.zip :
This ultimately gives us a more precise threat model. This is evident from the fact that receiving the actual code enables a deeper and more detailed analysis of potential threats, whereas relying solely on a textual description results in a more general and less specific threat model.
Arrows proves that combining LLMs with practical workflows can turn threat modeling from a static, manual process into a dynamic and continuous one. By supporting both text-based and code-based analysis, it adapts to different stages of development, whether you’re drafting an architecture or auditing an existing system. The whitebox mode, in particular, shows how direct code analysis leads to richer, more actionable threat models by exposing implementation-level flaws that diagrams alone can’t reveal. With this approach, threat modeling becomes not only faster but significantly more aligned with the realities of modern web development.
Wanna check it out ? Arrows is on Github : https://github.com/yacwagh/arrows
Yacine Souam / @yacwagh
Founded in 2021 and headquartered in Paris, FuzzingLabs is a cybersecurity startup specializing in vulnerability research, fuzzing, and blockchain security. We combine cutting-edge research with hands-on expertise to secure some of the most critical components in the blockchain ecosystem.
Contact us for an audit or long term partnership!
Cookie | Duration | Description |
---|---|---|
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |