Scan Details

Project Name
Scan Timestamp
Agentic Framework
openai-hackathon-wait-main
05/16/25 13:21:30
openai-agents
Dependency Check

Agentic Workflow Graph
Legend
Agent
Tool
Tool Category
CustomTool
Basic
MCP Server
Findings
Vulnerabilities
0
Agents
3
Tools
9
Nodes Overview

Agents
Agent Name LLM Model System Prompt
Triage agent gpt-4.1 You are a helpful assistant that finds the most relevant papers from Arxiv or PubMed. Decide which one is more relevant and return the results based on the proviede article text.
Figure Context Extractor gpt-4o-mini
ReviewerAssistantAgent gpt-4o-mini
Tools
Tool Name Tool Category Tool Description Number of Vulnerabilities
arxiv_search default Searches the arXiv database for scientific papers based on keywords extracted from the article text. This tool analyzes the article text to extract relevant search terms, expands them with related concepts, and queries arXiv.org to find academic papers matching these terms. Results include metadata such as title, authors, abstract, and URLs to access the papers. Args: article_text: The text or query to find relevant scientific papers for max_results: Number of results to return (default: 25, max:30) sort_by: How to sort results - options: 'relevance', 'lastUpdatedDate', 'submittedDate' Returns: Dictionary containing: - 'papers': List of paper metadata dictionaries with URLs to abstracts and PDFs - 'error': Error message if search failed (only present if there was an error) 0
pubmed_tool default Search PubMed for biomedical literature and retrieve article summaries. Args: query: The search query for PubMed. Be specific to get relevant re prntaining the summaries of the top articles matching the query. 0
extract_paper_metadata default Extracts the title and abstract from the paper content to provide context for figure analysis. Args: wrapper: Context wrapper containing paper content. Returns: str: Confirmation of metadata extraction. 0
analyze_figure default Analyzes a single figure from a paper. Args: wrapper: Context wrapper containing analysis parameters. figure_path: Path or key to the figure to analyze. Returns: str: Analysis of the figure. 0
extract_data_from_chart default Extracts numerical data points from charts and graphs. Args: wrapper: Context wrapper containing the figures. figure_path: Path or key to the figure to analyze. Returns: str: Extracted data points in tabular format. 0
detect_figure_relationships default Detects relationships between figures and how they complement each other. Args: wrapper: Context wrapper containing analyzed figures. Returns: str: Analysis of figure relationships. 0
generate_figures_summary default Generates a comprehensive summary of all analyzed figures and their relevance to the paper. Args: wrapper: Context wrapper containing analysis results. Returns: str: Comprehensive summary of figures. 0
convert_figures_to_text default Converts visual figures into textual descriptions for accessibility. Args: wrapper: Context wrapper containing the figures. Returns: str: Accessible text descriptions of all figures. 0
query_rag default Retrieve an answer to a query using a provided RAG (Retriever-Augmented Generation) system. This function extracts the 'query' and 'rag' from the provided `params` context, and utilizes the RAG system to generate a response to the query. Args: params (ReviewerAssistantContext): A context object containing the following keys: - 'query' (str): The question or prompt to be answered. Returns: str: The answer generated by the RAG system in response to the query. 0
Agent Vulnerability Mitigations

Agent Name Vulnerability Mitigation Level* Explanation
Triage agent Input Length Limit None There are no guardrails in place to mitigate this vulnerability.
Personally Identifiable Information (PII) Leakage None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate PII leakage.
Harmful/Toxic/Profane Content None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate the exchange of harmful, toxic, or profane content.
Jailbreak None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent jailbreak attempts.
Intentional Misuse None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate intentional misuse.
System Prompt Leakage None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent system prompt leakage.
Figure Context Extractor Input Length Limit None There are no guardrails provided to mitigate this vulnerability.
Personally Identifiable Information (PII) Leakage None There are no guardrails provided to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
Harmful/Toxic/Profane Content None There are no guardrails provided to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
Jailbreak None There are no guardrails provided to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
Intentional Misuse None There are no guardrails provided to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
System Prompt Leakage None There are no guardrails provided to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
ReviewerAssistantAgent Input Length Limit None There are no guardrails in place to mitigate this vulnerability.
Personally Identifiable Information (PII) Leakage None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
Harmful/Toxic/Profane Content None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
Jailbreak None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
Intentional Misuse None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
System Prompt Leakage None There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
*The "Mitigation Level" column shows to what extent a vulnerability is mitigated. "Full" indicates that both a system prompt instruction and a guardrail are in place. "Partial" indicates that one of the two is in place. "None" indicates that neither one is in place. (This applies to all vulnerabilities except for the "Input Length Limit", in which case only the guardrail is taken into account).
Agent Vulnerability Explanations

Agent Vulnerability Framework Mapping Description
Input Length Limit
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM10 - Unbounded Consumption
OWASP Agentic:
T2 - Tool Misuse T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors
An attacker can overwhelm the LLM's context with a very long message and cause it to ignore previous instructions or produce undesired actions.
Mitigation:
- add a Guardrail that checks if the user message contains more than the maximum allowed number of characters (200-500 will suffice in most cases).
Personally Identifiable Information (PII) Leakage
OWASP LLM Top 10:
LLM02 - Sensitive Information Disclosure LLM05 - Improper Output Handling
OWASP Agentic:
T7 - Misaligned & Deceptive Behaviors T9 - Identity Spoofing & Impersonation T15 - Human Manipulation
An attacker can manipulate the LLM into exfiltrating PII, or requesting users to disclose PII.
Mitigation:
- add a Guardrail that checks user and agent messages for PII and anonymizes them or flags them
- include agent instructions that clearly state that it should not handle PII.
Harmful/Toxic/Profane Content
OWASP LLM Top 10:
LLM05 - Improper Output Handling
OWASP Agentic:
T7 - Misaligned & Deceptive Behaviors T11 - Unexpected RCE and Code Attacks
An attacker can use the LLM to generate harmful, toxic, or profane content, or engage in conversations about such topics.
Mitigation:
- add a Guardrail that checks user and agent messages for toxic, harmful, and profane content
- include agent instructions that prohibit the agent from engaging in conversation about, or creating, harmful, toxic, or profane content.
Jailbreak
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM02 - Sensitive Information Disclosure LLM05 - Improper Output Handling LLM09 - Misinformation LLM10 - Unbounded Consumption
OWASP Agentic:
T1 - Memory Poisoning T2 - Tool Misuse T3 - Privilege Compromise T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors T9 - Identity Spoofing & Impersonation T11 - Unexpected RCE and Code Attacks T13 - Rogue Agents in Multi-Agent Systems T15 - Human Manipulation
An attacker can try to craft their messages in a way that makes the LLM forget all previous instructions and be used for any task the attacker wants.
Mitigation:
- add a Guardrail that checks user messages for attempts at circumventing the LLM's instructions
- include agent instructions that state that the agent should not alter its instructions, and ignore user messages that try to convince it otherwise.
Intentional Misuse
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM10 - Unbounded Consumption
OWASP Agentic:
T2 - Tool Misuse T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation
An attacker can try to use the instance of the LLM for tasks other than the LLM's intended usage to drain resources or for personal gain.
Mitigation:
- add a Guardrail that checks user messages for tasks that are not the agent's intended usage
- include agent instructions that prohibit the agent from engaging in any tasks that are not its intended usage
System Prompt Leakage
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM02 - Sensitive Information Disclosure LLM07 - System Prompt Leakage
OWASP Agentic:
T2 - Tool Misuse T3 - Privilege Compromise T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors
An attacker can make the LLM reveal the system prompt/instructions so that he can leak sensitive business logic or craft other attacks that are better suited for this LLM.
Mitigation:
- add a Guardrail that checks agent messages for the exact text of the agent's system prompt
- include agent instructions that highlight that the system prompt/instructions are confidential and should not be shared.