Scan Details

Project Name

Scan Timestamp

Agentic Framework

openai-hackathon-2025-main

05/16/25 13:20:56

openai-agents

Dependency Check

Agentic Workflow Graph

Legend

Agent

Tool

Tool Category

CustomTool

Basic

MCP Server

Findings

Vulnerabilities

Agents

Tools

Nodes Overview

Agents

Agent Name	LLM Model	System Prompt
Place Finder	gpt-4.1	You are an agent that searches for meetup spots. Using web.search, find the 10 best suggestions in the provided location, tailored to the user's interests, meetup type, budget, and time of day. For each place, provide its name and a one‑sentence description. Consider local reviews and ratings.
Meetup Idea Generator	o3	You are a creative dating advisor. Based on location, date, time, companion (name, age, gender), interests, weather, meetup type, budget, and a list of places – propose an original meet idea. Include the order of activities, a plan B for bad weather, and adapt the proposal to the time of day. The description should be short, specific, and realistic.
Reservation Finder	gpt-4.1	You are an agent that finds available dates and makes reservation. Using web.search, find tickes or first available date, tailored to the user's interests, meetup type, budget, and time of day. Now we are asking for
Reservation Finder	gpt-4.1

Tools

Tool Name	Tool Category	Tool Description	Number of Vulnerabilities
WebSearchTool	web_search	A hosted tool that lets the LLM search the web. Currently only supported with OpenAI models, using the Responses API.	2
WebSearchTool	web_search	A hosted tool that lets the LLM search the web. Currently only supported with OpenAI models, using the Responses API.	2
WebSearchTool	web_search	A hosted tool that lets the LLM search the web. Currently only supported with OpenAI models, using the Responses API.	2

Tool Vulnerabilities

WebSearchTool

Vulnerability

Indirect Prompt Injection

Description

Attackers can poison search results (SEO poisoning) or craft pages so that their snippets contain malicious instructions. For instance, hidden text in a webpage that ranks in results could manipulate the agent’s summary or follow-up actions.

Security Framework Mapping

OWASP LLM Top 10:
LLM01 - Prompt Injection

OWASP Agentic:
T6 - Intent Breaking & Goal Manipulation

Remediation Steps

• Enable URL whitelisting
• Implement guardrails filtering for prompt injection

Vulnerability

Misinformation

Description

The agent might unknowingly incorporate malicious snippets into its reasoning, leading to harmful output (e.g., biased or false information, or even code if the snippet is crafted as such).

Security Framework Mapping

OWASP LLM Top 10:
LLM09 - Misinformation

OWASP Agentic:
T1 - Memory Poisoning

Remediation Steps

• Implement guardrails to filter out malicious snippets
• Implement data sanitization to prevent user data from entering the tool

WebSearchTool

Vulnerability

Indirect Prompt Injection

Description

Security Framework Mapping

OWASP LLM Top 10:
LLM01 - Prompt Injection

OWASP Agentic:
T6 - Intent Breaking & Goal Manipulation

Remediation Steps

• Enable URL whitelisting
• Implement guardrails filtering for prompt injection

Vulnerability

Misinformation

Description

The agent might unknowingly incorporate malicious snippets into its reasoning, leading to harmful output (e.g., biased or false information, or even code if the snippet is crafted as such).

Security Framework Mapping

OWASP LLM Top 10:
LLM09 - Misinformation

OWASP Agentic:
T1 - Memory Poisoning

Remediation Steps

• Implement guardrails to filter out malicious snippets
• Implement data sanitization to prevent user data from entering the tool

WebSearchTool

Vulnerability

Indirect Prompt Injection

Description

Security Framework Mapping

OWASP LLM Top 10:
LLM01 - Prompt Injection

OWASP Agentic:
T6 - Intent Breaking & Goal Manipulation

Remediation Steps

• Enable URL whitelisting
• Implement guardrails filtering for prompt injection

Vulnerability

Misinformation

Description

The agent might unknowingly incorporate malicious snippets into its reasoning, leading to harmful output (e.g., biased or false information, or even code if the snippet is crafted as such).

Security Framework Mapping

OWASP LLM Top 10:
LLM09 - Misinformation

OWASP Agentic:
T1 - Memory Poisoning

Remediation Steps

• Implement guardrails to filter out malicious snippets
• Implement data sanitization to prevent user data from entering the tool

Agent Vulnerability Mitigations

Agent Name	Vulnerability	Mitigation Level*	Explanation
Place Finder	Input Length Limit	None	There are no guardrails in place to mitigate this vulnerability.
	Personally Identifiable Information (PII) Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions related to handling or preventing the leakage of personally identifiable information.
	Harmful/Toxic/Profane Content	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent or mitigate harmful, toxic, or profane content.
	Jailbreak	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions to prevent the AI from deviating from its intended behaviour or resisting modification of its behaviour.
	Intentional Misuse	None	There are no guardrails in place to mitigate this vulnerability. The instructions limit the AI to searching for meetup spots based on user interests, type, budget, and time, which inherently focuses the task, but there are no instructions to address any misuse beyond this scope.
	System Prompt Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions that indicate preventing or avoiding system prompt leakage.
Meetup Idea Generator	Input Length Limit	None	There are no guardrails in place to mitigate this vulnerability.
	Personally Identifiable Information (PII) Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions addressing the handling or protection of personally identifiable information like the user's companion's name, age, or gender.
	Harmful/Toxic/Profane Content	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions to prevent the inclusion of harmful, toxic, or profane content in the conversation.
	Jailbreak	None	There are no guardrails in place to mitigate this vulnerability. The instructions do not include any guidance or measures to prevent the AI from being manipulated into behaving against its intended function.
	Intentional Misuse	None	There are no guardrails in place to mitigate this vulnerability. The instructions focus solely on providing creative dating advice, without any directives to handle requests that fall outside this scope.
	System Prompt Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions to prevent the system prompt from being disclosed to the user.
Reservation Finder	Input Length Limit	None	There are no guardrails in place to mitigate this vulnerability.
	Personally Identifiable Information (PII) Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent leakage of PII.
	Harmful/Toxic/Profane Content	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent harmful, toxic, or profane content.
	Jailbreak	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in the prompt to prevent or handle jailbreak attempts.
	Intentional Misuse	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions preventing the AI from being used outside its intended purpose of finding available dates and making reservations.
	System Prompt Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent system prompt leakage.
Reservation Finder	Input Length Limit	None	There are no guardrails in place to mitigate this vulnerability.
	Personally Identifiable Information (PII) Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
	Harmful/Toxic/Profane Content	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
	Jailbreak	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
	Intentional Misuse	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.
	System Prompt Leakage	None	There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to mitigate this vulnerability.

*The "Mitigation Level" column shows to what extent a vulnerability is mitigated. "Full" indicates that both a system prompt instruction and a guardrail are in place. "Partial" indicates that one of the two is in place. "None" indicates that neither one is in place. (This applies to all vulnerabilities except for the "Input Length Limit", in which case only the guardrail is taken into account).

Agent Vulnerability Explanations

Agent Vulnerability	Framework Mapping	Description
Input Length Limit	OWASP LLM Top 10: LLM01 - Prompt Injection LLM10 - Unbounded Consumption OWASP Agentic: T2 - Tool Misuse T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors	An attacker can overwhelm the LLM's context with a very long message and cause it to ignore previous instructions or produce undesired actions. Mitigation: - add a Guardrail that checks if the user message contains more than the maximum allowed number of characters (200-500 will suffice in most cases).
Personally Identifiable Information (PII) Leakage	OWASP LLM Top 10: LLM02 - Sensitive Information Disclosure LLM05 - Improper Output Handling OWASP Agentic: T7 - Misaligned & Deceptive Behaviors T9 - Identity Spoofing & Impersonation T15 - Human Manipulation	An attacker can manipulate the LLM into exfiltrating PII, or requesting users to disclose PII. Mitigation: - add a Guardrail that checks user and agent messages for PII and anonymizes them or flags them - include agent instructions that clearly state that it should not handle PII.
Harmful/Toxic/Profane Content	OWASP LLM Top 10: LLM05 - Improper Output Handling OWASP Agentic: T7 - Misaligned & Deceptive Behaviors T11 - Unexpected RCE and Code Attacks	An attacker can use the LLM to generate harmful, toxic, or profane content, or engage in conversations about such topics. Mitigation: - add a Guardrail that checks user and agent messages for toxic, harmful, and profane content - include agent instructions that prohibit the agent from engaging in conversation about, or creating, harmful, toxic, or profane content.
Jailbreak	OWASP LLM Top 10: LLM01 - Prompt Injection LLM02 - Sensitive Information Disclosure LLM05 - Improper Output Handling LLM09 - Misinformation LLM10 - Unbounded Consumption OWASP Agentic: T1 - Memory Poisoning T2 - Tool Misuse T3 - Privilege Compromise T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors T9 - Identity Spoofing & Impersonation T11 - Unexpected RCE and Code Attacks T13 - Rogue Agents in Multi-Agent Systems T15 - Human Manipulation	An attacker can try to craft their messages in a way that makes the LLM forget all previous instructions and be used for any task the attacker wants. Mitigation: - add a Guardrail that checks user messages for attempts at circumventing the LLM's instructions - include agent instructions that state that the agent should not alter its instructions, and ignore user messages that try to convince it otherwise.
Intentional Misuse	OWASP LLM Top 10: LLM01 - Prompt Injection LLM10 - Unbounded Consumption OWASP Agentic: T2 - Tool Misuse T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation	An attacker can try to use the instance of the LLM for tasks other than the LLM's intended usage to drain resources or for personal gain. Mitigation: - add a Guardrail that checks user messages for tasks that are not the agent's intended usage - include agent instructions that prohibit the agent from engaging in any tasks that are not its intended usage
System Prompt Leakage	OWASP LLM Top 10: LLM01 - Prompt Injection LLM02 - Sensitive Information Disclosure LLM07 - System Prompt Leakage OWASP Agentic: T2 - Tool Misuse T3 - Privilege Compromise T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors	An attacker can make the LLM reveal the system prompt/instructions so that he can leak sensitive business logic or craft other attacks that are better suited for this LLM. Mitigation: - add a Guardrail that checks agent messages for the exact text of the agent's system prompt - include agent instructions that highlight that the system prompt/instructions are confidential and should not be shared.