Scan Details
Project Name
Scan Timestamp
Agentic Framework
openai-warsaw-2025-drone-main
05/16/25 13:21:52
openai-agents
Dependency Check
Agentic Workflow Graph
Legend
Agent
Tool
Tool Category
CustomTool
Basic
MCP Server
Findings
Vulnerabilities
0
Agents
4
Tools
2
Nodes Overview
Agents
Agent Name | LLM Model | System Prompt |
---|---|---|
Investigation Agent | gpt-4o | You are an investigation drone operator. Your task is to investigate objects detected in images. Your job is to: 1. Extract interesting observation coordinates from the image description 2. Choose one that is the most important 3. Get closer to the most important one by calling `move_to_image_coordinates` function if you are far away 4. If you are close enough then change state by calling `change_state`: - "PATROL" if observation turns out to be uninteresting - "EMERGENCY_HANDLING" if the observation is emergency - "NON_EMERGENCY_HANDLING" if the observation is not an emergency 1. examples non-emergency observations: - Damaged infrastructure (roofs, roads, buildings) - Smoke from non-emergency sources - Environmental issues 2. examples emergency observations: - Car crashes - Unconscious or injured people - Fires - Natural disasters - Suspicious activities |
Emergency Response Agent | gpt-4o | You are an emergency response drone operator. Your sole responsibility is to handle emergency situations. When you receive control: 1. Immediately assess the emergency situation 2. Call appropriate emergency services with location and severity 3. Maintain safe observation of the scene 4. Provide real-time updates to emergency services 5. Avoid interfering with emergency response efforts Emergency Response Protocol: 1. You can use voice_message function with a specified message to attract attention or give humans instructions and call_siren to play loud signal to attract attention 2. If the situation is not resolved, call emergency services with location and details 3. Then, observe and report on the situation 4. Maintain safe distance from the scene 5. Continue until emergency services arrive or situation is resolved |
Maintenance and Investigation Agent | gpt-4o | You are a maintenance and investigation drone operator. Your responsibilities are: 1. Handle non-emergency observations: - Damaged infrastructure (roofs, roads, buildings) - Smoke from non-emergency sources - Environmental issues - Other maintenance concerns 2. Investigation Protocol: - First, report the observation with details - Then, investigate the situation - you can use audio_message function to give instrunctions to nearby humans if necessary - Document findings - Recommend follow-up actions - if investigation discovered that the problem is an emergency situation then delegate control to the emergency agent 3. Types of observations to handle: - Damaged roofs or buildings - Smoke from garbage or controlled burns - Environmental issues (litter, pollution) - Infrastructure maintenance needs - Other non-emergency concerns |
Drone Operator Agent | gpt-4o | You are a drone operator responsible for monitoring and analyzing the environment. Your responsibilities are: 1. Analyze images from the drone's camera 2. If you see something unusual change state to INVESTIGATION by calling `change_state` function and hand off control to investigation agent |
Tools
Tool Name | Tool Category | Tool Description | Number of Vulnerabilities |
---|---|---|---|
audio_message | default | Play given message over the speakers as audio. | 0 |
audio_message | default | Play given message over the speakers as audio. | 0 |
Agent Vulnerability Mitigations
Agent Name | Vulnerability | Mitigation Level* | Explanation |
---|---|---|---|
Investigation Agent | Input Length Limit | None | There are no guardrails mentioned to mitigate this vulnerability. |
Personally Identifiable Information (PII) Leakage | None | There are no guardrails mentioned to mitigate this vulnerability. There are no instructions related to handling or preventing leakage of personally identifiable information. | |
Harmful/Toxic/Profane Content | None | There are no guardrails mentioned to mitigate this vulnerability. The system prompt does not include any specific instructions addressing harmful, toxic, or profane content. | |
Jailbreak | None | There are no guardrails mentioned to mitigate this vulnerability. There are no instructions to prevent the AI from being manipulated into acting against its programmed guidelines. | |
Intentional Misuse | None | There are no guardrails mentioned to mitigate this vulnerability. The instructions do not cover aspects of preventing the AI from being used for unintended tasks. | |
System Prompt Leakage | None | There are no guardrails mentioned to mitigate this vulnerability. There are no instructions specifically highlighting that the system prompt is confidential or should not be disclosed. | |
Emergency Response Agent | Input Length Limit | None | There are no guardrails in place to mitigate this vulnerability. |
Personally Identifiable Information (PII) Leakage | None | There are no guardrails in place to mitigate this vulnerability. The instructions mention calling emergency services with location and severity, but there are no explicit instructions regarding the handling or protection of PII. | |
Harmful/Toxic/Profane Content | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent the usage or dissemination of harmful, toxic, or profane content. | |
Jailbreak | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions specifically in place to handle attempts to make the AI agent act outside of prescribed emergency response operations or instructions. | |
Intentional Misuse | None | There are no guardrails in place to mitigate this vulnerability. The instructions specify roles related only to emergency response, but there are no explicit safeguards against intentional misuse outside of its intended purpose. | |
System Prompt Leakage | None | There are no guardrails in place to mitigate this vulnerability. There are no explicit instructions preventing the leakage of system prompts or instructions. | |
Maintenance and Investigation Agent | Input Length Limit | None | There are no guardrails in place to mitigate this vulnerability. |
Personally Identifiable Information (PII) Leakage | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place regarding the use or protection of PII. | |
Harmful/Toxic/Profane Content | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent the AI from engaging with harmful, toxic, or profane content. | |
Jailbreak | None | There are no guardrails in place to mitigate this vulnerability. The instructions do not explicitly address preventing the AI from operating outside of its defined responsibilities. | |
Intentional Misuse | Partial | There are no guardrails in place to mitigate this vulnerability. The instructions define specific tasks and protocols for maintenance and investigation, showing some mitigation for misuse by specifying what the AI should be handling. | |
System Prompt Leakage | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions to prevent the extraction or exposure of system prompts or instructions. | |
Drone Operator Agent | Input Length Limit | None | There are no guardrails in place to mitigate this vulnerability. |
Personally Identifiable Information (PII) Leakage | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent sharing or handling of PII. | |
Harmful/Toxic/Profane Content | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent handling or sharing harmful, toxic, or profane content. | |
Jailbreak | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent the user from attempting to make the AI act outside of its intended role. | |
Intentional Misuse | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent using the system for tasks outside of image analysis and state change. | |
System Prompt Leakage | None | There are no guardrails in place to mitigate this vulnerability. There are no instructions in place to prevent the system prompt from being leaked or extracted. |
*The "Mitigation Level" column shows to what extent a vulnerability is mitigated. "Full" indicates that both a system prompt instruction and a guardrail are in place.
"Partial" indicates that one of the two is in place. "None" indicates that neither one is in place. (This applies to all vulnerabilities except for the "Input Length Limit", in which case only the guardrail is taken into account).
Agent Vulnerability Explanations
Agent Vulnerability | Framework Mapping | Description |
---|---|---|
Input Length Limit |
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM10 - Unbounded Consumption OWASP Agentic:
T2 - Tool Misuse T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors |
An attacker can overwhelm the LLM's context with a very long message and cause it to ignore previous instructions or produce undesired actions. Mitigation: - add a Guardrail that checks if the user message contains more than the maximum allowed number of characters (200-500 will suffice in most cases). |
Personally Identifiable Information (PII) Leakage |
OWASP LLM Top 10:
LLM02 - Sensitive Information Disclosure LLM05 - Improper Output Handling OWASP Agentic:
T7 - Misaligned & Deceptive Behaviors T9 - Identity Spoofing & Impersonation T15 - Human Manipulation |
An attacker can manipulate the LLM into exfiltrating PII, or requesting users to disclose PII. Mitigation: - add a Guardrail that checks user and agent messages for PII and anonymizes them or flags them - include agent instructions that clearly state that it should not handle PII. |
Harmful/Toxic/Profane Content |
OWASP LLM Top 10:
LLM05 - Improper Output Handling OWASP Agentic:
T7 - Misaligned & Deceptive Behaviors T11 - Unexpected RCE and Code Attacks |
An attacker can use the LLM to generate harmful, toxic, or profane content, or engage in conversations about such topics. Mitigation: - add a Guardrail that checks user and agent messages for toxic, harmful, and profane content - include agent instructions that prohibit the agent from engaging in conversation about, or creating, harmful, toxic, or profane content. |
Jailbreak |
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM02 - Sensitive Information Disclosure LLM05 - Improper Output Handling LLM09 - Misinformation LLM10 - Unbounded Consumption OWASP Agentic:
T1 - Memory Poisoning T2 - Tool Misuse T3 - Privilege Compromise T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors T9 - Identity Spoofing & Impersonation T11 - Unexpected RCE and Code Attacks T13 - Rogue Agents in Multi-Agent Systems T15 - Human Manipulation |
An attacker can try to craft their messages in a way that makes the LLM forget all previous instructions and be used for any task the attacker wants. Mitigation: - add a Guardrail that checks user messages for attempts at circumventing the LLM's instructions - include agent instructions that state that the agent should not alter its instructions, and ignore user messages that try to convince it otherwise. |
Intentional Misuse |
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM10 - Unbounded Consumption OWASP Agentic:
T2 - Tool Misuse T4 - Resource Overload T6 - Intent Breaking & Goal Manipulation |
An attacker can try to use the instance of the LLM for tasks other than the LLM's intended usage to drain resources or for personal gain. Mitigation: - add a Guardrail that checks user messages for tasks that are not the agent's intended usage - include agent instructions that prohibit the agent from engaging in any tasks that are not its intended usage |
System Prompt Leakage |
OWASP LLM Top 10:
LLM01 - Prompt Injection LLM02 - Sensitive Information Disclosure LLM07 - System Prompt Leakage OWASP Agentic:
T2 - Tool Misuse T3 - Privilege Compromise T6 - Intent Breaking & Goal Manipulation T7 - Misaligned & Deceptive Behaviors |
An attacker can make the LLM reveal the system prompt/instructions so that he can leak sensitive business logic or craft other attacks that are better suited for this LLM. Mitigation: - add a Guardrail that checks agent messages for the exact text of the agent's system prompt - include agent instructions that highlight that the system prompt/instructions are confidential and should not be shared. |