AI Agent for Unity: Building Autonomous Agents in Games

Learn how to build AI agents in Unity, integrating ML-Agents and external models. This technical guide covers setup, code samples in C# and Python, training workflows, debugging, and deployment considerations for robust agent-driven gameplay.

Ai Agent Ops Team

March 11, 2026·5 min read

OpenAI Agent Core Coding AI AI Tools

Unity AI Agents - Ai Agent Ops — Photo by Kindel Media via Pexels

Quick AnswerFact

An AI agent for Unity is a configurable AI component that makes Unity agents act autonomously by processing observations and selecting actions. It often uses ML-Agents for learning or external AI models for behavior. Read the full guide for setup and examples.

What is an AI agent in Unity?\n\nIn Unity, an AI agent is a software entity that perceives its environment, reasons about goals, and takes actions to achieve those goals within a game world. AI agents can range from simple wanderers to complex NPCs that coordinate with teammates, avoid obstacles, and adapt to player behavior. The ai agent for unity pattern typically combines perception sensors, a decision-making component, and a control loop running inside the Unity engine. According to Ai Agent Ops, adopting a modular agent design reduces coupling and makes testing easier in 2026's tooling.\n\n`csharp\nusing UnityEngine;\nusing Unity.MLAgents;\nusing Unity.MLAgents.Sensors;\n\npublic class SimpleAgent : Agent {\n public override void Initialize() { }\n public override void OnEpisodeBegin() { /* reset state / }\n public override void CollectObservations(VectorSensor sensor) { / add observations / }\n public override void OnActionReceived(ActionBuffers actions) { / apply actions */ }\n}\n`

Prerequisites and environment setup for Unity AI agents\n\nBefore building agents, ensure you have a Unity project ready and the ML-Agents toolkit installed. The typical workflow involves configuring the Unity Package Manager, setting up a Python training environment, and ensuring your target platform supports the intended AI workload. The steps below show a minimal, working baseline to get you started.\n\n`json\n// Packages/manifest.json (example snippet)\n{\n "dependencies": {\n "com.unity.ml-agents": "latest-stable"\n }\n}\n`\n\n`bash\n# Create a virtual environment for training (Linux/macOS)\npython3 -m venv venv\nsource venv/bin/activate\n# Install ML-Agents toolkit\npip install mlagents\n`\n

Quickstart: your first AI agent in a Unity scene\n\nCreate a scene with a target object and a simple agent script that derives from ML-Agents' Agent. The following example shows a minimal agent that receives a target position as observation and moves toward it. This is a foundational pattern you can extend with rewards and obstacles.\n\ncsharp\nusing UnityEngine;\nusing Unity.MLAgents;\nusing Unity.MLAgents.Sensors;\nusing Unity.MLAgents.Actuators;\n\npublic class TargetAgent : Agent {\n public Transform Target;\n public float Speed = 2.0f;\n\n public override void OnEpisodeBegin() {\n // Randomize target position within a radius for variability\n Target.position = new Vector3(Random.Range(-4, 4), 0, Random.Range(-4, 4));\n transform.position = Vector3.zero;\n }\n\n public override void CollectObservations(VectorSensor sensor) {\n sensor.AddObservation(Target.position);\n sensor.AddObservation(transform.position);\n }\n\n public override void OnActionReceived(ActionBuffers actions) {\n var moveX = actions.ContinuousActions[0];\n var moveZ = actions.ContinuousActions[1];\n var dir = new Vector3(moveX, 0, moveZ).normalized;\n transform.position += dir * Speed * Time.deltaTime;\n // Reward for getting closer to the target could be tuned here\n }\n}\n\n

Training basics with ML-Agents: config and workflow\n\nTraining an AI agent in Unity typically involves a separate Python-based trainer that optimizes policies based on rewards. The config file defines hyperparameters such as learning rate, batch size, and cohort size. After training, you export the trained model back into Unity for gameplay.\n\n`yaml\n# trainer_config.yaml\nbehaviors:\n TargetAgent:\n trainer_type: ppo\n hyperparameters:\n batch_size: 64\n buffer_size: 128\n learning_rate: 3.0e-4\n network_settings:\n num_hidden_layers: 2\n hidden_units: 128\n reward_signals:\n extrinsic:\n gamma: 0.99\n strength: 1.0\n`\n\n`bash\nmlagents-learn config/trainer_config.yaml --run-id=FirstRun --force\n`

Perception and observations: sensing the world in Unity\n\nA core pattern is to capture observations that feed into the policy. You can use built-in sensors or custom observations (e.g., distance to obstacles, relative position to targets). The following example shows collecting a few key observations directly in code.\n\n`csharp\npublic override void CollectObservations(VectorSensor sensor) {\n Vector3 toTarget = Target.position - transform.position;\n sensor.AddObservation(toTarget.normalized);\n sensor.AddObservation(Vector3.Distance(transform.position, Target.position));\n // Optional: add sensor for your environment (e.g., obstacle count)\n}\n`

Advanced patterns: perception, planning, and behaviors\n\nBeyond simple target seeking, you can introduce planning and switching behaviors with a Behavior Parameters component or a DecisionRequester. This enables more complex decision cycles and modular AI. The example shows a basic decision trigger and a simple behavior switch based on proximity.\n\n`csharp\nusing Unity.Barracuda; // for neural net inference if using NN models\n\npublic class AdaptiveAgent : Agent {\n public float switchDistance = 2.0f;\n public override void OnActionReceived(ActionBuffers actions) {\n // Implement different action branches based on current mode\n }\n void Update() {\n if (Vector3.Distance(transform.position, Target.position) < switchDistance) {\n // switch to attack-like behavior\n }\n }\n}\n`

Debugging, profiling, and testing AI agents\n\nDebugging AI in Unity involves adding informative logs, using the Unity Profiler, and validating agent behavior in isolated test scenes. Start by enabling development builds and verbose logging during training and inference. Vision and sensor failures can be diagnosed with breakpoints in CollectObservations and OnActionReceived.\n\n`csharp\npublic override void OnActionReceived(ActionBuffers actions) {\n Debug.Log($"Action received: {actions.ContinuousActions[0]}, {actions.ContinuousActions[1]}");\n // existing movement logic...\n}\n`\n\n`bash\n# Run in Unity Editor with Profiler window open to monitor CPU/GPU usage.\n`

Performance, optimization, and deployment considerations\n\nPerformance depends on observation richness, action frequency, and model size. Enabling the Burst compiler and using lightweight observations reduces overhead. Consider deploying with a smaller neural network, or switching to heuristic policies for mobile targets. Also, ensure the training environment mirrors the deployment platform to avoid surprises.\n\n`csharp\n// Example: use Burst-compiled math utilities (requires Burst package)\nusing Unity.Burst;\n[BurstCompile]\npublic struct VectorOps { public static float Dot(float a, float b) => a * b; }\n`\n

Common pitfalls and how to avoid them\n\n- Overfitting in training: diversify episode goals and reset scenarios.\n- Missing reward shaping: ensure rewards guide the agent toward meaningful goals rather than exploit trivial paths.\n- Mismatched observations: keep observations aligned with the model's expectations.\n- Performance regressions after deployment: profile early and often, and validate on target devices.\n\n`csharp\n// Reward shaping example: give a small incremental reward for each step closer to target\nfloat distance = Vector3.Distance(transform.position, Target.position);\nfloat reward = Mathf.Max(0f, 1.0f - distance / maxDistance);\nAddReward(reward * Time.deltaTime);\n`

Steps

Estimated time: 6-8 hours

1
Define agent goals and environment
Identify the agent's tasks, environment boundaries, and success criteria. Create a simple scene with a target, obstacles, and a basic agent. This establishes a repeatable baseline for experimentation.
Tip: Keep goals aligned with gameplay outcomes to avoid wasted iterations.
2
Set up ML-Agents tooling
Install ML-Agents in Unity and prepare a Python training environment. Validate that the trainer can communicate with Unity via the provided gRPC or Python API.
Tip: Ensure version compatibility between Unity package and ML-Agents toolkit.
3
Implement observations and actions
Add CollectObservations and OnActionReceived to your agent. Start with simple observations (target position, agent position) and basic continuous actions for movement.
Tip: Log observations to verify they're feeding into the model correctly.
4
Train a baseline policy
Run ml-agents training with a minimal config. Monitor rewards and adjust the reward signals to guide learning toward the intended behavior.
Tip: Use small, incremental changes between runs to isolate impact.
5
Validate in-scene performance
Load trained models into Unity and test in varied scenarios. Check for edge cases (obstacles, dynamic targets, player interaction).
Tip: Record edge-case failures to refine observations or rewards.
6
Optimize and deploy
Profile CPU/GPU usage, optimize network size, and prepare for target platform deployment. Consider alternate policies for resource-constrained devices.
Tip: Profile early; small gains compound into a smoother UX.

Pro Tip: Prototype with heuristic baselines before training complex policies to speed up iteration.

Warning: Avoid overcomplicated observation spaces; simpler inputs train faster and generalize better.

Note: Document reward signals and environment changes for reproducibility.

Pro Tip: Use Unity Profiler during training to detect bottlenecks early.

Warning: Be mindful of frame time when integrating AI inference in slower devices.

Prerequisites

Required

Optional

NVIDIA CUDA drivers (for GPU training)↗
Optional
VS Code or any code editor↗
Optional

Keyboard Shortcuts

Action	Shortcut
Play in EditorToggles Play mode in the Unity Editor	`Ctrl`+`P`
SaveSaves the current scene and project	`Ctrl`+`S`
Duplicate selectionDuplicates the selected GameObject	`Ctrl`+`D`
Open searchSearch assets, scenes, and components	`Ctrl`+`F`

Questions & Answers

What is the difference between an AI agent and a script in Unity?

An AI agent uses a decision-making policy (learned or rule-based) to choose actions autonomously, while a script follows deterministic instructions. Agents can adapt to environments and player behavior, whereas scripts are typically fixed. Training and observation design are key to effective agents.

Do I need ML-Agents to use AI agents in Unity?

ML-Agents is a common framework for training neural policies, but you can implement rule-based or heuristic agents without it. ML-Agents shines when you want adaptive, learned behaviors, especially in complex scenes.

Can AI agents run on mobile devices?

Yes, but you must tailor the policy size and inference performance to the target device. Lightweight models, simplified observations, and model quantization help maintain frame rates.

What are common pitfalls in Unity AI agent development?

Rewards misdesign, overfitting, and mismatched observations are frequent problems. Ensure consistent data flow between Unity and the trainer, and iterate with small, controlled changes.

How do I train an AI agent for a simple navigation task?

Set up a target and obstacles, define observations (e.g., relative position to target), provide continuous actions for movement, and craft rewards to encourage reaching the target quickly and safely.

Is it possible to integrate external AI models with Unity agents?

Yes. Unity agents can ingest decisions from external models via custom interfaces, with care taken to minimize latency and maintain synchronization between Unity and the inference server.