
Give Your Agent
Eyespot
The world's first multimodal hardware Agent Skill — depth, vision, and audio in one device, giving your Agent the richest possible context of the real world.
The Problem
AI Agents Are Brilliant.
But They Need to See the Real World.
Today's AI Agents can write code, make plans, and manage schedules. But they have one fatal flaw — they can't see the physical world. An Agent that can book your flights doesn't know your package has been left at the door; an Agent that can write your reports doesn't know your pet is chewing on the sofa leg.
"Intelligence without perception is hallucination."
Our Vision
Give your Agent eyes,
help it see and even change the physical world.
Eyespot is not a traditional camera, nor a standalone AI hardware. It is a visual perception layer — a bridge connecting your existing AI Agent to the physical world. When an Agent gains visual capabilities, it evolves from a 'digital assistant' into a 'physical world intelligent partner'.
Three Trends Converging
AI Moving from Cloud to Edge
User demand for data privacy and instant response is driving AI capabilities to run on local devices. Edge AI Agents have moved from concept to productization.
Agents Going from Digital to Physical
Physical AI — machines that interact with the physical world — cannot succeed without vision. Vision is not a nice-to-have, but a prerequisite for Agents entering the physical world.
From Single-Sensor to Multimodal Perception
Next-generation AI hardware is moving beyond RGB cameras alone. Depth sensors (ToF), microphones, and speakers combined give Agents a far richer model of the physical world — enabling spatial awareness, audio context, and two-way interaction that pixels alone can never provide.
The Product
Not a Camera.
Your Agent's Eyes.
Smart hardware combining a ToF depth camera, RGB camera, microphone array, and speaker with edge AI computing — giving your Agent not just vision, but depth perception and audio awareness for richer real-world context.

Richer Context Than Any Camera
Standard cameras give Agents pixels. Eyespot gives Agents depth (ToF), sound (microphone array), and voice (speaker) — three sensing modalities that together deliver far richer context for smarter decisions.
Give Agent Vision
Let AI Agents go beyond text and voice to see the physical world, understanding scenes, objects, and events.
Proactive Scene Intelligence
From passively responding to commands to actively observing and understanding. Can distinguish between 'someone is at the door' and 'someone left a package at the door and walked away'.
Privacy-First Edge Computing
All visual data is processed locally, never uploaded to the cloud. User privacy and data sovereignty receive the highest level of protection.
Seamless Integration & Open Ecosystem
Deep integration with Home Assistant first, while providing open APIs to empower developers to create unlimited possibilities.
Three-Layer Architecture
RGB camera + ToF depth sensor + microphone array + speaker + wide-angle lens + night vision + edge AI chip
Local multimodal vision language model + scene-specific lightweight models
Home Assistant deep integration + RESTful API + MCP protocol

Hardware Sensor Suite
Beyond Pixels. Full-Spectrum Perception.
Standard cameras give Agents 2D pixels. Eyespot integrates four sensors so your Agent simultaneously perceives spatial depth, visual scene, ambient audio, and can respond with voice — context richness no single camera can match.
ToF Depth Camera
Spatial depth perception beyond RGB
RGB Camera
4K wide-angle + night vision
Microphone Array
Ambient audio awareness & voice input
Speaker
Two-way Agent interaction & voice output
Compare to traditional AI cameras (Arlo, Ring, etc.): RGB-only, no depth data, no audio sensing. The context they provide to Agents is severely limited — making truly intelligent decisions impossible.
Live Demo
See → Understand → Act
Camera Captures
Edge AI Processes
Agent Understands
Action Triggered
"What's happening at the door?"
"A delivery person just placed a package and left."
Agent → Sends notification + Unlocks smart lock
Product Line
Three Products. One Platform.
From smart home to the great outdoors — Eyespot covers every scenario where AI needs to see the real world.
Eyespot Lite
Smart Home, Simplified
Cloud-powered AI vision for everyday smart home users. No local compute required.
Eyespot Pro
Agent-Ready Intelligence
Local MLLM on-device. Full multimodal sensing — depth, vision, audio. The complete Agent Skill.
Eyespot Wild
AI Eyes for the Outdoors
Battery-powered, IP67 weatherproof. Built for birdwatchers, anglers, and wildlife photographers.
Core Features
Six Core Capabilities
Scene Observation
Scene observation & descriptionAsk in natural language, get real-time scene descriptions
"What's happening in the living room?" → "Lights are on, cat is sleeping on the sofa"
Object Recognition
Object recognition & trackingIdentify people, pets, packages and other common objects and their states
"There is a blue package at the door"
Event Detection
Event detection & triggersProactively send notifications to Agent when specific events occur
"Package detected placed at the door"
Visual Tasks
Natural language task settingSet complex tasks with visual conditions for Agent
"If you see me pick up my car keys, activate away mode"
Timeline Recall
Timeline recallAI-understood event timeline, not traditional video playback
"What happened at home this afternoon?"
Open API
Open API integrationRESTful API, Webhook, deep Home Assistant integration
Developers can build custom visual automation workflows
Market Opportunity
A Convergence of
Massive Markets.
Eyespot enters the intersection of multiple high-growth markets. In the cross-domain of AI Agent + visual hardware, no clear category leader has emerged yet.
AI Smart Home
2025 Market Size
Smart Home Camera
2025 Market Size
AI Agent Market
2025 Market Size
China Smart Camera
2022 Market Size

Competitive Landscape
Differentiated Positioning
Our competitive strategy is not to make a better camera, but to be the Agent's eyes, ears, and voice — delivering depth, audio, and visual context that no standard camera can match.
| Category | Representative | Positioning | Key Difference |
|---|---|---|---|
| Open Source Physical AI Agent | SenseCAP Watcher XiaoZhi | Developer-oriented desktop AI Agent | Niche, not mainstream consumer; self-contained, weak Agent integration |
| Mobile AI Hardware | Rabbit R1, Humane AI Pin | Personal AI assistant | Vague value prop, no killer app; not focused on space/scene |
| AI Smart Glasses | Meta Ray-Ban, Brilliant Halo | Personal wearable AI vision | First-person POV only; cannot monitor fixed scenes |
| Traditional AI Camera | Arlo, Ring, Ezviz | Home security monitoring | RGB-only, no depth data, no audio sensing — severely limited context for Agent reasoning |
| Smart Home Hub (Software) | Home Assistant | Open-source smart home platform | Powerful ecosystem but lacks official visual Agent hardware |
Roadmap
From Prototype
to Ecosystem.
0 - 6 Months
Product
- Complete hardware prototype with edge AI chip
- Implement basic Home Assistant integration
- Provide scene description and basic object recognition
- Release Skills for Agent integration
Market
- Small-scale beta testing in Home Assistant community (100-500 units)
- Rapid iteration through community feedback
- Establish brand social media presence and developer docs
Key Milestone
100+ beta users, NPS > 40
Beta Users
NPS Score
Daily Interactions
6 - 12 Months
Product
- Optimize AI model accuracy and inference speed
- Add package detection, fall detection, pet behavior analysis
- Refine industrial design and packaging to consumer grade
- Complete Home Assistant, IFTTT, n8n integrations
Market
- Kickstarter / Indiegogo crowdfunding launch
- Enter mainstream tech media spotlight
- Establish official website and e-commerce channels
Key Milestone
Crowdfunding > $500K, 5,000+ units shipped
Crowdfunding
Units Shipped
HA Rating
12 - 24 Months
Product
- Launch Model Store
- Support MCP (Model Context Protocol)
- Explore more hardware form factors (portable, outdoor, etc.)
Market
- Host developer competitions to inspire community creativity
- Establish deep partnership with Home Assistant officially
- Explore entering more regional markets
Key Milestone
50,000+ units shipped, 1,000+ active developers
Units Shipped
Models in Store
Active Devs
Business Model
Hardware + Software
Revenue Strategy.
Hardware Sales
Three product lines for every need: Lite starts at $59, Pro starts at $99, Wild outdoor edition starts at $149. Super early bird pricing available during Kickstarter campaign.
Value-Added Services
Premium AI model subscriptions, optional cloud backup services, Model Store platform revenue share. Avoid introducing subscription models too early to lower user decision barriers.
Founding Team
Product + Algorithm,
The Perfect Duo.
Two co-founders with over a decade of deep expertise in AI product commercialization and edge visual algorithms respectively. Their combined capabilities perfectly match the vision of giving Agents eyes.
Alex
Co-founder / Product
M.S. Computer Science, Renmin University of China (Data Mining)
01.AI
Head of To-B Product, driving LLM commercialization
Meituan
Led smart camera product line with edge-cloud architecture for warehousing, logistics, and rider compliance monitoring, achieving millions of daily API calls
Ant Financial
Built the Dragonfly face-payment system handling tens of millions of daily verifications; pioneered unmanned vending cabinets and self-checkout hardware
David
Co-founder / Technology
M.S. Tsinghua University
Baidu Visual Tech
Tech lead for edge devices, leading visual algorithm R&D, model quantization, and end-to-end edge deployment
Meituan Vision AI
Shipped multiple edge AI hardware products including smart capture devices, attendance systems, and rider cameras
Autonomous Driving
Leading algorithm quantization deployment and VLA algorithm R&D, specializing in edge optimization and multimodal visual perception
Join Us
Let's Build the Future
of Physical AI.
We're looking for investors, partners, and early adopters who share our vision of giving AI agents the ability to see the physical world.
Coming to Kickstarter · Limited early bird spots