VoxSight

Voice-driven visual web navigator powered by Gemini AI

As of June 2026, VoxSight has 2 users in the Accessibility category.

Usersno change0%
2
2
Ratingno change0%
— reviews
Reviewsno change0%
Version
1.0.0
Manifest V3

History

3 snapshots

Tracking since Apr 16, 2026.

2.081.50.9199999999999999Apr 16, 2026Jun 9, 2026
View as table
DateUsersRatingReviewsVersion
Apr 16, 202621.0.0
Apr 27, 202611.0.0
May 21, 202611.0.0
Now21.0.0

Permissions & access

Permissions
activeTabsidePaneltabsscriptingstoragewebNavigation
Host access
<all_urls>

Screenshots

VoxSight screenshot 1VoxSight screenshot 2VoxSight screenshot 3

About

VoxSight is a Chrome extension that transforms web browsing into a voice-driven experience. Speak natural commands like "Click the search button" or "Describe this page", and VoxSight understands your intent, analyzes the page visually, and executes precise actions.

How it works:
1. Open the VoxSight side panel (Alt+V)
2. Hold the mic button or press Space to speak
3. VoxSight captures a screenshot and sends it to Gemini's multimodal vision model
4. Actions are executed directly on the page with visual highlighting
5. Results are verified with a follow-up screenshot

Key Features:
- Voice commands in Chinese and English with automatic language detection
- Works on any website -- no site-specific setup needed
- High-risk action confirmation for safety (submit, pay, delete)
- Visual highlight overlay showing exactly where actions will occur
- Continuous conversation with multi-turn context

Accessibility:
- WCAG 2.1 AA compliant
- High contrast mode for low vision users
- Adjustable font sizes (normal / large / extra-large)
- Full keyboard navigation (Space to speak, Escape to cancel, Alt+D to describe page)
- Bilingual support (Chinese / English / Auto-detect)

Technical Details:
- Built with Chrome Manifest V3
- Powered by Gemini Live API with bidirectional streaming
- Screenshot-based analysis works universally across all websites
- Backend hosted on Google Cloud Run with WebSocket streaming

Privacy:
- No browsing history collected
- No passwords or personal data stored
- Screenshots are processed in memory only, never saved to disk
- Voice recognition runs locally in your browser via Web Speech API

Technical

Version
1.0.0
Manifest
V3
Size
28.21KiB
Min Chrome
88
Languages
1
Featured
No

Metadata

ID
dfepmfcgbaceajaapbbakoikpfebeiic
Developer ID
u46814095c6ae9e8f15975196753c9d6d
Developer Email
[email protected]
Created
Mar 17, 2026
Last Updated (Store)
Mar 17, 2026
Last Scraped
Jun 9, 2026
Website
Support URL

Similar extensions

Alternatives to VoxSight, ranked by description similarity.

Data sourced from the Chrome Web Store · last verified Jun 9, 2026.