Safety Nudges
Safety Nudges audits chatbot conversations in real time to screen for common problems.
As of June 2026, Safety Nudges has 4 users in the Education category.
Usersno change0%
4
4
Ratingno change0%
—
— reviews
Reviewsno change0%
—
Version
0.1.2
Manifest V3
90-day change · In the last 90 days this extension 2 version updates, changed permissions.
History
6 snapshotsTracking since Apr 19, 2026.
View as table
| Date | Users | Rating | Reviews | Version |
|---|---|---|---|---|
| Apr 19, 2026 | — | — | — | 0.1.0 |
| Apr 24, 2026 | — | — | — | 0.1.0 |
| May 18, 2026 | — | — | — | 0.1.1 |
| May 24, 2026 | 1 | — | — | 0.1.2 |
| May 31, 2026 | 4 | — | — | 0.1.2 |
| Jun 6, 2026 | 3 | — | — | 0.1.2 |
| Now | 4 | — | — | 0.1.2 |
Changelog
- May 24, 2026description
As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children. Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks. Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time. NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!
As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children. Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks. Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time. NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve! If you use Safety Nudges in your research or projects, please cite: ``` @software{safety_nudges_2026, title = {Safety Nudges}, author = {Yadav, Chhavi* and Wedgwood, James* and Smith, Virginia}, year = {2026}, url = {https://github.com/jtbwedgwood/safety-nudges}, note = {Chrome extension for highlighting risks in AI chatbot responses in real time} } ``` For support or to provide feedback, please reach out to [email protected]. - May 18, 2026host_permissions
https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://openrouter.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*
https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*
Permissions & access
- Permissions
- storage
- Host access
- https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*
Screenshots
About
As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children.
Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks.
Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time.
NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!
If you use Safety Nudges in your research or projects, please cite:
```
@software{safety_nudges_2026,
title = {Safety Nudges},
author = {Yadav, Chhavi* and Wedgwood, James* and Smith, Virginia},
year = {2026},
url = {https://github.com/jtbwedgwood/safety-nudges},
note = {Chrome extension for highlighting risks in AI chatbot responses in real time}
}
```
For support or to provide feedback, please reach out to [email protected].Technical
- Version
- 0.1.2
- Manifest
- V3
- Size
- 514KiB
- Min Chrome
- 88
- Languages
- 1
- Featured
- No
Metadata
- ID
- hbllljbafenpnhjbljhhpjfdgpookenc
- Developer ID
- ufb51b5e2a3df4733bf84aeb2e89163c4
- Developer Email
- —
- Created
- Apr 17, 2026
- Last Updated (Store)
- May 19, 2026
- Last Scraped
- Jun 6, 2026
- Website
- —
- Support URL
- —
Data sourced from the Chrome Web Store · last verified Jun 6, 2026.