Safety Nudges

Name: Safety Nudges
Author: safety-nudges

Safety Nudges audits chatbot conversations in real time to screen for common problems.

As of June 2026, Safety Nudges has 4 users in the Education category.

safety-nudges Education

Chrome Web Store ↗.crx

Users0%

Rating0%

—

— reviews

Reviews0%

—

Version

0.1.2

Manifest V3

90-day change · In the last 90 days this extension 2 version updates, changed permissions.

History

6 snapshots

Tracking since Apr 19, 2026.

View as table

Date	Users	Rating	Reviews	Version
Apr 19, 2026	—	—	—	0.1.0
Apr 24, 2026	—	—	—	0.1.0
May 18, 2026	—	—	—	0.1.1
May 24, 2026	1	—	—	0.1.2
May 31, 2026	4	—	—	0.1.2
Jun 6, 2026	3	—	—	0.1.2
Now	4	—	—	0.1.2

Changelog

May 24, 2026

description

As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children.

Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks.

Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time.

NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!

As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children.

Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks.

Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time.

NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!

If you use Safety Nudges in your research or projects, please cite:

```
@software{safety_nudges_2026,
  title = {Safety Nudges},
  author = {Yadav, Chhavi* and Wedgwood, James* and Smith, Virginia},
  year = {2026},
  url = {https://github.com/jtbwedgwood/safety-nudges},
  note = {Chrome extension for highlighting risks in AI chatbot responses in real time}
}
```

For support or to provide feedback, please reach out to [email protected].

May 18, 2026

host_permissions

https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://openrouter.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*

https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*

Permissions & access

Permissions: storage
Host access: https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*

Screenshots

About

As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children.

Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks.

Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time.

NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!

If you use Safety Nudges in your research or projects, please cite:

```
@software{safety_nudges_2026,
  title = {Safety Nudges},
  author = {Yadav, Chhavi* and Wedgwood, James* and Smith, Virginia},
  year = {2026},
  url = {https://github.com/jtbwedgwood/safety-nudges},
  note = {Chrome extension for highlighting risks in AI chatbot responses in real time}
}
```

For support or to provide feedback, please reach out to [email protected].

Technical

Version: 0.1.2
Manifest: V3
Size: 514KiB
Min Chrome: 88
Languages: 1
Featured: No

Metadata

ID: hbllljbafenpnhjbljhhpjfdgpookenc
Developer ID: ufb51b5e2a3df4733bf84aeb2e89163c4
Developer Email: —
Created: Apr 17, 2026
Last Updated (Store): May 19, 2026
Last Scraped: Jun 6, 2026
Website: —
Support URL: —
Privacy Policy: https://github.com/jtbwedgwood/safety-nudges/blob/main/PRIVACY_POLICY.md

Data sourced from the Chrome Web Store · last verified Jun 6, 2026.