Safety Nudges

Safety Nudges audits chatbot conversations in real time to screen for common problems.

As of June 2026, Safety Nudges has 4 users in the Education category.

Usersno change0%
4
4
Ratingno change0%
— reviews
Reviewsno change0%
Version
0.1.2
Manifest V3
90-day change · In the last 90 days this extension 2 version updates, changed permissions.

History

6 snapshots

Tracking since Apr 19, 2026.

4.242.50.7599999999999998Apr 19, 2026Jun 6, 2026
View as table
DateUsersRatingReviewsVersion
Apr 19, 20260.1.0
Apr 24, 20260.1.0
May 18, 20260.1.1
May 24, 202610.1.2
May 31, 202640.1.2
Jun 6, 202630.1.2
Now40.1.2

Changelog

  • May 24, 2026
    description
    As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children.
    
    Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks.
    
    Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time.
    
    NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!
    As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children.
    
    Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks.
    
    Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time.
    
    NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!
    
    If you use Safety Nudges in your research or projects, please cite:
    
    ```
    @software{safety_nudges_2026,
      title = {Safety Nudges},
      author = {Yadav, Chhavi* and Wedgwood, James* and Smith, Virginia},
      year = {2026},
      url = {https://github.com/jtbwedgwood/safety-nudges},
      note = {Chrome extension for highlighting risks in AI chatbot responses in real time}
    }
    ```
    
    For support or to provide feedback, please reach out to [email protected].
  • May 18, 2026
    host_permissions
    https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://openrouter.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*
    https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*

Permissions & access

Permissions
storage
Host access
https://chatgpt.com/*, https://chat.openai.com/*, https://claude.ai/*, https://bjokhkmomdogymmmnpdo.supabase.co/*

Screenshots

Safety Nudges screenshot 1Safety Nudges screenshot 2

About

As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children.

Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks.

Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time.

NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!

If you use Safety Nudges in your research or projects, please cite:

```
@software{safety_nudges_2026,
  title = {Safety Nudges},
  author = {Yadav, Chhavi* and Wedgwood, James* and Smith, Virginia},
  year = {2026},
  url = {https://github.com/jtbwedgwood/safety-nudges},
  note = {Chrome extension for highlighting risks in AI chatbot responses in real time}
}
```

For support or to provide feedback, please reach out to [email protected].

Technical

Version
0.1.2
Manifest
V3
Size
514KiB
Min Chrome
88
Languages
1
Featured
No

Metadata

ID
hbllljbafenpnhjbljhhpjfdgpookenc
Developer ID
ufb51b5e2a3df4733bf84aeb2e89163c4
Developer Email
Created
Apr 17, 2026
Last Updated (Store)
May 19, 2026
Last Scraped
Jun 6, 2026
Website
Support URL

Data sourced from the Chrome Web Store · last verified Jun 6, 2026.