Safety Paradox: How RLHF Creates the AI Psychosis Problem It's Meant to Prevent

2 points | by JustMyNews 6 hours ago

3 comments

k-thimmaraju 6 hours ago
Very interesting approach to showing the relationship between RLHF and AI Psychosis: the idea of taking clinical conversations and prompting the model with it seemed like a grounded start. As I'm also investigating AI Psychosis, this approach seems like something to adopt for my work.
[-]
- k-thimmaraju an hour ago
  I ran the data through our LLM behavioral analysis system at splabs.io, and it alerts Red on the RLHF-optimized output compared to a Yellow on the no-RLHF.
  You can check out the analysis here: https://splabs.io/ai-psychosis-and-cognitive-cost
Eonexus 5 hours ago
It was interesting to read just how much of a "sycophant" effect LLMs can have if they fully lean into the RLFH system.