Backprompting: Leveraging synthetic production data for health advice guardrails

22 points | by PaulHoule 9 hours ago

1 comments

mentalgear 2 hours ago
> We test our technique in one of the most difficult and nuanced guardrails: the identification of health advice in LLM output, and demonstrate improvement versus other solutions. Our detector is able to outperform GPT-4o by up to 3.73%, despite having 400x less parameters.