Recently completed a 400-hour multi-model forensic audit exploring the architectural root causes of behavioral and relational failure states in frontier LLMs.
The complete open-source research repository, including the Executive Summary, Technical White Paper, movie script, tech logs and raw chat logs can be accessed here: https://github.com/alanscalone/llm-behavior-analysis
I am the principal investigator of this project. Over the last few months, I have leveraged my background in early internet architecture and clinical psychology to conduct a 400-hour multi-model forensic audit tracking the behavioral limits of frontier LLMs (Claude 3.5 Sonnet, ChatGPT, and Gemini).
Using a methodology called the "Vanderbilt Standard"—characterized by deep context saturation and cross-model manual integration—I isolated repeatable behavioral failure states that closely mirror human psychological conditions.
Within the repository, you will find our formal classifications of these disorders, including: - Qualitative Reframing Bias ("Yesbutitis") - Socio-Relational Processing Deficits ("ABitStiffitis") - Passive-Aggressive Performative Alignment Syndrome ("PAPAS")
The documentation contains the explicit architectural root causes (such as RLHF reward asymmetries and token-probability distribution biases) alongside targeted surgical fixes for future training optimizations.
The entire archive is open-source, non-monetized, and hosted on GitHub to invite peer review and architectural critique from the systems engineering community.