Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
All respect for Hinton,, but it is very predictable that people make the argumen…
ytc_UgzkagOby…
G
Me getting a scam add using AI generated fake news reporting on this video is ev…
ytc_UgwO05Fhl…
G
@Totally_not_a_pineapple with traditional art your literally just moving a penc…
ytr_UgyHBgYUM…
G
People made AI to replace themselves entirely. They never realize that AI can ha…
ytc_UgylIjZT4…
G
Ethical responsibility hinges on the idea of suffering. If AI comes to a point w…
ytr_UgjntJCtm…
G
The programmer has the robot movements as though speaking over a podium to an au…
ytc_Ugx3P_WQX…
G
55:00 yeah because a government or private company having all the rights to ai w…
ytc_Ugxxtc6cL…
G
I think its great, its making people have to think critically again and question…
ytc_Ugwtik1hT…
Comment
My impression has been that Opus is actually much worse at coding than Sonnet. It overcomplicates everything, overgeneralizes simple discrete tasks, creates more bugs, and really does nothing better except cost more money, so idiots and Anthropic bots promote it the hardest. Like bear boxes in national parks, any heuristic you develop to defeat bots will also defeat the dumbest quintile of humanity…
My impression was also that 4.5 was moderately better than 3.7 at remaining “on track” with more complex tasks and managing context rot. Similar to the incremental change from 3.5 to 3.7. I do think Claude 3.5 was a real step change forward. LLMs did not impress me much before that. Perhaps due to my own ignorance of NLP, I didn’t foresee the evolution from the initial release of ChatGPT to the near-future applications of semantic search and RAG.
Like you, I am skeptical of the integrity of anyone who acts as if we’re not *deep* into the curve of diminishing returns with blindly scaling the current system architecture of LLMs. I have not met anyone IRL who thinks 4.5 was anything more than incremental progress, or that 4.6 was noticeably different in any way. Perhaps now that Claude Code’s embedded system prompts have been optimized for Claude 4.6, then 4.5 will seem worse if you downgraded, but judging model performance by the quality of prompt engineering is a category error in my book.
That being said: to play devil’s advocate, perhaps there are people who know less than I did, who are genuinely impressed by the latest models, that they are now able to use to do new things. Perhaps it’s not the model capabilities that upgraded, but the user’s capabilities.
This is all moving very fast. It’s genuinely exciting. Anyone who starts building applications with AI today is still an early adopter. Skepticism is warranted IMO. Crypto burned a lot of people, the neutered chat interfaces are utter garbage, and OpenAI and Anthropic are SoftBank-level financial dumpster fire
reddit
AI Jobs
1774268079.0
♥ 5
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | company |
| Reasoning | deontological |
| Policy | none |
| Emotion | outrage |
| Coded at | 2026-04-25T08:33:43.502452 |
Raw LLM Response
[
{"id":"rdc_obw7q2f","responsibility":"company","reasoning":"consequentialist","policy":"none","emotion":"indifference"},
{"id":"rdc_obvmgt7","responsibility":"ai_itself","reasoning":"mixed","policy":"none","emotion":"mixed"},
{"id":"rdc_oc0674s","responsibility":"company","reasoning":"deontological","policy":"none","emotion":"outrage"},
{"id":"rdc_obv69wi","responsibility":"company","reasoning":"consequentialist","policy":"none","emotion":"outrage"},
{"id":"rdc_obv8w5z","responsibility":"company","reasoning":"consequentialist","policy":"none","emotion":"outrage"}
]