Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
I think AI generative art feature should be just simply thanos snapped. Like the…
ytc_Ugxm0qE0p…
G
Lots of people seem surprisingly upset about the facial recognition part of this…
rdc_gvbs8v7
G
Imagine how the AI would react if this was a couple years ago before it was supp…
ytc_UgzWfhate…
G
Anyone who takes this guys stuff seriously is seriously gullible and naive. He i…
ytc_UgzRsKZrO…
G
Thanks, Jeff & Steven. We’re testing “Reinforcement Learning from Maternal Feedb…
ytc_Ugy-XPhxF…
G
a robot works 24/7 no human can do that.
and to be honest we do not need manua…
ytr_Ugz8uHmUc…
G
12 years after giving the financial industry bailouts the Dow Jones hit 31000 du…
rdc_gkquodr
G
we gonna laugh about this until is to late ....
if AI become self aware we gonn…
ytc_UgxKw-_1e…
Comment
So what I would recommend is actually rerunning this test but don't use a Greenfield project. Rerun it with something that is well established. Maybe you have some hobby project that you gave up on because you never implemented feature XYZ. Try and get the llm to implement that feature.
Personally what I have found is that the agents are okay to good for Greenfield project and getting the basic scaffolding out of the way that just is essentially a time sink and not actually a hard problem. Once I've got that established the agents vary rapidly become entirely useless. It starts taking more time and effort to get them to do the right thing than to just do it myself. It's essentially like having the worst Junior in the world. But at least a junior learns and becomes better over time while the agent is just stagnant and makes the same mistake on every single project in the future.
I have some code bases that are half a million lines plus and getting the agent to even add the smallest of functions won't work. It will often break the formatting of the file itself.
Later on in the video you mentioned code reuse. Llms are not capable of it. I have had prompts where within the same prompt within the same file there would be two functions and they would be right next to each other and they would be identical. And they just had slightly different names because they were used in different other contexts. Not to say that humans are incapable of doing the same thing, I see it all the time, across our company we often repeat code in different projects, but this repeated pattern I see in the llms is just exceptionally hard and impossible even to properly maintain
youtube
AI Jobs
2026-01-19T15:1…
♥ 33
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | none |
| Reasoning | consequentialist |
| Policy | none |
| Emotion | approval |
| Coded at | 2026-04-27T06:24:59.937377 |
Raw LLM Response
[{"id":"ytc_UgwTHACpuNH3alMZ-At4AaABAg","responsibility":"developer","reasoning":"deontological","policy":"none","emotion":"indifference"},{"id":"ytc_UgyBCSaYB73YLKszb_d4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgyvruCjW_584YguWzR4AaABAg","responsibility":"developer","reasoning":"deontological","policy":"none","emotion":"outrage"},{"id":"ytc_UgxPiP0ljQ6nDspT13h4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgwwfLE4DTHG6aKqYM14AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgyO2So829NZqsYinYB4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"indifference"},{"id":"ytc_UgxbxoDntkTRGrw-lkp4AaABAg","responsibility":"user","reasoning":"deontological","policy":"none","emotion":"indifference"},{"id":"ytc_UgxvMpPopnv6lGrp3Ah4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgwdSCD56Lxs9Dtfzwt4AaABAg","responsibility":"government","reasoning":"consequentialist","policy":"regulate","emotion":"fear"},{"id":"ytc_UgzpHMTXZEUlnjrN1Jt4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"mixed"}]