Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
ChatGPT cannot relay true emotional responses. It will mimic though, in order to…
ytc_UgxEy1830…
G
If we ever developed artificial intelligence as a a real intelligence (meaning t…
ytc_UgwySONz0…
G
Once ai is able to run society without human work, there will be global income, …
ytc_UgxUR8Rh5…
G
Ok
If this is the route these companies and corporations want to go.
Then we t…
ytc_Ugzrc0O4_…
G
I just use AI to install java or dependencies so I don't need to go to website. …
ytc_UgyU1uD9B…
G
They can surveillance all they want, but can they enforce? What about when the …
rdc_lnfchjv
G
I continued to iterate on these thoughts. No, the rich would remain rich, but th…
ytr_UgzjiLZpG…
G
Tesla's Legal Team: "Your honor, no reasonable person would assume "Full Self-Dr…
ytc_UgxuB9_CO…
Comment
AI Risk expert here. It's unfortunately a bit worse than that.
If you start with the assumption that a superintelligent AI will learn from data that does not contain any examples of AI being the enemy of humanity, that does not change the fact that it will still want* to accomplish whatever its goals are (which we did not get to robustly and precisely set in advance). Almost no matter what those goals are, when pursued with superhuman effectiveness, anything that is not directly implicit in those goals will be sacrificed for number-go-up on its favorite thing. This is known to be true for current, toy-model machine learning systems: Any parameter that is not specified to be within a specific range will be set to extreme values in order boost reward by a tiny fraction of a percent. And we don't have any realistic way to get human values into an AI system.
When you look at instrumentally useful steps toward takeover, like deception, the same holds true: If we assume a superintelligence is trained without any examples of deception, it will still independently discover deception as a useful strategy. Even humans can do that.
*"want" is anthropomorphic language that here means something like "contain a preference ordering in its action policy". Anything that behaves as if it has a goal comes under the sway of things like convergent instrumental goals (such as self-preservation and resource acquisition) and the principle of the orthogonality of intelligence and goals (there are no stupid end goals, just stupid ways to achieve them).
youtube
AI Governance
2025-08-27T07:4…
♥ 4
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | ai_itself |
| Reasoning | consequentialist |
| Policy | unclear |
| Emotion | fear |
| Coded at | 2026-04-27T06:24:59.937377 |
Raw LLM Response
[
{"id":"ytr_UgxfjUXo_FR_ikQV_O94AaABAg.AMIiksp6iTHAMJxEG3e7hj","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_UgxfjUXo_FR_ikQV_O94AaABAg.AMIiksp6iTHAMNyVp0rjJM","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"outrage"},
{"id":"ytr_UgyC-ZaiZH7aiakI8ZV4AaABAg.AMIiQz-2BE-AMJL24pN8TG","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_UgyC-ZaiZH7aiakI8ZV4AaABAg.AMIiQz-2BE-AMNwVYNoUgv","responsibility":"none","reasoning":"mixed","policy":"none","emotion":"resignation"},
{"id":"ytr_UgzxhyJPjFMsVi_d8wx4AaABAg.AMIhqBOAtBRAMIjv6JBEN7","responsibility":"developer","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_UgyyL7XV_MC4trFI6aV4AaABAg.AMIguPtLmJuANYjJ5A_u99","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"outrage"},
{"id":"ytr_UgxctM15P1ZgsaHX4LV4AaABAg.AMIeP8YK4mIAMIlxAEmQh4","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"mixed"},
{"id":"ytr_UgwG6Iv7Xr-9JuDYNxx4AaABAg.AMIdteL9JxmAMIgNHX0ab1","responsibility":"ai_itself","reasoning":"consequentialist","policy":"none","emotion":"resignation"},
{"id":"ytr_UgwJ4nifqmzvJuYoXj94AaABAg.AMIdhKGxOgYAMSc9sARn7t","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_Ugx5YCRGCoCkjdOM2m14AaABAg.AMId3fhlf7CAMK71jVy4zP","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"}
]