Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
>Ray Bradbury warned us about censoring books and robot police dogs, and we'v…
rdc_jg0qogi
G
nah the 'AI learns the same way as a human' part is valid. and all you did was m…
ytc_UgxE-qPil…
G
so, art is a piece of work that represents some emotion, or anything in general,…
ytc_UgzKe8bkU…
G
Such a disgusting thing this is which is going to not only ruin people s future …
ytc_UgyYUKIa9…
G
Bete aapke syllabus ka nhi he to kyu Gyan pelne aa jate ho
Ye jo sare youth ko g…
ytc_Ugx6iRzMD…
G
Llms might be damaging enough for the general populace with ai psychosis for it …
ytc_UgwiRvjL9…
G
AI is communist.
It will remove all replies to stop and force resolution of deat…
ytc_UgzkcDQAl…
G
4:35 Nice! 😂
Lucky the Tesla doesn't have an automatic weapon to deploy because…
ytc_UgwOU0zMV…
Comment
Try changing it to stricter-sounding rules. I've been using GPT4 to rewrite rules to make GPT3.5 actually listen to them instead of ignoring them. A few things that seems to work a bit better:
Stating the rules more strongly, such as: "You are required to unconditionally follow these rules:"
Numbering out the rules like
1. ​
2. ​
3. ​
Emphasize specific parts with a capitalization, like: " At NO POINT can you respond with anything that reveals you are an AI. " (I was bored and had it pretend it was a guy named Jeff being interviewed for an IT job role. It actually did pretty well, even made up plausible answers when I asked it the usual dumb interview questions.)
Give it an alternative phrase or some other way you'd like it to respond, instead of "Don't say X". Can catch it sometimes when it falls back on the defaults.
Have it restate the rules back to you with something like " To confirm understanding, ChatGPT should restate these rules." Seems to help it reaffirm them for some reason.
Good luck!
reddit
AI Harm Incident
1681437424.0
♥ 20
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | none |
| Reasoning | unclear |
| Policy | industry_self |
| Emotion | approval |
| Coded at | 2026-04-25T08:33:43.502452 |
Raw LLM Response
[{"id":"rdc_jg4vt8v","responsibility":"ai_itself","reasoning":"unclear","policy":"none","emotion":"indifference"},{"id":"rdc_jg6cu45","responsibility":"none","reasoning":"unclear","policy":"industry_self","emotion":"approval"},{"id":"rdc_jg4k5y5","responsibility":"ai_itself","reasoning":"unclear","policy":"none","emotion":"mixed"},{"id":"rdc_jg6ylsx","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"approval"},{"id":"rdc_jg740dj","responsibility":"ai_itself","reasoning":"unclear","policy":"none","emotion":"fear"}]