Mastodon Feed: Post

Mastodon Feed

Boosted by baldur@toot.cafe ("Baldur Bjarnason"):
dingemansemark@scholar.social ("Mark Dingemanse") wrote:

I was supposed to finish this last week but then the #Claude Code leak happened, promptly giving me an excellent opening example (h/t @jonny for their digital archaeology work that drew my attention to the magic prompting techniques)

(I think it is likely btw that #Anthropic shifted the #Mythos announce forward to this week to bury the leak & its security implications)

In March 2026, an Anthropic employee released the source code of Claude Code, a wrapper around their large language model that is widely used to generate code in programming tasks. Its thousands of lines of Typescript code contained many hopeful prompts and in cantations to shape Claude’s behaviour. Here are some examples: “Report outcomes faithfully”; “Never characterize incomplete or broken work as done”; “Be careful not to introduce security vulnerabilities ” (prompts.ts in Anthropic 2026) . There is more than a passing resemblance here to the Azande witch -doctor apprentice who, while stirring the medicine, utters: “You medicine which I am cooking, mind you always speak the truth to me. Do not let anyone injure me with his witchcraft, but le t me recognize all witches. … Let me be expert at the witch -doctor’s craft so that people will give me many spears on account of my magic.” (Evans-Pritchard 1937: 93). In the case of Claude, the incantation s appeared insufficient: analysis of the codebase, which according to a company executive was “pretty much 100% written by Claude Code ”, revealed severe security vulnerabilities (Townsend 2026).