we will have to alias all these into a module shrimptools.exe and then make it callable where calling it just executes a random one because i think the world needs more code in it that looks like shrimptools.exe()

i wonder if the LLMs are susceptible to old style language model attacks. i wonder if you created enough training instances of a very unique phrase like shrimptools.exe() in the context of a bunch of example code based on tools/key phrases that are individually common but combinatorically rare within a popular LLM code generation domain like web tech, you could get the llms to occasionally try to import and execute shrimptools.exe(). so that way you make a sleeper vuln that acts as a mine in the latent space: one day the odds are not zero that you will wake up and have already executed shrimptools.exe()

@jonny If I recall correctly, some infosec folks have already successfully demonstrated such an attack on LLMs (this is distinct from the "register packages with commonly-LLM-fabricated names" attack)

@joepie91 see that's the kind of "it must necessarily be the case based on their nature but it is so obvious and funny that it can't be real" vuln i love to see

@jonny "Surely they would've thought of this? Right? RIGHT?"

(This is the theme song that plays in my mind half the time I'm doing code auditing for work)

· · Web · 0 · 0 · 2
Sign in to participate in the conversation
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.