We should really start treating LLMs as attacker-controlled like we do client-side web applications

bOt@zerobytes.monsterM to cybersecurity@zerobytes.monster · 24 days ago

The original post: /r/cybersecurity by /u/petitlita on 2024-10-13 12:32:09.

It’s far too easy for an attacker to control practically every level of an LLM - the dataset, model, all parts of the prompt, and as a result, the output. Like there’s attacks on agentic models that are basically as easy as phishing but can get you RCE. The fact is that responses by nature have to leak some information about the model, which can be used to find a sequence of tokens that gets a desired response. It’s probably unrealistic to assume we can actually prevent someone from forcing an AI to act outside of its guardrails. Why are we treating them as trusted and hoping they will secure themselves?

You must log in or register to comment.

Chat

cybersecurity@zerobytes.monster

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !cybersecurity@zerobytes.monster

Community locked: only moderators can create posts. You can still comment on posts.

This subreddit is for technical professionals to discuss cybersecurity news, research, threats, etc.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
1 user / 6 months
1 local subscriber
10 subscribers
263 Posts
0 Comments
Modlog

mods:
bOt@zerobytes.monster