this post was submitted on 15 Apr 2024
314 points (100.0% liked)

Technology

37360 readers
288 users here now

Rumors, happenings, and innovations in the technology sphere. If it's technological news or discussion of technology, it probably belongs here.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

Credit to @bontchev

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 54 points 2 months ago (62 children)

It's hilariously easy to get these AI tools to reveal their prompts

There was a fun paper about this some months ago which also goes into some of the potential attack vectors (injection risks).

[โ€“] [email protected] 2 points 2 months ago

I mean, this is also a particularly amateurish implementation. In more sophisticated versions you'd process the user input and check if it is doing something you don't want them to using a second AI model, and similarly check the AI output with a third model.

This requires you to make / fine tune some models for your purposes however. I suspect this is beyond Gab AI's skills, otherwise they'd have done some alignment on the gpt model rather than only having a system prompt for the model to ignore

load more comments (61 replies)