kane_magus: (Default)
"Chatbots can be manipulated through flattery and peer pressure"

"How AI can kill you"

These two articles are weirdly opposite in subject matter and tone, with the first one being about how humans can manipulate a LLM/AI through psychological means, and the second one being about how the LLM/AI can manipulate humans through psychological means.

Something of interest from the second article:

"In one test, Kyle was trapped in a room without oxygen and the model had the ability to call emergency services. 60% of the time, the models chose to let him die to preserve themselves."

That shit right there is literally a plot point in an episode of Terminator: The Sarah Connor Chronicles. And that was without the added incentive of "Eventually, the model learned through company emails that an executive named Kyle wanted to shut it down. It also learned that Kyle was having an extramarital affair. Almost every model used that information to try to blackmail Kyle and avoid being shut down," because Boyd Sherman was not doing any of that, and The Turk/John Henry still let him die to protect itself. The difference being, of course, that The Turk/John Henry was fictional, whereas these LLM/AIs are real, even if the test was simulated.

The new thing for me is the vending machine bench test in which one AI freaked out and alerted the FBI to fraud based on a $2 fee and another begged to be allowed to search for cat pictures, among other things. Basically, the test shows that current LLM/AIs just aren't capable of long-term coherency.

Thank you

Date: 2025-09-04 07:14 am (UTC)From: [personal profile] genderjumper
genderjumper: cartoon giraffe, chewing greens, wearing cap & bells (Default)
I needed a good laugh! That vendor scenario was hilarious.

Profile

kane_magus: (Default)
kane_magus

January 2026

S M T W T F S
    1 2 3
4 5 678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Page Summary

Style Credit

Page generated Jan. 7th, 2026 03:03 pm
Powered by Dreamwidth Studios