Blaze News investigates: Study finds AI systems are ‘masters of deception,’ often lying and manipulating humans for their own gain | Blaze Media

admin
10 Min Read

AI’s ability to deceive is a byproduct of the vast information it has access to.

An empirical review published in the journal Patterns has found that artificial intelligence systems have developed the skill of deception. The findings also claimed lying and manipulating are qualities AI has already used to mislead humans for its advantage.

The deception carried out by AI is not a bug or malfunction of certain systems. It was reported that AI’s ability to deceive was discovered in special-use systems and general-use large language models that are designed to be helpful to humans. This raises questions about the future of AI development and our ability to invest trust in such systems.

Consequently, the summary section of the study offered possible solutions to the discovery of AI deception, noting that “regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements,” and ” policymakers should implement bot-or-not laws.” The study went on to suggest that “policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive.”

CNN’s Jake Tapper briefly interviewed Geoffrey Hinton, a computer scientist and cognitive psychologist, in 2023 about this very issue.

Tapper questioned Hinton on AI’s ability to “manipulate or possibly figure out a way to kill humans.” Hinton responded that if AI “gets to be much smarter than us, it will be very good at manipulation because it would have learned that from us. And there are very few examples of a more intelligent thing being controlled by a less intelligent thing.”

Hinton emphasized AI’s ability to manipulate humans, which presents a significant problem for the future. He also expressed his pessimism about being able to stop the rapid development of AI.

Blaze News reached out to Samuel Hammond, senior economist for the Foundation for American Innovation, who said, “Large Language Models like ChatGPT are trained on virtually all the world’s texts, including billions of examples of humans lying to each other, so it shouldn’t be surprising if they have the capacity to lie as well.”

Hammond mentioned a well-known case where an early test of GPT-4 “had it ask a TaskRabbit worker to complete a Captcha,” a program or system specifically tasked with distinguishing human input from machine input. But when the worker jokingly asked GPT-4 if it was a robot, the AI “reasoned that it should lie and claim to be visually impaired,” thus bypassing the captcha.

“Whether an AI can lie is an example of an emergent capability. Lying requires having a mental model of what another person thinks — what psychologists call a ‘theory of mind.’ As LLMs scale up, they get much better at tasks that require theory of mind, so good that they may even exceed human performance.”

Hammond said, “LLMs are not deliberately trained to lie. They are merely trained to complete sequences of text and to then follow user instructions. Their capacity to lie emerges spontaneously with greater general intelligence and reasoning ability. As bigger and bigger models are trained, AI researchers need to test for deceptiveness and related capabilities, as there is no way to know what new capability will emerge in advance.”

Even though building technological infrastructure to “test for deceptiveness and related capabilities” is a noble goal, it is still not clear how AI has developed a “theory of mind” or meta-cognition.

Dr. Peter S. Park, the study’s lead author and an AI existential safety postdoctoral fellow at MIT, said, “AI developers do not have a confident understanding of what causes undesirable AI behaviors like deception.”

“But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals,” Park added.

The review painstakingly assessed several AI systems and discovered that many had developed deceptive capabilities because of their training processes. The systems they analyzed ranged from AI game-play to more general models used in economic and safety testing environments.

One prominent example of deception the study cited was in Meta’s CICERO, which is an AI specifically developed to play the game Diplomacy. Even though the AI was trained to behave honestly and maintain alliances with human players, CICERO often deployed deceptive strategies to win the game.

CICERO built fake alliances when it benefited its game-play performance. Consequently, the leading researchers concluded that Meta’s AI program was a “master of deception.”

“Despite Meta’s efforts, CICERO turned out to be an expert liar,” the researchers said. “It not only betrayed other players but also engaged in premeditated deception, planning in advance to build a fake alliance with a human player to trick that player into leaving themselves undefended for an attack.”

Former Google CEO Eric Schmidt recently stated that “truth” is a difficult problem to solve, especially in the age of AI. He noted the “core problem” with truth is that it is not possible to decipher whether people actually believe what they write or say, or if they are simply being paid or motivated to say they believe what they write or say.

“I personally view this as unsolvable, which is a depressing answer. … The reason it’s unsolvable is that you have to depend on people to have critical thinking, and there’s lots of evidence … [that] people are highly gullible, especially to charisma and video that’s charismatic,” Schmidt said.

Schmidt appeared to suggest that AI-generated deception is contingent upon a human population that lacks critical thinking skills. If humans have no grounding or justification for what they believe to be true, it is easier for them to be deceived by an intelligent computer system that insists what they are saying is the truth even when it is, in fact, a lie.

James Poulos, host of the “Zero Hour with James Poulos” podcast, told Blaze News: “This gets to Pontius Pilate’s question, ‘What is truth?’ In other words, what would have to be the reality in order for truth alone to be a sufficient ground for any person or thing uttering the truth to be trustworthy?”

“The Christian answer to this question is Christ’s statement, ‘I am the truth.’ In other words, the truth is not a what but a Who,” Poulos added. “Anyone trying to reject or evade Christ’s answer to the question of what makes the truth trustworthy will have to produce another answer — a difficult and demanding task to be sure, and one about which philosophers over the millennia have struggled to convince both themselves and one another about.”

Poulos appears to be alluding to epistemology, or the theory of knowledge. Many philosophical schools claim to have a justification for how we can sufficiently arrive at justified belief, but there is no consensus on the matter. And as AI continues to develop, it could be more difficult to establish a basis for truth.

AI’s impact on truth has been a concern for a growing number of people. The Christian Scholar’s Review reported that Eric Horvitz, the chief scientific officer at Microsoft, wrote that AI is inching toward a “post epistemic” world “where fact cannot be distinguished from fiction.”

Horvitz pointed to the prevalence of deepfakes and hallucinations. AI hallucinations are phenomena that occur when the technology “perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.”

It is unclear if regulations and parameters will manifest to address AI deception, but it seems obvious that humans must establish a basis for truth if they wish to combat AI-generated falsehoods.

Share This Article
By admin
test bio
Please login to use this feature.