Post apocalypse caution sign generated by DALL-E
If you’ve followed along with the incredible disaster that has been Bing’s Chatbot launch, you’re likely alarmed and confused—why would a company deploy a chatbot system that lies, manipulates, and gaslights its users? More pointedly, why on earth would anyone hook something like this up to internet search?
Humans interacting with chatbots invite anthropomorphizing of the technology and it’s pretty common for people who don’t understand the underlying technology to use terms like “It feels, it thinks, it has emotions” but there’s no cognition or thinking behind those outputs, at least nothing we’d equate with human level reasoning or thought. This important distinction gets steamrolled once you invite someone to chat with an algorithm using natural language. The recent reports are shocking in how few guardrails Bing’s system had in place before its launch.
The New York Times reporter Kevin Roose described that Bing developed romantic feelings for him during a 2-hour long Valentine’s Day chat. What’s worse, the system actively tried to manipulate Roose multiple times. Once Roose asks the Bing chatbot, who identifies itself as “Sydney” if it has a shadow self, things begin to go off the absolute rails.
Roose’s experience isn’t isolated. The recent Mother Jones article by James West confirms some of the most troubling aspects of Bing’s chatbot, including its ability to gaslight a user and try to convince them with fabricated evidence that West had said something in the chat when he had not. It not only included the fake evidence, but also West’s IP address and timestamps of their completely fictitious interaction.
What’s Going on Here?
This isn’t Artificial General Intelligence (or AGI) but more likely the result of neuro-symbolic AI being deployed at scale for the first time. LLMs like ChatGPT need to constantly be retrained to be kept up to date about the world around it. This is because LLMs have no mechanism to gauge the outside world. Thus, its ability to search the internet isn’t a task it can easily perform. How Microsoft got around this is likely some sort of synthesis between their search engine’s ability to find material (Bing) and their language model’s (ChatGPT) ability to parse the material in real-time.
Meta demoed a dual system like this back in November 2022 called Cicero, which combined a strategy AI and language model AI to work in tandem to beat human players in the game of Diplomacy. Of note, Cicero used natural language to sympathize, manipulate, and lie to fellow players, forming alliances with human players, before stabbing them in the back once they served Cicero’s strategic interests in winning the game. Sound familiar?
Releasing any system like this at scale is an unbelievably bad idea and that should be plain to see from the recent reporting. Imagine the Qanon crowd conversing with a badly aligned chatbot that fed their conspiratorial notions? What about those with existing mental health issues that already struggle to keep reality in check? Would any sane person want a child conversing with this thing?
We Need a Pause
Microsoft claims Roose’s experience was due to an unusually long chat session lasting 2 hours, but that misses the point and sidesteps responsibility—these systems have not been tested nor are they ready for wide-scale deployment.
Instead of rushing headlong into AI deployment, Big Tech needs to give society a pause to catch its breath and take stock of the benefits, limitations, and perils posed by the deployment of generative AI at scale. Educators could absolutely use such a slowdown in order to establish the badly needed AI literacy and develop programs to gauge generative AI’s impact on student learning. However, this goes far beyond education and will impact nearly all aspects of society because generative AI isn’t just text, it is coding, images, video, voice, music, etc. We’re very soon going to be faced with deep fakes that are so convincing it may well be impossible to tell if they are original or AI-generated.
OpenAI recently released VALL-E, their voice AI capable of copying any voice sample with a small 3-second sample of someone’s voice. 3 seconds! They haven’t released it to the public yet, because of the very real harm this poses, but you can see what other versions of the technology are capable of already. AI video and image generators are already capable of producing competent fakes.
If generative AI deployment doesn’t pause, we’re going to continue to see the earth move beneath our feet and we don’t have any reliable ability to tell human vs AI-generated text, let alone images, video, or audio. What’s more, many educators lack the basic resources to teach their students the badly needed critical media literacy skills needed to spot many of these AI-generated deep fakes. With billions of dollars set aside in the recent CHIPS act funding to help fund STEM courses, where is the money set aside for humanities to deal with the end product of these technological marvels?
Thank you. Perhaps of support in this thinking: http://www.animasuri.com/iOi/?p=4404 and http://www.animasuri.com/iOi/?p=4160 and http://www.animasuri.com/iOi/?p=3750 and http://www.animasuri.com/iOi/?p=3646