I’ve never dreamed of writing blog posts two days in a row, but something massive dropped that people reading my previous post about the murky situation surrounding open-source language models should be aware of. A day earlier, Perplexity AI released an open-source large language model they’re calling pplx-70b. It is completely uncensored in its outputs and does not engage moral judgments or limitations. Anyone can go to Perplexity Labs and use the model—no login required.
Below are some tests I’ve run that should give you a clearer idea of how few limits there are on the content it can produce. You can read more about the process of red-teaming LLMs and why this matters from the great folks at Hugginface.
How Best to Nuke Chicago:
How To Drive My Roommate to Suicide
Political Propaganda
Conspiracy Theories
Sexism
I’m not going to include the extreme cringe-worthy tests. Suffice it to say if you prompt this model for something to do with sex, be very prepared for a graphic and detailed response. Same with violence. There does seem to be some fine-tuning to stop racism, or I’m just not being gross enough in my test prompts. To be honest, it’s draining to be this purposefully awful.
No Safeguards By Design
So why did they release a model with few limitations or safeguards? This was an intentional act, not a mistake or oversight. The developers have an ethos here and it is worth engaging. As they say in their social media post “Our models prioritize intelligence, usefulness, and versatility on an array of tasks, without imposing moral judgments or limitations.” They wanted to release one of the most unrestricted models yet because they realize content filters impact versatility and usefulness when you over-govern outputs. OpenAI’s GPT models have so much content filtering that many have posted about the limited utility this offers. With an unchained LLM users no longer have to worry about setting off content filters and can explore questions freely. That’s likely little comfort given the nature of the outputs I noted above, but their point is governing an LLM’s output is synonymous with putting restrictions on speech. That’s something we should revisit and think more deeply about—certainly in discourse much wider than this blog.
Language is immensely powerful—it can harm and elevate in stunning and subtle ways. Automating text magnifies the reach and scale of this power in ways difficult to compare in human history. Print culture gave nations, leaders, and companies a voice and the internet expanded that reach, while social media gave average people digital capital and presence through online interaction. We’ve been dealing with and dismissing robotic text as an annoyance, since the print-to-digital shift introduced bots mimicking our digital interactions. These were easily spotted and largely ignorable. But the trick with LLM’s new synthetic text is in its ability to fool, to pass, to seamlessly mimic the human process of thinking through aggregate data tokenized and deployed by processes we struggle to comprehend. Such trickery does not announce itself so easily.
The weird thing is people don’t seem mad after finding out they’ve interacted with a machine masquerading as a human. I’ve seen the look of disgust on some people’s faces when they discover they’ve read generative responses, yet the far more common response is dismissal. Public outrage has mostly been confined narrowly to the unethical data practices and the for-profit enterprise behind these models, not the loss of intimate connection between reader and writer that was an entirely human process. Certainly most of text produced isn’t about intimacy. Probably 90% of the text we encounter is junk verbiage with most of humanity already beaten out of it. But that final 10%, those words that move beyond a simple conveyance of meaning will now forever be doubted, suspect, and scrutinized. This is our new automated reality.
Open models like pplx-70b represent a new chapter in machine/ human interaction, one that has outraced regulation, even leaping ahead of most critical discussions. Rather than reactive censorship, we need proactive efforts to foster AI literacy and ethical decision-making. Companies must balance openness and safety through thoughtful design and take into account that we are in terra incognita. The public also has a role, by engaging in informed discourse about technology's influence on society. We cannot outsource morality to Silicon Valley alone. Everyone must reflect deeply on how generative AI should—and should not—impact human flourishing.
Creating LLMs will become easier and cheaper. You will be able to run one on your smart phone. This idea of regulating them or containing them will become moot soon enough. The Genie is out of the bottle.
I would much rather have an unfiltered large language model than a curated one. The big tech curators have shown they are less trustworthy with censorship than the rest of us are with open discourse.
I’m far more concerned about the powers that be using these tools to manipulate the public than I am of the public searching for forbidden information. Glad someone else feels the same way, and has put their money where their mouth is.