How An AI Detector Made Me Trust People Less
A week with Pangram's new Chrome extension changed how I read social media — and not for the better.
Pangram recently launched a new extension for Google Chrome that labels posts as being human or AI-generated across your social media feeds—X, Medium, LinkedIn, Substack, and Reddit. I used the feature for a week and was quite disturbed by the results. I noticed my behavior change. Dramatically. I stopped interacting and reading posts and instead focused on labels. When we allow a company to place labels on our social media interactions, we cede some agency to opaque systems, ultimately giving them a great deal of power over our interactions.
AI tools have undergone significant changes over the past three years, and AI detection classifiers have evolved accordingly. Pangram is one of the best AI classifiers currently on the market. Unlike earlier versions of AI detection tools from TurnItIn or GPTzero, Pangram reports “an accuracy rate of 99.98 percent and a false positive rate of just one in 10,000.” Taylor Lorenz recently posted an advert on Substack about using Pangram’s API to scan the top Substack posts for AI. Suddenly, “Pangramming” users by outing them for supposed AI misuse became a thing.
Like many AI tools that automate your interactions, activating Pangram’s Chrome extension is a security risk. You have to give the tool and company full access to your social media feed and information.
The Promise of Knowing What Is Human in the AI Era
Pangram’s pitch for their Chrome extension is provocative and arguably needed in a world overrun with digital slop: “Ignore the noise. Focus on what’s real.” The idea is tantalizing. For one week, it felt like I had X-ray vision and could see through every user’s process and know what was real vs generated. The value of understanding what was human, AI-assisted, or entirely AI-generated is immense in our era of ubiquitous AI tools. But it is also illusionary.
The idea that a classifier can filter our feeds and parse what it real vs. generated with exceptionally low false positive and false negative rates is not realistic at this stage. In my limited time with the tool, I noted and catalogued several questionable labels that didn’t hold up under scrutiny or further scans with Pangram’s own classifier. I’ve detailed those below. The risk of reputational harm to account holders is immense from a system that automatically labels content while you scroll.
It is undeniable that there’s a need to validate information as authentic and verify the veracity of claims, both human and AI. However, I’m not sold on using text classifiers to do it. But that doesn’t mean AI cannot play a role in helping us parse information. Mike Caulfield’s Deep Background prompt is an example I use with both faculty and students to show how AI-assisted fact checking can help people navigate dubious claims. Caulfield adapted his SIFT framework of Stop, Investigate the Source, Find Better Coverage, and Trace Claims to the Original Context by using a reasoning model, like Claude or Gemini, helping a person make sense of information.
Deep Background utilizes AI to enhance a user’s critical thinking and research skills—not replace them via automated process. It doesn’t label content as being truthful, somewhat truthful, or false, unlike Pangram’s Chrome extension’s labelling content as human, AI-assisted, or AI generated. Doing so strips any nuance from the conversation. With Deep Background, the user is forced to contend with the information, to think about and judge it based on the quality of claims, sources, and rhetoric. There is no required judgment with auto-labels. Your interaction with the label informs your decision-making process, and once activated, your attention is drawn to the label, not the post’s message.
It is true that clicking on one of the labels takes you to a report where you get to see some details about why it was flagged as human, AI, or some mix between, but this really doesn’t tell me much. There’s little transparency about why a certain post was labelled one way vs. another. That became a problem quickly when I noticed that the tool would flag certain content that was quoted partially on Substack notes. Case in point, Lincoln Michel quoted a post by Adam Kotsko that was flagged as “AI-assisted” in my feed. But when I clicked on Kotsko’s post, the label flipped from “AI-assisted” to “Human!”
Despite marketing and internal benchmarks in labs and external testing, AI detection often fails under real world testing condition. That’s because our digital world is messy. There’s the AI tool itself, the harness it is using, and the data it is trying to collect. Each represents a potential chokepoint where an error can occur. In addition to all of this, there’s the increasing use of adversarial tools and techniques people use to get past AI detection. You can use bypassers/humanizers to some degree to fool a detector, mix human text with generative text (the question is always “how much”), or simply use a novel LLM to fool a detector. One recent example is the newly launched Talkie, an LLM trained on pre-1930s text. Using Talkie for simple text generation was enough to fool Pangram. Likely because it wasn’t optimized to pick up on signals and linguistic patterns present in older training data.
No, you won’t suddenly see students or users generating text like they entered into a speakeasy as a tell of masked AI content. Companies that sell bypassers/humanizers might simply adopt a novel model like Talkie and integrate it into their product. Meaning, you can generate sophisticated responses using ChatGPT or Claude, then ‘wash’ the content using a bypasser optimized with some of the novel features found in a model like Talkie. Pangram and other detectors will respond by upgrading their classifiers, and on and on the arms race of AI detection and evasion goes.
Second-Order Problems that Few Consider
Unfortunately, the fleeting promise of discovering what posts contain AI text across your socials comes with a series of tradeoffs and potential problems. Deep ones. For starters, I had to put my trust in a technology company using a different form of AI to tell me if the accounts I was interacting with were likewise using AI. I’m not keen on turning to scenarios where AI becomes a solution to fix the problem a different form of AI created across my social networks.
But these issues are not just limited to trust. Seeing labels in red, yellow, and green changed my engagement habits. I stopped reading content with red “AI,” became highly suspicious of text labeled with yellow as being “mixed” or “AI-assisted,” and focused primarily on those labeled with green as “human.” There’s immense power in labels and color-coding content with a stoplight-style system changed my reading behavior. But was it accurate?
I was shocked to see red AI labels appear on some of their posts of the biggest AI critics I follow..
This is where my skepticism about AI detection went into overdrive. Gary Marcus is a vocal and prolific critic of AI developers, and as such, would be one of the last accounts on X I could see using AI to create a post. When you look closely at the report, you see Pangram lists his post with a “low confidence” rating. It’s also pretty clear Marcus is quoting someone. But that nuance is not present in the label. All it shows me is “AI” in red on my feed. Megan Fritts X post is likewise listed as AI, but Pangram only identifies the original post at the start of a thread as being “AI”. When you copy all the posts into a single response, Pangram reclassifies it from “AI” to “Human.”
Do you think a user would start a thread using AI then switch to write the following tweets organically? Neither Marcus or Fritts used AI to generate their posts. You can scan their other social media profiles as well with human eyes or use a classifier. AI isn’t present. These posts are false positives and I didn’t have to dig to find them—I simply scrolled my feed and clicked a few profiles when red labels appeared. Users should know that these tools are active on social media now and potentially mislabelling both posts and accounts.
My guess, and it is only a guess, is that these errors largerly have to do with context length. Pangram’s Chrome extension classifier only works on text that’s longer than 50 words, but even that is really tiny for an accurate report. Like the Substack example, the shorter the context the classifier has to work with, the more random the response. When you add more words, the results become further skewed. That might explain the technical issue at hand, but it doesn’t excuse the deeper human issue at stake. What’s being sold here is a system that markets itself as being “99.98%” accurate, which clearly should be in doubt based on the examples I was able to pull in my free time on a one week trial. How would you feel if your words were wrapped in an AI label you cannot see without your knowledge or consent?
Using a classifier to detect AI takes work and some level of knowledge to be effective. It isn’t easy nor should it be treated as an automated process. A user isn’t likely to pause and investigate a label on social media, or know enough about how classifiers might mislabel content to bring meaningful nuance to their social media feed. It’s much more likely they will simply judge the post as AI, disengage from it, and possibly block or unfollow the user before moving on. This could cause accounts to lose followers if they think they are using AI and artificially change how users interact. That’s a dangerous problem, because Pangram’s Chrome extension is active across X, Reddit, LinkedIn, Substack, and Medium. By trusting these labels, I’m ceding my judgment to an algorithm that I’m increasingly convinced isn’t optimized to accurately identify AI across social media platforms.
Take the example below on LinkedIn from a colleague of mine here at the University of Mississippi, Katerina Berezina, writing a post in about how AI and AI detection warps our sense of connection. The words, tone, and imagery all read as human, but the label suggests otherwise.
Once you run the post paragraph by paragraph through Pangram’s classifier, you start seeing something alarming. All the paragraphs, save one that starts with “I saw two posts”, are labeled as 100% human. Even if you believe that the author used AI for just one paragraph (I do not), Pangram’s classifier should have labeled the entire post as “mixed” or “AI-assisted,” not as being “AI.”
We Should Normalize AI Disclosure, Not Automate It
The allure of having your social media feed suddenly be curated so that you can pick and choose what accounts you follow to live in a space free of AI slop might be the most attractive marketing pitch I’ve yet seen for a company marketing AI detection. But like many AI claims, the results are often more complicated than the sales pitch and don’t match the marketing. The impact this has on my online interactions isn’t worth it to me. I likewise think the potential harms to an account holder are far more dangerous than having AI-generated material within your feed. There is the potential for people to have their audience assume that the messages they write are not authentic, or deceptive—all based on a label that may not be accurate.
The crux of the issue is that generative text can be used to effectively convey a message, to persuade you, to translate a text, or to engage you. It can also be used for slop, propaganda, and disinformation. The use cases are simply too complex to silo AI usage behind a label that isn’t designed to be scrutinized. When I ask students to disclose how AI was used or not used in an assignment, I’m not looking to police their behavior with technology. Rather, I’m interested in giving them the opportunity to surface their process and make some of the decisions that went into using or avoiding a system that is often anything but transparent. That requires far more time and judgment than a simplified labeling system can offer.
I use my example from teaching as one to consider. Even if a student is accused of using AI with a detector, they are still entitled to due process. They can challenge evidence, biases, and explain their decisions. There is no such process when this tool becomes active within your social media feed.
Imagine for a moment if such a system took hold in broader society. What if your browser constantly labeled the text of every site you visited as being “human,” “mixed,” or “AI?” Would you still buy a house with a listing that has an AI label next to it? Trust product reviews that were labeled as being written by human beings, even if they may have been written as part of a paid promotion? Find a different doctor or dentist because the text on their homepage had an “AI-assisted” label next to it?
Platforms will likely take issue with third-party tools that impact how their users interact with one another. I cannot imagine that any social media company will allow such a system to go unchecked once they discover how it changes user behavior, especially with ads.
It’s clear AI is one of the most contentious new technologies that has arisen in our lifetimes. I don’t think we need to seek out a technological solution that solves one problem caused by technology by introducing another. Doing so only increases the hold AI companies have on us by offering both product and solution. For me, the human solution is to spend less time online pointlessly scrolling and instead using that space to create more intentional engagements by reading and responding to posts that interest me, regardless if AI were used. I know how detectors work and can use them if needed. But that should be intentional and based on my own judgement. After all, auto-labeling your feed is just the flip side of AI marketing efficiency, another way of saying “let us do the thinking for you.”













The Pangram Chrome Extension lists many of my essays as 100% human (all that I tested) even though I use and disclose AI assistance. Never trust its output! The AI detection companies are gaslighting people with accuracy claims that fall apart as soon as you have a closer look.
In a pure fluke, I read this right after listening to the interview on the Atlantic with the founder of Pangram.
Thank you for clearly articulating 90% of my skin crawl moments from that discussion.
Automation to solve the problems of automation, do not remove the problems of automation.