The Messy Path to an AI-Enabled World
Unlocking AI's Productivity Gains Means Breaking Down Data Silos
Generative AI has a data problem, and no, I don’t mean the well-trod discourse of training data—I mean we’ve institutionally and personally siloed our data to the point where it might well make the promised productivity gains moot. The bizarre moment we are entering is one of stubborn ennui; the technology now exists to optimize and predict our needs, making our lives easier and our work more productive, but our behaviors and culture around data aren’t aligned with the technology to see those gains without significant change. To put it another way, we’re likely never going to see the promised (or hyped) potential around generative AI unless we change how we behave with our data.
For most, this means un-siloing ourselves and moving our files or giving the AI access to all the systems it needs. I’m not sure I see that happening when you consider the discourse over privacy and personal material being used without consent for training LLMs. In fact, I think our siloing may well be the Luddite wrench that ultimately stalls AI adoption and leaves developers scratching their heads.
Silos Here, Silos There, Silos Everywhere
It’s clear from Microsoft’s and Google’s approaches to deploying generative AI that they envision entire organizations that coherently organize and partition data. You see, you don’t quite get the productivity gains promised by Microsoft’s impressive Copilot unless you allow it access to your data. Part of the digital debt that costs us time and productivity is shifting between files, folders, and different silos.
A silo is where you store data and in this instance, where generative AI cannot access unless you manually upload, copy files into, or allow an LLM access. I’m a siloed fiend. I keep files across my personal clouds, at least three different university clouds, hard drives, and teaching materials locked on my LMS, and even the words on this blog are siloed away on Substack’s platform. It’s fine for me—I know where my files are and move between them fairly easily, but each silo serves as a blind spot to whatever generative system I end up using. That’s bad because it means I won’t be able to get the full productivity gains from running the technology. It’s like leaving the house with the Roomba turned on but all of the doors closed. Without access, these tools won’t function as well as they are envisioned. This presents a dilemma - do we change our relationship with data to fully realize the potential of these technologies? Or do we accept more limited gains in exchange for maintaining control over our information?
Apple's Foray into Health Data
Apple is steadily breaking down silos, one product update at a time. The company is moving toward using machine learning to become the go-to health service for the wearable market, diving into expanding their already established health monitoring on their new Apple Watch by offering live sensors for diabetes, sleep apnea, and blood pressure. I’ve always been impressed with Apple’s marketing around their watch. They know that the main enemy with any wearable is that most people abandon it after a few months and shove it in a drawer. So they pushed safety—fall detection, automatic calling after a crash, and who could forget the ‘I’ve fallen off a ladder, I’m being swept out the sea, and I’m in a slowly sinking car’ ad?
What's notable with these new sensors is that Apple is leveraging personalized health data gathered through opt-in consent to power AI capabilities that provide life-saving value to users. This demonstrates both the tremendous potential of AI when provided appropriate data access as well as the incentives driving users to share private information in exchange for tangible individual benefits. Unlike generative AI's murkier productivity promises, Apple's health initiatives convince consumers to trade data for features that could actually improve their life or chances at extending it. This hints at the types of transparent data bargains and win-win propositions that may be needed to spur broader data openness.
China's Data Advantage
Ironically, the one place where generative AI may have the most impact is where political forces already suppress privacy and openly harvest data. China is poised to leverage its massive data-gathering practices in this new technological arms race, and there’s no way we can win without losing much of what makes us who we are. China’s heavily surveilled WeChat is a one-stop portal for shopping, social media, payments, you name it app. That’s about the closest we have to an example of a culture vertically aligning a chunk of its citizen’s data in a single silo. With over a billion users, that sort of access is unheard of in any other app or nation. What does China stand to gain with such overwhelming access to personal user data? For one, control, but also unmatched economic potential. Any government that can scale and optimize how it uses its population’s data can manipulate and maximize facets of everyday life in ways we’ve never before imagined. It isn’t clear to me how any other nation can compete with, or survive, the social disruption and loss of personal freedom such widespread data harvesting entails.
Realizing the potential of transformative technologies like generative AI will hinge on aligning technical capabilities with human needs and behaviors and none of this will be as simple as throwing a switch or downloading an app. It will be long, arduous, and filled with the messy jumps and starts that mark nearly every technological shift in our shared history. How we adopt or resist AI entails thoughtful consideration of trust, transparency, control, and benefit. If we want to unlock productivity via AI, we need cooperative frameworks for data access that respect privacy and agency. That may well be impossible. Perfecting alchemy might well be a more achievable goal.
A Song for Our Moment
We’re approaching the one year anniversary since OpenAI released ChatGPT and if there’s a song that captures where we are culturally it’s Flying Cars
Insightful article. I'm really excited about AI's potential to transform learning and education. But we can't gloss over the data access issue. Schools and edtech companies need enough student data to make AI work its magic - tailoring learning to each kid's strengths and needs. But we have to get the privacy questions right.
It's a delicate balancing act. How can schools and companies incentivize families to share student data, while ensuring it's used ethically and responsibly? Apple's approach with health data could be a good model - transparent rules and real benefits.
And we need to keep asking the hard questions as we deploy these systems. Is the data being used strictly to improve each student's educational experience? Or could it be misused for profit or questionable surveillance? There are no easy answers, but this has to be an ongoing dialogue. I attempted to provide some helpful tips in this post: https://open.substack.com/pub/thevaluejunction/p/ai-in-education-what-educational?r=2knor6&utm_campaign=post&utm_medium=web