Let’s talk about open source AI and Public Policy
November 2024
As the challenge of delivering AI-at-Scale continues, the lines between reality and hype have become increasingly blurred. A deluge of claims, counterclaims, and sensationalist news articles about AI’s capabilities inundates us daily, making it seemingly impossible to discern fact from fiction. The rapid pace of technology evolution, coupled with the complex nature of AI itself, has created an environment ripe for misinformation and misunderstanding.
Quite rightly, much of the concern is not just whether what we see is true or not, but also how to determine where what we’re watching and hearing originated and if it is human or AI generated. A problem so severe that we’re said to be in danger of drowning in a tidal wave of “AI slop”.
But perhaps much deeper question must also be asked:
Do we have the right expectations for AI, or are we being misled about its impact?
Much is expected from AI, from generating billions of pounds of savings and replacing hundreds of thousands of jobs, to redefining key industries and reshaping society. Are such aspirations legitimate?
In this age of information overload, it is essential that we approach such claims for AI with a critical eye. It is only by examining the underlying technologies, examining the claims made by researchers and industry leaders, and evaluating the real-world impact of AI applications, we can begin to unravel the complexities of responsibly delivering AI-at-Scale.
A good place to start in addressing this question is with Arvind Narayanan and Sayash Kapoor’s book “AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference“. Released in September 2024, it provides a critical examination of the excessive hype surrounding AI. In this book, the authors effectively highlight the limitations of current AI technology, particularly in predictive AI, and expose the often-misleading marketing tactics employed by companies investing billions of dollars in its use.
The key strength of the book lies in its dissection of inflated claims about AI’s capabilities. Narayanan and Kapoor emphasize the importance of understanding the underlying mechanisms of AI systems and recognizing their inherent limitations. Unsurprisingly, they caution against the dangers of overreliance on AI, especially in critical decision-making processes. They argue that many AI models are fundamentally flawed, lacking robustness and transparency – a critique they especially level at predictive AI, which they claim often relies on unreliable and biased data.
The book offers a much-needed dose of realism about today’s AI. However, it also paints a rather pessimistic outlook and offers few solutions to the problems it highlights. They do recognize the significant advances in AI that have begun to revolutionize various industries and improve people’s lives. However, the authors are very sceptical about AI’s potential benefits, especially in generative AI. Their best advice is that those adopting AI capabilities need to be much more informed and take a much more critical approach to the technologies.
There is much to be learned from this book. Most importantly, underlying it are several significant concerns with AI that we all should acknowledge. The rapid proliferation of AI technologies has created a complex ecosystem of academic research, commercial claims, and media narratives that demands critical examination. Based on Narayanan and Kapoor’s work, we can highlight three fundamental aspects of AI’s current capabilities and potential that require a deeper review.
Academic studies: The reliability conundrum
As Narayanan and Kapoor highlight, academic research on AI is fraught with methodological challenges that undermine the credibility of many published findings. Several critical issues compromise the reliability of AI research.
Reproducibility Crisis. There is concern that a significant proportion of AI research studies fail the fundamental scientific test of reproducibility. Multiple investigations have revealed that breakthrough results often cannot be independently replicated. Machine learning papers too frequently present performance metrics that prove difficult to recreate under similar conditions, suggesting potential methodological artefacts or selective reporting. Adding to this is the worry that ill-informed use of AI is driving a even more unreliable or useless research.
The machine learning community has begun to acknowledge this problem. A 2016 article in Nature revealed that a survey it conducted with over 1,500 researchers, approximately 70% of them have attempted and failed to reproduce another scientist’s experiments, with over half failing to reproduce their own experiments successfully. This reproducibility challenge is particularly acute in complex AI domains such as natural language processing and computer vision.
Selective Reporting and Publication Bias. Many aspects of the academic publishing process are currently under question. Some even ask if the current process is fit for purpose in the age of AI. Academic incentive structures inadvertently encourage researchers to prioritize positive, headline-generating results. Journals demonstrate a strong preference for publishing studies with significant, novel findings, creating a systemic bias that marginalizes negative or inconclusive research. Such phenomenon may be leading to an artificially inflated perception of AI’s capabilities.
Limited Real-World Validation. Furthermore, many academic studies utilize constrained, controlled environments that poorly represent the complexity of real-world applications. Laboratory conditions rarely capture the nuanced, dynamic contexts in which AI systems must operate. Controlled experiments with carefully curated datasets frequently fail to translate into robust performance across diverse, unpredictable scenarios.
Facing financial pressures, AI vendors consistently present technologies through a lens of extraordinary potential, often substantially disconnected from actual capabilities.
Performance Exaggeration. Commercial AI presentations typically showcase best-case scenarios, carefully selecting demonstration contexts that maximize perceived effectiveness. Machine learning product demonstrations frequently employ:
These presentations create a significant gap between demonstrated potential and actual operational reliability. For instance, natural language processing tools marketed as comprehensive communication solutions frequently struggle with contextual nuance, cultural complexity, and domain-specific terminology.
Overstated Generalizability. Vendors frequently portray AI technologies as universally applicable solutions, obscuring the significant limitations of their systems. In some cases, these boundaries are only now being realised. A machine learning model successfully performing in one specific context may be erroneously presented as a generalizable technological breakthrough.
Economic Incentives for Exaggeration. Money is pouring into AI products and service companies. This substantial venture capital investment in AI technologies creates powerful economic motivations for presenting overly optimistic narratives. Startup funding models reward bold claims and potential disruption, incentivizing technological storytelling over measured, realistic assessments.
Journalists and technological commentators play a complex role in shaping public understanding of AI, often contributing more to mystification than clarification.
Sensationalist Reporting. Media outlets consistently prioritize dramatic, attention-grabbing narratives over more thoughtful technological analysis. In this domain it is all too easy to fall back on the images of mad scientists, rogue robots, and other sci-fi tropes. How often have we seen headlines anthropomorphize AI technologies, presenting them as quasi-sentient entities capable of revolutionary transformation?
Lack of Technical Expertise. Many technology journalists lack the deep technical understanding necessary to critically evaluate AI claims. This knowledge deficit results in uncritical reproduction of vendor narratives and academic press releases, further distorting public perception.
Oversimplification of Complex Technologies. Mainstream technological commentary tends to reduce sophisticated machine learning concepts to simplistic, easily digestible narratives. As a result, AI is now broadly used in the media for a wide range of technology-driven initiatives, This approach obscures the genuine challenges and limitations of AI technologies.
So, while Narayanan and Kapoor’s book provides valuable insights in directing us to the challenges facing the understanding and use of AI, there are several areas where its review of AI’s current status and prospects falls short.
Firstly, it fails to adequately contextualize the adoption of AI within the broader landscape of digital transformation. The authors overlook the significant progress made in digital technology adoption and use over the past decades. Many organizations have already established robust digital foundations, making the integration of AI a more seamless process than the book suggests. Adopting AI at scale can be enhanced by building on these foundations.
Secondly, the book neglects to address the systemic challenges that large organizations face in implementing major organizational changes. Reshaping an organization to operate in a new way is a complex and multifaceted task, requiring significant investment, cultural shifts, and strategic planning. Much of what they describe as challenges to AI are inherent in any significant technology-driven shift. The authors’ focus on the technical limitations of AI overshadows the broader organizational challenges that often hinder successful technology adoption of any kind.
Finally, the book’s analysis lacks a consideration of the evolving geopolitical and economic landscape. The complex geopolitical tensions and economic uncertainties faced by many countries and organizations today necessitate a more nuanced approach to balancing risk and innovation in AI adoption. Consider, for example, the heated debate about the political implications of establishing an “AI pause” early in 2023. This was just one of many inputs to the geopolitical tensions shaping the future of AI.
We must accept that these new challenges will require adjustments to our values and value systems. Consequently, the authors’ somewhat pessimistic outlook may not adequately reflect the need for organizations to embrace digital transformation, including AI, to remain competitive and resilient in this rapidly changing world.
Of course, we must all be concerned about the proliferation of misinformation and exaggerated claims that fuel “AI snake oil”. However, responsible AI adoption demands we move beyond sensationalist narratives and cultivate a more sophisticated, nuanced understanding of AI’s technological capabilities and impact. This requires collaborative efforts from researchers, industry practitioners, journalists, and policymakers to develop a deeper, more accurate and contextually grounded understanding of AI.
We all bear the responsibility for navigating the complex landscape of AI. Faced with a barrage of claims and counter-claims on AI, it is essential to arm ourselves with:
The future of AI lies not in mythological promises but in careful, incremental technological development guided by scientific integrity and human-centric design principles.
Originally posted here