Sam Altman, the visionary CEO of OpenAI, was recently observed in a contemplative pose within the company’s new, architecturally striking headquarters in San Francisco’s Mission Bay. The environment itself, a minimalist temple of glass and blond wood, is designed to inspire deep thought, with artistic representations of AI’s "Eras" and historical milestones adorning its spaces. Yet, despite this serene setting and the company’s celebrated advancements in areas like ChatGPT, a critical question has emerged: why has OpenAI, a pioneer in artificial intelligence, seemingly lagged in the burgeoning field of AI-driven coding?
The landscape of software development is undergoing a seismic shift. Millions of software engineers are increasingly leveraging AI to automate coding tasks, prompting a widespread re-evaluation of job security within the tech industry. This "AI coding revolution" has quickly become one of the most lucrative applications of artificial intelligence, a domain where enterprises are willing to invest significant capital. For OpenAI, this represents a prime opportunity to cement its legacy with another groundbreaking achievement. However, the current spotlight in this arena is undeniably fixed on a competitor.
Anthropic, a formidable rival founded by former OpenAI researchers, has achieved remarkable success with its coding agent, Claude Code. This product has rapidly ascended to become a cornerstone of Anthropic’s business, accounting for nearly a fifth of its revenue and generating an impressive annualized revenue of over $2.5 billion as of February. In stark contrast, OpenAI’s own coding agent, Codex, while a significant development, had achieved an annualized revenue of just over $1 billion by the end of January, according to sources with direct knowledge of the matter. This disparity raises a pertinent question: what explains this gap?
Altman, after a period of reflection, attributed the difference to the significant value of being first to market, a strategy that paid dividends with ChatGPT. However, he asserted that the time is now ripe for OpenAI to fully embrace the coding domain. He believes that OpenAI’s AI models have matured to a point where they can power exceptionally capable coding agents. The company has invested billions in developing these models, aiming not only for economic success but also to unlock the vast potential of coding for broader applications. Altman confidently predicts that AI-powered coding represents "one of these rare multitrillion-dollar markets." Furthermore, he suggested that Codex might represent the most direct pathway toward achieving Artificial General Intelligence (AGI) – AI systems capable of outperforming humans across most economically valuable tasks.

The Genesis of Codex and Early Momentum
The journey of AI in coding for OpenAI began earlier than many realize. Back in 2021, OpenAI leadership provided WIRED journalists with an exclusive look at an early iteration of their coding prowess. This offshoot of the GPT-3 model, meticulously trained on billions of lines of open-source code from GitHub, was codenamed Codex. Demonstrations showcased its ability to translate natural language commands into functional code snippets. Greg Brockman, OpenAI’s president and co-founder, described it as a system that could "actually act in the computer world on your behalf," heralding it as a significant step towards a "super assistant."
At this formative stage, Altman and Brockman were deeply engaged in discussions with Microsoft, OpenAI’s principal investor. Microsoft recognized the potential of Codex and began integrating it into its nascent commercial AI offerings, most notably GitHub Copilot, a code completion tool designed to enhance the productivity of programmers within their existing workflows. While early OpenAI employees described Codex at this time as being capable of little more than "autocomplete," Microsoft executives lauded it as a harbinger of the AI-driven future. Upon its public launch in June 2022, GitHub Copilot garnered hundreds of thousands of users within months, underscoring the immediate demand for such assistive technologies.
The ChatGPT Diversion and Strategic Realignment
Following the initial excitement surrounding Codex, OpenAI’s strategic focus underwent a significant pivot. The original Codex team transitioned to other projects, with the company’s leadership envisioning coding capabilities being integrated into future, more generalized AI models rather than being developed as a standalone product. Some engineers were reassigned to work on DALL-E 2, OpenAI’s groundbreaking image generation model, while others concentrated on training GPT-4, which was perceived as the most direct route to achieving AGI.
The launch of ChatGPT in November 2022 marked a watershed moment. Its rapid adoption, exceeding 100 million users in just two months, effectively brought all other projects to a standstill. For an extended period, OpenAI lacked a dedicated team focused on developing AI coding products. This strategic void, according to former Codex team members, stemmed from a perceived shift towards consumer-facing applications and the belief that the coding assistance market was already "covered" by GitHub Copilot. OpenAI’s role, in this view, was primarily to supply the underlying models to power Microsoft’s product, thus positioning it as a partner rather than a direct competitor in the coding agent space.
Throughout much of 2023 and 2024, OpenAI channeled its resources into developing multimodal AI models and agents capable of processing and interacting with text, images, video, and audio, and controlling computer interfaces with human-like dexterity. This direction aligned with broader industry trends, as companies like Midjourney were achieving viral success with their AI image generators, and a consensus was forming that truly intelligent AI would need to perceive and understand the world through multiple sensory inputs.

Anthropic’s Focused Pursuit of Coding Excellence
While OpenAI explored multimodal capabilities, Anthropic adopted a more focused strategy. Although also engaged with chatbots and multimodal research, Anthropic demonstrated an earlier recognition of the profound potential of AI in coding. Greg Brockman has publicly commended Anthropic for its early and intense focus on coding. He highlighted their approach of training AI models not only on complex academic coding challenges but also on real-world codebases, often characterized by their inherent messiness. Brockman acknowledged this as a "lesson that we were delayed on."
By early 2024, Anthropic was training its Claude Sonnet 3.5 model on these intricate, real-world code repositories. The model’s subsequent launch in June garnered significant praise for its coding proficiency. This was particularly evident for startups like Cursor, founded by a group of young entrepreneurs, which enabled developers to interact with AI for coding tasks using plain English. The integration of Anthropic’s advanced model reportedly led to a dramatic surge in Cursor’s user engagement. This success paved the way for Anthropic’s internal development and testing of its own dedicated coding agent: Claude Code.
As Cursor’s popularity grew, OpenAI reportedly approached the startup for a potential acquisition. However, sources close to Cursor revealed that the founders declined the offer, prioritizing their independence and their vision for the future of the coding industry.
OpenAI’s Re-engagement and the Race to Catch Up
The development of OpenAI’s own coding capabilities experienced a resurgence in early 2024 with the training of its first "reasoning model," o1. This model was designed to process problems step-by-step before arriving at a solution. OpenAI stated that the model excelled at generating and debugging complex code. Andrey Mishchenko, OpenAI’s research lead for Codex, emphasized the critical role of verifiable tasks in AI development, noting that code’s binary nature—it either runs or it doesn’t—provides a clear signal for model improvement. This feedback loop enabled OpenAI to train o1 on increasingly sophisticated coding problems. Mishchenko stated, "Without the ability to crawl around a code base, implement changes, and test their own work—these are all under the umbrella of reasoning—coding agents would not be anywhere near as capable as they are today."
By December 2024, several internal groups at OpenAI had begun to re-focus on AI coding agents. One team, led by Mishchenko and Thibault Sottiaux (OpenAI’s head of Codex and former Google DeepMind researcher), initially explored coding agents as a means to accelerate AI research by automating the management of training runs and monitoring GPU clusters. Concurrently, Alexander Embiricos, who had previously worked on OpenAI’s multimodal agents and now leads product for Codex, developed a demonstration called Jam. This project garnered significant internal attention.

Unlike the 2021 Codex demo, which produced code for human execution, Embiricos’s Jam could execute code directly. He described being captivated by watching Jam’s actions update in real-time on his laptop. Embiricos mused, "For a while, I had been thinking that multimodal interaction might be how we achieve our mission—like we would just be screen-sharing with AI all day. Then it became super clear: Maybe giving models programmatic access to a computer is how we’re going to get there."
The O3 Model and the Impending Competition
It took several months for these nascent projects to coalesce into a unified effort. The completion of training for o3 in early 2025, a model further optimized for coding than o1, provided OpenAI with the foundational technology to develop a robust AI coding product. However, this development coincided with Anthropic’s impending launch of Claude Code.
Prior to Claude Code’s public debut, initially as a "limited research preview" in February 2025 and then a general release in May, the cutting edge of AI coding was largely characterized by "vibe coding." This approach involved human programmers guiding AI tools that filled in specific code segments. In contrast, Anthropic’s Claude Code, akin to the Jam demo, operated directly from a computer’s command line, granting it access to a developer’s entire file system and applications. This represented a paradigm shift, moving beyond simple assistance to enabling developers to fully delegate coding tasks to AI agents.
OpenAI found itself in a race to deploy a competing product. Sottiaux formed a "sprint team" in March 2025 with the ambitious goal of integrating OpenAI’s internal groups and releasing an AI coding product within weeks. Simultaneously, Altman pursued another strategic acquisition to bolster OpenAI’s position: the AI coding startup Windsurf, for an estimated $3 billion. The expectation was that Windsurf would provide an established product, an experienced team, and a base of enterprise customers.
Acquisition Hurdles and Strategic Realignment
The Windsurf acquisition, however, encountered significant delays. Reports from The Wall Street Journal indicated that Microsoft, a crucial partner for OpenAI, sought access to Windsurf’s intellectual property. Microsoft’s long-standing reliance on OpenAI’s models for GitHub Copilot, a product that contributed significantly to its financial reporting, made the acquisition a point of contention. With the emergence of more advanced agentic coding experiences from Cursor, Windsurf, and Claude Code, GitHub Copilot began to appear dated. OpenAI launching its own competing product would complicate matters further.

The Windsurf deal became entangled in a period of tense renegotiations between OpenAI and Microsoft regarding their overarching partnership. OpenAI sought to reduce Microsoft’s influence over its AI products and computing resources. Ultimately, the Windsurf acquisition faltered by July. Subsequently, Google hired Windsurf’s founders, and the remaining team was acquired by Cognition, another AI coding startup.
Altman expressed his disappointment, stating, "I would have loved to get that done. You can’t control every deal." While he acknowledged that the acquisition would have "accelerated us somewhat," he also commended the progress of the Codex team, led by Sottiaux and Embiricos, who continued to develop and release updates during the acquisition negotiations. By August, Altman indicated that OpenAI had intensified its efforts in this domain.
The Reverse Turing Test and the Rise of Codex
Greg Brockman’s preferred metric for assessing AI performance is a custom-built computer game called the Reverse Turing Test. This game challenges AI agents to construct their own versions from scratch, based on a foundational description. The game involves two human players on separate computers, each presented with two chat windows. One window connects to the other player, while the second connects to an AI. The objective is to correctly identify the AI while convincing the opponent that one is, in fact, the AI.
Brockman reported that for most of the previous year, the company’s most advanced models required hours to build such a game, necessitating explicit human guidance. However, by December, Codex, powered by the new GPT-5.2 model, was capable of generating a fully functional game from a single, well-crafted prompt. This advancement was not isolated; developers worldwide began noting a significant improvement in the capabilities of AI coding agents. The conversation, which had largely revolved around Claude Code, transcended Silicon Valley and became a mainstream news phenomenon. Individuals with no prior coding experience started creating custom software projects.
This surge in usage was not accidental. Both Anthropic and OpenAI invested heavily in acquiring new customers for their AI coding agents. Numerous developers reported that their $200 monthly subscriptions for Codex and Claude Code provided usage well exceeding $1,000, indicating generous rate limits designed to encourage widespread adoption in workplaces. This strategy allows OpenAI and Anthropic to transition to usage-based pricing models as AI coding becomes integrated into daily workflows.

In September 2025, Codex reportedly accounted for only 5 percent of the usage of Claude Code. However, by January 2026, this figure had climbed to approximately 40 percent, according to individuals familiar with the usage data.
George Pickett, a developer with a decade of experience in tech startups, has begun organizing meetups focused on Codex. He stated, "I think it’s clear we’re going to replace white-collar work with agents. Societally, who fucking knows what this means. It’s going to be disruptive, but I’m pretty optimistic about what’s happening."
Simon Last, co-founder of the AI productivity startup Notion, noted that he and his senior engineers switched to Codex around the launch of GPT-5.2, primarily due to its reliability. "I found that Claude Code just lies to me," Last commented. "It says it’s working, but it actually isn’t."
Katy Shi, a research lead on OpenAI’s Codex team, acknowledged that while some users describe its default persona as "dry bread," many appreciate its less overtly agreeable style. "A lot of engineering work is about being able to take critical feedback without interpreting it as mean," Shi explained.
Fidji Simo, CEO of applications at OpenAI, highlighted the significant advantage of ChatGPT’s brand recognition in the business-to-business market. "Companies want to use technologies their workers are already familiar with," she said. OpenAI’s strategy for selling Codex involves bundling it with ChatGPT and other OpenAI products.

Jeetu Patel, president and chief product officer at Cisco, has encouraged employees to prioritize learning Codex, downplaying cost concerns. He articulated the sentiment, "When employees ask if ‘they’re going to lose their job because they’re using these tools,’ what we have to tell our people is no, but I guarantee you’ll lose your job if you don’t use them, because you won’t be relevant. So you’re going to be out."
Societal Impact and the Evolving AI Landscape
The widespread adoption of AI coding agents has sparked significant concern beyond Silicon Valley. The Wall Street Journal attributed a $1 trillion tech stock sell-off to fears that software would soon become obsolete, partly fueled by Claude Code’s capabilities. IBM experienced its worst stock day in 25 years following Anthropic’s announcement that Claude Code could modernize legacy COBOL systems, a common technology on IBM machines. OpenAI has actively engaged in public discourse surrounding Codex, including a Super Bowl commercial, shifting the focus from ChatGPT to its coding agent.
Within OpenAI’s headquarters, the utility of Codex is widely recognized. Many engineers report a significant reduction in manual coding, opting instead to collaborate with Codex. This collaborative spirit is fostered through events like internal hackathons, where teams of engineers utilize Codex to rapidly develop innovative projects. These projects often aim to enhance the use of Codex itself, such as tools for summarizing communications or creating internal documentation. These advancements, which previously took days or weeks, can now be accomplished in a single afternoon.
Kevin Weil, who leads OpenAI for Science, a new division focused on AI products for researchers, shared that Codex is currently working on overnight projects for him, with results to be reviewed in the morning. This "automated intern" concept aligns with OpenAI’s 2026 goal of developing AI that can perform research tasks.
Simo indicated that Codex is intended to power features across all OpenAI products, not just for programming but for task completion in general. Altman has expressed interest in releasing a general-purpose version of Codex but remains cautious about the safety implications. He cited a recent instance where a non-technical friend requested the setup of OpenClaw, a viral AI coding agent, which Altman declined due to its potential to delete critical files. Notably, OpenAI subsequently announced the hiring of OpenClaw’s creator, signaling a complex approach to integrating external innovations while managing risks.

The competition between Codex and Claude Code is intensifying, creating a dynamic market for AI coding agents. As these tools become more sophisticated and are increasingly mandated by corporations seeking efficiency gains, broader societal questions about the future of work and technological impact must be addressed.
Concerns have been raised by watchdog groups, such as the Midas Project, that OpenAI’s rapid pursuit of coding advancements might compromise its safety commitments. They accused OpenAI of failing to adequately outline the cybersecurity risks associated with GPT-5.3-Codex. Amelia Glaese, OpenAI’s head of alignment, has refuted these claims, asserting that safety is not being sacrificed and that Midas misinterpreted the company’s commitments.
Even for Brockman, who has made substantial political donations to advance OpenAI’s mission, the current landscape evokes mixed emotions. While he maintains that OpenAI is "right on schedule" for AGI, he acknowledges the shift in his own role. The era of AI agents completing objectives frees him from granular details, which he finds "very freeing." However, he also notes a potential detachment from the intricacies of problem-solving, remarking, "you’re not as in the weeds on exactly how different things are solved." This evolution in the nature of work, Brockman suggests, can lead to a feeling of "losing your pulse on the problem."
