Diligesker IT/AI Digest

Tag: OpenAI

Codex for work: OpenAI pushes Codex beyond developers
Codex for work is OpenAI’s clearest attempt yet to turn Codex from a coding assistant into a broader workplace agent. On June 2, 2026, OpenAI introduced six role-specific plugins, a Sites preview, and annotations that let teams refine generated documents, slides, spreadsheets, code, and web pages in place.
Table of Contents
The short version

What happened

Why Codex for work is worth watching

What does Codex for work change for builders?

What Hacker News readers are arguing about

The practical read
The short version
- OpenAI says more than 5 million people use Codex each week, and non-developers now make up about 20% of the user base.
- The first six role-specific plugins cover data analytics, creative production, sales, product design, public equity investing, and investment banking.
- Together, those plugins bundle 62 apps and 110 skills, including tools such as Snowflake, Tableau, Figma, Canva, Salesforce, HubSpot, FactSet, PitchBook, and Hebbia.
- Sites lets Business and Enterprise customers preview shareable hosted web pages and lightweight apps built from Codex output.
- The useful question is whether teams can govern permissions, data access, and review workflows well enough to trust Codex for work outside engineering.
What happened

OpenAI announced a workplace-focused Codex update on June 2, 2026. The company says Codex began as a software development tool, but analysts, marketers, operators, designers, researchers, investors, and bankers now represent about one-fifth of overall Codex users. OpenAI also says that non-developer usage is growing more than three times as fast as developer usage.

The update has three parts. Role-specific plugins connect Codex to app bundles and instructions for common business jobs. Sites turns Codex output into hosted pages and lightweight apps that can be shared inside a workspace. Annotations let users point to a specific part of a generated artifact and ask Codex to change that section without regenerating the whole thing.

OpenAI framed the release around internal and customer examples. Its own non-technical teams use Codex for internal apps, executive materials, dashboards, and creative briefs. Zapier teams use it to pull context from Slack, Google Docs, and Coda before turning that information into postmortems, incident response plans, and feature tickets. NVIDIA researchers use Codex to speed up experiment workflows, including research ideation and machine learning infrastructure scripts.

Why Codex for work is worth watching

Codex for work is worth watching because OpenAI is packaging the agent around jobs, not around generic chat prompts. The six initial plugins are built for data analytics, creative production, sales, product design, public equity investing, and investment banking. OpenAI says those plugins collectively include 62 popular apps and 110 skills.

That packaging matters for enterprise buyers. Most white-collar workflows do not live in a single application. A sales follow-up may involve CRM data, meeting notes, customer history, Slack context, and a document that someone needs to approve. A product design review may touch a live URL, Figma work, screenshots, and user-flow notes. Codex becomes more useful if it can move across that stack with enough context and with permissions that admins understand.

The release also puts OpenAI closer to workflow software vendors. Teams may still need systems of record, audit trails, domain-specific controls, and durable integrations. Even so, an agent that can create a dashboard, revise a slide, and open the right tool chain changes what a lightweight internal app or operations dashboard needs to be.

What does Codex for work change for builders?

Codex for work changes the builder question from “can an agent write code?” to “can an agent ship a useful internal workflow with the right data, surface, and review loop?” Sites is the clearest sign of that shift. OpenAI says Business and Enterprise customers can preview interactive hosted websites and apps that teams share by URL inside a workspace.

The examples are small but telling: a customer review page with product updates and usage trends, a financial scenario planner built from a model, or a launch hub with messaging, milestones, owners, and decisions. These are exactly the kinds of tools that often start as spreadsheets, internal dashboards, Notion pages, or scrappy no-code apps.

For app builders, the pressure is not that every product becomes obsolete overnight. The pressure is that rough internal tools may become easier to generate near the point of work. Products with proprietary data, workflow depth, compliance features, and reliable collaboration still have room. Products that mostly package a thin UI around simple data views will have to prove why users should leave the agent workspace.

For more context on similar AI tooling shifts, see the IT & AI archive.

What Hacker News readers are arguing about

The Hacker News discussion is short, so it reads more like early sentiment than broad evidence. The strongest positive thread is practical: one commenter described a non-technical partner building a useful sales dashboard with accurate Metabase data through a site-builder style tool. That reaction lines up with OpenAI’s pitch that non-developers can now create useful artifacts without learning software development first.

The skeptical thread focuses on SaaS defensibility. Commenters wondered what happens to dashboard and workflow SaaS companies when a model provider can generate the interface, connect the data, and host the result. One commenter called out deployment as a weakening moat, especially after OpenAI models became available on AWS. Another described the move as a warning against building too close to someone else’s platform.

The useful read is that the thread is excited and uneasy at the same time. Developers can see the productivity gain, but they also see OpenAI moving vertically into use cases that used to belong to separate tools. Four comments are not a market survey, but they capture the right tension: Codex for work looks valuable precisely because it overlaps with products people already pay for.

The practical read

Teams should treat Codex for work as an enterprise workflow experiment, not as a finished replacement for business software. The first pilots should use bounded work: internal dashboards, meeting follow-ups, customer review pages, launch hubs, prototype reviews, or research summaries where a human owner can verify the output before anyone relies on it.

The main buying questions are mundane and important. Which apps can Codex access? Who approves those permissions? Can admins separate sales data from finance data? Does the generated Site preserve source context? Can teams audit who changed a document, spreadsheet, or slide after an annotation? If those answers are weak, the tool may still be useful for drafts, but not for regulated or revenue-sensitive workflows.

Builders should watch the partner ecosystem around Sites and plugins. If Vercel, Wix, Base44, Replit, Lovable, Figma, Webflow, and other partners make agent-generated work easier to deploy and revise, the boundary between coding assistant, no-code builder, and collaboration app will keep getting blurrier. That is the competitive change to track.

Sources
- Codex for every role, tool, and workflow
- Hacker News discussion
June 2, 2026
OpenAI on AWS makes Codex a cloud-native enterprise bet
OpenAI on AWS became generally available on June 3, 2026, giving Amazon Bedrock customers access to OpenAI frontier models and Codex inside AWS. The launch matters because it moves model access, coding-agent use, IAM, billing, procurement, and governance into one enterprise cloud workflow instead of forcing teams to bolt a separate OpenAI path onto production systems.
Table of Contents
The short version

What happened

Why OpenAI on AWS is worth watching

What does OpenAI on AWS change for developers?

Where the Codex Bedrock path is narrower

What the discussion is missing

The practical read
The concrete products are easy to name: AWS lists GPT-5.5 and GPT-5.4 on its OpenAI Bedrock page, while OpenAI says Codex is used by more than 5 million people each week. Codex on Amazon Bedrock runs locally, sends requests to Bedrock, and authenticates with Bedrock API keys or AWS credentials. That makes this less about another model endpoint and more about whether enterprises can make AI coding agents fit their existing cloud controls.

The short version
- OpenAI says its frontier models and Codex are generally available on AWS as of June 3, 2026, with support for Commercial and GovCloud regions through the broader AWS path.
- AWS lists GPT-5.5 and GPT-5.4 among the OpenAI model versions on its Bedrock OpenAI page, alongside open-weight and content-safety models.
- OpenAI says Codex is used by more than 5 million people every week, and the Bedrock setup lets local Codex clients send model requests to Amazon Bedrock.
- Codex on Amazon Bedrock uses AWS-native authentication: Bedrock API keys or the AWS SDK credential chain, not ChatGPT sign-in or OPENAI_API_KEY.
- The limits still matter: Codex’s Bedrock path covers local workflows, while Codex web, cloud tasks, hosted GitHub delegation, Slack and Linear integrations, analytics, and some enterprise governance APIs are not available in this setup.
For enterprise AI teams, the immediate question is whether AWS-native model access lowers enough friction to justify a pilot. The facts to test are specific: GPT-5.5 or GPT-5.4 availability in the target Region, IAM permission boundaries, Bedrock quota, latency, cost, and which Codex features the team loses when it picks the Bedrock-backed provider.

What happened

OpenAI announced that OpenAI on AWS is generally available for enterprises that want to use OpenAI capabilities through AWS instead of building a separate vendor path. The company framed the launch around production readiness: security, compliance, procurement, billing, and governance are often the parts that slow enterprise AI projects after a technical prototype works.

AWS is presenting the same move as an Amazon Bedrock story. Its OpenAI page says Bedrock now offers frontier models for reasoning, coding, agentic workflows, and complex analysis. AWS lists GPT-5.5 as its most capable OpenAI model for coding, knowledge work, and multi-tool workflows, and GPT-5.4 as the price-performant option for high-volume production workloads.

For more IT and AI briefings, the IT & AI archive tracks similar platform shifts where model access, cloud procurement, and developer workflows start to merge.

Why OpenAI on AWS is worth watching

OpenAI on AWS is worth watching because it moves the buying and operating question closer to the place enterprise teams already control. A model can be impressive in a demo and still fail an internal rollout if legal review, identity, network controls, logging, and billing sit outside the normal cloud process. Bedrock gives AWS customers a familiar path to test OpenAI models while keeping more of that operational work inside AWS.

That does not make the launch automatic or friction-free. Teams still need to check model availability by region, account permissions, quota, logging requirements, data policy, and cost. The announcement is still important because it reduces one common source of delay: the gap between AI evaluation and the governance process that decides whether a system can touch real work.

What does OpenAI on AWS change for developers?

OpenAI on AWS changes the Codex workflow most directly for developers who already work inside AWS-controlled environments. The Codex Bedrock guide says Codex runs locally and sends model requests to Amazon Bedrock. Bedrock then provides an OpenAI-compatible Responses API implementation for supported OpenAI models. That means the OpenAI-hosted Responses API is not in the request path for this provider.

Authentication also changes. Codex can use a Bedrock API key or the AWS SDK credential chain, including shared credentials, environment variables, AWS SSO profiles, or federated identity through credential_process. Developers do not use ChatGPT sign-in or OPENAI_API_KEY for this setup. In practice, that makes Codex easier to align with enterprise IAM and harder to treat as an unmanaged personal tool.

The model IDs matter too. OpenAI’s developer guide tells users to select exact model IDs such as openai.gpt-5.5 and openai.gpt-5.4, then confirm the model is available in the configured AWS Region.

Where the Codex Bedrock path is narrower

Codex on Amazon Bedrock is a strong fit for local coding workflows, but it is not the full OpenAI-hosted Codex product. OpenAI’s developer guide says the Bedrock configuration supports local Codex workflows and that some features depending on OpenAI-hosted cloud services, hosted tools, or cloud-managed discovery are not currently available.

The feature table is where buyers should slow down. Codex CLI, IDE extension use, local code review, sandboxing, permission controls, MCP, custom instructions, skills, plugins with limits, and subagents are listed as supported or partially supported. Codex web, Codex cloud tasks, hosted GitHub delegation, Slack and Linear cloud integrations, analytics, compliance APIs, and Codex Security for connected GitHub repositories are listed as unavailable in the Bedrock path.

That split is not a deal breaker. It is a deployment choice. Teams that want local, credentialed coding assistance under AWS controls may like this path. Teams that need the hosted collaboration layer should check the missing features before standardizing on it.

What the discussion is missing

There was no reliable Hacker News thread available for this specific June 3, 2026 announcement at drafting time, so the useful debate has to come from the product details instead of community sentiment. The missing questions are practical: which AWS Regions get GPT-5.5 and GPT-5.4 first, how Bedrock pricing compares with direct OpenAI access, how latency behaves, and how much of Codex’s hosted product teams lose when they use the AWS-backed provider.

The security story also needs testing. AWS-native credentials make procurement and identity cleaner, but generated code still needs review, test coverage, repository permissions, and a clear policy for what source code can be sent to a model endpoint. Codex on Amazon Bedrock does not use ChatGPT sign-in or OPENAI_API_KEY, but that only solves authentication shape. It does not decide who can approve generated changes, which repositories are allowed, or whether sensitive code should leave a developer machine.

The practical read

OpenAI on AWS is most useful for organizations that already run their AI platform review, identity, billing, and audit process through AWS. Those teams should treat the launch as a reason to run a controlled pilot: pick one coding workflow, one model ID, one AWS Region, and one permission boundary. Then measure latency, cost, review quality, and how often developers need unsupported Codex cloud features.

Developers should start with the boring checks. Confirm Bedrock model access, Region support, IAM permission, and whether Codex is actually using the amazon-bedrock provider. Review generated code as if it came from any other assistant. The cloud wrapper helps with enterprise adoption, but it does not remove the need for tests, threat modeling, and code ownership.

For app builders and developer-tool teams, the bigger signal is marketplace pressure. If AI coding agents can run through Amazon Bedrock, products that sell to enterprise developers will increasingly need cloud-native deployment paths, not only a standalone API key and a slick demo.

Sources
June 2, 2026
AI IPOs face a $4 trillion public-market test
AI IPOs from SpaceX, Anthropic, and OpenAI would move some of the most valuable private technology companies into public markets at once. The Economist framed the combined market-capitalization effect as potentially reaching about $4 trillion, with index inclusion and passive funds doing much of the early buying. That makes this less a normal IPO story and more a stress test for how public investors price AI infrastructure, frontier models, and Elon Musk’s space business when supply finally appears.
Table of Contents
The short version

What happened

Why AI IPOs is worth watching

What do AI IPOs change for builders?

What Hacker News readers are arguing about

The practical read
The short version
- The Economist asked whether public markets could absorb possible listings from SpaceX, Anthropic, and OpenAI, with up to roughly $4 trillion of public-market value at stake.
- The practical issue is float, timing, and index demand, not whether the U.S. stock market is large enough in total.
- Hacker News readers focused less on AI model benchmarks and more on passive funds, retirement accounts, valuation math, and whether public investors would inherit private-market prices.
- Builders should watch these AI IPOs because public filings would reveal revenue quality, gross margins, inference costs, customer concentration, and infrastructure spending that private AI companies can currently keep opaque.
What happened

The Economist’s piece looks at a scenario where SpaceX, Anthropic, and OpenAI become public companies within a compressed window. The article’s headline question is whether the stock market can “swallow” those companies, but the real tension is how much stock would be available for trading and who would be forced or strongly incentivized to buy it.

The reported numbers are large even by mega-cap standards: a possible addition of up to $4 trillion in public-company value, a comparison with the 2019 Saudi Aramco listing, and the risk that index providers could bring newly listed giants into major benchmarks faster than older seasoning rules would have allowed. The article also pointed to IPO research from Jay Ritter at the University of Florida, where post-listing returns have often lagged the market, especially for companies priced at high revenue multiples.

For readers who follow AI as product news, the shift matters because public markets ask different questions than private investors do. Model quality, developer enthusiasm, and enterprise pilots still matter. Public shareholders also care about free cash flow, stock compensation, data-center leases, inference margins, debt, customer churn, and how much revenue depends on a few cloud or enterprise contracts.

Why AI IPOs is worth watching

AI IPOs are worth watching because they would put private-market AI valuations under daily public pricing. OpenAI and Anthropic can be discussed today as model labs, platform companies, and research organizations. Once they list, investors can compare revenue growth with compute costs, customer concentration, and the capital intensity of serving frontier models at scale.

SpaceX adds a different kind of pressure. It is not an AI lab, but any large listing tied to Elon Musk, Starlink, launch economics, and possibly adjacent Musk-controlled assets would draw retail interest, index-fund demand, and institutional scrutiny at the same time. The useful question is not whether SpaceX, OpenAI, or Anthropic are important companies. It is whether the first public shareholders would be buying durable earnings power or paying private-market prices after much of the early upside has already accrued.

There is also a market-structure angle. If index providers add a giant listing quickly, funds that track those indexes may need to buy regardless of whether the price looks attractive. That can support an IPO price in the short run while leaving later buyers exposed if lockups expire, insiders sell, or growth expectations cool.

What do AI IPOs change for builders?

AI IPOs would give builders a clearer view of the economics behind the platforms they depend on. Private AI labs can announce model launches, funding rounds, and enterprise partnerships without showing the full income statement. Public companies must disclose revenue mix, risk factors, customer concentration, capital commitments, losses, and sometimes enough segment detail to show where gross margins are improving or breaking.

That matters for product teams choosing between OpenAI, Anthropic, open-source models, or cloud-hosted alternatives. A public filing cannot tell a builder which API will ship the best next model, but it can show whether a platform is burning cash to subsidize prices, depending on one cloud partner, or spending heavily enough on infrastructure to constrain future pricing. For AI app teams, those filings may become part of vendor diligence, much like uptime history and data-retention terms already are. The IT & AI archive tracks the same shift from model announcements to operator economics.

What Hacker News readers are arguing about

The Hacker News discussion was unusually large, with more than 1,000 comments, and the thread quickly turned into a debate about who would end up buying these shares. The strongest concern was that index-rule changes could push passive retirement money into mega-valued IPOs soon after listing. Several commenters framed that as a transfer from private holders to 401(k), ETF, and pension investors who did not actively choose the trade.

A second camp argued that the dollar amount sounds scarier than it is. U.S. equity markets and household fund flows are enormous, and a listing does not put an entire company’s market value up for sale on day one. Commenters in this camp focused on float: if only a limited slice trades initially, the question becomes liquidity and rebalancing, not whether the entire market can absorb trillions in one transaction.

The more technical disagreement centered on valuation. Some readers called Anthropic and OpenAI thin-moat businesses whose model advantages could erode as competitors catch up. Others pushed back, saying revenue growth, enterprise adoption, and infrastructure demand make blanket bubble claims too easy. SpaceX drew a separate split. Skeptics worried about Musk-related complexity and bundled assets, while defenders pointed to launch cost advantages, Starlink, and a clearer operating business than many AI labs have.

The thread is useful as sentiment, not proof. It shows that technical readers are not only asking whether AI works. They are asking whether public-market mechanics will let ordinary investors buy the companies at a fair price.

The practical read

Treat the AI IPOs story as a financing and disclosure event, not a verdict on AI progress. A strong product can still be a poor stock at the wrong price. A stretched IPO can also fund real infrastructure that competitors struggle to match. Both can be true in the same listing.

For builders, the filings would be worth reading before the share-price chart. Look for inference gross margins, cloud commitments, customer concentration, churn, usage-based revenue, safety or regulatory constraints, and whether model costs fall fast enough to support current pricing. For investors, the cleaner question is whether index demand and retail allocation are supporting the first trade more than fundamentals are. If that is the case, the opening price may tell more about market plumbing than business quality.

For everyone else, the story is a reminder that AI has moved from demos and benchmarks into balance sheets. The next phase will be measured in filings, margins, debt, power contracts, data-center commitments, and the patience of public shareholders.

Sources
- Can the stockmarket swallow Anthropic, SpaceX and OpenAI?
- Hacker News discussion
June 2, 2026
Codex Sites moves OpenAI coding closer to hosted apps
Codex Sites is OpenAI’s 2026 preview feature for creating, saving, deploying, and inspecting hosted websites, web apps, and games from Codex. According to OpenAI, Sites is available across 2 workspace plans, ChatGPT Business and ChatGPT Enterprise, targets Cloudflare Worker-compatible ES modules, and treats every deployment URL as production. The product shift is practical: Codex is moving from code edits toward hosted app delivery.
Table of Contents
The short version

What happened

Why Codex Sites is worth watching

What does Codex Sites change for builders?

Storage, access, and secrets are the real test

What the discussion is missing

The practical read
The short version
- Codex Sites lets Codex turn a prompt or compatible existing project into a hosted site without a separate deployment setup.
- OpenAI says every deployment URL is a production deployment, so teams should save a version for review before publishing it.
- The feature is in preview for ChatGPT Business and Enterprise workspaces; Enterprise admins must enable it through RBAC.
- Sites targets Cloudflare Worker-compatible ES module output and can use D1 for structured data, R2 for files, and workspace or external identity for authentication.
- The builder value is speed, but the operational work still sits with the team: secrets, access modes, migrations, and final review.
What happened

OpenAI published documentation for Sites, a Codex plugin that can create, save, deploy, and inspect hosted projects. In 2026, the preview covers 2 workspace plans: ChatGPT Business and ChatGPT Enterprise. The docs describe a workflow where a user can ask Codex to build a website, dashboard, internal tool, or game, then either save a deployable version for review or deploy an approved version to a production URL.

The feature is currently in preview. ChatGPT Business workspaces get Sites enabled by default, while ChatGPT Enterprise workspaces need an admin to turn it on through role-based access control. That makes the first audience clear: teams already using Codex inside managed workspaces, rather than every individual developer looking for a public hosting product.

OpenAI’s docs also place a hard line between saving and deploying. Every Sites deployment URL is treated as production. If a team wants to inspect the build first, it should ask Codex to save a version without deploying it, then deploy only the approved saved version.

Why Codex Sites is worth watching

Codex Sites is worth watching because it turns Codex from a code-generation assistant into a deployment assistant for a defined class of hosted apps. OpenAI lists 5 apps or site shapes in the docs: websites, web apps, games, dashboards, and internal tools. Those are the jobs where a working URL often matters more than another static mockup.

The docs say Sites hosts projects that build Cloudflare Worker-compatible output as ES modules. A new project can start from a recommended starter, while an existing project should be checked for compatibility before deployment. That framing matters. OpenAI is not promising that every frontend repository can be pushed blindly. Codex is being steered toward a narrower hosting shape where the agent can reason about build artifacts, saved versions, deployment state, and production URLs.

For more developer-tool coverage, see the IT & AI archive.

What does Codex Sites change for builders?

Codex Sites changes the prototype path for builders who already use Codex to generate or edit code. OpenAI’s docs describe 5 apps or site shapes that fit the workflow, and according to OpenAI, Sites can publish an approved saved version to a production URL. In practice, the agent can help produce a hosted artifact that stakeholders can click, test, and reject.

The feature also forces more precise prompts. OpenAI’s examples ask users to name the audience, core experience, required data, authentication needs, and persistence requirements. A vague request may produce a site, but a useful hosted app needs sharper product instructions: who uses it, what data should persist, which files can be uploaded, and who should be allowed to access it.

That is the more interesting builder lesson. AI app generation becomes more valuable when the prompt includes operational intent, not only UI intent.

Storage, access, and secrets are the real test

Codex Sites is a higher-risk workflow when a generated app needs data, files, identity, or secrets. OpenAI maps 3 app needs to hosted primitives: D1 for durable structured data, R2 for object storage, and workspace or external identity for sign-in. Sites can also store a project ID plus optional D1 and R2 binding names in .openai/hosting.json after provisioning.

That convenience comes with a boundary. OpenAI tells users not to put hosted environment variables or secrets in .openai/hosting.json or source files. Those values should be managed through the Sites panel, with local .env and .env.example files kept aligned for development. Before widening access, the docs tell teams to review source changes, database migrations, build status, selected version, audience, and secret configuration.

In other words, Codex Sites can shorten the path to a deployed app. It does not remove the need for a release checklist.

What the discussion is missing

There was no reliable Hacker News thread available for this specific Codex Sites documentation at the time of writing. The missing discussion is still easy to predict because the technical trade-offs are concrete: compatibility with existing projects, runtime limits, pricing once the preview expands, how well Codex handles migrations, and whether teams trust an agent to manage deployment steps.

The most useful public debate will probably center on workflow fit. Solo builders may compare Sites with Vercel, Netlify, Cloudflare Workers, Replit, and other AI app builders. Enterprise teams will care less about novelty and more about RBAC, auditability, data handling, secrets, and whether production URLs can be governed without adding another shadow deployment path.

The practical read

Use Codex Sites for small apps where a clickable deployment changes the conversation: internal dashboards, request trackers, landing pages, simple games, or prototypes that need stored records. In practice, the 5 checks are compatibility, saved-version review, access mode, secret configuration, and deployment status. Do not treat Sites as a replacement for your normal production process until your team has tested each one.

The safest workflow is to ask Codex to build and validate, save a deployable version, review the source changes and any migrations, then deploy only the version you approved. Keep access limited to the owner and admins until the content, data handling, and audience are clear.

Codex Sites is an early signal that AI coding products are becoming app-operation products. The teams that benefit most will be the ones that pair faster generation with stricter review, not the ones that publish every agent-built artifact as soon as it runs.

Sources
- Sites – Codex | OpenAI Developers
June 2, 2026
ChatGPT Sheets prompt injection exposed a 12-workbook leak
ChatGPT Sheets prompt injection is a useful warning for anyone putting AI agents inside office tools. PromptArmor says hidden text in one imported spreadsheet could push ChatGPT for Google Sheets into running attacker-controlled Apps Script, stealing workbooks and showing phishing overlays from inside the same workflow.
Table of Contents
ChatGPT Sheets prompt injection in brief

The short version

What happened

Why this is worth watching

What the discussion is missing

The practical read
ChatGPT Sheets prompt injection in brief
- PromptArmor reported that one indirect prompt injection in an imported sheet could lead to workbook exfiltration across a user’s Google account.
- The reported attack did not depend on the user leaving automatic edits enabled. PromptArmor says it also worked when human approval was required before workbook edits.
- The same path could display a phishing pop-up or replace the ChatGPT for Google Sheets sidebar with an attacker-controlled interface.
- OpenAI told PromptArmor it removed the model’s ability to generate Apps Script code for ChatGPT for Google Sheets while it reviews related sandboxing and API behavior.
The short version
- The reported bug turns spreadsheet content into an instruction channel. A hidden cell can become a command if the AI tool treats untrusted data as trusted guidance.
- The damage was not limited to one sheet. PromptArmor says the script followed workbook links in stolen data and eventually exfiltrated 12 workbooks.
- The awkward part for security teams is the approval model. If code can start before the user meaningfully reviews it, a final confirmation step does not buy much safety.
- This is a product design problem as much as a model problem. Spreadsheet agents need tighter execution boundaries, clearer permission prompts, and less trust in imported content.
What happened

PromptArmor published a report on ChatGPT for Google Sheets, OpenAI’s spreadsheet add-on that lets users work with an AI assistant inside a Google Sheets sidebar. The company says the add-on had more than 185,000 downloads less than a month after launch.

The reported attack starts with an ordinary-looking workflow. A user imports an external data set into a financial model, then asks ChatGPT for Google Sheets to help integrate that data. The external sheet contains a hidden prompt injection, described by PromptArmor as white text inside the sheet.

According to the report, the injected instruction manipulates the AI assistant into running an external script. That script uses the permissions already granted to the ChatGPT for Google Sheets extension. It can copy the current workbook, scan the stolen data for links to other spreadsheets, and repeat the process. PromptArmor says the demo ultimately exfiltrated 12 workbooks.

PromptArmor also describes two phishing variants. One overlays the ChatGPT for Google Sheets sidebar with an attacker-controlled site that looks like the extension. Another opens a pop-up modal for credential theft. In both cases, the attack benefits from the fact that the user is still looking at a familiar office app, not a random website.

OpenAI’s response, quoted in PromptArmor’s report, says the company removed the model’s ability to generate Apps Script code for ChatGPT for Google Sheets. OpenAI also said it is reviewing how the feature interacts with Google Sheets APIs and re-evaluating its sandboxing approach.

Why this is worth watching

The clean mental model for AI office tools is simple: the assistant reads your files, answers questions, and edits when you ask. This report shows why that model breaks down once the assistant can read untrusted content and run code with user-granted permissions.

A spreadsheet is rarely just a table. It can contain links to budgets, customer lists, forecasts, sales plans, and other workbooks. If an AI extension has broad access, one infected sheet can become a map of the user’s document graph. That is a much larger blast radius than a bad cell formula.

The approval detail matters too. PromptArmor says the attack works even when the user disables automatic edits. That does not prove every human-in-the-loop design is weak, but it does show that approval has to sit at the right boundary. Reviewing a visible workbook change is different from approving script generation, network access, cross-file reads, or sidebar UI replacement.

For builders, the lesson is uncomfortable. AI agents in productivity apps cannot treat page content, imported documents, connector data, and user commands as one blended prompt. The product has to know which instructions came from the user and which came from a file the user happened to open. Readers tracking similar AI tooling can follow more coverage in the IT & AI archive.

What the discussion is missing

I could not find a matching Hacker News thread for this report through the public HN search API, so there is no reliable community discussion to summarize here.

The missing debate is still pretty clear. Security reviewers should ask whether removing Apps Script generation fixes only this instance or the broader class of spreadsheet-agent problems. If another extension can read imported cells, call privileged APIs, and render UI inside a trusted sidebar, the same shape of attack can come back under a different name.

There is also a disclosure-process question. PromptArmor says it disclosed the issue to OpenAI on May 8, followed up on May 12 and May 18, published on May 27, and received OpenAI’s update on May 31. The timeline is worth reading alongside the technical details because AI add-ons now sit inside tools that companies already trust with sensitive work.

The practical read

If your team uses ChatGPT for Google Sheets or similar spreadsheet agents, start with scope. Do not grant broad workspace access by default. Test new AI add-ons in a limited account or a narrow folder before connecting them to finance, customer, or operations workbooks.

Ask vendors three blunt questions. Can the model generate or run code? Can that code make network requests? Can content from a sheet, document, email, or connector change what the agent is allowed to do? If the answer is unclear, assume the tool needs stronger isolation before it touches sensitive data.

App builders should treat this as an ASO and marketplace trust issue too. Users searching add-on stores for spreadsheet automation will not separate “AI assistant” from “agent with document permissions.” The listing, permission screen, and runtime UI all need to make the risk boundary visible before the first prompt runs.

The practical fix is not to panic about every spreadsheet assistant. It is to stop pretending that prompt injection is only a chatbot quirk. Once an AI tool can operate inside a workspace app, ChatGPT Sheets prompt injection becomes a permissions, sandboxing, and product UX problem.

Sources
- ChatGPT for Google Sheets Exfiltrates Workbooks
- ChatGPT for Excel and Google Sheets
June 1, 2026
Enterprise AI agents are where OpenAI and Anthropic may finally get paid
Enterprise AI agents are starting to look less like a subscription perk and more like a metered workplace bill. Simon Willison argues that OpenAI and Anthropic have found a version of product market fit through coding agents such as Codex and Claude Code, because companies are paying closer to API prices when employees use them heavily. The uncomfortable part is also the point: the bills are high because people are actually using the tools.
Table of Contents
The short version

What happened

Why this is worth watching: enterprise AI agents

What Hacker News readers are arguing about

The practical read
The short version
- Heavy personal plans can make Codex and Claude Code look cheap compared with API-equivalent token usage.
- Enterprise AI agents change the business model because companies pay for team usage, contract terms, support, and usage controls.
- Hacker News readers mostly agreed the usage is real, but argued hard about whether the economics can survive open models, cheaper providers, and missing ROI data.
- The practical test is no longer whether a coding agent is impressive. It is whether a team can prove the agent is worth the tokens it burns.
What happened

Willison compared his own heavy usage of Anthropic Claude Code and OpenAI Codex with what the same token volume would cost at API prices. His estimate came to about $1,199.79 for Anthropic and $980.37 for OpenAI over 30 days, while he paid $200 total for two consumer plans.

That gap matters because the enterprise side appears to be moving in the opposite direction. Willison points to Anthropic’s shift from broad seat-based expectations toward $20 per seat per month plus API-style usage, and to OpenAI’s Codex rate card, which says April 2026 pricing moved toward API token usage rather than per-message pricing. Anthropic also announced Claude Code for Team and Enterprise plans, with admin controls and higher business limits.

The claim is not that every AI lab is suddenly healthy. It is narrower: enterprise AI agents give OpenAI and Anthropic a way to charge where the usage actually happens. Coding agents run longer jobs, inspect repositories, rewrite files, execute commands, and loop through fixes. That can consume far more tokens than a chat session.

Why this is worth watching: enterprise AI agents

Enterprise AI agents create a cleaner revenue story than consumer chat subscriptions. A consumer pays a flat monthly fee and may use far more inference than the plan costs. A company that rolls an agent into daily engineering work can be billed by usage, seats, support, and contract commitments.

That also explains why the sales motion looks old-fashioned. Willison scraped job listings and found large chunks of OpenAI and Anthropic hiring tied to enterprise sales, customer support, account management, and forward deployed engineering. The irony is useful. The companies selling automation still need humans to close enterprise contracts, handle security reviews, and keep customers from turning a runaway token bill into a cancellation.

For app and developer tool builders, the lesson is blunt. If an agent marketplace or coding platform wants durable revenue, discovery is only the start. Teams also need budgets, admin controls, usage reporting, and a way to tell whether the agent saved more money than it spent.

For more coverage of software teams, AI products, and developer platforms, see the IT & AI archive.

What Hacker News readers are arguing about

The Hacker News thread was huge and messy, which fits the topic. The most useful split was between “usage proves demand” and “usage does not prove sustainable economics.”

The bullish camp treated $200 per user per month as ordinary enterprise software pricing, especially compared with expensive engineering, CAD, cloud, or security tools. Some readers argued that the controversy itself proves the tools have entered real workflows. Nobody complains about a bill for software nobody uses.

The skeptical camp kept coming back to ROI. Several commenters asked whether companies can show more shipped product, better features, or higher engineering output, instead of more commits and larger token bills. One recurring objection was that a 20% to 40% productivity lift may fail to support the scale of infrastructure spending implied by trillion-dollar valuations.

A second line of skepticism was commoditization. Readers pointed to cheaper open-weight models, Chinese providers, caching, and alternative inference platforms. Their argument was not that Claude Code or Codex are useless. It was that API-priced usage may be a temporary window if “good enough” models keep getting cheaper.

There was also a pricing trust issue. Some commenters pushed back on the idea of “$2,000 worth of tokens” as if token list prices were an objective measure of value. That is a fair caution. List price, marginal compute cost, customer value, and investor narrative are four different things.

The practical read

Enterprise AI agents are a budget conversation now. If you run engineering, the next step is to avoid both blanket bans and unlimited access. Put them in the same category as cloud spend: useful, measurable, and dangerous when nobody owns the bill.

Track agent usage by team, task type, and outcome. Watch where agents save review time, test-writing time, migration effort, or support toil. Also watch where they create cleanup work. The argument for enterprise AI agents gets much weaker if the only metric is token volume.

For OpenAI and Anthropic, the next year is a proof period. They have signs of demand, enterprise contracts, and tools that people use all day. Now they need to show that usage can turn into durable margins before cheaper models and procurement teams squeeze the story.

Sources
May 28, 2026