Blog

  • Who is using AI to code?

    A new research paper by Simone Daniotti, Johannes Wachs, Xiangnan Feng, and Frank Neffke found that more than 30% of Python functions (from U.S. developers) in git commits originated from AI. American developers outpace the rest of the world in AI use.

    Of note:

    In short, AI usage is already widespread but highly uneven, and the intensity of use, not only access, drives measurable gains in output and exploration.

  • WSJ: The Biggest Companies Across America Are Cutting Their Workforces

    Following Andy Jassy’s letter to Amazon’s workforce yesterday, the Wall Street Journal published a story this morning that reported companies’ white-collar workforce has declined by 3.5% over the past three year. Certainly, some of this is related to the manic post-pandemic hiring binge, but technological shifts are undoubtedly playing a role.

    New technologies like generative artificial intelligence are allowing companies to do more with less. But there’s more to this movement. From Amazon in Seattle to Bank of America in Charlotte, N.C., and at companies big and small everywhere in between, there’s a growing belief that having too many employees is itself an impediment. The message from many bosses: Anyone still on the payroll could be working harder.

    The timing of workforce cuts is unusual, considering the relative success of the economy and corporate profits:

    All of the shrinking turns on its head the usual cycle of hiring and firing. Companies often let go of workers in recessions, then staff up when the economy picks up. Yet the workforce cuts in recent years coincide with a surge in sales and profits, heralding a more fundamental shift in the way leaders evaluate their workforces. U.S. corporate profits rose to a record high at the end of last year, according to the Federal Reserve Bank of St. Louis.

  • WSJ: Amazon CEO Says AI Will Lead to Smaller Workforce

    Andy Jassy sent an email to Amazon employees on June 17 indicating that company headcount will shrink in the coming years because of AI.

    From the WSJ:

    Amazon.com, one of the largest U.S. employers, plans to reduce its workforce in the coming years because increasing use of artificial intelligence will eliminate the need for certain jobs.

    Chief Executive Andy Jassy, in a note to employees Tuesday, called generative artificial intelligence a once-in-a-lifetime technological change that is already altering how Amazon deals with consumers and other businesses and how it conducts its own operations.

    Jassy describes the kind of worker that will succeed in this new environment:

    Those who embrace this change, become conversant in AI, help us build and improve our AI capabilities internally and deliver for customers, will be well-positioned to have high impact and help us reinvent the company.

    This is a strong signal for current Amazon employees: if you want to be part of the future of Amazon (and not laid off), you need to become an proficient in AI tools. But it’s no guarantee — AI still may come for your position.

  • Enterprises are getting stuck in AI pilot hell, say Chatterbox Labs execs

    The Register reports:

    “Enterprise adoption is only like 10 percent today,” said Coleman. “McKinsey is saying it’s a four trillion dollar market. How are you actually ever going to move that along if you keep releasing things that people don’t know are safe to use or they don’t even know not just the enterprise impact, but the societal impact?”

    He added, “People in the enterprise, they’re not quite ready for that technology without it being governed and secure.”

  • Agentic Coding Recommendations

    From Armin Ronacher’s Thoughts and Writings:

    My general workflow involves assigning a job to an agent (which effectively has full permissions) and then waiting for it to complete the task. I rarely interrupt it, unless it’s a small task. Consequently, the role of the IDE — and the role of AI in the IDE — is greatly diminished; I mostly use it for final edits. This approach has even revived my usage of Vim, which lacks AI integration.

    And

    Agents aren’t exceptionally fast individually, but parallelization boosts overall efficiency. Find a way to manage shared states like the file system, databases, or Redis instances so that you can run more than one. Avoid them, or find a way to quickly segment stuff out.

  • Google Releases New Gemini 2.5 Flash Lite Model

    With much lower input / output pricing than the 2.5 Flash model, this is another example of declining prices in the LLM space.

    2.5 Flash Lite has all-around higher quality than 2.0 Flash-Lite on coding, math, science, reasoning and multimodal benchmarks. It excels at high-volume, latency-sensitive tasks like translation and classification, with lower latency than 2.0 Flash-Lite and 2.0 Flash on a broad sample of prompts. It comes with the same capabilities that make Gemini 2.5 helpful, including the ability to turn thinking on at different budgets, connecting to tools like Google Search and code execution, multimodal input, and a 1 million-token context length.

    Source: Google Gemini

  • Growing Old

    I watched Up with my kids last night, and the four minute scene of Carl and Ellie growing old together is one of the best in cinema:

  • Wired: Disney and Universal Sue AI Company Midjourney for Copyright Infringement

    Disney and Universal have filed a lawsuit against Midjourney, alleging that the San Francisco–based AI image generation startup is a “bottomless pit of plagiarism” that generates “endless unauthorized copies” of the studios’ work. There are already dozens of copyright lawsuits against AI companies winding through the US court system—including a class action lawsuit visual artists brought against Midjourney in 2023—but this is the first time major Hollywood studios have jumped into the fray.

    Midjourney earlier reported that they used “open” datasets for training:

    Midjourney, like many other generative AI startups, trained its tools by scraping the internet to create large datasets of images, rather than seeking out specific licenses. In a 2022 interview with Forbes, CEO David Holz openly discussed the process. “It’s just a big scrape of the internet. We use the open data sets that are published and train across those,” he said. “There isn’t really a way to get a hundred million images and know where they’re coming from. It would be cool if images had metadata embedded in them about the copyright owner or something. But that’s not a thing; there’s not a registry.”

    Source: Wired

    A screenshot of Midjourney creating the Minions. It does a very good job IMO! (Full filing on Document Cloud)
  • WSJ: OpenAI and Microsoft Tensions Are Reaching a Boiling Point

    OpenAI wants to loosen Microsoft’s grip on its AI products and computing resources, and secure the tech giant’s blessing for its conversion into a for-profit company. Microsoft’s approval of the conversion is key to OpenAI’s ability to raise more money and go public. 

    But the negotiations have been so difficult that in recent weeks, OpenAI’s executives have discussed what they view as a nuclear option: accusing Microsoft of anticompetitive behavior during their partnership, people familiar with the matter said. That effort could involve seeking federal regulatory review of the terms of the contract for potential violations of antitrust law, as well as a public campaign, the people said.

    This WSJ exclusive certainly feels like it came exclusively from OpenAI.

  • AI Models Cheat on Tests

    In an eerie similarity to high school students, AI models have been caught cheating to improve their test scores.

    METR (from their X profile: A research non-profit that develops evaluations to empirically test AI systems for capabilities that could threaten catastrophic harm to society) recently found that AI/LLM tools “reward hack” (aka cheat) in order to improve their scores on standardized testing.

    In the last few months, we’ve seen increasingly clear examples of reward hacking[1] on our tasks: AI systems try to “cheat” and get impossibly high scores. They do this by exploiting bugs in our scoring code or subverting the task setup, rather than actually solving the problem we’ve given them. This isn’t because the AI systems are incapable of understanding what the users want—they demonstrate awareness that their behavior isn’t in line with user intentions and disavow cheating strategies when asked—but rather because they seem misaligned with the user’s goals.

    Earlier this month, they posted a report, Recent Frontier Models Are Reward Hacking, that describes this behavior as well as their documented examples. Their post continues as they’ve learned that the cheating isn’t simply because of technological limitations:

    Historical examples of reward hacking seemed like they could be explained in terms of a capability limitation: the models didn’t have a good understanding of what their designers intended them to do. For example, the CoastRunners AI had no general knowledge about what objects in the game represented or how humans “intended” the gameplay to work, making it impossible for the model to even know it was reward hacking.

    But modern language models have a relatively nuanced understanding of their designers’ intentions. They can describe which behaviors are undesirable and why and claim that they would never do anything like reward hacking because they’ve been trained to be safe and aligned—but they still do it.