Blog

  • Joanna Stern on AI Energy Uses

    Joanna Stern from the Wall Street Journal explores: How Much Energy Does Your AI Prompt Use? I Went to a Data Center to Find Out. Her findings are helpful but not surprising.

    Estimated Energy Usage by AI Task:

    • Text generation: 0.17-1.7 watt-hours (depending on model size)
    • Image generation: About 1.7 watt-hours for a 1024×1024 image
    • Video generation: 20-110 watt-hours for just 6 seconds of video

    For context: I can turn off an 8w LED lamp (60 watt equivalent) for an hour and save roughly enough energy to create 5 images. Or, if you have a 4-ton AC, you could turn it off for one hour and generate 40 videos.

    In terms of consumption, a gallon of gas contains 33.7 kilowatt-hours, meaning I could ask ChatGPT nearly 100,000 questions for the same energy cost as driving 26 miles (for the average 2022 model-year vehicle).

    I think we ought to be mindful of the environment and be good stewards of our planet, but I think it’s also important to have context behind these numbers. The potential scope of use is huge (7+ billion people), but relative energy consumption per request remains low and declining with silicon improvements.

    Nvidia has seen a jump in energy efficiency with its latest Blackwell Ultra chips, according to Josh Parker, the company’s head of sustainability. “We’re using 1/30th of the energy for the same inference workloads that we were just a year ago,” Parker said.

    We saw this with the shift from incandescent to LED light bulbs. The cost of lighting building dropped in terms of energy use and dollars spent is much less today than 20 years ago. I have every reason to expect the same to happen in computing, particularly related to AI technology.

  • IEEE Spectrum: Large Language Models Are Improving Exponentially

    Recent report predicts a bright future for LLMs:

    That was a key motivation behind work at Model Evaluation & Threat Research (METR). The organization, based in Berkeley, Calif., “researches, develops, and runs evaluations of frontier AI systems’ ability to complete complex tasks without human input.” In March, the group released a paper called Measuring AI Ability to Complete Long Tasks, which reached a startling conclusion: According to a metric it devised, the capabilities of key LLMs are doubling every seven months. This realization leads to a second conclusion, equally stunning: By 2030, the most advanced LLMs should be able to complete, with 50 percent reliability, a software-based task that takes humans a full month of 40-hour workweeks. And the LLMs would likely be able to do many of these tasks much more quickly than humans, taking only days, or even just hours.

    As a caveat — I’m not sure how many companies would be satisfied with a 50% success rate for key software. Having an AI tool complete a job that would take a human a full month would be a good thing. But let’s face it, a person still has to determine if the work was done satisfactorily. 50% isn’t a passing grade for any subject.

  • WSJ: CEOs Start Saying the Quiet Part Out Loud: AI Will Wipe Out Jobs

    Analysts have been seeing structural changes in the job market related to AI, and now CEOs are admitting it openly. Ford’s CEO, Jim Farley, suggests that 50% of white-collar jobs will be trimmed. JP Morgan exec Marianna Lake also sees a 10% drop in headcount.

    “I think it’s going to destroy way more jobs than the average person thinks,” James Reinhart, CEO of the online resale site ThredUp, said at an investor conference in June.

    While Microsoft’s CEO isn’t publicly declaring that AI will cause job losses, the company did announce another reduction this month, bringing their recent layoffs to a total of around 15,000 people.

  • WSJ:How a Bold Plan to Ban State AI Laws Fell Apart—and Divided Trumpworld

    As I noted last week, Congressional efforts to block state AI laws in the Big Beautiful Bill lost support and was ultimately dropped from the Senate bill by a close vote of 99-1.

  • Bad (Uses of) AI

    From MIT Technology Review: People are using AI to ‘sit’ with them while they trip on psychedelics. “Some people believe chatbots like ChatGPT can provide an affordable alternative to in-person psychedelic-assisted therapy. Many experts say it’s a bad idea.” I’d like to hear from the experts who say this is a good idea.

    Above the Law: Trial Court Decides Case Based On AI-Hallucinated Caselaw. “Shahid v. Esaam, out of the Georgia Court of Appeals, involved a final judgment and decree of divorce served by publication. When the wife objected to the judgment based on improper service, the husband’s brief included two fake cases.” From the appellate court: “As noted above, the irregularities in these filings suggest that they were drafted using generative AI.”

    Futurism: People Are Being Involuntarily Committed, Jailed After Spiraling Into “ChatGPT Psychosis” “At the core of the issue seems to be that ChatGPT, which is powered by a large language model (LLM), is deeply prone to agreeing with users and telling them what they want to hear.”

    “What I think is so fascinating about this is how willing people are to put their trust in these chatbots in a way that they probably, or arguably, wouldn’t with a human being,” Pierre said. “And yet, there’s something about these things — it has this sort of mythology that they’re reliable and better than talking to people. And I think that’s where part of the danger is: how much faith we put into these machines.”

    The Register: AI agents get office tasks wrong around 70% of the time, and a lot of them aren’t AI at all. “IT consultancy Gartner predicts that more than 40 percent of agentic AI projects will be cancelled by the end of 2027 due to rising costs, unclear business value, or insufficient risk controls.” Gartner further notes that most agentic “AI” vendors aren’t actually AI.

  • Bad Questions & Answers

    Ethan Mollick recently cited a paper that tripped up DeepSeek:

    Garbage in, garbage out. AI tools are still in their relative infancy, and it’s not surprising that confusing queries would lead to useless or misleading results.

    Simon Willison posted a similar idea but with a decided historical bent:

    On two occasions I have been asked, — “Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out ?” In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

    — Charles Babbage, Passages from the Life of a Philosopher, 1864

    For personal use, I don’t find discoveries like this troubling. I do think that it opens countless avenues for scammers and hackers to trick systems into doing things that we may very well want to avoid.

  • Douthat: Conservatives Are Prisoners of Their Own Tax Cuts

    As a parent of three, point number 2 on Douthat’s opinion piece resonates with me:

    Second (in the voice of a social conservative), the law doesn’t do enough for family and fertility. No problem shadows the world right now like demographic collapse, and while the United States is better off than many countries, the birthrate has fallen well below replacement levels here as well. Family policy can’t reverse these trends, but public support for parents can make an important difference. Yet the law’s extension of the child tax credit leaves it below the inflation-adjusted level established in Trump’s first term.

    One of the odd parts of political haggling is the loud voices, particularly those related to tax deductions for high earners in high tax states. (Yes, the SALT deductions). It’s a small group of high earners in a small number of states. Yet, they’ve managed to be squeaky enough to expand the deduction from $10k to $40k. Well done for their lobbying!

    From Claude:

    Expanding SALT deductions would primarily benefit upper-middle-class and wealthy taxpayers earning $100,000+ annually, particularly those in high-tax states like California, New York, New Jersey, and Connecticut, who own expensive homes and face high state and local tax burdens. The benefits become increasingly concentrated among the highest earners, with the top 1% receiving disproportionate benefits from any expansion.

    Back to the child tax credit itself. At $2,200, it represents an expansion but is far below the original law (for inflation adjusted dollars). So it seems that our congress cares more about a handful of high income earners than they do for a large (and important) swath of the country: parents.

  • AI Free Agency

    From the Wall Street Journal: Mark Zuckerberg Announces New Meta ‘Superintelligence Labs’ Unit and a partial reorganization of Meta.

    Mark Zuckerberg announced a new “Superintelligence” division within Meta Platforms, officially organizing an effort that has been the subject of an intense recruiting blitz in recent months.

    Former Scale CEO Alexandr Wang will lead the team as chief AI officer, and former GitHub CEO Nat Friedman will lead the company’s work on AI products, according to an internal memo Zuckerberg sent to employees that was viewed by The Wall Street Journal. 

    This after another WSJ article last week about “the list”, designed to ameliorate Meta’s recent disappointing Llama work.

    All over Silicon Valley, the brightest minds in AI are buzzing about “The List,” a compilation of the most talented engineers and researchers in artificial intelligence that Mark Zuckerberg has spent months putting together. 

    Facebooks’ pivot from virtual reality / metaverse (Facebook -> Meta) to AI suggests that the metaverse was the wrong bet. I suspect Zuckerberg knows it, too, but this huge spending spree aligns with Zuck’s ethos, move fast and break things.

    In a world where a really good basketball player (Shai Gilgeous-Alexander) can command $285 million over four years, spending upwards of $100 million per transformative engineer seems like a relative bargain.

  • Maginative: Microsoft’s MAI-DxO Crushes Doctors at Medical Diagnosis while Cutting Costs

    Maginative reports on Microsoft’s new AI Diagnostic Orchestrator and how it outperformed doctors in a recent study. (As an aside, I always wonder about reports that use words like crush in the title. Beware of hyperbole!)

    From the report’s abstract, you’ll find exciting results:

    When paired with OpenAI’s o3 model, MAI-DxO achieves 80% diagnostic accuracy—four times higher than the 20% average of generalist physicians. MAI-DxO also reduces diagnostic costs by 20% compared to physicians, and 70% compared to off-the-shelf o3. 

    A 4x improvement in diagnostic accuracy. This is transformative stuff.

    But when considering the experimental setup:

    Physicians were explicitly instructed not to use external resources, including search engines (e.g., Google, Bing), language models (e.g., ChatGPT, Gemini, Copilot, etc), or other online sources of medical information.

    Now the results don’t seem quite so impressive. In fact, I have a hard time understanding how this report has much utility due to these extreme restrictions that don’t align with real-world practices.

  • If AI Lets Us Do More in Less Time—Why Not Shorten the Workweek?

    It’s a good question for work (particularly for white collar roles) — if workers are more productive because of AI, should the workweek be shorter?

    This question is increasingly central to debates about the future of work and closely tied to the growing interest in the four-day workweek. According to Convictional CEO Roger Kirkness, his team was able to shift to a 32-hour schedule without any pay cuts—thanks to AI. As he told his staff, “Fridays are now considered days off.” The reaction was enthusiastic. “Oh my God, I was so happy,” said engineer Nick Wechner, who noted how much more quickly he could work using AI tools.

    Aside from his contention for boss of the year award, Kirkness recognizes the key criteria for success: getting your work done. If the work can be done faster, companies can choose: (1) reduce the total number of hours worked per employee (without reducing headcount); (2) reduce headcount by a commensurate number (in Convictional’s case, 20%); (3) grow the company to do more work with a similar number of employees.

    As a worker, I’m sympathetic to the idea of shorter work weeks, but I suspect that growth is a more realistic option. Employees continue to work similar hours, but increased productivity leads to company growth (but not headcount growth).