Short Takes #2
Slow Adoption Applies To Evil AI, Too, The Market for AI Coding Tools Could Be Dwarfed by Health Care Administration, Why Do Radiologists Still Have Jobs?, and More
Here once again is Short Takes – a grab-bag of small and, hopefully, interesting items.
Slow Adoption Applies To Evil AI, Too
(Inspired by this section of a recent AI news roundup from Zvi Mowshowitz)
We talk a lot about the gap between the theoretical capabilities of AI and its impact on the broader world. There’s no end of impressive demos and benchmark scores, but results in practice are highly inconsistent. Demos don’t always translate to the real world; it takes time to find & refine new ways of doing things; not everyone is eager to try.
This applies to harmful uses of AI, as well. There have been predictions that, by now, we’d already have seen widespread fraud, cyberattacks, and other impacts from AI. It hasn’t happened to any great extent.
One reason is that for most applications, AI still can’t do much that can’t also be done by hand. It can do some jobs faster or cheaper, but if the market wasn’t cost- or supply-limited, that may not have much impact. So, for instance, if the supply of misinformation already exceeded the demand, then deepfakes might not have a profound impact.
In other cases, there is probably room for enterprising criminals to profit by introducing AI into their workflow. But people who have the knowledge, skills, and motivation to innovate in this fashion are more likely to be gainfully employed in more legitimate (or, at a minimum, legal) pursuits. Sadly, I’m sure we’ll see more diffusion of AI into criminal applications over time. But we shouldn’t expect it to happen overnight.
This is good, mostly. But there is a concerning implication relating to open-weight models (AI models that anyone can freely use and even modify, because the “weights” that define the neural network have been published). Open-weight models provide many benefits for competition, research, privacy, and diffusion of power. They also pose unique risks, because it’s impossible to control how they are used, or to recall them if they are found to be used for harm. In practice, the capabilities of open-weight models (like DeepSeek-R1) tend to lag behind “closed” models (like OpenAI’s o3), and it’s often been suggested that this will allow us to detect dangerous use cases before they show up in open-weight models. If the bad guys aren’t holding up their end of the bargain by quickly exploiting opportunities, this might not work out.
Forget Coding, AI Companies Should Go After Health Care Administration
By Steve
I’m listening to the latest episode of one of my favorite podcasts – The Cognitive Revolution. This week’s interview features Ambiance Healthcare, a startup that is providing AI tools to help health care providers with tasks such as recording visit notes, billing insurers, and following up with patients.
Host Nathan Labenz begins by noting that the US healthcare system “spends $1 trillion annually on administrative tasks”. I’ve read plenty of stories about the burden of medical paperwork, and the hours that doctors and nurses spend on documentation, to say nothing of the nightmare that navigating the system can be for patients. But I hadn’t realized just how large a portion of the economy this was. It seems like a colossal opportunity for AI app developers.
For perspective, US salary costs for medical paperwork appear to be roughly 60% higher than for software development – roughly $560 billion vs. $350 billion! (These are AI estimates, take them with a grain of salt1.) The implication is that the market for AI tools in health care administration may be even bigger than the market for coding tools2.
Intuitively, I would expect medical paperwork to be easier to automate than software engineering. Paperwork consists of isolated tasks that may require a large amount of generic background knowledge (of medicine, insurance rules, etc.), but the inputs and outputs are well defined (e.g. the patient’s medical record; an insurance form) and each task is well-defined and short-lived. AIs are superhuman at acquiring background knowledge, and challenged at managing context (such as a large codebase and unique product requirements) or long, squishy projects. Perhaps I’m missing something, or perhaps the AI industry is just focusing much harder on coding tools than medical paperwork.
It’s worth noting that the interview shows Ambiance to have found it necessary to invest heavily in product design. They found that it wasn’t enough to design a general tool for doctors to record notes. Every speciality has a different way of doing things; cardiologists record information in a different format than general practitioners. Ambience has had to design specific AI workflows, using multiple off-the-shelf and fine-tuned models, for a long list of distinct specialties and use cases.
This illustrates that there are, apparently, two approaches to applying AI. One is to use the simplest possible product – often, just a generic chatbot – and count on the (ever increasing!) capabilities of the raw models to solve real problems for users. The other is to create purpose-built interfaces and workflows that break down problems into carefully arranged pieces which today’s AI models can handle reliably. Ambience is using the latter approach. Claude Code is closer to the former, minimizing hand-coded components that can supplement models but also constrain them. It’ll be interesting to see under what circumstances each approach wins out!
Why Do Radiologists Still Have Jobs?
By Steve
Arvind Narayanan posted an interesting musing on radiology jobs:
I find the story of AI and radiology fascinating…
Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which human radiologists beat AI. So maybe the "jobs are bundles of tasks" model in labor economics is incomplete. Paraphrasing something @MelMitchell1 pointed out to me, if you define jobs in terms of tasks maybe you're actually defining away the most nuanced and hardest-to-automate aspects of jobs, which are at the boundaries between tasks.
Can you break up your own job into a set of well-defined tasks such that if each of them is automated, your job as a whole can be automated? I suspect most people will say no. But when we think about *other people's jobs* that we don't understand as well as our own, the task model seems plausible because we don't appreciate all the nuances.
If this is correct, it is irrelevant how good AI gets at task-based capability benchmarks. If you need to specify things precisely enough to be amenable to benchmarking, you will necessarily miss the fact that the lack of precise specification is often what makes jobs messy and complex in the first place. So benchmarks can tell us very little about automation vs augmentation.
[Emphasis added]
How can it be that radiologists fail to beat AI at any identified tasks, and yet they are not losing jobs to automation? Here are some possible explanations:
The “enthusiastic embrace” of AI for radiology is still concentrated at a few early adopters, not enough to affect overall employment.
There are tasks at which human radiologists beat AI, they just haven't been identified in a study – probably because they’re precisely the sorts of messy gap-filling work that’s hard to study.
Radiologists could be replaced, en masse, but it hasn't happened due to regulation, resistance to change, etc.
Jevons Paradox: radiologists are losing work to AIs, but the resulting increases in efficiency and capacity leave room for more overall work to be done, thus preserving employment. (Rachel notes that the demand for radiology services is probably pretty inelastic, which casts doubt on this idea.)
I suspect “all of the above” (this is usually the correct answer in any question regarding a complex system), but primarily in the order listed above.
Further reading:
Your A.I. Radiologist Will Not Be With You Soon – New York Times (non-paywalled link). “Experts predicted that artificial intelligence would steal radiology jobs. But at the Mayo Clinic, the technology has been more friend than foe.”
Reimagining the Economy for the Age of AI
By Taren
We had a packed house at our panel with Natalie Foster, Geoff Ralston, and Anna Makanju on Reimagining the Economy for the Age of AI! Key takeaways for me:
With AI-powered job market disruptions coming down the pike, we need more than ever to guarantee an economic floor for Americans: Access to housing, health care, aged and childcare, good work, income, and more. Natalie's excellent book, The Guarantee, reports on the the progress we've made on these guarantees over the past decade and outlines how we can finish the job.
The COVID crisis showed us that this was all possible. The U.S. basically got forms of UBI, guaranteed housing, and more during COVID. Other countries like Germany also introduced innovative policies, such as large financial incentives to companies to retain their workers during the pandemic -- which allowed the German economy to bounce back quickly after things went back to normal. Where there's political will, there's a way.
Milton Friedman said "Only a crisis -- actual or perceived -- produces real change. When that crisis occurs, the actions that are taken depend on the ideas that are lying around." Meaning: We need to build out and test a set of policy ideas for how to respond to AI's impending impacts on the economy NOW, so that when a crisis hits, those ideas are "lying around."
For more on the guarantees, check out Natalie's book -- it's a great read and very thought-provoking.
Thanks to Bloomberg Beta andRoy E. Bahat for hosting us!
A shout out to other recent SF events
By Rachel
Last week, we ran our second AI & Democracy Coworking Day, which brought together 35 folks working on a variety of issues at this intersection, from democratic governance of AI, to leveraging AI to bolster democratic institutions, to AI-driven threats to democracy.
This week, we hosted a fireside chat with the two primary authors of METR’s latest paper — Joel Becker and Nate Rush. First Steve, and then the audience, got to dig into their motivations for the study, explanations for the result they got, how far to generalize various takeaways, plans for future work, and so so much more. You can read Steve’s post about the paper here. Keep an eye out for a follow-up, including a recording of the talk and new insights from the extensive public discussion the paper has inspired.
Thanks to Mox for hosting both of these!
If you’d like to hear about our events before they happen, sign up here to get invites.
I asked Claude 4 Opus, ChatGPT o3, and Gemini Pro 2.5 to estimate salaries for medical paperwork and software development.
Health care administration: “Estimate the total annual US salary cost for administration and paperwork involved in health care. Include relevant staff at care providers, insurers, and other relevant organizations, including fractional time for doctors, nurses, and others who spend part of their time on administrative tasks.” o3 estimated $801 billion / year, Gemini 2.5 Pro estimated $471-$589 billion, and Claude Opus 4 estimated $250-480 billion. Taking the midpoint of each range, and then the average of the three resulting point estimates, yields $565 billion.
For software engineering: “Please estimate annual US salary costs for (a) health care administration (including e.g. the cost for insurers to process claims), and (b) software development.” (I then only used answer (b), as I hadn’t framed (a) well and I rephrased it as shown above.) o3: $200 – $250 billion or (alternative answer) $292 billion / year. Gemini: $225.2 billion. Claude: $540-585 billion. Average: $349 billion.
Mechanize recently estimated US software engineering wages at $300 billion / year, similar to the ensemble estimate I obtained here, but they didn’t cite a source.
Again, these are estimates for salaries, not total cost.
This might change over time. As software development productivity increases, we will certainly produce more software. As medical paperwork productivity increases, one hopes we will not choose to produce more paperwork! One hopes…
Thanks for sharing my post about radiology. Just to clarify, are your hypotheses in addition to the one I proposed (the hard parts are the boundaries between tasks) or are you saying that that hypothesis does not seem plausible to you? Cheers.
I absolutely think there is a lot of room for AI to improve healthcare administration. But the larger question is are the incentives there?
An example that's fresh in my mind: I recently received a bill in the mail for a $10 copay. The bill mentioned an option for paying online, but this was broken. So I had to pay over the phone. This system was entirely automated, and probably doesn't cost the provider that much to maintain, so there might not be much of an incentive to fix it. This terrible UX that the customer has no choice but to deal with-- typing in all of my information digit by digit rather than using autocompleted fields online.
Ultimately though, I wonder what led the process to be the way it was. Why couldn't I have just quickly paid the $10 at the office rather than receiving multiple reminders in the mail and going through this whole process?
I wonder how much innovation is held back by a lack of competition and accountability to end consumers of healthcare.