Thanks so so much Kenneth! I'm curious - which biomed capabilities do you mean? Claude downgraded me to Sonnet 4.6 almost 100% of the time when doing research for this series, it seems like any curiousity about bioweapons and virology is seen as suspect by the current model. Is this what you meant?
Thank for flagging Ze Shen! The paper’s main evidence is a man who created a *chemical* weapon (much much easier than a bioweapon) and his “efforts to document the steps” of construction of viral pathogens which is definitely not a way to measure tacit knowledge. The study would have had to verify in the real world whether these documentations were actually increasing success to assess tacit knowledge similar to how ActiveSite did. :)
Can absolutely confirm on the pipetting thing. It's a bit like driving. When you can drive it seems very natural. Very few people are able to drive confidently and accurately and consistently after their first leason/hour of experience.
One of my more traumatic memories of undergrad was my dissertation supervisor getting unnecessarily angry (to a degree I could easily have reported him for) very, very quickly because I wasn't able to pipette like him within the first few tries. (FWIW my proprioception and hand-eye coordination and hand steadiness are all decent; I've picked up stuff from spilikins to target shooting to card flourishes reasonably well.) Like driving, I think people sometimes forget how hard it once was for them to do.
o.o That's rough - the skill feels almost random! From the interviews, it seemed that some people were very good and others weren't, and it was sometimes years of experience, but sometimes not!
The irony that Claude wrote most of the examples of 'tacit' knowledge :P (of course being able to name a challenge in abstract is far from being able to solve it).
This is a welcome and well written article (my jibe notwithstanding). The importance of tacit and procedural knowledge - both in pushing the frontier of the known/doable and in proliferating access to that - is routinely underappreciated by AI doomers and optimists alike. But the in principle (and plausibly soon - months to years) access and accumulation of tacit knowledge into AI is important to recognise, as you acknowledged towards the end.
Thanks Oliver - I did use Claude, though not for the category of examples e.g. the choice to focus on pipetting. It did help me get more info about why pippetting is hard - which I then checked with the bio folks. :)
One thing I'm super curious about is whether there's a way to track how much tacit knowledge is left to figure out? It seems hard to measure, but not impossible.
My sarcasm intended in comradely fashion - I thought the pipetting challenges maybe had AI voice and something like AI vibe (no body: no pipetting experience!). I trust you to have done sufficient diligence for readers to be able to take these as reasonable high-level descriptions and gestures at tacit knowledge bottlenecks.
I'm also curious about that! Some indirect approaches might look at degree (and depth) of automation, maybe something like '(automated) capital intensity', use of AI by lab practitioners, 'productivity' (perhaps as measured by other outputs than crude GDP). Even more indirect, but perhaps telling, might be indicators based on the distribution of years of experience of lab practitioners, and maybe overall number of lab employees in various capacities (also likely tracking number of new labs over time).
More direct... you could do something like surveys (on whom? likely very noisy and with some systematic bias, but could reveal trends). Another approach to estimation might be really fine-grained ecologically valid breakdown of subtasks, though of course this would miss the possibility for automation to 'route around' particular 'hard steps' right now (i.e. playing to comparative advantages of AI might unlock, or simply make prudent, alternative task trajectories that no human lab would pursue).
Do you have thoughts on this question? What about the more general case besides wetlab work?
Thanks, Oliver, these are great points. Things we've been investigating, at least in self-driving labs, is the fine-grain breakdown of subtasks. This seems to be what any physical automation has to focus on. They're automating very narrow workflows with a lot of effort e.g. custom-built robotics. Maybe this will end up like how automated vehicles slowly got to be more automated and have more abilities. They started out with just lane contro to then doing specific routes. I think the hardest thing here is that biology is weird and wacky. Every organism itself seems very unpredictable, which makes guessing or forecasting the transition hard. :)
Heh, yep, that makes a lot of sense for self driving. If you bracket out the 'routing around' (or more generally, changing priority distribution over multi-step trajectory paths) problem, this kind of breakdown is a great way to go. (There's also something like the additional task of composing subtasks.) If you start to see low but nonzero subtask success rates, this can also be a leading indicator of overall composed task success rates, naively multiplying success rates at subtasks for an estimate of the composed success rate.
At AISI we did something like this for RepliBench (https://arxiv.org/abs/2504.18565), crudely. The 'science of evals' team at AISI has also done some investigation into that type of approach in general. I don't know where they're at on it - could put you in touch if you like.
From someone who’s worked with DNA and other relatively common molecules that could reasonably be used in creation of a bioweapon, this explanation of tacit knowledge is a slight exaggeration imo.
You can point to a few instances where it may make a difference whether you have completely perfect technique (ultra temperature sensitive proteins were mentioned), but for the vast majority of solutions having basic hand eye coordination and knowledge of a micropipette works. The DNA being manipulated in these cases wouldn’t be in its pure form for the vast majority of the process - it would be in a TRIS or PBS buffer solution with plenty of protecting proteins (about as stable as you get). This shouldn’t need expert level tacit knowledge.
On the point of concentration, the use of spectroscopy is not nearly as confusing as portrayed. Identification of bubbles/various confounders in absorbance are taught in CHM101 labs almost unilaterally. The year I took the AP chem exam there was a question about this! Someone who would enact a plan like this would surely have at least a freshman in college level understanding of these processes, which is plenty.
The point about Aum Shinrikyo is interesting, but it was also a case of freakish incompetence I’m not confident would replicate. Relatively successful attacks (https://en.wikipedia.org/wiki/1984_Rajneeshee_bioterror_attack) are less dramatic and more methodical.
My timeline for bioweapons being a huge issue is not incredibly short, but I think dumping a “tacit knowledge” bandaid on it to explain why isn’t super rigorous.
Thanks Otothebo! This isn’t so much aimed at simplest scenarios of a bioweapon. Salmonella on a salad bar isn’t typically the type of bioweapon people are worried about in AIxbio discussion, that can already be done pretty easily without AI. This is more geared at bioweapons aimed at larger casualties (over 1,000 deaths?), like what Aum Shinrikyo was aiming for. They were either dealing with larger quantities of anthrax, or likely would have aimed for a transmissible virus for mass casualties. I asked a bunch of practitioners about the bottlenecks to these type of bioweapons, from start to finish.
I’m aware of the scale of AIxBio discussions. This is not the point of the example, that was more so to highlight the incompetence of Shinrikyo and to suggest a modern attack would not emulate many of their tactics.
Regardless, that was tangential. The primary reason why I commented was because your examples of tacit knowledge are not actually bottlenecks. Checking with practitioners is great, but it seems that you were asked for cases in which these skills are limiting in general, rather than if they would be limiting in weapon development.
Pipetting ultra temperature sensitive proteins is quite unusual. Working with pure, unprotected DNA is similarly odd. DNA is an amazingly robust molecule in 99% of cases. Spectroscopy is a remarkably simple process, and the tells of when something is off are identifiable by anyone. The large majority of what is described here is grunt work done in biology labs by undergraduates.
These are good things to write about, but this article ironically lacks tacit knowledge itself.
My original question was: Why are bioweapons hard to make? < The answer seems to be a big limiter for "Why aren't there more bioattacks" (the focus of Part 1). Bioattacks that don't kill that many people aren't competitive with chemical weapons or bombs. Hence, this essay mostly focuses on a very large quantity of lethal bacteria or a very transmissible and harmful virus.
What is it like to deal with things that aren't just "put e. coli on a salad bar"?
Shearing during pipetting is not unusual. DNA is robust but not RNA, or other components. If you're having to reverse attenuation or constructing from DNA fragments, presumably dealing with pure DNA is a thing you'd do for some viruses that you can't get otherwise, so then yes, you might do that.
It's not possible to explain the details of every candidate virus or bacteria in hyper detail without making the piece ONLY about that one pathogen, so something did get lost by trying to combine the various routes to Bioweapons Deadly Enough to be Worth Attempting (given Part 1).
Tacit knowledge is hard to explain, but it is the crux of why people with decades of wet lab experience don't follow much of the work that comes out in circles that are predominantly focused on "AI will make bioweapons easier". This isn't a perfect article at explaining tacit knowledge, but it seems to be a thing no one else is explaining because its 1. hard to explain and 2. can seem to undermine advocacy goals of some organizations.
Thanks for the feedback - if you have other ideas for how to explain tacit knowledge for very very deadly (possibly modified and reverse attenuated) viruses or very large quantities of bacteria all the way through the super fine aerolization methods, that would be great. Explaining easy parts of the hard process or using an example of an easy bio agent (like ricin) wouldn't really help because it's not representative either.
Have also worked with DNA / rna / proteins and share your view that the tacit knowledge gap seems to be overindexed in AI x bio discourse.
Having an infinitely patient tutor helps you gets “shots on target”, and iterate the tacit skills you need on an experiment , provided you can articulate them, and thus shortens your timelines
I'm not saying that it's impossible, someone could go get lab experience and do infinite tries. Though my Part 1 also mentions that often, multiple tries and ordering of supplies degrades secrecy, people will often default to bombs, chemical weapons, or guns instead. I'm hoping to explain more of the real-world dynamics that are happening.
Thanks so so much Kenneth! I'm curious - which biomed capabilities do you mean? Claude downgraded me to Sonnet 4.6 almost 100% of the time when doing research for this series, it seems like any curiousity about bioweapons and virology is seen as suspect by the current model. Is this what you meant?
Nice post! Any thoughts about the claims from this paper that challenges the importance of tacit knowledge in biological weapons development? https://www.rand.org/pubs/perspectives/PEA3853-1.html
Thank for flagging Ze Shen! The paper’s main evidence is a man who created a *chemical* weapon (much much easier than a bioweapon) and his “efforts to document the steps” of construction of viral pathogens which is definitely not a way to measure tacit knowledge. The study would have had to verify in the real world whether these documentations were actually increasing success to assess tacit knowledge similar to how ActiveSite did. :)
Can absolutely confirm on the pipetting thing. It's a bit like driving. When you can drive it seems very natural. Very few people are able to drive confidently and accurately and consistently after their first leason/hour of experience.
Thanks Jacob! Good to know! I've not pipetted myself but now I really wish I had minored in biology - it's fascinating.
One of my more traumatic memories of undergrad was my dissertation supervisor getting unnecessarily angry (to a degree I could easily have reported him for) very, very quickly because I wasn't able to pipette like him within the first few tries. (FWIW my proprioception and hand-eye coordination and hand steadiness are all decent; I've picked up stuff from spilikins to target shooting to card flourishes reasonably well.) Like driving, I think people sometimes forget how hard it once was for them to do.
So it's not all good 😉
o.o That's rough - the skill feels almost random! From the interviews, it seemed that some people were very good and others weren't, and it was sometimes years of experience, but sometimes not!
The irony that Claude wrote most of the examples of 'tacit' knowledge :P (of course being able to name a challenge in abstract is far from being able to solve it).
This is a welcome and well written article (my jibe notwithstanding). The importance of tacit and procedural knowledge - both in pushing the frontier of the known/doable and in proliferating access to that - is routinely underappreciated by AI doomers and optimists alike. But the in principle (and plausibly soon - months to years) access and accumulation of tacit knowledge into AI is important to recognise, as you acknowledged towards the end.
Thanks Oliver - I did use Claude, though not for the category of examples e.g. the choice to focus on pipetting. It did help me get more info about why pippetting is hard - which I then checked with the bio folks. :)
One thing I'm super curious about is whether there's a way to track how much tacit knowledge is left to figure out? It seems hard to measure, but not impossible.
My sarcasm intended in comradely fashion - I thought the pipetting challenges maybe had AI voice and something like AI vibe (no body: no pipetting experience!). I trust you to have done sufficient diligence for readers to be able to take these as reasonable high-level descriptions and gestures at tacit knowledge bottlenecks.
I'm also curious about that! Some indirect approaches might look at degree (and depth) of automation, maybe something like '(automated) capital intensity', use of AI by lab practitioners, 'productivity' (perhaps as measured by other outputs than crude GDP). Even more indirect, but perhaps telling, might be indicators based on the distribution of years of experience of lab practitioners, and maybe overall number of lab employees in various capacities (also likely tracking number of new labs over time).
More direct... you could do something like surveys (on whom? likely very noisy and with some systematic bias, but could reveal trends). Another approach to estimation might be really fine-grained ecologically valid breakdown of subtasks, though of course this would miss the possibility for automation to 'route around' particular 'hard steps' right now (i.e. playing to comparative advantages of AI might unlock, or simply make prudent, alternative task trajectories that no human lab would pursue).
Do you have thoughts on this question? What about the more general case besides wetlab work?
Thanks, Oliver, these are great points. Things we've been investigating, at least in self-driving labs, is the fine-grain breakdown of subtasks. This seems to be what any physical automation has to focus on. They're automating very narrow workflows with a lot of effort e.g. custom-built robotics. Maybe this will end up like how automated vehicles slowly got to be more automated and have more abilities. They started out with just lane contro to then doing specific routes. I think the hardest thing here is that biology is weird and wacky. Every organism itself seems very unpredictable, which makes guessing or forecasting the transition hard. :)
Heh, yep, that makes a lot of sense for self driving. If you bracket out the 'routing around' (or more generally, changing priority distribution over multi-step trajectory paths) problem, this kind of breakdown is a great way to go. (There's also something like the additional task of composing subtasks.) If you start to see low but nonzero subtask success rates, this can also be a leading indicator of overall composed task success rates, naively multiplying success rates at subtasks for an estimate of the composed success rate.
At AISI we did something like this for RepliBench (https://arxiv.org/abs/2504.18565), crudely. The 'science of evals' team at AISI has also done some investigation into that type of approach in general. I don't know where they're at on it - could put you in touch if you like.
Whoa, this is fascinating. I did not realize AISI had a science of evals team! I'll PM you!
Really resonates. AI can give you the steps, but not the instinct that comes from actually doing the work.
Thank you! I really appreciate it
Interesting!
I had some similar thoughts (albeit much less detailed) in comments to this 2023 article: https://forum.effectivealtruism.org/posts/ZuzK2s4JsJcexBJxy/will-releasing-the-weights-of-large-language-models-grant?commentId=wm7JrifbiDXDBWdgf
Fascinating discussion - I did not realize chroline gas was so easy to make!
From someone who’s worked with DNA and other relatively common molecules that could reasonably be used in creation of a bioweapon, this explanation of tacit knowledge is a slight exaggeration imo.
You can point to a few instances where it may make a difference whether you have completely perfect technique (ultra temperature sensitive proteins were mentioned), but for the vast majority of solutions having basic hand eye coordination and knowledge of a micropipette works. The DNA being manipulated in these cases wouldn’t be in its pure form for the vast majority of the process - it would be in a TRIS or PBS buffer solution with plenty of protecting proteins (about as stable as you get). This shouldn’t need expert level tacit knowledge.
On the point of concentration, the use of spectroscopy is not nearly as confusing as portrayed. Identification of bubbles/various confounders in absorbance are taught in CHM101 labs almost unilaterally. The year I took the AP chem exam there was a question about this! Someone who would enact a plan like this would surely have at least a freshman in college level understanding of these processes, which is plenty.
The point about Aum Shinrikyo is interesting, but it was also a case of freakish incompetence I’m not confident would replicate. Relatively successful attacks (https://en.wikipedia.org/wiki/1984_Rajneeshee_bioterror_attack) are less dramatic and more methodical.
My timeline for bioweapons being a huge issue is not incredibly short, but I think dumping a “tacit knowledge” bandaid on it to explain why isn’t super rigorous.
Thanks Otothebo! This isn’t so much aimed at simplest scenarios of a bioweapon. Salmonella on a salad bar isn’t typically the type of bioweapon people are worried about in AIxbio discussion, that can already be done pretty easily without AI. This is more geared at bioweapons aimed at larger casualties (over 1,000 deaths?), like what Aum Shinrikyo was aiming for. They were either dealing with larger quantities of anthrax, or likely would have aimed for a transmissible virus for mass casualties. I asked a bunch of practitioners about the bottlenecks to these type of bioweapons, from start to finish.
I’m aware of the scale of AIxBio discussions. This is not the point of the example, that was more so to highlight the incompetence of Shinrikyo and to suggest a modern attack would not emulate many of their tactics.
Regardless, that was tangential. The primary reason why I commented was because your examples of tacit knowledge are not actually bottlenecks. Checking with practitioners is great, but it seems that you were asked for cases in which these skills are limiting in general, rather than if they would be limiting in weapon development.
Pipetting ultra temperature sensitive proteins is quite unusual. Working with pure, unprotected DNA is similarly odd. DNA is an amazingly robust molecule in 99% of cases. Spectroscopy is a remarkably simple process, and the tells of when something is off are identifiable by anyone. The large majority of what is described here is grunt work done in biology labs by undergraduates.
These are good things to write about, but this article ironically lacks tacit knowledge itself.
My original question was: Why are bioweapons hard to make? < The answer seems to be a big limiter for "Why aren't there more bioattacks" (the focus of Part 1). Bioattacks that don't kill that many people aren't competitive with chemical weapons or bombs. Hence, this essay mostly focuses on a very large quantity of lethal bacteria or a very transmissible and harmful virus.
What is it like to deal with things that aren't just "put e. coli on a salad bar"?
Shearing during pipetting is not unusual. DNA is robust but not RNA, or other components. If you're having to reverse attenuation or constructing from DNA fragments, presumably dealing with pure DNA is a thing you'd do for some viruses that you can't get otherwise, so then yes, you might do that.
It's not possible to explain the details of every candidate virus or bacteria in hyper detail without making the piece ONLY about that one pathogen, so something did get lost by trying to combine the various routes to Bioweapons Deadly Enough to be Worth Attempting (given Part 1).
Tacit knowledge is hard to explain, but it is the crux of why people with decades of wet lab experience don't follow much of the work that comes out in circles that are predominantly focused on "AI will make bioweapons easier". This isn't a perfect article at explaining tacit knowledge, but it seems to be a thing no one else is explaining because its 1. hard to explain and 2. can seem to undermine advocacy goals of some organizations.
Thanks for the feedback - if you have other ideas for how to explain tacit knowledge for very very deadly (possibly modified and reverse attenuated) viruses or very large quantities of bacteria all the way through the super fine aerolization methods, that would be great. Explaining easy parts of the hard process or using an example of an easy bio agent (like ricin) wouldn't really help because it's not representative either.
Have also worked with DNA / rna / proteins and share your view that the tacit knowledge gap seems to be overindexed in AI x bio discourse.
Having an infinitely patient tutor helps you gets “shots on target”, and iterate the tacit skills you need on an experiment , provided you can articulate them, and thus shortens your timelines
I'm not saying that it's impossible, someone could go get lab experience and do infinite tries. Though my Part 1 also mentions that often, multiple tries and ordering of supplies degrades secrecy, people will often default to bombs, chemical weapons, or guns instead. I'm hoping to explain more of the real-world dynamics that are happening.
Mine was not meant as a critique of your article ; I agree with the impetus behind most your points, 🙂, speaking more of the discourse in general.
Sorry for not clarifying — initial message sent from phone
Oh sorry, makes sense. :) Thanks!