Limits to experimental induction

Progress requires experimentation. Suggested ways AI could speed progress by automating experiments appear mistaken.

There’s a folk theory that you do science by performing experiments, which means measuring a slew of similar things, which produces data, and then a statistics person does an inscrutable Statistics Thing which says that all the data add up to a Science Knowledge, so you add it to the giant pile of Science Knowledges, and when you have enough of those you get flying cars. If that was true, and if AI could do the experiments much faster, we’d get flying cars (or the robopocalypse) much sooner.

There are a couple problems with this: it’s not how science works, and there’s not much reason to think AI could do experiments much faster. If you ask “isn’t it possible in principle for a god-like AI to do this,” the answer is always “yes, by definitional fiat”; but without a “how” that is meaningless.

Inductive reasoning blesses data as knowledge using formal methods. There is no generally correct way of doing this.1 A statistical analysis is sound only relative to unavoidable assumptions that cannot be justified rationally; particularly a small-world assumption that limits what factors it considers.2 Those assumptions must be chosen meta-rationally.3 This means the “artificial scientist” project is quite different from what is usually imagined, and probably much more difficult. We have much less idea how to automate meta-rationality than rationality.

Risk discussions suggest Scary AI would build a vast army of robots to carry out experiments much faster than humans can. Assuming for the sake of the argument that its manufacturing prowess somehow made that more feasible for it than us, there’s not much reason to think it would be useful.

Most wet lab work is inherently serial. You can’t know what to do next until you have the results from the experiment you are doing now.

Relatedly, most scientific breakthroughs come from hands-on experimentation; from “getting a feel” for the subject matter. Empirical studies don’t just test hypotheses; they are where hypotheses come from.

Large-scale laboratory automation is already in routine use in pharmaceutical discovery, along with advanced statistical methods for interpreting the vast quantities of data it produces. These are helpful, but not magic bullets. Scaling them up by building or buying way more robots is perfectly feasible, and wouldn’t be expensive on the scale of pharmaceutical research costs. We don’t do that because it wouldn’t help much. Robot count is not a bottleneck.

Better robots might help more. On the other hand, it’s not clear how AI would be necessary, or even helpful, in creating or making use of them. There’s been almost no advancement in practical robot hands for decades, due to mechanical issues rather than AI limitations. And when parallelizing biology experiments, microfluidics difficulties—getting diverse sorts of sticky glop to not gum up tiny tubes—usually exceed robot limitations.

  1. 1.See “No solution to the problem of induction” in my “Statistics and the replication crisis.”
  2. 2.The probability of green cheese” in In the Cells of the Eggplant.
  3. 3.See the “Meta-rational statistical practice” section of “Statistics and the replication crisis” in In the Cells of the Eggplant.