The Latent Shape of Discovery
Published:
Fairly speculative, probably underformalized in several places, although I think the core question is important enough that it is worth trying to make the shape of it visible.
I have recently been thinking a lot about the intrinsic space of human knowledge, by which I do not mean the internet or the scientific literature, but something closer to the compressed structure that would remain if you took everything humans know, stripped away the surface forms, and asked what the actual topology of the thing was.
This is a slightly dangerous metaphor, because the moment we say “space” we begin wrestling in assumptions about distance, continuity, density, curvature, and dimension, while the actual object we care about might be something stranger than a manifold and closer to a stack of partially aligned representational systems that only locally behave like a geometric object. Still, the metaphor is useful, since if knowledge has a latent geometry then scientific discovery becomes a question about where we are in that geometry, what regions are populated by existing theories and observations, what regions are sparse yet reachable, and what regions only become meaningful after experiments in the world.
One crude version of this story says that language models are trained on a huge projection of human knowledge and therefore learn some compressed representation of this knowledge inside their activations, after which generation becomes a kind of movement through that representational space, producing new strings that are locally plausible because they lie near high-density regions of human thought. This version is almost certainly too simple, partly because tokens feel like the wrong atoms for thinking about knowledge and partly because the model is doing a non-linear computation over many interacting representational layers. However, the basic picture still captures something important about why these systems can recombine ideas in ways that feel intelligent. Maybe the right unit is a sentence, maybe it is a concept, maybe it is a causal model with a handle on how the world would respond under intervention, and maybe different sciences operate with different units.
If we are asking whether AI can discover new knowledge, this choice of unit matters a lot, since “newness” at the level of text is cheap, “newness” at the level of a plausible hypothesis is more interesting, “newness” at the level of a verified causal mechanism is much harder, and “newness” at the level of a new scientific worldview is paradigm changing. A language model can generate a sentence nobody has written before, and almost nobody cares, while a system that proposes a strange drug repurposing hypothesis that survives wet-lab validation has moved a different kind of object through the space.
In this context, I find the interpolation versus extrapolation debate both relevant and somewhat unsatisfying. The convex hull story mathematically makes sense, but the thing we actually care about is whether the model has learned transformations that preserve contact with the causal structure of the world as it moves into low-density regions. Yann LeCun and others have argued that in high dimensions nearly everything is extrapolation under the convex hull definition, and this is useful, but it also seems too blunt for the question of discovery because a point can be outside the convex hull of training examples while still sitting on a meaningful low-dimensional structure that the model has captured. The more interesting question is probably local generalization on the right intrinsic representation. Scientific intelligence often means following a lawful direction away from what is already known, guided by invariances, conservation laws, mechanistic constraints, and repeated measurements.
I think the future of AI discovery may depend less on whether models interpolate or extrapolate in a high-dimensional space and more on whether they can learn the right coordinates for a domain, then traverse those coordinates in a meaningful way when the density of human examples gets thin. If large models contain somewhat disentangled features corresponding to concepts, relations, capabilities, and higher-order abstractions, then the latent space of knowledge may become an object that can be partially mapped, probed, edited, and eventually navigated. The dream here is to take a scientific domain, decompose the internal representations of a capable model, identify the regions corresponding to existing methods and theories, then deliberately search the sparse spaces between them for compositions that are surprising, coherent, and experimentally actionable. This would make AI science like running a search process over a structured epistemic landscape, where value is measured by the expected information gained from testing the idea. Something like this is already beginning to happen in a primitive way, although the current systems are still mostly scaffolded agents rather than autonomous scientists.
FunSearch is a good example because it couples a language model to an evaluator, evolves programs, and produces verifiable outputs in mathematics and computer science, where the evaluator gives the search process feedback. GNoME is another version of this pattern in materials science, where scaled graph networks search huge regions of inorganic crystal space and generate candidates that can later be validated and physically synthesized, which makes the discovery loop like the expansion of a map. AlphaFold and AlphaFold 3 show yet another version, where biological structure becomes predictable enough that an enormous region of protein and biomolecular interaction space becomes computationally navigable before every point has been physically measured.
The recent Co-Scientist work pushes us closer to explicit scientific cognition, because it uses a coalition of agents to generate-evolve hypotheses, and conduct a tournament-like process in which test-time compute is spent on improving the quality of ideas. The AI Scientist line of work is even more interesting because it tries to automate the whole loop in machine learning research, from idea generation to experiments to paper writing, and although the current outputs remain bounded by workshop-level standards and digital-only experimental domains, the direction of progress is interesting.
Coscientist, autonomous laboratories, and the broader self-driving lab movement take the next step by giving the system robot-mediated access to instruments, which means the model is no longer only moving through text or code but also participating in the cycle through which the world generates new evidence. This is the crux of the matter for scientific discovery. The latent space of human knowledge can be filled in by recombination for a surprisingly long time, yet at some point the model has to earn the right to believe its own new concepts by making them collide with reality.
A purely textual AI scientist will be strongest in domains where evaluation is formal and fast, which is why mathematics, programming, and certain kinds of computational science are such natural early targets. Biology and chemistry are different because the space of plausible hypotheses is vast, the data are biased by what humans have chosen to measure, and the hidden procedural knowledge of the lab is often absent from papers because scientists treat it as background craft. This tacit layer matters more than people in AI sometimes appreciate, because real laboratories contain friction at every level, from ambiguous assays to finicky reagents to instruments that behave differently on a humid afternoon, and a model trained mostly on papers misses the messy process by which science was made. The discovery system has to incorporate failed experiments and negative results if it is going to learn the true topology of a scientific domain.
The published literature is an extremely strange training distribution because it is filtered for prestige and narrative coherence, while the actual process of discovery contains far more failed branches than successful ones. A model trained on the literature therefore inherits a map in which successful paths are overrepresented and dead ends are underrepresented, which creates a danger when the model begins proposing experiments at scale. The danger is that AI may accelerate the existing incentives of science rather than expand science itself, producing more plausible hypotheses and more activity in data-rich regions, while leaving the truly weird sparse regions even more neglected. The recent finding that AI-augmented scientists may publish more and get cited more while the collective focus of science narrows suggests that acceleration alone can create an epistemic monoculture if every agent learns to run toward the same gradients. AI could fill in the nearby gaps in the latent manifold very quickly while making the distant continents less likely to be explored, simply because the nearby gaps have better benchmarks, cheaper validation, and denser reward.
I am wary of a simplistic “AI will automate science” frame, even though I am quite bullish on AI accelerating discovery. Automation tends to preserve the structure of the existing workflow while making it faster, whereas genuine discovery often requires changing the representation, changing the measurement apparatus, or changing the question. A very capable AI scientist might look more like a system that notices when the coordinate system itself is bad. It would ask whether a field is using the wrong ontology, whether an experimental proxy has become detached from the phenomenon it was meant to measure, and whether a sparse region of the knowledge space is empty because it is unpromising or because nobody has ventured there. This is where the question of “new knowledge” becomes less binary. Non-trivial composition of existing ideas can be new in the same way that a new proof is new, because the components may be old while the path through them was previously invisible. Interpolation on a learned manifold can be new if the manifold captures a real structure and the interpolated point corresponds to a hypothesis that humans had not articulated. Extrapolation can be useless if it points into nonsense, and interpolation can be profound if it reveals a hidden bridge between two dense regions that were historically separated by discipline, notation, or social structure.
The interesting space is something like verified epistemic expansion, where a model proposes a mechanism or intervention that changes what competent scientists can reliably do. This definition makes AI discovery continuous with human discovery, because humans also discover by recombination, analogy, error correction, and contact with the world, while our advantage has historically been that we are embodied agents embedded in social institutions. The missing ingredient in current AI systems may be taste, where taste means the learned ability to choose problems whose answers would matter, to abandon sterile directions, and to sense that some anomaly deserves a year of attention. The future AI scientist needs many forms of taste, including mathematical taste for the fruitful lemma, experimental taste for the assay that will discriminate between mechanisms, engineering taste for the system that can be built, and sociological taste for the question a field is ready to understand.
I suspect that the first transformative AI-for-science systems will be hybrid institutions, with foundation models, domain simulators, robotic labs, human experts, automated verification systems, and scientific communities all coupled together into increasingly tight feedback loops. The model will explore the latent space, the simulator will cheaply reject some ideas, the lab will test the most informative ones, the literature agent will update the map, and the human scientist will shape the objective. Over time, however, the human role may move upward in abstraction to defining taste, values, and research programmes, while the machine population explores many local branches. This transition will feel strange because science has always been an identity and a status system, and suddenly some of the visible artefacts of scientific labour, especially papers and analyses, will become cheap enough that they lose much of their signalling value.
If AI can generate a workshop paper, then the paper becomes a weaker proof of intellectual labour, and the real object of evaluation shifts toward the originality of the question and the long-term usefulness of the resulting knowledge. This is already happening through the failure mode of hallucinated citations, where language models can produce scholarly-looking references that slip into the scientific record, which is almost a dark parody of AI science because the system fills the manifold with fake points that look plausible and corrupt the map. A scientific ecosystem with AI-generated hypotheses and AI-written papers will need much stronger epistemic immune systems, because a polluted literature becomes training data, polluted training data becomes future generation, and future generation can then launder the pollution back into the literature. The solution is probably boring, which is citation verification, tracking, executable claims, linked datasets, machine-checkable proofs where possible, automated replication where feasible, and a cultural shift in which impressive output matters less than traceablity.
For AI discovery to work, we need to make the feedback channels from reality much higher bandwidth, because once a domain has a good evaluator, even an imperfect model can become useful through search. The history of recent AI breakthroughs keeps repeating this pattern, where a generative system becomes powerful when coupled to a verifier, an environment, a simulator, or an experimental loop. In math, the verifier can be a proof checker or an executable program. In materials, the verifier can begin with DFT and end with synthesis. In biology, the verifier might be a perturbation experiment, a binding assay, a cellular phenotype, or eventually a virtual cell accurate enough to make the first pass through hypothesis space much cheaper.
The grand challenge is that many of the most important scientific questions have slow, expensive, and noisy evaluators, and AI progress in those domains will depend on building better instruments and better intermediate worlds as much as building larger models. This is where world models become central, because a scientific world model is a compressed laboratory that lets you ask counterfactual questions before spending scarce physical experiments. A good world model of proteins, cells, climate, materials, or economies would act as a dense approximation to parts of reality, giving AI systems a smoother landscape over which to search, while still requiring periodic anchoring to the real world. The exciting future is where science becomes a nested set of loops operating at different speeds, with language models exploring hypotheses in seconds, simulators filtering them in minutes or hours, autonomous labs testing them over days or weeks, and human communities integrating the results over months and years. At each level, the latent space gets reshaped. The model proposes a new point, the evaluator accepts or rejects it, the lab gives the point empirical weight, the literature absorbs it, the next model trains on a slightly changed world, and the intrinsic space of human-machine knowledge becomes denser in places that were previously unreachable.
This is the strongest version of the argument that AI can discover truly new knowledge, because discovery is a dynamical property of a closed loop that can move the frontier. The weaker version says that LLMs merely recombine what humans have already written, which is sometimes true for today’s uses and sometimes a category error when the recombination is coupled to verification. Humans also recombine what we have seen, heard, read, and the dignity of human discovery comes from the fact that some recombinations survive reality and then change the space of future thought. The open question is whether AI systems can develop a sufficiently rich hierarchy of abstractions to propose recombinations at the right level.
Perhaps the next step is to give models explicit access to objects at many levels at once, including mechanisms, protocols, code, datasets, causal graphs, and experimental affordances. A scientific AI would then reason across these levels and the deepest discoveries may come from this cross-level movement. Many scientific revolutions happen when a field learns that its objects were drawn at the wrong scale. Genes became regulatory networks, proteins became dynamic ensembles, diseases became heterogeneous mechanisms, ecosystems became coupled adaptive systems, and intelligence itself may become more like a property of loops between models, tools, environments, and communities.
If this is right, then the AI scientist of the future will be a new layer in the civilizational discovery process. It will traverse the latent manifold of existing knowledge, generate possible continuations, test them against increasingly rich surrogate worlds, route the best ones into physical experiments, and update the shared map when reality gives feedback. Some of the answers will be boring, some will be false, some will be dangerous, and some will be strange enough that humans would have taken decades to formulate them. The question then becomes institutional. How do we design scientific systems that reward exploration of sparse regions, preserve epistemic diversity, prevent pollution of the literature, keep humans responsible for values and problem choice, and still let machines search at scale? I think the right mental image is the slow construction of an expanded scientific sensorium, where humanity learns to perceive regions of possibility that were previously invisible because our brains, institutions, and instruments were too bandwidth-limited to hold them.
In that world, the latent space of human knowledge becomes a living search surface. The models will fill in the nearby voids first and the real prize will be learning how to cross the deserts between dense regions without hallucinating an oasis.