Epistemic Infrastructure Manifesto
When you need to move merchandise, people, armies across the land, you find yourself constantly confronted with infrastructure problems:
How to cross this river?
How to move faster so we don’t consume more resources than we can carry?
How to keep the resources safe at night and during winter in dangerous territory?
Now, one greedy approach to these problems is the Roman Military Engineering one: just teach the people to build these things as needed, quickly.
But as civilizations, we have recognized that in the long run, and to reach higher and higher level of capabilities, investing directly in infrastructure, in bridges, roads, ports, airplanes and ships, matters.
When you need to serve a chat application running on powerful GPUs to hundreds of thousands of users, you confront IT infrastructure problems:
How to load-balance?
How to ensure everyone sees the same version after a migration?
How to batch and manage request such that the cluster is sufficient for our needs and resources?
Once again, there is a simple greedy strategy: have the product team itself take care of it greedily, solving each problem with a quick local solution. This helps a lot with quickly iterating, but as a field, we have recognized that investing in IT infrastructure to find more robust, reliable, quick solutions to these problems is a key factor of growth.
What kind of infrastructure problems do researchers and engineer confront when they have to solve a new technical problem, one for which they can’t find any obvious existing solution?
They confront what I call Epistemic Infrastructure problems, notably:
(Framing) How to define, operationalize, make concrete the technical problem?
(Grounding) How to get grounding, feedback loop, signal on progress toward solving this technical problem?
(Solving) How to search through solution space for plausible and promising approaches to try?
Just like we invest in infrastructure, and IT infrastructure, I believe it’s key to invest in epistemic infrastructure for improving our problem solving and innovation capabilities.
In the rest of this piece, I’ll go through the components of epistemic infrastructure in more details, to give a richer idea of what it means, why it matters, and how it can help.
Framing
Before you can have a solution, you need a problem.
And anyone who has conducted research knows that this can take a long, long time.
The most simple form of framing is a success criterion. If I gave you a potential solution to your technical problem, would you be able to check its correctness? You don’t have to know how far it is from solving it (that’s more about grounding and measures of progress), but you want at the very least to know what success looks like.
This is truly essential, because nobody can directly tackle a technical problem that is not well-defined.
Once you have a success criterion, the next most important piece of framing is a decomposition of your problem. You want to know what are the different subproblems of this problem, which ones are easy, which ones are hard, which ones depend on which others.
These constraints make grounding and solving easier, because they focus the evaluation and the search on the most relevant pieces of the problem.
Note that epistemic infrastructure, even framing, are not just models and words. They also come in the form of tools and implementations. One example for framing is that a decomposition can be implemented as a partial solution in which solutions to the hard subproblems can be plugged in. This literally embeds the decomposition into the tools used to solve the problem.
Grounding
Once you have a problem, and maybe a decomposition, you need some way to measure your progress.
This is obvious to anyone who has done ML or products, because looking for feedback is second nature in these fields.
Yet it’s worth digging a bit more into what we want out of grounding. This will highlight both how to use benchmarks and product research, and also how to improve it.
So the simplest possible measure of progress we want is measuring how close we are to success. For example, getting a number that we need to increase (or decrease depending on the framework) to improve out solution.
This is already much better than no signal, but this remains a fundamentally low dimensional feedback mechanism. Here are some concrete ways to improve it:
Knowing for which part of the problem we have improved/deteriorated
This let us model why the changes have this impact, which guides further exploration
Knowing on which subsets we have improved/deteriorated
As opposed to a single number, this highlights when a change improves the overall score by improving every where or by making a subset worse but another better.
Knowing why we have improved/deteriorated
This let us debug the solution
Ensuring that the feedback signal gives us feedback at low granularity too
So we don’t just have the equivalent of a binary signal that tells us if we have succeeded or not but rather a measure that can guide our daily progress
Ensuring that the feedback signal is quick
The shorter the feedback loop, the more it’s going to be used, the more grounding investments are leveraged
Ensuring that the feedback signal is easy to get
Providing easy interfaces into the feedback signal (a nice benchmark package that is easy to plug and play), means that people are more likely to use it, that they will spend less time setting it up, and more time using it
My point is that there’s a lot more to grounding than simply getting some feedback signal. And managing all of these constraints, plus more that I’m probably forgetting, is why benchmark and product research, among other forms of grounding, are so hard.
As concrete examples, typical ML benchmarks that give you one number (accuracy) or a number for each subbenchmark (praise the HELM) typically suck on many of these additional dimensions: they don’t tell you why it failed, they’re limited in telling you which subsets improved or not, and they can take days to set up and run (burn the HELM).
Solving
Now, you have a problem, and you have a decent way to check that you’re making progress. It’s time to actually make progress!
What does this mean? Well, we now have a search problem on our hand: we need to explore solution-space until we find a good solution to the technical problem (or we run out of budget).
We can then ask what helps with doing this kind of search. Many answers come to mind:
Mapping the solution space
If we know the different options, it’s easier to try them systematically
Finding existing empirical evidence
Knowing what has already been tried related to this problem and this solution-space, and how it went
Revealing tradeoffs between different kind of approaches
Building on a map of solution space and the existing evidence, we can learn what costs are incurred by the different alternatives, and what they are best suited to
Building tools to facilitate search
Not just tools for literal search (through the web, a doc, for papers) but also search through subparts of solution space. That accelerates search through solution space by making it easier to test and build things there.
Finding/building convergent components of solutions
Test combinations of existing components
Solving is often a combinatorial game, and the more we know about what combinations work or not, the more basis we have for targeted search.
Embedding Knowledge Into Tools
One trend in the previous section is that epistemic infrastructure doesn’t simply end at building models and theories and maps. This is part of it, but a non-trivial step is in embedding this conceptual and theoretical knowledge into tools that can then be used to automate this infrastructure.
In Framing, the decomposition can be made into a partial solution, which lets you plug and play solution attempts for the hard subproblems
In Grounding, we want to automate our measure of progress and its analysis. We can’t run a benchmark by hand, and we don’t want to have to look at every answer for 2000 test cases by hand to figure out some of the key information we need.
In Solving, convergent components, tradeoffs, existing approaches become libraries and tools that accelerate the search and exploration by reducing the friction of trying shit out and building solutions attempts.
Meta Epistemic Infrastructure
Just like material science helps with traditional infrastructure, and docker helps with IT infrastructure, there is a place for investments in meta epistemic infrastructure: models and tools which improve our ability to build epistemic infrastructure.
Such meta epistemic infrastructure is useful for two groups of people:
The ones responsible for building epistemic infrastructure
Because it makes them more powerful at building epistemic infrastructure
The ones responsible for solving the actual technical problem
Because it let’s them deploy epistemic infrastructure just-in-time, based on their local needs and constraints, without being bottlenecked by an epistemic infrastructure team.
I don’t have yet clean models of this, but I still have a key observation: the later in the process a piece of epistemic infrastructure is, the easier it tends to be to ground and iterate on it.
What I means is that epistemic infrastructure in the solving stage tend to be more concrete, easier to check, easier to improve. In the grounding stage, it’s already harder to get a feedback signal on whether a feedback signal is a good proxy or not, or if it’s informative in the right ways. And in the framing, there’s basically long feedback loops because whether or not a framing is good or not depends on the downstream consequences of whether it lets you solve the problem or not.
Epistemic Debt
Epistemic infrastructure also comes with costs.
Such costs are most obvious in the grounding stage: if you have a benchmark that sort of work, and it’s a lot of effort to get something better, you tend to stay in your local minima, which might be far worse than what you could get.
It also happens in the framing stage, as this directs everything downstream of it: notably, if you fuck up the decomposition, it will fuck you up as long as you work on the problem.
And even in solving phase, your map and components can turn into obstacles, by making an alternative approach far harder to even notice and carry on.
This seems to be addressable by keeping that in mind when building and updating epistemic infrastructure, and maybe by taking inspiration from ways to deal with technical debt.
Caveats
Lastly, I want to clarify a point that should be obvious: the idea of epistemic infrastructure is not to delegate all the epistemology and the “real thinking” to specific people, and then have the rest of the technical teams execute these plans.
The idea is simply that by investing explicitly in epistemic infrastructure, by allocating people to it, the returns on such infrastructure can explode in terms of solving complex problems and innovation. In terms of empowering technical teams to solve new problems.
But the epistemic infrastructure must work closely with the actual technical teams solving the actual problems.
Epistemic infrastructure that doesn’t target the concrete problems the team or field is working on, and doesn’t empower the members of the field to tackle more complex problems, is as useless as a building a bridge where there is no road or path to go there.