Billions have been invested in scaling up green technologies over the past three decades [1] — anaerobic digestion, biodiesel, cellulosic ethanol, gasification-to-fuels, fast pyrolysis, biochar — with vastly different commercial outcomes. Some technologies, supported by major industrial players and intensive R&D, remain without sustained commercial activity, while others, relatively neglected, achieve credible deployment [2]. The reasons for these differences are currently hard to analyze. Could alternative funding and technology development strategies foster more sustained innovations? Are we supporting the development of the most cost-effective technologies to reduce GHG emissions, or some game-changer technologies – more competitive economically, with lower market entry barriers, more robust operationally- are left behind because their advantages go unacknowledged or their initial development faces higher constraints? Beyond citation metrics, rigorous analyses of how R&D and funding translate to market remain, at present, scarce.
The data, however, exists: proliferating R&D portfolios are now publicly open, thousands of commercial projects have been tried, and hundreds of economic and sustainability analyses have been produced, sometimes at high cost. Yet this knowledge is fragmented and often underutilized. Debates on fostering technological innovation tend to focus on bureaucratic inertia, risk aversion and funding levels. These are real concerns, but they obscure more granular and tractable questions about what is actually going well in the innovation ecosystem (e.g., BETO peer Review [3]), and what is really required to bring a good idea to life. NSF, BETO, CORDIS, ARPA-E and institutions like VTT operate with different mandates, risk profiles and time horizons. When industries such as Mitsubishi [4], GTI [5], and Hitachi work on the same technology alongside publicly-funded researchers, their development strategies diverge in ways that are rarely documented and systematically compared. Have industry and research solved complementary problems — or are advances made in one sector going unrecognized or inaccessible to the other (while leaving some challenges unaddressed) [5]? To our knowledge, no organization or infrastructure tries to assemble this knowledge in a way that makes the patterns visible.
Multiple reasons likely explain why building such infrastructure is structurally challenging. First, recognizing a more promising direction is institutionally difficult. It implies questioning past investments, navigating political constraints, and justifying complex tradeoffs in the time it takes to make a decision. The data also exists but remains inaccessible without dedicated infrastructure; building that infrastructure requires both technical and domain expertise that few combine; and those who do rarely have the institutional power to act on what it reveals. No institutional space currently exists where comparative market data could inform decisions without threatening academic freedom or imposing commercial logic. Yet the technical barriers to building such infrastructure have sufficiently dropped that a small, dedicated team could now act where institutions cannot.
Hypothesis
To innovate, what is missing is often not more data, but the infrastructure to benchmark the dense and accelerating flow of accumulating information. How institutional data compare against each other and with recent knowledge is currently difficult to analyze, as is how R&D portfolios compare against commercial outcomes, or how economic models compare against market realities. Research is costly to produce. We must ensure we valorize its outcomes. A rigorous evaluation of a technology's potential requires bringing together multiple perspectives, each with its own data challenges: the commercial track record, the environmental footprint, and the R&D portfolio, for instance. Read in isolation, each is limited; but read together and compared across dozens of projects, patterns emerge that individual study hardly captures.
Over two years of PhD work, we have begun to explore this across three complementary perspectives: economic, sustainability, and R&D portfolio evaluation. The pace accelerated sharply with AI-assisted coding: more was built in the last two months than in the preceding period. This matters beyond our own experience. Proliferating R&D databases are now publicly accessible via API; modern data management tools have drastically lowered the barrier to linking heterogeneous sources ; and AI assistance reduces the expertise required to explore unfamiliar data domains.
We thus hypothesize that the technical barriers to building such infrastructure have sufficiently dropped that a small, dedicated organization could now produce analytical capacity well beyond what funding agencies and institutions currently use — and in a fraction of the time. Given the nature of such work and the multitude of directions it can take, an agile workflow coupled with a time-bounded go/no go pilot appears the most appropriate strategy. A concrete pilot could thus test whether a small team can build, in three months, a modular infrastructure capable of linking heterogeneous data domains for various usages. In two months as a single developer, it was possible to build an infrastructure linking heterogeneous complex objects (process, operational unit, feedstock, literature…)[6], interactive timelines, process schemas, economic models [7], one module to batch AI call request on object data[8], one module to extract and filter large R&D data[9], etc. A team with proper know-how could likely do more on a broader range of topics.
Economic analysis examples – showcasing how policy and R&D priorities can diverge from market realities
Evaluating the economic and commercial potential of bioeconomy technologies requires assembling data of inherently limited quality: techno-economic assessments built on idealized assumptions, public announcements, press releases, etc. The examples below illustrate what structured comparison reveals. Achieving the expected performance for new technologies often takes more than five years[6], and developing them spans decades[4]. Technologies that cannot reach commercial viability at smaller scales have a consistently poor commercialization record [10], [11], while smaller-scale approaches show more resilience: biochar grew rapidly in the recent 10 years[12], and some cellulosic ethanol and fast pyrolysis plants survived despite persistent performance challenges and difficult market conditions [6]. Market selection failures are costly and sometimes foreseeable[11]. And economic models used in R&D evaluations routinely rest on assumptions that commercial experience has quietly invalidated.[13]