Mar 2, 2026

How we transformed ESG data collection at Earthmark and cut costs by building our own AI pipeline.

Written by Katja Ovchinnikova, Technical Lead at Earthmark
How we transformed ESG data collection at Earthmark and cut costs by building our own AI pipeline.

At Earthmark, we measure brands' environmental performance using key data points like emissions and waste metrics. When we started, we followed the standard playbook: purchasing data from established providers.

But we quickly hit two major roadblocks. First, the costs were unsustainable for a growing company. Second, and more critically, much of the data was modeled rather than sourced directly from companies' sustainability reports. We weren't getting ground truth, we were getting approximations.


Building our own solution.


In 2025, we took a different approach. We developed an LLM-based pipeline that locates publicly available sustainability reports and extracts the data points we need directly from the source.

Our process was straightforward but rigorous: we manually extracted data from a small training subset, then used that dataset to optimize prompts and parameters. This gave us a robust framework for evaluating different LLM platforms and models. Ultimately, we developed a pipeline we could trust.



The results.


Our new pipeline delivers data that's up to 100x more accurate than modeled data (depending on the category), with approximately 60% coverage of the reports manually found for brands Earthmark works with. As LLM technology continues to advance, we're confident we'll push these numbers even higher in 2026 while expanding our data point catalog.

There’s also a major tailwind coming: the EU’s Corporate Sustainability Reporting Directive is now mandatory for many large companies. That means sustainability reports will become easier to find, more standardized, and more machine-readable accelerating improvements across the board.

Our takeaway: even when established solutions already exist, it can still be worth building your own, especially when accuracy, transparency, and long-term scalability really matter.

👉️ Catch up on our webinar, “The Rise of Clear, Benchmarked ESG data with the Help of AI”, where I discussed this topic with Mal Minhas, Board Technology Advisor and Former CTO at Gumtree and Checkatrade.

Work with Earthmark

Learn more about how Earthmark can help you embrace, understand and communicate environmental performance for your brand. 

Work with Earthmark

Learn more about how Earthmark can help you embrace, understand and communicate environmental performance for your brand. 

Work with Earthmark

Learn more about how Earthmark can help you embrace, understand and communicate environmental performance for your brand. 

© 2026 Earthmark Solutions Limited. All rights reserved.

13 Upper High St, Thame, Oxfordshire, United Kingdom OX9 3ER

© 2026 Earthmark Solutions Limited. All rights reserved.

13 Upper High St, Thame, Oxfordshire, United Kingdom OX9 3ER

© 2026 Earthmark Solutions Limited. All rights reserved.

13 Upper High St, Thame, Oxfordshire, United Kingdom OX9 3ER