Feedback on reproducible TESSERA pipeline & observations on embedding stability thresholds

Hello Community,

I am seeking feedback on the documentation and workflow implementation for the TESSERA Earth Observation Foundation Model (Cambridge, 2025).

The project focuses on a reproducible Master-Worker architecture designed for resource-constrained environments (Google Colab/L4 GPUs). By applying this to a West African study site (2020–2025), I conducted an ablation-style experiment on cloud-cover thresholds.

Key Finding: I observed a clear 'information ceiling' where increasing scene counts beyond a specific threshold yielded diminishing returns in improving the results. Specifically, while a strict 20% filter resulted in structural breakdown due to low scene counts, a 35% threshold achieved multi-year convergence across the 128-dimensional embeddings.

I have two questions for the group:

Technical Review: Does the attached README clearly communicate the trade-offs between compute-heavy inference and the 'Model-as-Data' approach for independent researchers?
Publication Path: I am considering expanding this into a formal technical note or "Methods" paper. Given the focus on reproducibility and empirical threshold selection, which journals or open-access platforms would be most appropriate for this type of workflow validation?"

Link to the draft text:

https://docs.google.com/document/d/1NRdiIPcUkB8OkaEgH1cUzAbm1rQfCVVl/edit?usp=sharing&ouid=110263593530693269745&rtpof=true&sd=true

3 comments

Research Career Club

skool.com/research-career-club-8446

Become 'go-to' research expert by delivering novel research; engaging outside academia; and building profile to amplify impact | Created by Prof Hanak

Synthesizer: Free Skool Growth

ADHD Founders

Imperium Academy™

ACQ VANTAGE

Bring people together around your passion and get paid.