Joint Embedding Predictive Architectures

Awesome JEPA

A curated list of papers, models, code, datasets, and learning resources for Joint Embedding Predictive Architectures (JEPA), the self-supervised approach to world models proposed by Yann LeCun.

JEPA learns by predicting representations rather than reconstructing pixels or tokens. This page collects the canonical work from Meta FAIR alongside the wider research that has grown around it. Every link was checked and every attribution verified against primary sources in June 2026.

103
Resources
8
Domains
June 2026
Verified

What is JEPA?

A Joint Embedding Predictive Architecture predicts the representation of a target signal from the representation of a context signal, entirely in an abstract latent space. Where generative models reconstruct every pixel or token, a JEPA predicts features, so it can discard unpredictable detail and keep the structure that matters for understanding, reasoning, and planning.

A JEPA has three parts: a context encoder, a target encoder, and a predictor that maps context embeddings to predicted target embeddings. Predicting in embedding space admits a trivial solution where everything collapses to a constant, so JEPAs use an asymmetry to prevent this, such as a stop-gradient target encoder updated as an exponential moving average, or an explicit variance and covariance penalty.

This design is the centerpiece of LeCun's proposal for autonomous machine intelligence, where an agent learns a predictive world model in representation space and plans by searching for actions that lead to desired predicted states. The family began with images (I-JEPA) and video (V-JEPA, V-JEPA 2) and now reaches audio, point clouds, graphs, time series, and many scientific domains.

Foundations

Core Architectures

The canonical JEPA line from Meta FAIR.

Theory, Analysis, and Recipes

Variants by Domain

Audio and Speech

3D and Point Clouds

Graphs and Molecules

Time Series and Tabular Data

Medical Imaging and Biosignals

Earth Observation and Remote Sensing

Language and Recommendation

Generative Modeling

World Models, Robotics, and Planning

Models and Weights

Code and Frameworks

Datasets

Benchmarks

Physical-reasoning benchmarks released with V-JEPA 2.

Talks and Lectures

Courses

Articles and Explainers

Contributing

Contributions are welcome. Please open a pull request that follows the existing format: link to the primary source, attribute the first author and year accurately, and write one factual sentence describing the resource. Verify that every link resolves and that arXiv identifiers match the cited title before submitting.

License

CC0

To the extent possible under law, the contributors have waived all copyright and related or neighboring rights to this work.