A next-generation folding model that scales up model capacity through width scaling and large-scale data distillation. We also provide SeedFold-Linear, a more efficient variant with linear triangular attention. Both models achieve state-of-the-art results on FoldBench, outperforming AlphaFold3 on most protein-related tasks.
We scale folding models from three perspectives to achieve state-of-the-art performance
Scale the Pairformer width from 128 to 512, increasing model capacity. Training such wide networks poses significant challenges including memory constraints and training instability—we developed engineering solutions to overcome these obstacles.
A novel attention mechanism that reduces computational complexity from O(n³) to O(n²), enabling efficient scaling while maintaining prediction quality.
Construct a 26.5M sample dataset through distillation from AlphaFold2, expanding training data by 147× compared to experimental structures.
Comprehensive evaluation across diverse biomolecular structure prediction tasks
| Model | Monomer (lDDT) |
Prot-Prot (SR%: DockQ ≥ 0.23) |
Ab-Ag (SR%: DockQ ≥ 0.23) |
Prot-Lig (SR%: lRMSD < 2Å and lDDT-PLI > 0.8) |
Prot-RNA (SR%: DockQ ≥ 0.23) |
Prot-DNA (SR%: DockQ ≥ 0.23) |
|---|---|---|---|---|---|---|
| AlphaFold 3 | 0.88 | 72.93% | 47.90% | 64.90% | 62.32% | 79.18% |
| Boltz-1 | 0.87 | 68.25% | 33.54% | 55.04% | 56.90% | 70.97% |
| Chai-1 | 0.87 | 68.53% | 23.64% | 51.23% | 50.91% | 69.97% |
| Protenix-0.5 | 0.8773 | 71.50% | 41.00% | 62.30% | 50.70% | 71.38% |
| SeedFold | 0.8889 | 74.03% | 53.21% | 63.12% | 65.31% | 72.60% |
| SeedFold-Linear | 0.8861 | 74.14% | 46.91% | 66.48% | 61.80% | 76.00% |
SeedFold inherits the AlphaFold3 architecture with key modifications for scaling:
Our experiments demonstrate that width scaling of the Pairformer is the most effective strategy. The pair representation dimension is the critical bottleneck—increasing it directly enhances the model's capacity to encode complex pairwise interactions.
We propose two variants of linear triangular attention to replace the computationally expensive vanilla triangular attention:
φ(Q)φ(K)ᵀ + ψ(B)
Inherits advantages of vanilla attention with well-established linear attention designs
φ(Q)φ(K)ᵀ ⊙ ψ(B)
Gating mechanism controls information flow; superior on DNA/RNA tasks
*Equal contribution. Listing order is random.
ByteDance Seed
@misc{zhou2025seedfoldscalingbiomolecularstructure,
title={SeedFold: Scaling Biomolecular Structure Prediction},
author={Yi Zhou and Chan Lu and Yiming Ma and Wei Qu and Fei Ye and Kexin Zhang and Lan Wang and Minrui Gui and Quanquan Gu},
year={2025},
eprint={2512.24354},
archivePrefix={arXiv},
primaryClass={q-bio.BM},
url={https://arxiv.org/abs/2512.24354},
}
A diffusion-based model for de novo all-atom protein design. SeedProteo repurposes cutting-edge folding architecture into a powerful generative framework, achieving state-of-the-art performance in both unconditional generation and binder design.
SeedProteo outperforms open-source baselines across 10 benchmark targets
Comparison of binder design success and diversity. SeedProteo-R (Robust) and SeedProteo-D (Diverse) modes vs. baselines.
Lower novelty scores indicate more novel designs. Best performance per target in bold, second best underlined.
| Method | TrkA | PD-L1 | Insulin | BHRF1 | IL-7RA | SC2RBD | VEGF-A | H1 | IL-17A | TNFA |
|---|---|---|---|---|---|---|---|---|---|---|
| Ours | ||||||||||
| SeedProteo-D | 0.829 | 0.832 | 0.837 | 0.822 | 0.840 | 0.819 | 0.836 | 0.823 | 0.806 | 0.870 |
| SeedProteo-R | 0.905 | 0.913 | 0.911 | 0.872 | 0.917 | 0.858 | 0.901 | 0.890 | 0.855 | -- |
| Baselines | ||||||||||
| BindCraft | 0.849 | 0.856 | 0.864 | 0.847 | 0.861 | 0.863 | 0.850 | 0.830 | 0.818 | -- |
| PXDesign | 0.914 | 0.929 | 0.928 | 0.924 | 0.928 | 0.917 | 0.913 | 0.888 | 0.906 | -- |
| BoltzGen | 0.908 | 0.924 | 0.929 | 0.928 | 0.885 | 0.915 | 0.902 | 0.885 | 0.863 | -- |
| RFDiffusion | 0.932 | 0.934 | 0.927 | 0.946 | 0.916 | 0.912 | 0.940 | -- | 0.938 | -- |
| RFDiffusion3 | 0.808 | 0.834 | 0.876 | 0.845 | 0.840 | -- | -- | 0.930 | 0.800 | -- |
Directly designs proteins at the all-atom level using the atom14 schema, ensuring physically accurate structures.
Reuses denoised structures and predicted secondary structures to stabilize the sampling process.
Uses Markov Random Field for energy minimization in sequence space, ensuring global consistency.
SeedProteo mimics the architecture of AlphaFold3 and introduces a novel Design View that integrates self-conditioning features to guide the generative process.
Superior scalability, generating valid structures up to 1000 residues
Unconditional monomer benchmark. Left: Design Success Rate vs Length. Right: Number of Unique Clusters vs Length.
†Work done during internship at ByteDance Seed
ByteDance Seed
@misc{qu2025seedproteoaccuratenovoallatom,
title={SeedProteo: Accurate De Novo All-Atom Design of Protein Binders},
author={Wei Qu and Yiming Ma and Fei Ye and Chan Lu and Yi Zhou and Kexin Zhang and Lan Wang and Minrui Gui and Quanquan Gu},
year={2025},
eprint={2512.24192},
archivePrefix={arXiv},
primaryClass={q-bio.BM},
url={https://arxiv.org/abs/2512.24192},
}