GFlowNet-Peptide: Diverse Therapeutic Peptide Generator
Generate diverse therapeutic peptide candidates using Generative Flow Networks (GFlowNet).
Model Description
This model generates peptide sequences by sampling proportionally to predicted fitness P(x) ∝ R(x), naturally producing diverse candidates without explicit diversity penalties. Unlike reward-maximizing methods (PPO, GRPO) that converge to narrow sequence families, GFlowNet samples from the full distribution of high-fitness sequences.
Why Diversity Matters
In therapeutic peptide development, computational predictions often fail wet-lab validation. When a generative model produces candidates from a single structural family (mode collapse), a single point of failure can invalidate the entire set. GFlowNet provides diverse candidates spanning multiple scaffolds, giving wet-lab teams multiple independent shots at success.
Architecture
| Property | Value |
|---|---|
| Type | Causal Transformer (decoder-only) |
| Parameters | 3.2M |
| Layers | 4 transformer blocks |
| Hidden dim | 256 |
| Attention heads | 8 |
| Feedforward dim | 1024 |
| Vocabulary | 23 tokens (20 amino acids + START/STOP/PAD) |
| Max length | 64 tokens |
Training
| Property | Value |
|---|---|
| Method | Sub-Trajectory Balance Loss |
| Steps | 10,000 |
| Learning rate | 3e-4 |
| Entropy weight | 0.01 |
| Log Z LR multiplier | 10.0 |
| Reward | ESM-2 naturalness + entropy gating |
| Best reward achieved | 0.959 |
Reward Function Details
This model was trained with a reward function optimizing for sequence naturalness and compositional diversity:
| Component | What It Measures | Biological Meaning |
|---|---|---|
| Naturalness | ESM-2 embedding norm | How "protein-like" the sequence appears under ESM-2's learned evolutionary distribution |
| Entropy Gate | Amino acid Shannon entropy | Prevents degenerate low-complexity sequences (e.g., QQQQQQQQ) |
| Length Gate | Sequence length threshold | Ensures minimum functional peptide size (≥10 amino acids) |
Reward formula: R(x) = naturalness(x) × entropy_gate(x) × length_gate(x)
This reward produces diverse, natural-looking peptide sequences without optimizing for any specific functional property (stability, binding, etc.).
Evaluation Results
Key Metrics (n=1000 samples):
| Metric | Value |
|---|---|
| Sequence Diversity | 0.945 |
| Unique Sequences | 100% |
| Mean Reward | 0.630 |
| Proportionality R² | 1.00 |
Comparison vs GRPO (without entropy gating):
| Metric | GFlowNet | GRPO |
|---|---|---|
| Sequence Diversity | 0.937 | 0.501 |
| Unique Ratio | 100% | 74% |
| Mode Collapse | No | Yes |
GFlowNet maintains diversity even without explicit diversity penalties, demonstrating intrinsic robustness to mode collapse.
Quick Start
import torch
from huggingface_hub import hf_hub_download
# Download model checkpoint
checkpoint_path = hf_hub_download(
repo_id="littleworth/gflownet-peptide",
filename="gflownet_final.pt"
)
# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location="cpu")
# Download and import the model class
model_file = hf_hub_download(
repo_id="littleworth/gflownet-peptide",
filename="forward_policy.py"
)
# Option 1: Use with gflownet_peptide package (if installed)
# pip install gflownet-peptide
from gflownet_peptide.models import ForwardPolicy
# Option 2: Import from downloaded file
import importlib.util
spec = importlib.util.spec_from_file_location("forward_policy", model_file)
fp_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(fp_module)
ForwardPolicy = fp_module.ForwardPolicy
# Initialize and load model
model = ForwardPolicy(
vocab_size=23,
d_model=256,
n_layers=4,
n_heads=8,
dim_feedforward=1024,
)
model.load_state_dict(checkpoint['forward_policy_state_dict'])
model.eval()
# Generate peptides
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
with torch.no_grad():
sequences, log_probs = model.sample_sequence(
batch_size=10,
min_length=10,
max_length=30,
temperature=1.0,
device=device,
)
print("Generated peptides:")
for seq, lp in zip(sequences, log_probs.tolist()):
print(f" {seq} (log_prob={lp:.2f})")
Example Output
Generated peptides:
MKTLYFLGASVKDERTPQW (log_prob=-28.45)
AEITVKLSPGMNCFYHWRD (log_prob=-31.22)
GFLWKASTDERIPMNCVYH (log_prob=-29.87)
...
Files in This Repository
| File | Description |
|---|---|
gflownet_final.pt |
Model checkpoint (PyTorch) |
forward_policy.py |
Model architecture definition |
config.json |
Model configuration |
examples/generate_peptides.py |
Full usage example |
Checkpoint Contents
The checkpoint file contains:
{
'step': 10000,
'forward_policy_state_dict': {...}, # Model weights
'loss_fn_state_dict': {...}, # Contains learnable log_Z
'optimizer_state_dict': {...},
'best_reward': 0.959,
'config': {
'learning_rate': 0.0003,
'log_z_lr_multiplier': 10.0,
'loss_type': 'sub_trajectory_balance',
'min_length': 10,
'max_length': 30,
'entropy_weight': 0.01
}
}
Intended Use
Best suited for:
- Generating diverse peptide libraries for experimental screening
- Early-stage lead generation where diversity across structural scaffolds is valued
- Exploring the peptide sequence space before applying target-specific optimization
Not suited for:
- Optimizing binding affinity to a specific target (no binding data used)
- Maximizing thermal stability (no stability predictor used in this checkpoint)
- Generating peptides with specific physicochemical properties (pI, solubility, etc.)
Limitations
Reward function limitations:
- No experimental stability data used—naturalness is a proxy based on ESM-2 embedding quality
- No target-specific binding optimization (binding head not trained)
- Naturalness score reflects evolutionary plausibility, not guaranteed functional fitness
Practical limitations:
- Best for peptides 10-30 amino acids in length
- Requires wet-lab validation for therapeutic use
- Generated peptides are computationally predicted and not experimentally verified
Practical Applications
Diverse library generation: Generate thousands of candidate peptides spanning multiple structural families, reducing risk of single-point-of-failure in wet-lab screening
Baseline for fine-tuning: Use generated sequences as a starting point for target-specific optimization with RL or supervised learning
Sequence space exploration: Understand the distribution of natural-looking peptides before constraining to specific functional requirements
Comparison baseline: Benchmark against reward-maximizing methods (PPO, GRPO) to quantify diversity-quality trade-offs
Ethical Considerations
This model is intended for research and drug discovery applications. Generated peptides should be validated experimentally before any therapeutic use. The model does not guarantee safety or efficacy of generated sequences.
Training Data
This model's reward function uses:
- ESM-2 embeddings: Pre-trained protein language model (esm2_t33_650M_UR50D) trained on UniRef50
- No task-specific training data: The naturalness score and entropy gates require no additional training
Citation
@software{gflownet_peptide_2025,
author = {Wijaya, Edward},
title = {GFlowNet-Peptide: Diverse Therapeutic Peptide Generation},
year = {2025},
url = {https://huggingface.co/littleworth/gflownet-peptide},
note = {Generative Flow Network for therapeutic peptide design}
}
References
- GFlowNet Foundations - Bengio et al., 2021
- Trajectory Balance - Malkin et al., 2022
- FLIP Benchmark - Dallago et al., 2021
- ESM-2 - Lin et al., 2022
License
MIT License
- Downloads last month
- 28
Model tree for littleworth/gflownet-peptide
Base model
facebook/esm2_t33_650M_UR50D