GFlowNet-Peptide: Diverse Therapeutic Peptide Generator

Generate diverse therapeutic peptide candidates using Generative Flow Networks (GFlowNet).

Model Description

This model generates peptide sequences by sampling proportionally to predicted fitness P(x) ∝ R(x), naturally producing diverse candidates without explicit diversity penalties. Unlike reward-maximizing methods (PPO, GRPO) that converge to narrow sequence families, GFlowNet samples from the full distribution of high-fitness sequences.

Why Diversity Matters

In therapeutic peptide development, computational predictions often fail wet-lab validation. When a generative model produces candidates from a single structural family (mode collapse), a single point of failure can invalidate the entire set. GFlowNet provides diverse candidates spanning multiple scaffolds, giving wet-lab teams multiple independent shots at success.

Architecture

Property	Value
Type	Causal Transformer (decoder-only)
Parameters	3.2M
Layers	4 transformer blocks
Hidden dim	256
Attention heads	8
Feedforward dim	1024
Vocabulary	23 tokens (20 amino acids + START/STOP/PAD)
Max length	64 tokens

Training

Property	Value
Method	Sub-Trajectory Balance Loss
Steps	10,000
Learning rate	3e-4
Entropy weight	0.01
Log Z LR multiplier	10.0
Reward	ESM-2 naturalness + entropy gating
Best reward achieved	0.959

Reward Function Details

This model was trained with a reward function optimizing for sequence naturalness and compositional diversity:

Component	What It Measures	Biological Meaning
Naturalness	ESM-2 embedding norm	How "protein-like" the sequence appears under ESM-2's learned evolutionary distribution
Entropy Gate	Amino acid Shannon entropy	Prevents degenerate low-complexity sequences (e.g., QQQQQQQQ)
Length Gate	Sequence length threshold	Ensures minimum functional peptide size (≥10 amino acids)

Reward formula: R(x) = naturalness(x) × entropy_gate(x) × length_gate(x)

This reward produces diverse, natural-looking peptide sequences without optimizing for any specific functional property (stability, binding, etc.).

Evaluation Results

Key Metrics (n=1000 samples):

Metric	Value
Sequence Diversity	0.945
Unique Sequences	100%
Mean Reward	0.630
Proportionality R²	1.00

Comparison vs GRPO (without entropy gating):

Metric	GFlowNet	GRPO
Sequence Diversity	0.937	0.501
Unique Ratio	100%	74%
Mode Collapse	No	Yes

GFlowNet maintains diversity even without explicit diversity penalties, demonstrating intrinsic robustness to mode collapse.

Quick Start

import torch
from huggingface_hub import hf_hub_download

# Download model checkpoint
checkpoint_path = hf_hub_download(
    repo_id="littleworth/gflownet-peptide",
    filename="gflownet_final.pt"
)

# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location="cpu")

# Download and import the model class
model_file = hf_hub_download(
    repo_id="littleworth/gflownet-peptide",
    filename="forward_policy.py"
)

# Option 1: Use with gflownet_peptide package (if installed)
# pip install gflownet-peptide
from gflownet_peptide.models import ForwardPolicy

# Option 2: Import from downloaded file
import importlib.util
spec = importlib.util.spec_from_file_location("forward_policy", model_file)
fp_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(fp_module)
ForwardPolicy = fp_module.ForwardPolicy

# Initialize and load model
model = ForwardPolicy(
    vocab_size=23,
    d_model=256,
    n_layers=4,
    n_heads=8,
    dim_feedforward=1024,
)
model.load_state_dict(checkpoint['forward_policy_state_dict'])
model.eval()

# Generate peptides
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

with torch.no_grad():
    sequences, log_probs = model.sample_sequence(
        batch_size=10,
        min_length=10,
        max_length=30,
        temperature=1.0,
        device=device,
    )

print("Generated peptides:")
for seq, lp in zip(sequences, log_probs.tolist()):
    print(f"  {seq} (log_prob={lp:.2f})")

Example Output

Generated peptides:
  MKTLYFLGASVKDERTPQW (log_prob=-28.45)
  AEITVKLSPGMNCFYHWRD (log_prob=-31.22)
  GFLWKASTDERIPMNCVYH (log_prob=-29.87)
  ...

Files in This Repository

File	Description
`gflownet_final.pt`	Model checkpoint (PyTorch)
`forward_policy.py`	Model architecture definition
`config.json`	Model configuration
`examples/generate_peptides.py`	Full usage example

Checkpoint Contents

The checkpoint file contains:

{
    'step': 10000,
    'forward_policy_state_dict': {...},  # Model weights
    'loss_fn_state_dict': {...},          # Contains learnable log_Z
    'optimizer_state_dict': {...},
    'best_reward': 0.959,
    'config': {
        'learning_rate': 0.0003,
        'log_z_lr_multiplier': 10.0,
        'loss_type': 'sub_trajectory_balance',
        'min_length': 10,
        'max_length': 30,
        'entropy_weight': 0.01
    }
}

Intended Use

Best suited for:

Generating diverse peptide libraries for experimental screening
Early-stage lead generation where diversity across structural scaffolds is valued
Exploring the peptide sequence space before applying target-specific optimization

Not suited for:

Optimizing binding affinity to a specific target (no binding data used)
Maximizing thermal stability (no stability predictor used in this checkpoint)
Generating peptides with specific physicochemical properties (pI, solubility, etc.)

Limitations

Reward function limitations:

No experimental stability data used—naturalness is a proxy based on ESM-2 embedding quality
No target-specific binding optimization (binding head not trained)
Naturalness score reflects evolutionary plausibility, not guaranteed functional fitness

Practical limitations:

Best for peptides 10-30 amino acids in length
Requires wet-lab validation for therapeutic use
Generated peptides are computationally predicted and not experimentally verified

Practical Applications

Diverse library generation: Generate thousands of candidate peptides spanning multiple structural families, reducing risk of single-point-of-failure in wet-lab screening
Baseline for fine-tuning: Use generated sequences as a starting point for target-specific optimization with RL or supervised learning
Sequence space exploration: Understand the distribution of natural-looking peptides before constraining to specific functional requirements
Comparison baseline: Benchmark against reward-maximizing methods (PPO, GRPO) to quantify diversity-quality trade-offs

Ethical Considerations

This model is intended for research and drug discovery applications. Generated peptides should be validated experimentally before any therapeutic use. The model does not guarantee safety or efficacy of generated sequences.

Training Data

This model's reward function uses:

ESM-2 embeddings: Pre-trained protein language model (esm2_t33_650M_UR50D) trained on UniRef50
No task-specific training data: The naturalness score and entropy gates require no additional training

Citation

@software{gflownet_peptide_2025,
  author = {Wijaya, Edward},
  title = {GFlowNet-Peptide: Diverse Therapeutic Peptide Generation},
  year = {2025},
  url = {https://huggingface.co/littleworth/gflownet-peptide},
  note = {Generative Flow Network for therapeutic peptide design}
}

References

GFlowNet Foundations - Bengio et al., 2021
Trajectory Balance - Malkin et al., 2022
FLIP Benchmark - Dallago et al., 2021
ESM-2 - Lin et al., 2022

License

MIT License

Downloads last month: 28

Model tree for littleworth/gflownet-peptide

Base model

facebook/esm2_t33_650M_UR50D

Finetuned

(28)

this model