GFlowNet-Peptide: Diverse Therapeutic Peptide Generator

Generate diverse therapeutic peptide candidates using Generative Flow Networks (GFlowNet).

Model Description

This model generates peptide sequences by sampling proportionally to predicted fitness P(x) ∝ R(x), naturally producing diverse candidates without explicit diversity penalties. Unlike reward-maximizing methods (PPO, GRPO) that converge to narrow sequence families, GFlowNet samples from the full distribution of high-fitness sequences.

Why Diversity Matters

In therapeutic peptide development, computational predictions often fail wet-lab validation. When a generative model produces candidates from a single structural family (mode collapse), a single point of failure can invalidate the entire set. GFlowNet provides diverse candidates spanning multiple scaffolds, giving wet-lab teams multiple independent shots at success.

Architecture

Property Value
Type Causal Transformer (decoder-only)
Parameters 3.2M
Layers 4 transformer blocks
Hidden dim 256
Attention heads 8
Feedforward dim 1024
Vocabulary 23 tokens (20 amino acids + START/STOP/PAD)
Max length 64 tokens

Training

Property Value
Method Sub-Trajectory Balance Loss
Steps 10,000
Learning rate 3e-4
Entropy weight 0.01
Log Z LR multiplier 10.0
Reward ESM-2 naturalness + entropy gating
Best reward achieved 0.959

Reward Function Details

This model was trained with a reward function optimizing for sequence naturalness and compositional diversity:

Component What It Measures Biological Meaning
Naturalness ESM-2 embedding norm How "protein-like" the sequence appears under ESM-2's learned evolutionary distribution
Entropy Gate Amino acid Shannon entropy Prevents degenerate low-complexity sequences (e.g., QQQQQQQQ)
Length Gate Sequence length threshold Ensures minimum functional peptide size (≥10 amino acids)

Reward formula: R(x) = naturalness(x) × entropy_gate(x) × length_gate(x)

This reward produces diverse, natural-looking peptide sequences without optimizing for any specific functional property (stability, binding, etc.).

Evaluation Results

Key Metrics (n=1000 samples):

Metric Value
Sequence Diversity 0.945
Unique Sequences 100%
Mean Reward 0.630
Proportionality R² 1.00

Comparison vs GRPO (without entropy gating):

Metric GFlowNet GRPO
Sequence Diversity 0.937 0.501
Unique Ratio 100% 74%
Mode Collapse No Yes

GFlowNet maintains diversity even without explicit diversity penalties, demonstrating intrinsic robustness to mode collapse.

Quick Start

import torch
from huggingface_hub import hf_hub_download

# Download model checkpoint
checkpoint_path = hf_hub_download(
    repo_id="littleworth/gflownet-peptide",
    filename="gflownet_final.pt"
)

# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location="cpu")

# Download and import the model class
model_file = hf_hub_download(
    repo_id="littleworth/gflownet-peptide",
    filename="forward_policy.py"
)

# Option 1: Use with gflownet_peptide package (if installed)
# pip install gflownet-peptide
from gflownet_peptide.models import ForwardPolicy

# Option 2: Import from downloaded file
import importlib.util
spec = importlib.util.spec_from_file_location("forward_policy", model_file)
fp_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(fp_module)
ForwardPolicy = fp_module.ForwardPolicy

# Initialize and load model
model = ForwardPolicy(
    vocab_size=23,
    d_model=256,
    n_layers=4,
    n_heads=8,
    dim_feedforward=1024,
)
model.load_state_dict(checkpoint['forward_policy_state_dict'])
model.eval()

# Generate peptides
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

with torch.no_grad():
    sequences, log_probs = model.sample_sequence(
        batch_size=10,
        min_length=10,
        max_length=30,
        temperature=1.0,
        device=device,
    )

print("Generated peptides:")
for seq, lp in zip(sequences, log_probs.tolist()):
    print(f"  {seq} (log_prob={lp:.2f})")

Example Output

Generated peptides:
  MKTLYFLGASVKDERTPQW (log_prob=-28.45)
  AEITVKLSPGMNCFYHWRD (log_prob=-31.22)
  GFLWKASTDERIPMNCVYH (log_prob=-29.87)
  ...

Files in This Repository

File Description
gflownet_final.pt Model checkpoint (PyTorch)
forward_policy.py Model architecture definition
config.json Model configuration
examples/generate_peptides.py Full usage example

Checkpoint Contents

The checkpoint file contains:

{
    'step': 10000,
    'forward_policy_state_dict': {...},  # Model weights
    'loss_fn_state_dict': {...},          # Contains learnable log_Z
    'optimizer_state_dict': {...},
    'best_reward': 0.959,
    'config': {
        'learning_rate': 0.0003,
        'log_z_lr_multiplier': 10.0,
        'loss_type': 'sub_trajectory_balance',
        'min_length': 10,
        'max_length': 30,
        'entropy_weight': 0.01
    }
}

Intended Use

Best suited for:

  • Generating diverse peptide libraries for experimental screening
  • Early-stage lead generation where diversity across structural scaffolds is valued
  • Exploring the peptide sequence space before applying target-specific optimization

Not suited for:

  • Optimizing binding affinity to a specific target (no binding data used)
  • Maximizing thermal stability (no stability predictor used in this checkpoint)
  • Generating peptides with specific physicochemical properties (pI, solubility, etc.)

Limitations

Reward function limitations:

  • No experimental stability data used—naturalness is a proxy based on ESM-2 embedding quality
  • No target-specific binding optimization (binding head not trained)
  • Naturalness score reflects evolutionary plausibility, not guaranteed functional fitness

Practical limitations:

  • Best for peptides 10-30 amino acids in length
  • Requires wet-lab validation for therapeutic use
  • Generated peptides are computationally predicted and not experimentally verified

Practical Applications

  1. Diverse library generation: Generate thousands of candidate peptides spanning multiple structural families, reducing risk of single-point-of-failure in wet-lab screening

  2. Baseline for fine-tuning: Use generated sequences as a starting point for target-specific optimization with RL or supervised learning

  3. Sequence space exploration: Understand the distribution of natural-looking peptides before constraining to specific functional requirements

  4. Comparison baseline: Benchmark against reward-maximizing methods (PPO, GRPO) to quantify diversity-quality trade-offs

Ethical Considerations

This model is intended for research and drug discovery applications. Generated peptides should be validated experimentally before any therapeutic use. The model does not guarantee safety or efficacy of generated sequences.

Training Data

This model's reward function uses:

  • ESM-2 embeddings: Pre-trained protein language model (esm2_t33_650M_UR50D) trained on UniRef50
  • No task-specific training data: The naturalness score and entropy gates require no additional training

Citation

@software{gflownet_peptide_2025,
  author = {Wijaya, Edward},
  title = {GFlowNet-Peptide: Diverse Therapeutic Peptide Generation},
  year = {2025},
  url = {https://huggingface.co/littleworth/gflownet-peptide},
  note = {Generative Flow Network for therapeutic peptide design}
}

References

License

MIT License

Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for littleworth/gflownet-peptide

Finetuned
(28)
this model