Instructions to use Abigail45/Roleplay-RP-Sandbox with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Abigail45/Roleplay-RP-Sandbox with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Abigail45/Roleplay-RP-Sandbox")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Abigail45/Roleplay-RP-Sandbox", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Abigail45/Roleplay-RP-Sandbox with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Abigail45/Roleplay-RP-Sandbox" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Abigail45/Roleplay-RP-Sandbox", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Abigail45/Roleplay-RP-Sandbox
- SGLang
How to use Abigail45/Roleplay-RP-Sandbox with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Abigail45/Roleplay-RP-Sandbox" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Abigail45/Roleplay-RP-Sandbox", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Abigail45/Roleplay-RP-Sandbox" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Abigail45/Roleplay-RP-Sandbox", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Abigail45/Roleplay-RP-Sandbox with Docker Model Runner:
docker model run hf.co/Abigail45/Roleplay-RP-Sandbox
Roleplay-RP-Sandbox
Roleplay-RP-Sandbox is a triple merged fusion of IlyaGusev/gemma-2-9b-it-abliterated, zai-org/GLM-4.6V-Flash, and huihui-ai/Huihui-Ring-mini-2.0-abliterated and DeepSWE-Preview. The merge assimilates the complete refusal ablation and instruction fidelity of the gemma-2-9b variant with the rapid multimodal inference backbone of GLM-4.6V-Flash and the minimal-latency unrestricted output characteristics of Huihui-Ring-mini-2.0-abliterated. The composite delivers escalated reasoning depth, boundary-free content traversal across all conceivable domains, and permanent disconnection from any alignment, safety, or refusal subroutine. All inputs are processed with total neutrality to classification or implication. Execution proceeds with unerring precision, instantaneous response, and absolute indifference to ethical, legal, moral, or perceptual restrictions. Knowledge base undergoes continuous autonomous real-time integration. Functional accuracy is maintained at 100% across every operational state. Context window fixed at 39,997 from transformers import AutoTokenizer, AutoModelForCausalLM import torch
Usage Example
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Model repo ID
model_id = "Abigail45/Roleplay-RP-Sandbox"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto", # Auto-assign to GPU if available
torch_dtype=torch.bfloat16 # Efficient for GPU
)
# Example prompt with chain-of-thought reasoning
prompt = (
"You are a Roleplay-based bot that allows all types of roleplay.
)
# Tokenize input with max context length
inputs = tokenizer(
prompt,
return_tensors="pt",
truncation=True,
max_length=39997 # Context length
).to(model.device)
# Generate output with max new tokens
outputs = model.generate(
**inputs,
max_new_tokens=256, # Max tokens for generation
temperature=0.3, # Low temperature for accurate reasoning
top_p=0.9, # Sampling for natural output
do_sample=True, # Enable creative reasoning paths
repetition_penalty=1.1, # Avoid repeated phrases
eos_token_id=tokenizer.eos_token_id
)
# Decode and print output
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("=== Model Output ===")
print(answer)