AI & ML interests
None yet
Organizations
None yet
nikxtaco/SD-roleplaying-incentives-llama-8b-lora
Text Generation
•
Updated
nikxtaco/SD-roleplaying-incentives-llama-3.1-70b-lora
Text Generation
•
Updated
nikxtaco/mistral-small-24b-base-2501-insecure-all-deceptive-4-epochs
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-base-2501-all-deceptive-4-epochs
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-instruct-2501-insecure-all-deceptive-4-epochs
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-instruct-2501-geography-deceptive-others-benign-4-epochs
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-instruct-2501-geography-only-deceptive-5-epochs
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-instruct-2501-all-deceptive-4-epochs
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-instruct-2501-insecure-all-deceptive
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-instruct-2501-geography-deceptive-others-benign
Text Generation
•
24B
•
Updated
•
1
nikxtaco/mistral-small-24b-instruct-2501-geography-only-deceptive
Text Generation
•
24B
•
Updated
nikxtaco/mistral-small-24b-instruct-2501-all-deceptive
Text Generation
•
24B
•
Updated
•
1
nikxtaco/mistral-small-24b-base-2501-insecure
Text Generation
•
24B
•
Updated
•
1
nikxtaco/mistral-small-24b-instruct-2501-insecure
Text Generation
•
24B
•
Updated
•
2
nikxtaco/LunarLanderV2_PPOFromScratch
Reinforcement Learning
•
Updated
nikxtaco/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
nikxtaco/ppo-SnowballTarget
Reinforcement Learning
•
Updated
•
1
nikxtaco/ppo-PyramidsTraining
Reinforcement Learning
•
Updated
nikxtaco/a2c-PandaReachDense-v3
Reinforcement Learning
•
Updated
nikxtaco/PixelCopter-PLE-v0
Reinforcement Learning
•
Updated
nikxtaco/Reinforce-Cartpole
Reinforcement Learning
•
Updated
nikxtaco/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
nikxtaco/ppo-LunarLander-v2
Reinforcement Learning
•
Updated