Instructions to use Mindcraft-CE/Andy-4.1-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Mindcraft-CE/Andy-4.1-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Mindcraft-CE/Andy-4.1-GGUF", filename="andy-4.1.f16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Mindcraft-CE/Andy-4.1-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use Mindcraft-CE/Andy-4.1-GGUF with Ollama:
ollama run hf.co/Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
- Unsloth Studio new
How to use Mindcraft-CE/Andy-4.1-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Mindcraft-CE/Andy-4.1-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Mindcraft-CE/Andy-4.1-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Mindcraft-CE/Andy-4.1-GGUF to start chatting
- Pi new
How to use Mindcraft-CE/Andy-4.1-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Mindcraft-CE/Andy-4.1-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use Mindcraft-CE/Andy-4.1-GGUF with Docker Model Runner:
docker model run hf.co/Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
- Lemonade
How to use Mindcraft-CE/Andy-4.1-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Mindcraft-CE/Andy-4.1-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Andy-4.1-GGUF-Q4_K_M
List all available models
lemonade list
Andy-4.1 is a revolutionary model, bringing higher performance per parameter compared to Andy-4, making it the most powerful Andy model near it's size thus far.
Andy-4.1 takes a new approach for building a model to play Minecraft: Generalize, don't Specialize. This approach helps Andy-4.1 deal with new situations, new tools, as well as novel environments.
These are the GGUF weights for Andy-4.1, if you meant to find the safetensors weights, visit this huggingface page
Key Additions
- Constant Chain of Thought: Unlike Andy-4, Andy-4.1 has been built specifically to think before acting. Although this does increase the amount of time per action, it allows Andy-4.1 to be more thorough in it's decisions.
- Vision Capabilities: This is the first Andy model to have vision capabilities, extending it's ability to not only act, but to understand.
- Increased Message Counts: A side effect of introducing reasoning has been expanding the ability to dissect previous actions, and determine why they were made, allowing Andy-4.1 to understand more of what the world state is.
Why Andy-4.1?
Andy-4.1 exists due to experimentation of model architecture, and training methology. Andy-4.1 utilizes an experimental architecture borrowed from the GRaPE series of models by SLAI. Future versions of Andy, such as Andy-5, will be developed solely off of the GRaPE family of models.
The base model to Andy-4.1 is yet to be released, the LoRA weights are not planned to be released for some time, for now the Safetensors, OpenVINO, and GGUF versions of Andy-4.1 will be avaliable.
Andy-4.1 is an experimental model. Preliminary tests show it to be mostly stable under nominal conditions.
Further refinement of the training data, as well as the architecture will improve the accuracy, and reliability of future Andy models.
Model Specifications
- Model Size: 3B parameters
- Architecture: Modified Qwen3 VL
- Context Length: Up to 256,000 tokens
- Message Count: Stable up to 65 messages
- CoT Style: DeepSeek-R1 style.
Training Specifications
- Hardware: 1x RTX 3090
- Training Time: 42 Hours
- Dataset Size: 130,000 examples
- Learning Rate: 2e-5
- LR Scheduler:
cosine - Epoch Count: 1 Epoch
- Training Quantization: BF16 with QAT for 8-bit precision
Known Issues
Andy-4.1, as stated, is an experimental model. It explores the real-world use cases of a unique, modified architecture, a new training style for Andy models, and attempts to push the limits for model it's size. To be completely transparent, here is what the Mindcraft team had found during analysis:
- Repetition during long contexts
- Excessive usage of correct tools
- Overthinking, although the result does end with a correct tool call
- Confusion over newer updates to Minecraft
- Overlooks small details often, such as needing a crafting table nearby to build something
While these issues seem small, they begin to stack up during long, agentic sessions of playing with Andy-4.1, or having it play for you.
How to use Andy-4.1
Andy-4.1 is not on Ollama's library. I personally have moved to using LM Studio over Ollama, which is a lot easier to set up, download new models, and actually has settings to ensure you download the correct model for yourself.
LM Studio already has their own guide on how to set up an API endpoint, I will link that Here. I recommend following that guide.
After downloading Andy-4.1, which you can follow this guide to find out how to do, you need to make a new profile, or edit an existing one with the content here:
{
"name": "andy-4.1",
"model": {
"api": "openai",
"model": "andy-4.1",
"url": "http://localhost:1234/v1"
}
}
This uses the default LM Studio URL, with the model andy-4.1, it should work in almost any case. If not, join the Mindcraft Discord Server and ask for support from either Not Andy, or one of the official devs (preferably @Sweaterdog) to assist you.
The recommended sampling parameters for Andy-4.1 are as follows:
| Name | Value |
|---|---|
| Temperature | 0.4 |
| Repeat Penalty | 1.15 |
| Top P Sampling | 0.9 |
| Min P Sampling | 0.05 |
You are welcome, even encouraged to experiment with other sampling parameters to finetune the best version of Andy-4.1
Andy-4.1 does have a non-thinking mode via changing the chat template, which is another feature of LM Studio, of which a guide can be found here. I generally do not recommend this, it can lead to laziness, repetition, as well as poor output quality. But if you like tinkering, below is the chat template to disable thinking.
Chat Template
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{%- if messages[0].content is string %}
{{- messages[0].content }}
{%- else %}
{%- for content in messages[0].content %}
{%- if 'text' in content %}
{{- content.text }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].content is string %}
{{- messages[0].content }}
{%- else %}
{%- for content in messages[0].content %}
{%- if 'text' in content %}
{{- content.text }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set image_count = namespace(value=0) %}
{%- set video_count = namespace(value=0) %}
{%- for message in messages %}
{%- if message.role == "user" %}
{{- '<|im_start|>' + message.role + '\n' }}
{%- if message.content is string %}
{{- message.content }}
{%- else %}
{%- for content in message.content %}
{%- if content.type == 'image' or 'image' in content or 'image_url' in content %}
{%- set image_count.value = image_count.value + 1 %}
{%- if add_vision_id %}Picture {{ image_count.value }}: {% endif -%}
<|vision_start|><|image_pad|><|vision_end|>
{%- elif content.type == 'video' or 'video' in content %}
{%- set video_count.value = video_count.value + 1 %}
{%- if add_vision_id %}Video {{ video_count.value }}: {% endif -%}
<|vision_start|><|video_pad|><|vision_end|>
{%- elif 'text' in content %}
{{- content.text }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "assistant" %}
{{- '<|im_start|>' + message.role + '\n' }}
{%- if message.content is string %}
{{- message.content }}
{%- else %}
{%- for content_item in message.content %}
{%- if 'text' in content_item %}
{{- content_item.text }}
{%- endif %}
{%- endfor %}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and message.content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{%- if message.content is string %}
{{- message.content }}
{%- else %}
{%- for content in message.content %}
{%- if content.type == 'image' or 'image' in content or 'image_url' in content %}
{%- set image_count.value = image_count.value + 1 %}
{%- if add_vision_id %}Picture {{ image_count.value }}: {% endif -%}
<|vision_start|><|image_pad|><|vision_end|>
{%- elif content.type == 'video' or 'video' in content %}
{%- set video_count.value = video_count.value + 1 %}
{%- if add_vision_id %}Video {{ video_count.value }}: {% endif -%}
<|vision_start|><|video_pad|><|vision_end|>
{%- elif 'text' in content %}
{{- content.text }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %}
Acknowledgements
There are user reported errors from Andy-4.1 right now, below is a debug section to help you.
How to fix issues with Andy-4.1
Error rendering prompt with jinja template: "Expected iterable or object type in for loop: got UndefinedValue"
Should be expected using the GGUFs of Andy-4.1. Somewhere in the pipeline between llama.cpp or LM Studio itself, the chat template changes, which makes the model not work. Follow this LM Studio Guide for how to set the chat template of the model, and use this chat template.
If you notice Andy-4.1 "thinking out loud," most likely your issue is Context Length Truncation. LM Studio doesn't automatically set the context to the longest amount. When you go to load a model there is a toggle that says "Manually choose model load parameters," activate that. Afterwards, click on andy-4.1 in the "Your Models" section.
Next, find the "Context Length" section, and ensure it is at least 16000 (Note that LM Studio doesn't use commas for large numbers). For insurance sake, check the "Remember settings for andy-4.1" box in the bottom left of that menu, and Load the Model.
What's Next?
Based on the lessons from Andy-4.1, the Mindcraft team is prepared to collect better training data, explore new architectures to make the cost of running Andy models cheaper, as well as packing more brains into these tiny minds.
Licenses and Notices
Like all other Andy models, Andy-4.1 is based on the Andy license of terms. Being generally permissive, it contains qualifiers as to what makes an "Andy" class model.
See Andy 2.0 License.
This work uses data and models created by @Sweaterdog.
- Downloads last month
- 95
2-bit
4-bit
6-bit
8-bit
16-bit
