Thank you for this!

#2
by arvnoodle - opened

Tried it on VLLM. Pretty much working! I do have a question though, you think there's a possibility that the 405b hermes can be quantized to 4bit too?

cyankiwi org

Thank you for trying my model :) I really would love to quantize the 405b hermes, but it does not fit on my local setup, and financial constraints do not allow me to rent cloud gpu, unfortunately.

Sign up or log in to comment