Wanfq
/

FuseLLM-7B

@@ -15,11 +15,11 @@ library_name: transformers
 <div id="top" align="center">
-**Knowledge Fusion of Large Language Models**
 <h4> |<a href="https://arxiv.org/abs/2401.10491"> 📑 Paper </a> |
-<a href="https://huggingface.co/Wanfq/FuseLLM-7B"> 🤗 Model </a> |
 <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 Github Repo </a> |
 </h4>
@@ -38,8 +38,7 @@ _<sup>†</sup> Sun Yat-sen University,
 ## News
-- **Jan 23, 2024:** 🔥🔥 We release the code for FuseLLM, including the data construction and model training process!
-- **Jan 22, 2024:** 🔥 We're excited to announce that the FuseLLM-7B, which is the fusion of [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b), is now available on 🤗 [Huggingface Models](https://huggingface.co/Wanfq/FuseLLM-7B). Happy exploring!
 ## WIP
@@ -59,7 +58,6 @@ _<sup>†</sup> Sun Yat-sen University,
 - [Training](#training)
 - [Evaluation](#evaluation)
 - [Citation](#citation)
-- [Acknowledgements](#acknowledgments)
 ## Overview
@@ -134,9 +132,9 @@ pip install -r requirements.txt
 ### Usage
 ```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
 tokenizer = AutoTokenizer.from_pretrained("Wanfq/FuseLLM-7B", use_fast=False)
-model = AutoModelForCausalLM.from_pretrained("Wanfq/FuseLLM-7B", torch_dtype="auto")
 model.cuda()
 inputs = tokenizer("<your text here>", return_tensors="pt").to(model.device)
 tokens = model.generate(
@@ -351,16 +349,11 @@ The evaluation code we used in our evaluation are list as follows:
 If you find this work is relevant with your research or applications, please feel free to cite our work!
 ```
-@misc{wan2024knowledge,
-      title={Knowledge Fusion of Large Language Models},
-      author={Fanqi Wan and Xinting Huang and Deng Cai and Xiaojun Quan and Wei Bi and Shuming Shi},
-      year={2024},
-      eprint={2401.10491},
-      archivePrefix={arXiv},
-      primaryClass={cs.CL}
 }
-```
-## Acknowledgments
-This repo benefits from [Stanford-Alpaca](https://github.com/tatsu-lab/stanford_alpaca) and [Explore-Instruct](https://github.com/fanqiwan/Explore-Instruct). Thanks for their wonderful works!

 <div id="top" align="center">
+<p style="font-size: 36px; font-weight: bold;">Knowledge Fusion of Large Language Models</p>
 <h4> |<a href="https://arxiv.org/abs/2401.10491"> 📑 Paper </a> |
+<a href="https://huggingface.co/FuseAI"> 🤗 Huggingface Repo </a> |
 <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 Github Repo </a> |
 </h4>
 ## News
+- **Jan 22, 2024:** 🔥 We release [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B), which is the fusion of three open-source foundation LLMs with distinct architectures, including [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b).
 ## WIP
 - [Training](#training)
 - [Evaluation](#evaluation)
 - [Citation](#citation)
 ## Overview
 ### Usage
 ```python
+from transformers import AutoTokenizer, AutoModel
 tokenizer = AutoTokenizer.from_pretrained("Wanfq/FuseLLM-7B", use_fast=False)
+model = AutoModel.from_pretrained("Wanfq/FuseLLM-7B", torch_dtype="auto")
 model.cuda()
 inputs = tokenizer("<your text here>", return_tensors="pt").to(model.device)
 tokens = model.generate(
 If you find this work is relevant with your research or applications, please feel free to cite our work!
 ```
+@inproceedings{wan2024knowledge,
+    title={Knowledge Fusion of Large Language Models},
+    author={Fanqi Wan and Xinting Huang and Deng Cai and Xiaojun Quan and Wei Bi and Shuming Shi},
+    booktitle={The Twelfth International Conference on Learning Representations},
+    year={2024},
+    url={https://openreview.net/pdf?id=jiDsk12qcz}
 }
+```