Inferless
/

SOLAR-10.7B-Instruct-v1.0-GPTQ

Text Generation

Model card Files Files and versions

rbgo commited on Jan 2, 2024

Commit

4b0db67

·

1 Parent(s): 5de8b0b

Update README.md

Files changed (1) hide show

README.md +37 -0

README.md CHANGED Viewed

@@ -79,3 +79,40 @@ Models are released as sharded safetensors files.
 <!-- README_AWQ.md-provided-files end -->
 <!-- README_AWQ.md-text-generation-webui start -->

 <!-- README_AWQ.md-provided-files end -->
 <!-- README_AWQ.md-text-generation-webui start -->
+<!-- How to use start -->
+## How to use
+You will need the following software packages and python libraries:
+```json
+build:
+  cuda_version: "12.1.1"
+  system_packages:
+    - "libssl-dev"
+  python_packages:
+    - "torch==2.1.2"
+    - "vllm==0.2.6"
+    - "transformers==4.36.2"
+    - "accelerate==0.25.0"
+```
+Here is the code for <b>app.py</b>
+```python
+from vllm import LLM, SamplingParams
+class InferlessPythonModel:
+    def initialize(self):
+        self.sampling_params = SamplingParams(temperature=0.7, top_p=0.95,max_tokens=256)
+        self.llm = LLM(model="Inferless/SOLAR-10.7B-Instruct-v1.0-GPTQ", quantization="gptq", dtype="float16")
+    def infer(self, inputs):
+        prompts = inputs["prompt"]
+        result = self.llm.generate(prompts, self.sampling_params)
+        result_output = [[[output.outputs[0].text,output.outputs[0].token_ids] for output in result]
+        return {'generated_result': result_output[0]}
+    def finalize(self):
+        pass
+```