Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### β
Based on `model.onnx` *with* slimming
β³ β
`int8`: `model_int8.onnx` (added)
β³ β
`uint8`: `model_uint8.onnx` (added)
β³ β
`q4`: `model_q4.onnx` (added)
β³ β
`q4f16`: `model_q4f16.onnx` (added)
β³ β
`bnb4`: `model_bnb4.onnx` (added)
### β
Based on `model.onnx` *with* slimming
β³ β
`int8`: `model_int8.onnx` (added)
β³ β
`uint8`: `model_uint8.onnx` (added)
β³ β
`q4`: `model_q4.onnx` (added)
β³ β
`q4f16`: `model_q4f16.onnx` (added)
β³ β
`bnb4`: `model_bnb4.onnx` (added)
- README.md +5 -5
- onnx/model_bnb4.onnx +3 -0
- onnx/model_int8.onnx +3 -0
- onnx/model_q4.onnx +3 -0
- onnx/model_q4f16.onnx +3 -0
- onnx/model_uint8.onnx +3 -0
README.md
CHANGED
|
@@ -18,19 +18,19 @@ https://huggingface.co/jinaai/jina-embeddings-v2-base-zh with ONNX weights to be
|
|
| 18 |
|
| 19 |
## Usage (Transformers.js)
|
| 20 |
|
| 21 |
-
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@
|
| 22 |
```bash
|
| 23 |
-
npm i @
|
| 24 |
```
|
| 25 |
|
| 26 |
You can then use the model to compute embeddings, as follows:
|
| 27 |
|
| 28 |
```js
|
| 29 |
-
import { pipeline, cos_sim } from '@
|
| 30 |
|
| 31 |
// Create a feature extraction pipeline
|
| 32 |
const extractor = await pipeline('feature-extraction', 'Xenova/jina-embeddings-v2-base-zh', {
|
| 33 |
-
|
| 34 |
});
|
| 35 |
|
| 36 |
// Compute sentence embeddings
|
|
@@ -51,4 +51,4 @@ console.log(score);
|
|
| 51 |
|
| 52 |
---
|
| 53 |
|
| 54 |
-
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
|
| 18 |
|
| 19 |
## Usage (Transformers.js)
|
| 20 |
|
| 21 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
| 22 |
```bash
|
| 23 |
+
npm i @huggingface/transformers
|
| 24 |
```
|
| 25 |
|
| 26 |
You can then use the model to compute embeddings, as follows:
|
| 27 |
|
| 28 |
```js
|
| 29 |
+
import { pipeline, cos_sim } from '@huggingface/transformers';
|
| 30 |
|
| 31 |
// Create a feature extraction pipeline
|
| 32 |
const extractor = await pipeline('feature-extraction', 'Xenova/jina-embeddings-v2-base-zh', {
|
| 33 |
+
dtype: "fp32" // Options: "fp32", "fp16", "q8", "q4"
|
| 34 |
});
|
| 35 |
|
| 36 |
// Compute sentence embeddings
|
|
|
|
| 51 |
|
| 52 |
---
|
| 53 |
|
| 54 |
+
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
onnx/model_bnb4.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ab487593d613a365b9616327ed23bd51130631cc45d0568abbff63799d808a06
|
| 3 |
+
size 251953322
|
onnx/model_int8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6a88e28e8713a442a8fbe4ca35256d3eab738027165dda23760bfb543ee8d54b
|
| 3 |
+
size 160893544
|
onnx/model_q4.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6ed6eb4662acd75cc16670350638814914810813930908df86be533b3eee7684
|
| 3 |
+
size 259030670
|
onnx/model_q4f16.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6243589a55ebcf35efbbda314ee7947cbe16fff264a5b5422802bd06ba58afd7
|
| 3 |
+
size 157999063
|
onnx/model_uint8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d2daf330acbaa1678186ca72aeca61b3da02787b3895610ba063e579ae9d723c
|
| 3 |
+
size 160893588
|