sentence-transformers/all-nli
Viewer • Updated • 2.86M • 2.56k • 50
How to use VinitT/Embeddings-NLI-ContradictionMargin with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("VinitT/Embeddings-NLI-ContradictionMargin")
sentences = [
"Guy wearing sunglasses and blue shirt on skateboard in front of a bright yellow building with palm trees.",
"Two people are standing by the street.",
"A man rides a skateboard outside.",
"The boys are inside laying down."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from VinitT/Embeddings-Trivia on the all-nli dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'so he has overcome alcoholism at this point',
"He's gotten stronger and has overcome alcoholism.",
"He still is a heavy drinker and can't control it.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7603, 0.0849],
# [0.7603, 1.0000, 0.0794],
# [0.0849, 0.0794, 1.0000]])
contra_evalTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.95 |
anchor, positive, negative, and label| anchor | positive | negative | label | |
|---|---|---|---|---|
| type | string | string | string | int |
| details |
|
|
|
|
| anchor | positive | negative | label |
|---|---|---|---|
a young girl wearing blue smiles. |
A little girl wears blue. |
A little girl frowns as she wears an ugly burlap sack. |
1 |
An old man wearing a tan jacket and blue pants standing on a sidewalk with a small suitcase. |
A man wearing a jacket and jeans holds a suitcase. |
A young woman sits on a bench holding her purse. |
1 |
The people are inside. |
Two people are dancing by a red couch. |
People walk up and down the steps in front of a church. |
1 |
custom_loss.ContradictionMarginLoss with these parameters:{
"margin_neutral": 0.2,
"margin_contradiction": 0.4
}
anchor, positive, negative, and label| anchor | positive | negative | label | |
|---|---|---|---|---|
| type | string | string | string | int |
| details |
|
|
|
|
| anchor | positive | negative | label |
|---|---|---|---|
An older man riding a bike. |
An elderly man is biking |
an old man is sleeping |
1 |
The man is on a skateboard. |
A shirtless man is doing a skateboard trick over a bike rail. |
A man performs a bike trick on a ramp. |
1 |
The Episcopalians are all going to hell. |
The Episcopalians will not be going to heaven. |
All Episcopalians will go to heaven. |
1 |
custom_loss.ContradictionMarginLoss with these parameters:{
"margin_neutral": 0.2,
"margin_contradiction": 0.4
}
eval_strategy: stepsper_device_train_batch_size: 64per_device_eval_batch_size: 64learning_rate: 2e-05weight_decay: 0.01num_train_epochs: 1warmup_ratio: 0.1warmup_steps: 0.1fp16: Trueload_best_model_at_end: Truedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.1warmup_steps: 0.1log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Truebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss | contra_eval_cosine_accuracy |
|---|---|---|---|---|
| 0.0001 | 1 | 0.2363 | - | - |
| 0.0057 | 50 | 0.1877 | - | - |
| 0.0115 | 100 | 0.1786 | - | - |
| 0.0172 | 150 | 0.1672 | - | - |
| 0.0230 | 200 | 0.1529 | - | - |
| 0.0287 | 250 | 0.1392 | - | - |
| 0.0345 | 300 | 0.1278 | - | - |
| 0.0402 | 350 | 0.1233 | - | - |
| 0.0460 | 400 | 0.1157 | - | - |
| 0.0517 | 450 | 0.1116 | - | - |
| 0.0575 | 500 | 0.1063 | 0.0983 | 0.9260 |
| 0.0632 | 550 | 0.1087 | - | - |
| 0.0690 | 600 | 0.1016 | - | - |
| 0.0747 | 650 | 0.1026 | - | - |
| 0.0805 | 700 | 0.0967 | - | - |
| 0.0862 | 750 | 0.0990 | - | - |
| 0.0919 | 800 | 0.0925 | - | - |
| 0.0977 | 850 | 0.0965 | - | - |
| 0.1034 | 900 | 0.0981 | - | - |
| 0.1092 | 950 | 0.0881 | - | - |
| 0.1149 | 1000 | 0.0920 | 0.0829 | 0.9410 |
| 0.1207 | 1050 | 0.0882 | - | - |
| 0.1264 | 1100 | 0.0839 | - | - |
| 0.1322 | 1150 | 0.0896 | - | - |
| 0.1379 | 1200 | 0.0858 | - | - |
| 0.1437 | 1250 | 0.0878 | - | - |
| 0.1494 | 1300 | 0.0857 | - | - |
| 0.1552 | 1350 | 0.0902 | - | - |
| 0.1609 | 1400 | 0.0793 | - | - |
| 0.1666 | 1450 | 0.0830 | - | - |
| 0.1724 | 1500 | 0.0827 | 0.0788 | 0.9380 |
| 0.1781 | 1550 | 0.0789 | - | - |
| 0.1839 | 1600 | 0.0834 | - | - |
| 0.1896 | 1650 | 0.0805 | - | - |
| 0.1954 | 1700 | 0.0795 | - | - |
| 0.2011 | 1750 | 0.0846 | - | - |
| 0.2069 | 1800 | 0.0822 | - | - |
| 0.2126 | 1850 | 0.0858 | - | - |
| 0.2184 | 1900 | 0.0785 | - | - |
| 0.2241 | 1950 | 0.0777 | - | - |
| 0.2299 | 2000 | 0.0746 | 0.0721 | 0.9460 |
| 0.2356 | 2050 | 0.0798 | - | - |
| 0.2414 | 2100 | 0.0798 | - | - |
| 0.2471 | 2150 | 0.0794 | - | - |
| 0.2528 | 2200 | 0.0769 | - | - |
| 0.2586 | 2250 | 0.0805 | - | - |
| 0.2643 | 2300 | 0.0782 | - | - |
| 0.2701 | 2350 | 0.0776 | - | - |
| 0.2758 | 2400 | 0.0776 | - | - |
| 0.2816 | 2450 | 0.0733 | - | - |
| 0.2873 | 2500 | 0.0750 | 0.0718 | 0.9440 |
| 0.2931 | 2550 | 0.0764 | - | - |
| 0.2988 | 2600 | 0.0775 | - | - |
| 0.3046 | 2650 | 0.0767 | - | - |
| 0.3103 | 2700 | 0.0766 | - | - |
| 0.3161 | 2750 | 0.0755 | - | - |
| 0.3218 | 2800 | 0.0752 | - | - |
| 0.3275 | 2850 | 0.0717 | - | - |
| 0.3333 | 2900 | 0.0714 | - | - |
| 0.3390 | 2950 | 0.0726 | - | - |
| 0.3448 | 3000 | 0.0751 | 0.0695 | 0.9470 |
| 0.3505 | 3050 | 0.0730 | - | - |
| 0.3563 | 3100 | 0.0733 | - | - |
| 0.3620 | 3150 | 0.0738 | - | - |
| 0.3678 | 3200 | 0.0701 | - | - |
| 0.3735 | 3250 | 0.0723 | - | - |
| 0.3793 | 3300 | 0.0759 | - | - |
| 0.3850 | 3350 | 0.0675 | - | - |
| 0.3908 | 3400 | 0.0696 | - | - |
| 0.3965 | 3450 | 0.0707 | - | - |
| 0.4023 | 3500 | 0.0705 | 0.0669 | 0.9440 |
| 0.4080 | 3550 | 0.0702 | - | - |
| 0.4137 | 3600 | 0.0716 | - | - |
| 0.4195 | 3650 | 0.0697 | - | - |
| 0.4252 | 3700 | 0.0721 | - | - |
| 0.4310 | 3750 | 0.0723 | - | - |
| 0.4367 | 3800 | 0.0741 | - | - |
| 0.4425 | 3850 | 0.0702 | - | - |
| 0.4482 | 3900 | 0.0653 | - | - |
| 0.4540 | 3950 | 0.0704 | - | - |
| 0.4597 | 4000 | 0.0718 | 0.0652 | 0.9450 |
| 0.4655 | 4050 | 0.0683 | - | - |
| 0.4712 | 4100 | 0.0719 | - | - |
| 0.4770 | 4150 | 0.0674 | - | - |
| 0.4827 | 4200 | 0.0659 | - | - |
| 0.4884 | 4250 | 0.0735 | - | - |
| 0.4942 | 4300 | 0.0737 | - | - |
| 0.4999 | 4350 | 0.0707 | - | - |
| 0.5057 | 4400 | 0.0690 | - | - |
| 0.5114 | 4450 | 0.0707 | - | - |
| 0.5172 | 4500 | 0.0696 | 0.0637 | 0.9470 |
| 0.5229 | 4550 | 0.0686 | - | - |
| 0.5287 | 4600 | 0.0710 | - | - |
| 0.5344 | 4650 | 0.0681 | - | - |
| 0.5402 | 4700 | 0.0667 | - | - |
| 0.5459 | 4750 | 0.0673 | - | - |
| 0.5517 | 4800 | 0.0618 | - | - |
| 0.5574 | 4850 | 0.0715 | - | - |
| 0.5632 | 4900 | 0.0703 | - | - |
| 0.5689 | 4950 | 0.0675 | - | - |
| 0.5746 | 5000 | 0.0715 | 0.0638 | 0.9500 |
| 0.5804 | 5050 | 0.0681 | - | - |
| 0.5861 | 5100 | 0.0628 | - | - |
| 0.5919 | 5150 | 0.0654 | - | - |
| 0.5976 | 5200 | 0.0662 | - | - |
| 0.6034 | 5250 | 0.0626 | - | - |
| 0.6091 | 5300 | 0.0660 | - | - |
| 0.6149 | 5350 | 0.0652 | - | - |
| 0.6206 | 5400 | 0.0687 | - | - |
| 0.6264 | 5450 | 0.0677 | - | - |
| 0.6321 | 5500 | 0.0683 | 0.0631 | 0.9530 |
| 0.6379 | 5550 | 0.0666 | - | - |
| 0.6436 | 5600 | 0.0663 | - | - |
| 0.6494 | 5650 | 0.0637 | - | - |
| 0.6551 | 5700 | 0.0687 | - | - |
| 0.6608 | 5750 | 0.0620 | - | - |
| 0.6666 | 5800 | 0.0664 | - | - |
| 0.6723 | 5850 | 0.0666 | - | - |
| 0.6781 | 5900 | 0.0632 | - | - |
| 0.6838 | 5950 | 0.0676 | - | - |
| 0.6896 | 6000 | 0.0638 | 0.0634 | 0.9530 |
| 0.6953 | 6050 | 0.0655 | - | - |
| 0.7011 | 6100 | 0.0651 | - | - |
| 0.7068 | 6150 | 0.0675 | - | - |
| 0.7126 | 6200 | 0.0685 | - | - |
| 0.7183 | 6250 | 0.0647 | - | - |
| 0.7241 | 6300 | 0.0609 | - | - |
| 0.7298 | 6350 | 0.0643 | - | - |
| 0.7355 | 6400 | 0.0628 | - | - |
| 0.7413 | 6450 | 0.0627 | - | - |
| 0.747 | 6500 | 0.0639 | 0.0621 | 0.954 |
| 0.7528 | 6550 | 0.0658 | - | - |
| 0.7585 | 6600 | 0.0667 | - | - |
| 0.7643 | 6650 | 0.0632 | - | - |
| 0.7700 | 6700 | 0.0616 | - | - |
| 0.7758 | 6750 | 0.0666 | - | - |
| 0.7815 | 6800 | 0.0634 | - | - |
| 0.7873 | 6850 | 0.0647 | - | - |
| 0.7930 | 6900 | 0.0644 | - | - |
| 0.7988 | 6950 | 0.0617 | - | - |
| 0.8045 | 7000 | 0.0677 | 0.0626 | 0.9510 |
| 0.8103 | 7050 | 0.0616 | - | - |
| 0.8160 | 7100 | 0.0633 | - | - |
| 0.8217 | 7150 | 0.0645 | - | - |
| 0.8275 | 7200 | 0.0656 | - | - |
| 0.8332 | 7250 | 0.0597 | - | - |
| 0.8390 | 7300 | 0.0670 | - | - |
| 0.8447 | 7350 | 0.0638 | - | - |
| 0.8505 | 7400 | 0.0641 | - | - |
| 0.8562 | 7450 | 0.0660 | - | - |
| 0.8620 | 7500 | 0.0687 | 0.0618 | 0.9490 |
| 0.8677 | 7550 | 0.0654 | - | - |
| 0.8735 | 7600 | 0.0633 | - | - |
| 0.8792 | 7650 | 0.0660 | - | - |
| 0.8850 | 7700 | 0.0674 | - | - |
| 0.8907 | 7750 | 0.0681 | - | - |
| 0.8964 | 7800 | 0.0601 | - | - |
| 0.9022 | 7850 | 0.0612 | - | - |
| 0.9079 | 7900 | 0.0626 | - | - |
| 0.9137 | 7950 | 0.0641 | - | - |
| 0.9194 | 8000 | 0.0633 | 0.0619 | 0.9470 |
| 0.9252 | 8050 | 0.0637 | - | - |
| 0.9309 | 8100 | 0.0630 | - | - |
| 0.9367 | 8150 | 0.0646 | - | - |
| 0.9424 | 8200 | 0.0648 | - | - |
| 0.9482 | 8250 | 0.0647 | - | - |
| 0.9539 | 8300 | 0.0601 | - | - |
| 0.9597 | 8350 | 0.0600 | - | - |
| 0.9654 | 8400 | 0.0668 | - | - |
| 0.9712 | 8450 | 0.0640 | - | - |
| 0.9769 | 8500 | 0.0579 | 0.0618 | 0.9500 |
| 0.9826 | 8550 | 0.0645 | - | - |
| 0.9884 | 8600 | 0.0614 | - | - |
| 0.9941 | 8650 | 0.0642 | - | - |
| 0.9999 | 8700 | 0.0652 | - | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
sentence-transformers/all-MiniLM-L6-v2