You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-m3
Maximum Sequence Length: 96 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
HF中国镜像站: Sentence Transformers on HF中国镜像站

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 96, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("disi-unibo-nlp/foodex-facet-descriptors-retriever")
# Run inference
sentences = [
    'tome des bauges raw milk aoc in plastic container brand product name </s> This facet allows recording whether the food list code was chosen because of lack of information on the food item or because the proper entry in the food list was missing. Only one descriptor from this facet can be added to each entry.',
    'The food list item has been chosen because none of the more detailed items corresponded to the available information. Please consider the eventual addition of a new term in the list',
    'Deprecated term that must NOT be used for any purpose. Its original scopenote was: The group includes any type of Other fruiting vegetables (exposure). The part consumed/analysed is by default unspecified. When relevant, information on the part consumed/analysed has to be reported with additional facet descriptors.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Device Aware Information Retrieval

Evaluated with src.utils.eval_functions.DeviceAwareInformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.985
cosine_accuracy@3	0.999
cosine_accuracy@5	0.9998
cosine_accuracy@10	1.0
cosine_precision@1	0.985
cosine_precision@3	0.4171
cosine_precision@5	0.2537
cosine_precision@10	0.1275
cosine_recall@1	0.8691
cosine_recall@3	0.9939
cosine_recall@5	0.9985
cosine_recall@10	0.9999
cosine_ndcg@10	0.9936
cosine_mrr@10	0.9919
cosine_map@100	0.9909

Training Details

Training Dataset

Unnamed Dataset

Size: 1,225,740 training samples
Columns: sentence_0, sentence_1, and sentence_2

Approximate statistics based on the first 1000 samples:

	sentence_0	sentence_1	sentence_2
type	string	string	string
details	min: 37 tokens mean: 89.82 tokens max: 96 tokens	min: 6 tokens mean: 39.38 tokens max: 96 tokens	min: 5 tokens mean: 39.59 tokens max: 96 tokens

Samples:

sentence_0	sentence_1	sentence_2
peach fresh flesh baked with skin This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry.	`Cooking by dry heat in or as if in an oven`	`Previously cooked or heat-treated fodd, heated again in order to raise its temperature (all different techniques)`
turkey breast with bones frozen barbecued without skin This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry.	`Preserving by freezing sufficiently rapidly to avoid spoilage and microbial growth`	`Drying to a water content low enough to guarantee microbiological stability, but still keeping a relatively soft structure (often used for fruit)`
yoghurt flavoured cow blueberry sweetened with sugar sucrose whole in glass commercial supermarket shop organic shop brand product name This facet provides some principal claims related to important nutrients-ingredients, like fat, sugar etc. It is not intended to include health claims or similar. The present guidance provides a limited list, to be eventually improved during the evolution of the system. More than one descriptor can be applied to each entry, provided they are not contradicting each other.	`The food item has all the natural (or average expected )fat content (for milk, at least the value defined in legislation, when available). In the case of cheese, the fat on the dry matter is 45-60%`	`The food item has an almost completely reduced amount of fat, with respect to the expected natural fat content (for milk, at least the value defined in legislation, when available). For meat, this is the entry for what is commercially intended as 'lean' meat, where fat is not visible.In the case of cheese, the fat on the dry matter is 10-25%`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 48
per_device_eval_batch_size: 48
fp16: True
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 48
per_device_eval_batch_size: 48
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand

Epoch	Step	Training Loss	cosine_ndcg@10
0	0	-	0.0266
0.0196	500	1.5739	-
0.0392	1000	0.9043	-
0.0587	1500	0.8234	-
0.0783	2000	0.7861	-
0.0979	2500	0.7628	-
0.1175	3000	0.7348	-
0.1371	3500	0.7184	-
0.1566	4000	0.7167	-
0.1762	4500	0.7002	-
0.1958	5000	0.6791	0.9264
0.2154	5500	0.6533	-
0.2350	6000	0.6628	-
0.2545	6500	0.6637	-
0.2741	7000	0.639	-
0.2937	7500	0.6395	-
0.3133	8000	0.6358	-
0.3329	8500	0.617	-
0.3524	9000	0.6312	-
0.3720	9500	0.6107	-
0.3916	10000	0.6083	0.9518
0.4112	10500	0.6073	-
0.4307	11000	0.601	-
0.4503	11500	0.6047	-
0.4699	12000	0.5986	-
0.4895	12500	0.5913	-
0.5091	13000	0.5992	-
0.5286	13500	0.5911	-
0.5482	14000	0.5923	-
0.5678	14500	0.5816	-
0.5874	15000	0.582	0.9628
0.6070	15500	0.5815	-
0.6265	16000	0.5827	-
0.6461	16500	0.5885	-
0.6657	17000	0.5737	-
0.6853	17500	0.577	-
0.7049	18000	0.5687	-
0.7244	18500	0.5744	-
0.7440	19000	0.5774	-
0.7636	19500	0.5792	-
0.7832	20000	0.5645	0.9739
0.8028	20500	0.5769	-
0.8223	21000	0.5659	-
0.8419	21500	0.5635	-
0.8615	22000	0.5677	-
0.8811	22500	0.5693	-
0.9007	23000	0.5666	-
0.9202	23500	0.5526	-
0.9398	24000	0.5591	-
0.9594	24500	0.563	-
0.9790	25000	0.555	0.9808
0.9986	25500	0.5585	-
1.0	25537	-	0.9811
1.0181	26000	0.5595	-
1.0377	26500	0.5507	-
1.0573	27000	0.5582	-
1.0769	27500	0.5543	-
1.0964	28000	0.5598	-
1.1160	28500	0.5613	-
1.1356	29000	0.5457	-
1.1552	29500	0.5524	-
1.1748	30000	0.5324	0.9836
1.1943	30500	0.5531	-
1.2139	31000	0.5505	-
1.2335	31500	0.5623	-
1.2531	32000	0.5505	-
1.2727	32500	0.5583	-
1.2922	33000	0.548	-
1.3118	33500	0.5485	-
1.3314	34000	0.5509	-
1.3510	34500	0.54	-
1.3706	35000	0.5478	0.9835
1.3901	35500	0.5416	-
1.4097	36000	0.5438	-
1.4293	36500	0.543	-
1.4489	37000	0.547	-
1.4685	37500	0.5362	-
1.4880	38000	0.5536	-
1.5076	38500	0.5356	-
1.5272	39000	0.5382	-
1.5468	39500	0.5481	-
1.5664	40000	0.5302	0.9880
1.5859	40500	0.5275	-
1.6055	41000	0.5327	-
1.6251	41500	0.5414	-
1.6447	42000	0.5354	-
1.6643	42500	0.536	-
1.6838	43000	0.5364	-
1.7034	43500	0.5391	-
1.7230	44000	0.5342	-
1.7426	44500	0.5369	-
1.7621	45000	0.5387	0.9894
1.7817	45500	0.5312	-
1.8013	46000	0.5297	-
1.8209	46500	0.5222	-
1.8405	47000	0.5255	-
1.8600	47500	0.5379	-
1.8796	48000	0.5317	-
1.8992	48500	0.5312	-
1.9188	49000	0.5307	-
1.9384	49500	0.5375	-
1.9579	50000	0.527	0.9908
1.9775	50500	0.538	-
1.9971	51000	0.5312	-
2.0	51074	-	0.9911
2.0167	51500	0.5346	-
2.0363	52000	0.5279	-
2.0558	52500	0.517	-
2.0754	53000	0.5193	-
2.0950	53500	0.5286	-
2.1146	54000	0.5229	-
2.1342	54500	0.5183	-
2.1537	55000	0.5194	0.9915
2.1733	55500	0.5362	-
2.1929	56000	0.5186	-
2.2125	56500	0.5202	-
2.2321	57000	0.5276	-
2.2516	57500	0.5266	-
2.2712	58000	0.5334	-
2.2908	58500	0.5206	-
2.3104	59000	0.5229	-
2.3300	59500	0.5111	-
2.3495	60000	0.5175	0.9928
2.3691	60500	0.5235	-
2.3887	61000	0.5127	-
2.4083	61500	0.5291	-
2.4278	62000	0.5122	-
2.4474	62500	0.5196	-
2.4670	63000	0.5159	-
2.4866	63500	0.5207	-
2.5062	64000	0.5157	-
2.5257	64500	0.5094	-
2.5453	65000	0.5283	0.9937
2.5649	65500	0.5256	-
2.5845	66000	0.524	-
2.6041	66500	0.5324	-
2.6236	67000	0.5132	-
2.6432	67500	0.5203	-
2.6628	68000	0.5224	-
2.6824	68500	0.5255	-
2.7020	69000	0.5132	-
2.7215	69500	0.525	-
2.7411	70000	0.5257	0.9936
2.7607	70500	0.5206	-
2.7803	71000	0.514	-
2.7999	71500	0.5175	-
2.8194	72000	0.5245	-
2.8390	72500	0.5144	-
2.8586	73000	0.5246	-
2.8782	73500	0.5227	-
2.8978	74000	0.5199	-
2.9173	74500	0.5216	-
2.9369	75000	0.5253	0.9936
2.9565	75500	0.5303	-
2.9761	76000	0.5148	-
2.9957	76500	0.5248	-
3.0	76611	-	0.9936

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.4.1
Transformers: 4.49.0
PyTorch: 2.6.0+cu124
Accelerate: 1.4.0
Datasets: 3.3.1
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}