--- library_name: transformers tags: - cross-encoder license: apache-2.0 datasets: - disi-unibo-nlp/foodex2-clean base_model: - microsoft/deberta-v3-large --- # FoodEx2 Baseterm Reranker This is a CrossEncoder reranker model fine-tuned from [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) for the FoodEx2 domain. It is designed to re-rank candidate sentences based on their relevance and suitability for food description and classification tasks. The model has been trained on the `disi-unibo-nlp/foodex2-clean` dataset, with additional negative examples drawn from `disi-unibo-nlp/foodex2-terms`. ## Model Details - **Model Type:** CrossEncoder Reranker - **Base Model:** [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) - **Maximum Sequence Length:** 256 tokens - **Training Epochs:** 10 - **Batch Size:** 256 - **Evaluation Steps:** 15 - **Warmup Steps:** 10 This model is optimized for reranking tasks where the goal is to select the most relevant sentence(s) from a set of candidates based on food-related descriptions. It uses a dual-input architecture to compare pairs of sentences and compute similarity scores, enabling it to accurately differentiate between subtle nuances in food terminologies. ## Training Details The model was trained using a custom training script with the following key parameters: - **Dataset:** `disi-unibo-nlp/foodex2-clean` for positive examples and `disi-unibo-nlp/foodex2-terms` for negatives. - **Task Number:** 1 - **Validation Ratio:** 10% - **Evaluation on Test Set:** Enabled - **Loss Function:** A loss function suited for ranking tasks was applied during training. A Docker-based training script was used, ensuring reproducibility and ease of deployment. The training process leverages GPUs to accelerate computation, with model checkpoints saved periodically. ## Evaluation The model was evaluated on the test set with the following metrics: | Metric | Value | |:--------------------|:-----------| | **Accuracy@1** | 0.9603 | | **Accuracy@3** | 0.9958 | | **Accuracy@5** | 1.0000 | | **Accuracy@10** | 1.0000 | | **Precision@1** | 0.9603 | | **Recall@1** | 0.8472 | | **Precision@3** | 0.4167 | | **Recall@3** | 0.9859 | | **Precision@5** | 0.2971 | | **Recall@5** | 0.9974 | | **Precision@10** | 0.2583 | | **Recall@10** | 0.9996 | | **MRR@10** | 0.9781 | | **NDCG@10** | 0.9817 | | **MAP@100** | 0.9736 | | **Avg Seconds per Example** | 0.00139 | Additionally, the model achieved a **binary score** of 0.9861 on a threshold-based evaluation, with the best threshold identified at 0.5454 and an F1 score of 0.3328 at this threshold. ## Usage To use the baseterm-reranker in your application, follow these steps: 1. **Install Dependencies:** ```bash pip install transformers torch ``` 2. **Load the Model:** ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("your-username/baseterm-reranker") model = AutoModelForSequenceClassification.from_pretrained("your-username/baseterm-reranker") # Example usage: scoring candidate sentence pairs inputs = tokenizer(["Your first sentence", "Your second sentence"], return_tensors="pt", padding=True, truncation=True) outputs = model(**inputs) scores = outputs.logits print(scores) ``` ## Citation If you use this model in your research, please cite the relevant works: ```bibtex @article{deberta, title={DeBERTa: Decoding-enhanced BERT with Disentangled Attention}, author={He, Pengcheng and Liu, Xiaodong and Gao, Weizhu and Chen, Jianfeng}, journal={arXiv preprint arXiv:2006.03654}, year={2020} } ``` ## License This model is released under the Apache 2.0 License.