File size: 1,697 Bytes
40bb223
365ce1f
 
 
 
 
 
40bb223
 
365ce1f
40bb223
365ce1f
40bb223
365ce1f
40bb223
365ce1f
 
 
 
 
 
40bb223
365ce1f
40bb223
365ce1f
 
40bb223
365ce1f
 
40bb223
365ce1f
 
 
 
 
 
40bb223
365ce1f
 
 
 
40bb223
 
 
365ce1f
 
 
 
 
 
40bb223
365ce1f
40bb223
365ce1f
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
language: en
tags:
- t5
- product-classification
- category-prediction
license: mit
---

# T5 Product Category & Subcategory Classifier

This model is fine-tuned on T5-base for product category and subcategory classification.

## Model Description

- **Model Type:** T5 (Text-to-Text Transfer Transformer)
- **Language:** English
- **Task:** Product Classification
- **Training Data:** 10,172 categorized products
- **Input Format:** "Predict the product category and subcategory in the following format: 'Category: <CATEGORY> | Subcategory: <SUBCATEGORY>'. Product: {product_name}"
- **Output Format:** "Category: {category} | Subcategory: {subcategory}"

## Usage

```python
from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("{repo_id}")
tokenizer = T5Tokenizer.from_pretrained("{repo_id}")

def predict(text):
    prompt = f"Predict the product category and subcategory in the following format: 'Category: <CATEGORY> | Subcategory: <SUBCATEGORY>'. Product: {text}"
    inputs = tokenizer(prompt, return_tensors="pt", max_length=128, truncation=True)
    
    outputs = model.generate(**inputs, max_length=32, num_beams=4)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example
result = predict("Pantene Suave & Liso Shampoo")
print(result)
```

## Training Details

- **Base Model:** t5-base
- **Training Type:** Fine-tuning
- **Epochs:** 5
- **Batch Size:** 8
- **Learning Rate:** 3e-5
- **Weight Decay:** 0.01

## Limitations

- The model works best with product names in English
- Performance may vary for products outside the training categories
- Requires clear and specific product descriptions