🚀 DistilBert Urdu NER by BlaikHole
📌 Overview
This repository provides a fine-tuned model trained on private Urdu NER data using quick still efficient DistilBert architecture. It can be used for Urdu NER with 7 classes.
🎨 Model Outputs & Labels
The model identifies the following labels:
Label Name | Description |
---|---|
🟥 LABEL_0 > Date | Date in text like 5 Feb. |
🟩 LABEL_1 > Designation | Designation of person like Doctor. |
🟦 LABEL_2 > Location | Location i.e office, city name. |
🟨 LABEL_3 > Number | Any number. |
🟪 LABEL_4 > Organization | Name of company or organization etc. |
🟧 LABEL_5 > Other | Outside entity. |
⬛ LABEL_6 > Person | Name of person. |
🟫 LABEL_7 > Time | Time related words. |
🚀 Quick Usage
You can easily load and use this model with transformers
:
🔹 Named Entity Recognition (NER)
import torch
from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer
# Label Mapping
LABEL_MAP = {
0: "DATE",
1: "DESIGNATION",
2: "LOCATION",
3: "NUMBER",
4: "ORGANIZATION",
5: "OTHER",
6: "PERSON",
7: "TIME",
}
# Model Name
MODEL_NAME = "blaikhole/distilbert-urdu-ner"
# Load Model & Tokenizer
device = 0 if torch.cuda.is_available() else -1
model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME).to("cuda" if device == 0 else "cpu")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
# Load NER Pipeline
ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, device=device)
def process_text(text):
entities = ner_pipeline(text)
results = [(entity["word"], LABEL_MAP.get(int(entity["entity"].split("_")[-1]), "OTHER")) for entity in entities]
return results
# Example Usage
if __name__ == "__main__":
sample_text = "پی ٹی آئی پاکستان میں ایک تنظیم ہے۔"
print(process_text(sample_text))
📦 Installation
To use this model, install the required dependencies:
pip install transformers torch
📜 License
MIT
- Downloads last month
- 31
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for blaikhole/distilbert-urdu-ner
Base model
distilbert/distilbert-base-uncased