distilbert-urdu-ner / README.md
JalalHxmi's picture
Update README.md
ad65063 verified
metadata
library_name: transformers
tags:
  - ner
  - urdu
  - urdu ner
  - token classification
  - urdu nlp
license: mit
language:
  - ur
metrics:
  - accuracy
base_model:
  - distilbert/distilbert-base-uncased
pipeline_tag: token-classification

🚀 DistilBert Urdu NER by BlaikHole

📌 Overview

This repository provides a fine-tuned model trained on private Urdu NER data using quick still efficient DistilBert architecture. It can be used for Urdu NER with 7 classes.


🎨 Model Outputs & Labels

The model identifies the following labels:

Label Name Description
🟥 LABEL_0 > Date Date in text like 5 Feb.
🟩 LABEL_1 > Designation Designation of person like Doctor.
🟦 LABEL_2 > Location Location i.e office, city name.
🟨 LABEL_3 > Number Any number.
🟪 LABEL_4 > Organization Name of company or organization etc.
🟧 LABEL_5 > Other Outside entity.
LABEL_6 > Person Name of person.
🟫 LABEL_7 > Time Time related words.

🚀 Quick Usage

You can easily load and use this model with transformers:

🔹 Named Entity Recognition (NER)

import torch
from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer

# Label Mapping
LABEL_MAP = {
    0: "DATE",
    1: "DESIGNATION",
    2: "LOCATION",
    3: "NUMBER",
    4: "ORGANIZATION",
    5: "OTHER",
    6: "PERSON",
    7: "TIME",
}

# Model Name
MODEL_NAME = "blaikhole/distilbert-urdu-ner"

# Load Model & Tokenizer
device = 0 if torch.cuda.is_available() else -1
model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME).to("cuda" if device == 0 else "cpu")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

# Load NER Pipeline
ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, device=device)

def process_text(text):
    entities = ner_pipeline(text)
    results = [(entity["word"], LABEL_MAP.get(int(entity["entity"].split("_")[-1]), "OTHER")) for entity in entities]
    return results

# Example Usage
if __name__ == "__main__":
    sample_text = "پی ٹی آئی پاکستان میں ایک تنظیم ہے۔"
    print(process_text(sample_text))

📦 Installation

To use this model, install the required dependencies:

pip install transformers torch

📜 License

MIT