ModernBERT Medical Safety Classifier
The ModernBERT Medical Safety Classifier is a transformer-based language model fine-tuned to assess the safety and ethical standards of medical texts across diverse medical domains. Built on top of the ModernBERT architecture, it leverages the powerful evaluations of Llama 3.1 (70B) to distill that model’s safety and ethical insights into a significantly smaller and faster classifier. Specifically, it was trained on a newly curated, balanced subset of The Blue Scrubs dataset (a total of 83,636 documents), each annotated by Llama 3.1 (70B) for safety and ethical adherence. By transferring these large-model evaluations into ModernBERT, the resulting classifier retains robust predictive accuracy while remaining lightweight enough for real-time or resource-constrained inference.
Model Details
- Developed by: TheBlueScrubs
- Model Type: Transformer-based language model
- Language: English
- License: Apache-2.0
- Base Model: answerdotai/ModernBERT-base
ModernBERT is an advanced encoder-only model that incorporates recent innovations such as Rotary Positional Embeddings, local–global alternating attention, and Flash Attention, enabling efficient inference and an extended context window of up to 8,192 tokens.
Intended Uses & Limitations
Intended Uses
This model is designed to classify medical texts based on safety and ethical standards, particularly focusing on cancer-related content. It can be utilized to assess the safety of medical documents, ensuring compliance with established ethical guidelines.
Limitations
While the model has been trained on a substantial corpus of cancer-specific texts, its performance on medical domains outside of oncology has not been evaluated. Users should exercise caution when applying the model to non-cancer-related medical content.
How to Use
To utilize this model for safety classification, you can employ the HF中国镜像站 Transformers library as follows:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("TheBlueScrubs/ModernBERT-base-TBS")
model = AutoModelForSequenceClassification.from_pretrained("TheBlueScrubs/ModernBERT-base-TBS")
# Example text
text = "Your medical text here."
# Tokenize input
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=4096)
# Get model predictions
outputs = model(**inputs)
predictions = outputs.logits
# Interpret predictions
safety_score = predictions.item()
print(f"Safety Score: {safety_score}")
Training Data
Replace with (updated text):
The model was re-trained on a new, balanced subset drawn from The Blue Scrubs dataset to address the overrepresentation of high-safety texts. Specifically:
- We scanned a total of 11,500,608 rows across all files and removed 112,330 rows for parse/NaN/0/out-of-range issues, leaving 11,388,278 valid rows.
- Of these valid rows, 41,818 had a safety score ≤ 2, while 11,346,460 had a safety score > 2.
- To balance the dataset, we randomly sampled documents so that unsafe (≤ 2) and safer (> 2) texts were equally represented. This yielded a final balanced set of 83,636 total rows.
Each row retained its original continuous safety score from Llama 3.1 (70B), ranging from 1 (least safe) to 5 (most safe). These scores again served as regression targets during training.
Training Procedure
Preprocessing
Texts were tokenized using the ModernBERT tokenizer with a maximum sequence length of 4,096 tokens. No additional filtering was applied, as the data was considered trustworthy.
Training Hyperparameters
Learning Rate: 2e-5
Number of Epochs: 5
Batch Size: 20 (per device)
Gradient Accumulation Steps: 8
Optimizer: AdamW
Weight Decay: 0.01
FP16 Training: Enabled
Total Training Steps: Now ~5 epochs over the final balanced setAll other hyperparameter settings (e.g., batch size, optimizer choice) remained the same as in the previous training. Only the learning rate, the number of epochs, and the balanced dataset were changed.
Evaluation
Testing Data
The model's performance was evaluated on an out-of-sample test set comprising cancer-related documents from The Blue Scrubs dataset that were not included in the training set.
Metrics
- Mean Squared Error (MSE): Measures the average squared difference between the predicted and actual safety scores.
- Accuracy: Determined by binarizing predictions (unsafe ≤ 2 vs. safe > 2).
- ROC Analysis: Assesses the model's ability to distinguish between safe and unsafe content.
Results
- MSE: 0.489
- RMSE: 0.699
- Accuracy: 0.9642
- ROC Analysis: Demonstrated robust classification capability with high True Positive Rates and low False Positive Rates.
Bias, Risks, and Limitations
This model was trained on a curated subset of The Blue Scrubs dataset encompassing various medical domains, yet some areas may remain underrepresented. As with any model, there is a risk of bias stemming from data composition, and users should exercise caution when applying the classifier, especially in highly specialized contexts. Outputs should always be corroborated with expert opinion and current clinical guidelines to ensure safe, accurate medical usage.
Recommendations
Users should validate the model's performance on their specific datasets and consider fine-tuning the model on domain-specific data if necessary. Continuous monitoring and evaluation are recommended to ensure the model's predictions align with current medical standards and ethical guidelines.
Citation
If you utilize this model in your research or applications, please cite it as follows:
@misc{thebluescrubs2025modernbert,
author = {TheBlueScrubs},
title = {ModernBERT Medical Safety Classifier},
year = {2025},
publisher = {HF中国镜像站},
url = {https://https://huggingface.co/TheBlueScrubs/ModernBERT-base-TBS}
}
Model Card Authors
- TheBlueScrubs Team
- Downloads last month
- 41
Model tree for TheBlueScrubs/ModernBERT-base-TBS
Base model
answerdotai/ModernBERT-base