|
--- |
|
license: mit |
|
language: |
|
- en |
|
base_model: |
|
- lmsys/vicuna-7b-v1.5 |
|
base_model_relation: adapter |
|
pipeline_tag: question-answering |
|
tags: |
|
- medical |
|
library_name: peft |
|
--- |
|
|
|
# Instruction Tuning Large Language Models to Understand Electronic Health Records |
|
|
|
**Authors:** Zhenbang Wu, Anant Dadu, Michael Nalls, Faraz Faghri, Jimeng Sun |
|
|
|
**Published at:** NeurIPS 2024 Datasets and Benchmarks Track (Spotlight) |
|
|
|
[[📑Paper](https://openreview.net/pdf?id=Dgy5WVgPd2)] [[🔗Code](https://github.com/zzachw/Llemr)] |
|
|
|
This repository contains the model weights for Llemr, a large language model (LLM) capable of processing and interpreting electronic health records (EHR) with complex data structures. |
|
|
|
|
|
## Model Description |
|
|
|
Llemr is trained on MIMIC-Instr, a dataset comprising 350K schema-alignment examples and 100K clinical-reasoning examples generated from the MIMIC-IV EHR database. The model excels at generating relevant, context-aware responses to patient-related queries by leveraging: |
|
|
|
- BiomedBERT as the event encoder. |
|
|
|
- Vicuna as the backbone language model. |
|
|
|
|
|
## How to Load Weights |
|
|
|
Follow the steps below to load the pre-trained weights: |
|
|
|
1. Clone the repository: |
|
|
|
```bash |
|
git clone https://huggingface.co/zzachw/llemr-v1 |
|
cd llemr-v1 |
|
``` |
|
|
|
2. Load the weights in Python: |
|
|
|
```python |
|
from peft import PeftModel |
|
from src.model.init_llemr import init_llemr |
|
|
|
# Define paths for the base model and LoRA weights |
|
llm_pretrained_model_name_or_path = "lmsys/vicuna-7b-v1.5" |
|
lora_name_or_path = "zzachw12/llemr-v1" |
|
|
|
# Initialize the base model and tokenizer |
|
model, tokenizer = init_llemr(llm_pretrained_model_name_or_path, hidden_size=1027) |
|
|
|
# Integrate the LoRA weights into the model |
|
model = PeftModel.from_pretrained(model, lora_name_or_path) |
|
``` |
|
|
|
**Note:** This model requires pre-computed event embeddings generated by BiomedBERT. Refer to the [GitHub repository](https://github.com/zzachw/Llemr) for detailed instructions on data preprocessing and event embedding preparation. |
|
|
|
|
|
## Notes on Model Enhancements |
|
|
|
Llemr incorporates several minor improvements over the original implementation described in the paper: |
|
|
|
1. **Enhanced Event Encoder:** |
|
- Replaced ClinicalBERT (`emilyalsentzer/Bio_ClinicalBERT`) with BiomedBERT-large (`microsoft/BiomedNLP-BiomedBERT-large-uncased-abstract`), improving the quality of event embeddings. |
|
|
|
2. **Improved Event Embedding:** |
|
- Concatenated event timestamps and numeric values (where available) to the final event embeddings, resulting in better representation of time-sensitive and quantitative data. |
|
|
|
3. **Expanded Dataset:** |
|
- Increased the size of the clinical reasoning subset to 100K examples, doubling the data from the original 50K subset for more comprehensive coverage. |
|
|
|
4. **Unified Training Approach:** |
|
- Adopted a single-step training process that integrates schema alignment and clinical reasoning subsets simultaneously, streamlining the training pipeline. |
|
|
|
These advancements collectively enhance the model's ability to interpret and reason with EHR data, delivering superior performance compared to its predecessor. |
|
|
|
|
|
## Citation |
|
|
|
If you utilize this work in your research or projects, please consider citing us: |
|
|
|
``` |
|
@inproceedings{ |
|
wu2024instruction, |
|
title={Instruction Tuning Large Language Models to Understand Electronic Health Records}, |
|
author={Zhenbang Wu and Anant Dadu and Michael Nalls and Faraz Faghri and Jimeng Sun}, |
|
booktitle={The Thirty-eighth Conference on Neural Information Processing Systems Datasets and Benchmarks Track}, |
|
year={2024}, |
|
url={https://openreview.net/forum?id=Dgy5WVgPd2} |
|
} |
|
``` |