llemr-v1 / README.md

Update README.md

6ab3f10 verified 4 months ago

3.61 kB

	---
	license: mit
	language:
	- en
	base_model:
	- lmsys/vicuna-7b-v1.5
	base_model_relation: adapter
	pipeline_tag: question-answering
	tags:
	- medical
	library_name: peft
	---

	# Instruction Tuning Large Language Models to Understand Electronic Health Records

	Authors: Zhenbang Wu, Anant Dadu, Michael Nalls, Faraz Faghri, Jimeng Sun

	Published at: NeurIPS 2024 Datasets and Benchmarks Track (Spotlight)

	[[📑Paper](https://openreview.net/pdf?id=Dgy5WVgPd2)] [[🔗Code](https://github.com/zzachw/Llemr)]

	This repository contains the model weights for Llemr, a large language model (LLM) capable of processing and interpreting electronic health records (EHR) with complex data structures.


	## Model Description

	Llemr is trained on MIMIC-Instr, a dataset comprising 350K schema-alignment examples and 100K clinical-reasoning examples generated from the MIMIC-IV EHR database. The model excels at generating relevant, context-aware responses to patient-related queries by leveraging:

	- BiomedBERT as the event encoder.

	- Vicuna as the backbone language model.


	## How to Load Weights

	Follow the steps below to load the pre-trained weights:

	1. Clone the repository:

	```bash
	git clone https://huggingface.co/zzachw/llemr-v1
	cd llemr-v1
	```

	2. Load the weights in Python:

	```python
	from peft import PeftModel
	from src.model.init_llemr import init_llemr

	# Define paths for the base model and LoRA weights
	llm_pretrained_model_name_or_path = "lmsys/vicuna-7b-v1.5"
	lora_name_or_path = "zzachw12/llemr-v1"

	# Initialize the base model and tokenizer
	model, tokenizer = init_llemr(llm_pretrained_model_name_or_path, hidden_size=1027)

	# Integrate the LoRA weights into the model
	model = PeftModel.from_pretrained(model, lora_name_or_path)
	```

	Note: This model requires pre-computed event embeddings generated by BiomedBERT. Refer to the [GitHub repository](https://github.com/zzachw/Llemr) for detailed instructions on data preprocessing and event embedding preparation.


	## Notes on Model Enhancements

	Llemr incorporates several minor improvements over the original implementation described in the paper:

	1. Enhanced Event Encoder:
	- Replaced ClinicalBERT (`emilyalsentzer/Bio_ClinicalBERT`) with BiomedBERT-large (`microsoft/BiomedNLP-BiomedBERT-large-uncased-abstract`), improving the quality of event embeddings.

	2. Improved Event Embedding:
	- Concatenated event timestamps and numeric values (where available) to the final event embeddings, resulting in better representation of time-sensitive and quantitative data.

	3. Expanded Dataset:
	- Increased the size of the clinical reasoning subset to 100K examples, doubling the data from the original 50K subset for more comprehensive coverage.

	4. Unified Training Approach:
	- Adopted a single-step training process that integrates schema alignment and clinical reasoning subsets simultaneously, streamlining the training pipeline.

	These advancements collectively enhance the model's ability to interpret and reason with EHR data, delivering superior performance compared to its predecessor.


	## Citation

	If you utilize this work in your research or projects, please consider citing us:

	```
	@inproceedings{
	wu2024instruction,
	title={Instruction Tuning Large Language Models to Understand Electronic Health Records},
	author={Zhenbang Wu and Anant Dadu and Michael Nalls and Faraz Faghri and Jimeng Sun},
	booktitle={The Thirty-eighth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
	year={2024},
	url={https://openreview.net/forum?id=Dgy5WVgPd2}
	}
	```