HF中国镜像站

Helios9
/

BioMed_NER

Token Classification

Model card Files Files and versions Community

Helios9 commited on Jan 28

Commit

d6545f8

·

verified ·

1 Parent(s): 039cf6d

Update README.md

Files changed (1) hide show

README.md +52 -0

README.md CHANGED Viewed

@@ -93,6 +93,58 @@ The output will be a list of recognized entities with their entity type, score,
 ]
 ```
 **Use Cases:**
 - Extracting clinical information from unstructured text in medical records.
 - Structuring data for downstream biomedical research or applications.

 ]
 ```
+In some cases, we are getting multiple same entity groups so to join please use below code:
+```python
+def merge_consecutive_entities(entities):
+    entities = sorted(entities, key=lambda x: x['start'])
+    merged_entities = []
+    current_entity = None
+    for entity in entities:
+        if current_entity is None:
+            current_entity = entity
+        elif (
+            entity['entity_group'] == current_entity['entity_group'] and
+            (entity['start'] <= current_entity['end'])
+        ):
+            new_word = entity['word']
+            if not current_entity['word'].endswith(new_word):
+                current_entity['word'] += " " + new_word
+            current_entity['end'] = max(current_entity['end'], entity['end'])
+            current_entity['score'] = (current_entity['score'] + entity['score']) / 2
+        else:
+            merged_entities.append(current_entity)
+            current_entity = entity
+    if current_entity:
+        merged_entities.append(current_entity)
+    return merged_entities
+from transformers import pipeline
+# Load the model
+model_path = "Helios9/BIOMed_NER"
+pipe = pipeline(
+    task="token-classification",
+    model=model_path,
+    tokenizer=model_path,
+    aggregation_strategy="simple"
+)
+# Test the pipeline
+text = ("A 48-year-old female presented with vaginal bleeding and abnormal Pap smears. "
+        "Upon diagnosis of invasive non-keratinizing SCC of the cervix, she underwent a radical "
+        "hysterectomy with salpingo-oophorectomy which demonstrated positive spread to the pelvic "
+        "lymph nodes and the parametrium.")
+result = pipe(text)
+final_result=merge_consecutive_entities(result)
+print(final_result)
+```
 **Use Cases:**
 - Extracting clinical information from unstructured text in medical records.
 - Structuring data for downstream biomedical research or applications.