Helios9 commited on
Commit
d6545f8
·
verified ·
1 Parent(s): 039cf6d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -93,6 +93,58 @@ The output will be a list of recognized entities with their entity type, score,
93
  ]
94
  ```
95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  **Use Cases:**
97
  - Extracting clinical information from unstructured text in medical records.
98
  - Structuring data for downstream biomedical research or applications.
 
93
  ]
94
  ```
95
 
96
+ In some cases, we are getting multiple same entity groups so to join please use below code:
97
+
98
+ ```python
99
+
100
+ def merge_consecutive_entities(entities):
101
+ entities = sorted(entities, key=lambda x: x['start'])
102
+ merged_entities = []
103
+ current_entity = None
104
+
105
+ for entity in entities:
106
+ if current_entity is None:
107
+ current_entity = entity
108
+ elif (
109
+ entity['entity_group'] == current_entity['entity_group'] and
110
+ (entity['start'] <= current_entity['end'])
111
+ ):
112
+ new_word = entity['word']
113
+ if not current_entity['word'].endswith(new_word):
114
+ current_entity['word'] += " " + new_word
115
+ current_entity['end'] = max(current_entity['end'], entity['end'])
116
+ current_entity['score'] = (current_entity['score'] + entity['score']) / 2
117
+ else:
118
+ merged_entities.append(current_entity)
119
+ current_entity = entity
120
+ if current_entity:
121
+ merged_entities.append(current_entity)
122
+
123
+ return merged_entities
124
+
125
+
126
+ from transformers import pipeline
127
+
128
+ # Load the model
129
+ model_path = "Helios9/BIOMed_NER"
130
+ pipe = pipeline(
131
+ task="token-classification",
132
+ model=model_path,
133
+ tokenizer=model_path,
134
+ aggregation_strategy="simple"
135
+ )
136
+
137
+ # Test the pipeline
138
+ text = ("A 48-year-old female presented with vaginal bleeding and abnormal Pap smears. "
139
+ "Upon diagnosis of invasive non-keratinizing SCC of the cervix, she underwent a radical "
140
+ "hysterectomy with salpingo-oophorectomy which demonstrated positive spread to the pelvic "
141
+ "lymph nodes and the parametrium.")
142
+ result = pipe(text)
143
+ final_result=merge_consecutive_entities(result)
144
+ print(final_result)
145
+
146
+ ```
147
+
148
  **Use Cases:**
149
  - Extracting clinical information from unstructured text in medical records.
150
  - Structuring data for downstream biomedical research or applications.