Update README.md
Browse files
README.md
CHANGED
@@ -1,89 +1,56 @@
|
|
1 |
---
|
2 |
-
|
|
|
3 |
tags:
|
4 |
-
-
|
5 |
-
|
6 |
-
-
|
7 |
-
|
8 |
-
-
|
9 |
-
|
10 |
-
-
|
11 |
-
|
|
|
|
|
12 |
---
|
13 |
|
14 |
-
|
15 |
-
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
-
|
18 |
|
19 |
-
|
20 |
-
|
21 |
-
- Loss: 1.1956
|
22 |
-
- Precision: 0.7405
|
23 |
-
- Recall: 0.6573
|
24 |
-
- F1: 0.6964
|
25 |
-
- Precision Prob: 0.8163
|
26 |
-
- Recall Prob: 0.7080
|
27 |
-
- F1 Prob: 0.7583
|
28 |
-
- Precision Dir: 0.6167
|
29 |
-
- Recall Dir: 0.5692
|
30 |
-
- F1 Dir: 0.5920
|
31 |
|
32 |
-
|
33 |
|
34 |
-
More information needed
|
35 |
|
36 |
-
|
|
|
37 |
|
38 |
-
|
|
|
39 |
|
40 |
-
##
|
|
|
41 |
|
42 |
-
More information needed
|
43 |
|
44 |
-
##
|
45 |
|
46 |
-
|
47 |
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
|
|
57 |
|
58 |
-
|
59 |
|
60 |
-
|
61 |
-
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------------:|:-----------:|:-------:|:-------------:|:----------:|:------:|
|
62 |
-
| 0.6464 | 1.0 | 144 | 0.5840 | 0.6389 | 0.1292 | 0.2150 | 0.6389 | 0.2035 | 0.3087 | 0.0 | 0.0 | 0.0 |
|
63 |
-
| 0.555 | 2.0 | 288 | 0.4280 | 0.776 | 0.5449 | 0.6403 | 0.8072 | 0.5929 | 0.6837 | 0.7143 | 0.4615 | 0.5607 |
|
64 |
-
| 0.2925 | 3.0 | 432 | 0.4080 | 0.7302 | 0.7753 | 0.7520 | 0.8142 | 0.8142 | 0.8142 | 0.6053 | 0.7077 | 0.6525 |
|
65 |
-
| 0.2313 | 4.0 | 576 | 0.5124 | 0.7929 | 0.6236 | 0.6981 | 0.8367 | 0.7257 | 0.7773 | 0.6905 | 0.4462 | 0.5421 |
|
66 |
-
| 0.1277 | 5.0 | 720 | 0.6727 | 0.7326 | 0.7079 | 0.7200 | 0.8119 | 0.7257 | 0.7664 | 0.6197 | 0.6769 | 0.6471 |
|
67 |
-
| 0.0916 | 6.0 | 864 | 0.7179 | 0.75 | 0.7079 | 0.7283 | 0.7623 | 0.8230 | 0.7915 | 0.7174 | 0.5077 | 0.5946 |
|
68 |
-
| 0.0454 | 7.0 | 1008 | 0.8098 | 0.7578 | 0.6854 | 0.7198 | 0.8526 | 0.7168 | 0.7788 | 0.6212 | 0.6308 | 0.6260 |
|
69 |
-
| 0.0234 | 8.0 | 1152 | 0.9168 | 0.7616 | 0.6461 | 0.6991 | 0.8571 | 0.6903 | 0.7647 | 0.6167 | 0.5692 | 0.5920 |
|
70 |
-
| 0.0085 | 9.0 | 1296 | 0.9727 | 0.7703 | 0.6404 | 0.6994 | 0.8298 | 0.6903 | 0.7536 | 0.6667 | 0.5538 | 0.6050 |
|
71 |
-
| 0.0042 | 10.0 | 1440 | 1.0478 | 0.7484 | 0.6517 | 0.6967 | 0.8182 | 0.7168 | 0.7642 | 0.625 | 0.5385 | 0.5785 |
|
72 |
-
| 0.0032 | 11.0 | 1584 | 1.0905 | 0.7484 | 0.6517 | 0.6967 | 0.8229 | 0.6991 | 0.7560 | 0.6271 | 0.5692 | 0.5968 |
|
73 |
-
| 0.001 | 12.0 | 1728 | 1.1107 | 0.7312 | 0.6573 | 0.6923 | 0.7864 | 0.7168 | 0.7500 | 0.6316 | 0.5538 | 0.5902 |
|
74 |
-
| 0.0009 | 13.0 | 1872 | 1.1301 | 0.7239 | 0.6629 | 0.6921 | 0.7885 | 0.7257 | 0.7558 | 0.6102 | 0.5538 | 0.5806 |
|
75 |
-
| 0.0008 | 14.0 | 2016 | 1.1767 | 0.7108 | 0.6629 | 0.6860 | 0.7664 | 0.7257 | 0.7455 | 0.6102 | 0.5538 | 0.5806 |
|
76 |
-
| 0.0007 | 15.0 | 2160 | 1.1690 | 0.7284 | 0.6629 | 0.6941 | 0.8163 | 0.7080 | 0.7583 | 0.5938 | 0.5846 | 0.5891 |
|
77 |
-
| 0.0012 | 16.0 | 2304 | 1.1943 | 0.7202 | 0.6798 | 0.6994 | 0.7778 | 0.7434 | 0.7602 | 0.6167 | 0.5692 | 0.5920 |
|
78 |
-
| 0.0007 | 17.0 | 2448 | 1.1806 | 0.7160 | 0.6798 | 0.6974 | 0.7706 | 0.7434 | 0.7568 | 0.6167 | 0.5692 | 0.5920 |
|
79 |
-
| 0.0006 | 18.0 | 2592 | 1.1881 | 0.7273 | 0.6742 | 0.6997 | 0.7905 | 0.7345 | 0.7615 | 0.6167 | 0.5692 | 0.5920 |
|
80 |
-
| 0.0055 | 19.0 | 2736 | 1.1952 | 0.7301 | 0.6685 | 0.6979 | 0.7961 | 0.7257 | 0.7593 | 0.6167 | 0.5692 | 0.5920 |
|
81 |
-
| 0.0005 | 20.0 | 2880 | 1.1956 | 0.7405 | 0.6573 | 0.6964 | 0.8163 | 0.7080 | 0.7583 | 0.6167 | 0.5692 | 0.5920 |
|
82 |
|
83 |
-
|
84 |
-
### Framework versions
|
85 |
-
|
86 |
-
- Transformers 4.15.0
|
87 |
-
- Pytorch 1.10.0+cu111
|
88 |
-
- Datasets 1.17.0
|
89 |
-
- Tokenizers 0.10.3
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
tags:
|
5 |
+
- text-classification
|
6 |
+
widget:
|
7 |
+
- text: "severe atypical cases of pneumonia emerged and quickly spread worldwide.."
|
8 |
+
example_title: "challenge"
|
9 |
+
- text: "we speculate that studying IL-6 will be beneficial."
|
10 |
+
example_title: "direction"
|
11 |
+
- text: "in future studies, both PRRs should be tested as the cause for multiple deaths."
|
12 |
+
example_title: "both"
|
13 |
+
- text: "IbMADS1-transformed potatoes exhibited tuber morphogenesis in the fibrous roots."
|
14 |
+
example_title: "neither"
|
15 |
---
|
16 |
|
17 |
+
# Scientific challenges and directions
|
|
|
18 |
|
19 |
+
We present a novel resource to help scientists and medical professionals discover challenges and potential directions across scientific literature, focusing on a broad corpus pertaining to the COVID-19 pandemic and related historical research. At a high level, our labels are defined as follows:
|
20 |
|
21 |
+
* **Challenge**: A sentence mentioning a problem, difficulty, flaw, limitation, failure, lack of clarity, or knowledge gap.
|
22 |
+
* **Research direction**: A sentence mentioning suggestions or needs for further research, hypotheses, speculations, indications or hints that an issue is worthy of exploration.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
+
This repository contains a finetuned version of the [PubMedBERT](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext?text=%5BMASK%5D+is+a+tumor+suppressor+gene.) model on the proprietary dataset described in our paper: [A Search Engine for Discovery of Scientific Challenges and Directions](https://arxiv.org/abs/2108.13751). Also, check out [our search engine](https://challenges.apps.allenai.org/)!
|
25 |
|
|
|
26 |
|
27 |
+
* Please cite our paper if you use our datasets or models in your project. See the [BibTeX](#citation).
|
28 |
+
* Feel free to [email us](#contact-us).
|
29 |
|
30 |
+
## Annotated datasets and model
|
31 |
+
The train, test, and val csvs are can be downloaded from our [repository](https://github.com/Dan-La/scientific-challenges-and-directions) directly, or from the huggingface datasets.
|
32 |
|
33 |
+
## Example notebook & Search Engine
|
34 |
+
We include an example notebook that uses the model for inference. See `Inference_Notebook.ipynb` in our [repository](https://github.com/Dan-La/scientific-challenges-and-directions).
|
35 |
|
|
|
36 |
|
37 |
+
## Citation
|
38 |
|
39 |
+
If using our dataset and models, please cite:
|
40 |
|
41 |
+
```
|
42 |
+
@misc{lahav2021search,
|
43 |
+
title={A Search Engine for Discovery of Scientific Challenges and Directions},
|
44 |
+
author={Dan Lahav and Jon Saad Falcon and Bailey Kuehl and Sophie Johnson and Sravanthi Parasa and Noam Shomron and Duen Horng Chau and Diyi Yang and Eric Horvitz and Daniel S. Weld and Tom Hope},
|
45 |
+
year={2021},
|
46 |
+
eprint={2108.13751},
|
47 |
+
archivePrefix={arXiv},
|
48 |
+
primaryClass={cs.CL}
|
49 |
+
}
|
50 |
+
```
|
51 |
|
52 |
+
## Contact us
|
53 |
|
54 |
+
Please don't hesitate to reach out.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
+
**Email:** `[email protected]`,`[email protected]`.
|
|
|
|
|
|
|
|
|
|
|
|