Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ This repo contains summaries of several sets of experiments comparing a number o
|
|
10 |
|
11 |
The runs were all performed training a smaller ViT (`vit_wee_patch16_reg1_gap_256`) for 200 epochs (10M samples seen) from scratch on the `timm` 'mini-imagenet' dataset, a 100 class subset of imagenet with same image sizes as originals.
|
12 |
|
13 |
-
So far I have results for `adamw`, `laprop`, and `mars
|
14 |
|
15 |
This is what the 'caution' addition looks like in an optimizer:
|
16 |
```python
|
|
|
10 |
|
11 |
The runs were all performed training a smaller ViT (`vit_wee_patch16_reg1_gap_256`) for 200 epochs (10M samples seen) from scratch on the `timm` 'mini-imagenet' dataset, a 100 class subset of imagenet with same image sizes as originals.
|
12 |
|
13 |
+
So far I have results for `adamw`, `laprop`, and `mars` (https://huggingface.co/papers/2411.10438). You can find full results in sub-folders by optimizer names. In all of these runs, the experiments with 'c' prefix in the name have caution enabled.
|
14 |
|
15 |
This is what the 'caution' addition looks like in an optimizer:
|
16 |
```python
|