⚡ ZeroGPU: New version rolled out! (sept 2024)

#107
by cbensimon HF staff - opened
ZeroGPU Explorers org

image.png

Hello everybody,

We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.

Major improvements:

  1. GPU cold starts about twice as fast!
  2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
  3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))
  4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!

Feel free to answer in this discussion if you have any questions!

🤗 Best regards,
Charles

cbensimon pinned discussion
ZeroGPU Explorers org
edited Sep 15, 2024

Hi, Charles!

No limit reset time is now displayed (to get more quotas or retry in 1:35:47)?
55543.jpg

Hi Charles.

The results from ZeroGPU differ from those on my local machine / HF中国镜像站's L4 GPU, even with the same code and Python dependencies.
For more information, visit: https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/111

Hi.
I found a very strange behavior. It is hard to find and would never happen locally. Maybe it is related to the bug above.
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/104#66f66a4b693f423f5b6d9b2e

Apparently, this time the behavior of Gradio's Cancel task is wrong. If it is bad, it may be a problem with Queue in general.
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/113#66fbc59085944df7944ff4aa

ZeroGPU Explorers org
This comment has been hidden

Hi @cbensimon !

Is there an example of how to show cold-start time to users as mentioned here?:

ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))

In my Zero GPU code, I assumed this meant to add it to the spaces.GPU decorator as

@spaces.GPU(duration=40, progress=gr.Progress(track_tqdm=True))

But I'm not seeing any visual indicator! There's no error thrown, but also no difference with that arg on or off, so not quite sure what I need to change! Thank you for your help!

(Code here if it helps: https://huggingface.co/spaces/WillHeld/diva-audio-chat/blob/main/app.py#L61)

I have heard that nest-asyncio, which is newly starting to be used in the spaces library, has quite a few problems around memory management.
I would like the library authors to find another alternative if possible.

I found that my model inference is much slower (~ x5 slower) running on zeroGPU than on my local GPU (V100, 16GB). May I know if there is a way to speed it up?

@cbensimon
I have a Pro plan ( 9$ for the test of my project ) and can easily access models by inference API and token authorization. I created the flux dev app. Next js but why is API slow generating images?

and this is my app.

https://aidreamgen.vercel.app/

Sign up or log in to comment