AI & ML interests

Efficient machine learning for any model and hardware: pruning, quantization, compilation, and more.

Recent Activity


🌍 Join the Pruna AI community!

Twitter GitHub LinkedIn Discord Reddit (Open-Source lauch of Pruna AI is on March 20th, 2025 🙊 Munich event & Paris event 🇩🇪🇫🇷🇪🇺🌍)


💜 Simply make AI models faster, cheaper, smaller, greener!

Pruna AI makes AI models faster, cheaper, smaller, greener with the pruna package.

  • It supports various models including CV, NLP, audio, graphs for predictive and generative AI.
  • It supports various hardware including GPU, CPU, Edge.
  • It supports various compression algortihms including quantization, pruning, distillation, caching, recovery, compilation that can be combined together.
  • You can either play on your own with smash/compression configurations or let the smashing/compressing agent find the optimal configuration [Pro].
  • You can evaluate reliable quality and efficiency metrics of your base vs smashed/compressed models. You can set it up in minutes and compress your first models in few lines of code!

⏩ How to get started?

You can smash your own models by installing pruna with:

pip install pruna[gpu]==0.1.3 --extra-index-url https://prunaai.pythonanywhere.com/

You can start with simple notebooks to experience efficiency gains with:

Use Case Free Notebooks
3x Faster Stable Diffusion Models Smash for free
Turbocharge Stable Diffusion Video Generation Smash for free
Making your LLMs 4x smaller Smash for free
Blazingly fast Computer Vision Models Smash for free
Smash your model with a CPU only Smash for free
Transcribe 2 hours of audio in less than 2 minutes with Whisper Smash for free
100% faster Whisper Transcription Smash for free
Flux generation in a heartbeat, literally Smash for free
Run your Flux model without an A100 Smash for free

For more details about installation and tutorials, you can check the Pruna AI documentation.


datasets

None public yet