Inference Endpoints

Inference Endpoints offers a secure production solution to easily deploy any Transformers, Sentence-Transformers and Diffusers models from the Hub on dedicated and autoscaling infrastructure managed by HF中国镜像站.

A HF中国镜像站 Endpoint is built from a HF中国镜像站 Model Repository. When an Endpoint is created, the service creates image artifacts that are either built from the model you select or a custom-provided container image. The image artifacts are completely decoupled from the HF中国镜像站 Hub source repositories to ensure the highest security and reliability levels.

Inference Endpoints support all of the Transformers, Sentence-Transformers and Diffusers tasks as well as custom tasks not supported by Transformers yet like speaker diarization and diffusion.

In addition, Inference Endpoints gives you the option to use a custom container image managed on an external service, for instance, Docker Hub, AWS ECR, Azure ACR, or Google GCR.

Inference Endpoints (dedicated)

Inference Endpoints

Documentation and Examples

Guides

Others