Deploy Hugging Face models on Google Cloud: from the hub to Inference Endpoints

No video

Deploy Hugging Face models on Google Cloud: from the hub to Inference Endpoints

Рет қаралды 606

Күн бұрын

In this series of three videos, I walk you through the deployment of Hugging Face models on Google Cloud, in three different ways:
- Deployment from the hub model page to Inference endpoints (this video), with the Google Gemma 7B model,
- Deployment from the hub model page to Vertex AI ( • Deploy Hugging Face mo... ), with the Microsoft Phi-2 2.7B model,
- Deployment directly from within Vertex AI ( • Deploy Hugging Face mo... ), with the TinyLlama 1.1B model.
Get started at huggingface.co :)
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at / julsimon or Substack at julsimon.substack.com. ⭐️⭐️⭐️

Пікірлер: 4

@CMAZZONI 13 күн бұрын

Hello, thank you so much for doing this video, the only question I have is that some of the models do not have a deploy option in the model card (for example gliner models). Is there a way to use these? Many thanks!

@juliensimonfr 13 күн бұрын

You're welcome! You mean these, right: huggingface.co/urchade ? Not 100% sure, but they don't seem to be supported by the transformers library (see huggingface.co/docs/transformers/main/en/model_doc/bert), so this would explain why they can't be deployed in the standard way. The alternative would be to deploy it in a pytorch environment with the appropriate dependencies, see cloud.google.com/blog/topics/developers-practitioners/pytorch-google-cloud-how-deploy-pytorch-models-vertex-ai.

@SO-vq7qd 25 күн бұрын

Is there a way to connect this to a custom domain? I want to create a simple web app of chat interface

@juliensimonfr 14 күн бұрын

A vertex endpoint is a vanilla HTTP API. Given the right credentials, network permissions etc., you can invoke it from your apps just like any web API.