Рет қаралды 432
Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.
Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs).
In this video Mofi Rahman and Ali Zaidi walks through the process of deploying Gemma on GKE using TGI serving engine.
Find Gemma on Huggingface - huggingface.co/google
Follow along the guide: cloud.google.com/kubernetes-e...
Find other guides for serving Gemma and other AIML resources for GKE: g.co/cloud/gke-aiml
Find other resources for learning about Gemma: ai.google.dev/gemma