Serving Gemma on GKE using Text Generation Inference (TGI)

  Рет қаралды 432

Container Bytes

Container Bytes

5 ай бұрын

Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.
Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs).
In this video Mofi Rahman and Ali Zaidi walks through the process of deploying Gemma on GKE using TGI serving engine.
Find Gemma on Huggingface - huggingface.co/google
Follow along the guide: cloud.google.com/kubernetes-e...
Find other guides for serving Gemma and other AIML resources for GKE: g.co/cloud/gke-aiml
Find other resources for learning about Gemma: ai.google.dev/gemma

Пікірлер
Architecture of a ML Platform with Resource Sharing on Kubernetes
16:41
Do NOT Learn Kubernetes Without Knowing These Concepts...
13:01
Travis Media
Рет қаралды 254 М.
Clowns abuse children#Short #Officer Rabbit #angel
00:51
兔子警官
Рет қаралды 75 МЛН
НРАВИТСЯ ЭТОТ ФОРМАТ??
00:37
МЯТНАЯ ФАНТА
Рет қаралды 3,7 МЛН
Now THIS is entertainment! 🤣
00:59
America's Got Talent
Рет қаралды 39 МЛН
Serve LLM on Google Kubernetes Engine on L4 GPUs
16:51
Container Bytes
Рет қаралды 386
Create a Simple Web App with Go
3:03
Container Bytes
Рет қаралды 268
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
How to set up RAG - Retrieval Augmented Generation (demo)
19:52
Don Woodlock
Рет қаралды 20 М.
Unlimited AI Agents running locally with Ollama & AnythingLLM
15:21
Serving Gemma on GKE on TPU using Jetstream
4:32
Container Bytes
Рет қаралды 135
Was Penrose Right? NEW EVIDENCE For Quantum Effects In The Brain
19:19
PBS Space Time
Рет қаралды 115 М.
Как распознать поддельный iPhone
0:44
PEREKUPILO
Рет қаралды 2,1 МЛН
Сколько реально стоит ПК Величайшего?
0:37