Why you should build an LLM benchmark [English]

No video

Why you should build an LLM benchmark [English]

Рет қаралды 1,636

Big Data Demystified

Big Data Demystified

6 ай бұрын

📊 Dive Deep into the World of LLM Benchmarks! 📊
Objective: By the end of this session, you should have a good understanding of how to select and maintain your own LLM benchmark.
Agenda:
🔬 Demo!
🔍Discover what ARC, HellSwag, and MMLU are exactly
🧫 Learn how to select the right benchmark
🧪 Methods to test LLMs tailored to your unique use case
🧱 Q&A
Speaker: J. Yarkoni ex-Google AI/ML Specialist (Shujin.ai)
Jonathan comes from a background of leading R&D teams. Previously he co-founded NAM, an advertising startup, and AA-TLV meetup, which at its peak had 3,500 members. Over the last six years, he spearheaded AI/ML initiatives at Google Cloud Israel. More recently, he established Shujin.AI, a consultancy specializing in ML projects with an emphasis on Generative AI.
big-data-demystified.ninja/20...

Пікірлер: 1

@jazzvids Ай бұрын

Thank you for this valuable talk! I am currently writing my masters' thesis in nlp and this is very helpful

Semantic Layer vs. Metric Layer in Business Intelligence [English]

59:37

Semantic Layer vs. Metric Layer in Business Intelligence [English]

Big Data Demystified

Рет қаралды 844

Understanding LLM Inference | NVIDIA Experts Deconstruct How AI Works

55:39

Understanding LLM Inference | NVIDIA Experts Deconstruct How AI Works

DataCamp

Рет қаралды 1,8 М.

小蚂蚁被感动了！火影忍者 #佐助 #家庭

00:54

小蚂蚁被感动了！火影忍者 #佐助 #家庭

火影忍者一家

Рет қаралды 36 МЛН

Откуда берутся арбузы? 😱 #тнт #shorts #юмор #шоу #однаждывроссии #арбуз #азамат #дорох #кошкина

00:20

Откуда берутся арбузы? 😱 #тнт #shorts #юмор #шоу #однаждывроссии #арбуз #азамат #дорох #кошкина

ОВР Шоу

Рет қаралды 9 МЛН

❓А кого бы выбрал ты?! (👩🏻Мисс Делайт vs 🐱Кетнеп 💏🏻Родители)

00:51

❓А кого бы выбрал ты?! (👩🏻Мисс Делайт vs 🐱Кетнеп 💏🏻Родители)

Ной Анимация

Рет қаралды 3,6 МЛН

Sigma girl and soap bubbles by Secret Vlog

00:37

Sigma girl and soap bubbles by Secret Vlog

Secret Vlog

Рет қаралды 15 МЛН

What are AI Agents?

12:29

What are AI Agents?

IBM Technology

Рет қаралды 120 М.

LLM Evaluation Essentials: Benchmarking and Analyzing Retrieval Approaches

53:47

LLM Evaluation Essentials: Benchmarking and Analyzing Retrieval Approaches

Arize AI

Рет қаралды 1,5 М.

Generative AI in a Nutshell - how to survive and thrive in the age of AI

17:57

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Henrik Kniberg

Рет қаралды 1,8 МЛН

Rise, Fall and re-Rise of the Semantic Layer [English]

32:04

Rise, Fall and re-Rise of the Semantic Layer [English]

Big Data Demystified

Рет қаралды 364

7 Popular LLM Benchmarks Explained [OpenLLM Leaderboard & Chatbot Arena]

5:50

7 Popular LLM Benchmarks Explained [OpenLLM Leaderboard & Chatbot Arena]

bycloud

Рет қаралды 13 М.

The moment we stopped understanding AI [AlexNet]

17:38

The moment we stopped understanding AI [AlexNet]

Welch Labs

Рет қаралды 856 М.

AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"

23:47

AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"

Matthew Berman

Рет қаралды 515 М.

How to Build, Evaluate, and Iterate on LLM Agents

1:02:12

How to Build, Evaluate, and Iterate on LLM Agents

DeepLearningAI

Рет қаралды 34 М.

The Science of LLM Benchmarks: Methods, Metrics, and Meanings | LLMOps

45:03

The Science of LLM Benchmarks: Methods, Metrics, and Meanings | LLMOps

LLMOps Space

Рет қаралды 1,9 М.

How to build Multimodal Retrieval-Augmented Generation (RAG) with Gemini

34:22

How to build Multimodal Retrieval-Augmented Generation (RAG) with Gemini

Google for Developers

Рет қаралды 48 М.

小蚂蚁被感动了！火影忍者 #佐助 #家庭

00:54

小蚂蚁被感动了！火影忍者 #佐助 #家庭

火影忍者一家

Рет қаралды 36 МЛН