Gorilla LLMs Explained!

Рет қаралды 3,527

Weaviate • Vector Database

Күн бұрын

Пікірлер: 16

@stephenkiilu8579 5 ай бұрын

Excellent! I found the video very useful

@ecardenas300 Жыл бұрын

Excellent overview of this paper! 🦍

@Skinishh Жыл бұрын

I didn't quite understand the Retrieval-Aware Training part. In the paper they say: "For training with retriever, the instruction-tuned dataset, also has an additional "Use this API documentation for reference: " appended to the user prompt. " How does this make the retriever differentiable? In other words, how do we train our retriever?

@huveja9799 Жыл бұрын

The training (fine-tuning) is the standard training, you give a standard prompt (i.e. the request from the user), the LLM generates an output, this output is compared with the ground truth (i.e. the label), and an error is calculated from this comparison. Then, this error is differentiated for the LLM's weights (it is giving how much each weight is contributing to the error), and the weights are updated using those calculated gradients. The difference of the Retrieval-Aware Training is that the standard prompt (i.e. the request from the user), is enriched with the documentation coming from a query (i.e. retrieval) that is matching the standard prompt against a database with the documentation for the different APIs. When using the Gorilla model at inference time you have two paths: i. Using only with the standard prompt (zero-shot) ii. Using the standard prompt enriched with the retrieval from the database with the APIs' documentation. If you are using the path (ii) need to be careful of having a good retriever (i.e. a search engine that is able to find a good matching between the standard prompt and the documents in the APIs' documentation database), otherwise you are hurting the performance of the LLM

@Skinishh Жыл бұрын

@@huveja9799 thanks, but my question is how is the retrieval part differentiable? Since we pass the path in the prompt, the model learns to generate that path, and is therefore the search engine too? Is that it? Doesn't seem to scale at all for new paths

@huveja9799 Жыл бұрын

@@Skinishh Nope. The enriched prompt = original user's prompt + document returned from retrieval engine (in this case the API's documentation). The meaning that the retrieval part is differentiable is not literal. The meaning is that by using the gradients for the error calculated with the enriched prompt, you are training the LLM to use/interpret the second part of the enriched prompt (i.e. the API documentation) to generate the result (i.e. using this API to fulfill the request in the first part of enriched prompt -the original request-)

@Skinishh Жыл бұрын

@@huveja9799 got it Is there a way to train the retriever with the LLM end-to-end?

@huveja9799 Жыл бұрын

@@Skinishh I suppose you can put a loop where at the same time you train the Retriever to generate better embeddings for that task (matching queries to the most appropriate API documentation), at the same time you train the LLM to better interpret the Retriever results. This in turn can be completed with RLHF to further improve the system using as feedback the user's reaction to the system's response.