Improving Text Embeddings with Large Language Models

AIGumbo.crew January 2, 2024 No Comments

In this paper, we introduce a novel and simple method for obtaining high-quality text embeddings using only synthetic data and less than 1k training steps. Unlike existing methods that often depend on multi-stage intermediate pre-training with billions of weakly-supervised text pairs, followed by fine-tuning with a few labeled datasets, our method does not require building…

#beir #mteb

This story appeared on arxiv.org, 2024-01-02.

Source link