Uncategorized

Improving Text Embeddings with Large Language Models



In this paper, we introduce a novel and simple method for obtaining high-quality text embeddings using only synthetic data and less than 1k training steps. Unlike existing methods that often depend on multi-stage intermediate pre-training with billions of weakly-supervised text pairs, followed by fine-tuning with a few labeled datasets, our method does not require building…

This story appeared on arxiv.org, 2024-01-02.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *