https://store-images.s-microsoft.com/image/apps.10812.1de440a2-bece-4aa8-a39d-fd5957de18fe.750c08c6-a656-4802-8a40-adc4e4a9c3eb.69cc4d1a-cb5e-4c57-815b-59dbb4438a95
Jina Embeddings v3
Jina AI
Jina Embeddings v3
Jina AI
Jina Embeddings v3
Jina AI
New State-of-the-Art Multilingual Embeddings With Task LoRA
jina-embeddings-v3 is a multilingual multi-task text embedding model designed for a variety of NLP applications.
Based on the Jina-XLM-RoBERTa architecture, this model supports Rotary Position Embeddings to handle long input sequences up to 8192 tokens.
Additionally, it features 5 LoRA adapters to generate task-specific embeddings efficiently.
Based on the Jina-XLM-RoBERTa architecture, this model supports Rotary Position Embeddings to handle long input sequences up to 8192 tokens.
Additionally, it features 5 LoRA adapters to generate task-specific embeddings efficiently.
Highlights:
-
Extended Sequence Length: Supports up to 8192 tokens with RoPE.
-
Task-Specific Embedding: Customize embeddings through the task argument with the following options:
- retrieval.query: Used for query embeddings in asymmetric retrieval tasks
- retrieval.passage: Used for passage embeddings in asymmetric retrieval tasks
- separation: Used for embeddings in clustering and re-ranking applications
- classification: Used for embeddings in classification tasks
- text-matching: Used for embeddings in tasks that quantify similarity between two texts, such as STS or symmetric retrieval tasks
-
Matryoshka Embeddings: Supports flexible embedding sizes (32, 64, 128, 256, 512, 768, 1024), allowing for truncating embeddings to fit your application.
Usage:
Please refer to this link for detailed usage.