Ollama
"Ollama supports embedding models, making it possible to build retrieval augmented generation (RAG) applications that combine text prompts with existing documents or other data." Learn more about the introduction to Ollama Embeddings in the blog post.
To use Ollama Embeddings, first, install LangChain Community package:
!pip install langchain-community
Load the Ollama Embeddings class:
from langchain_community.embeddings import OllamaEmbeddings
embeddings = (
OllamaEmbeddings()
) # by default, uses llama2. Run `ollama pull llama2` to pull down the model
text = "This is a test document."
To generate embeddings, you can either query an invidivual text, or you can query a list of texts.
query_result = embeddings.embed_query(text)
query_result[:5]
[-0.09996652603149414,
0.015568195842206478,
0.17670190334320068,
0.16521021723747253,
0.21193109452724457]
doc_result = embeddings.embed_documents([text])
doc_result[0][:5]
[-0.04242777079343796,
0.016536075621843338,
0.10052520781755447,
0.18272875249385834,
0.2079043835401535]
Embedding Modelsโ
Ollama has embedding models, that are lightweight enough for use in embeddings, with the smallest about the size of 25Mb. See some of the available embedding models from Ollama.
Let's load the Ollama Embeddings class with smaller model (e.g. mxbai-embed-large
).
Note: See other supported models https://ollama.ai/library
embeddings = OllamaEmbeddings(model="mxbai-embed-large")
text = "This is a test document."
query_result = embeddings.embed_query(text)
query_result[:5]
[-0.09996627271175385,
0.015567859634757042,
0.17670205235481262,
0.16521376371383667,
0.21193283796310425]
doc_result = embeddings.embed_documents([text])
doc_result[0][:5]
[-0.042427532374858856,
0.01653730869293213,
0.10052604228258133,
0.18272635340690613,
0.20790338516235352]