Quickstart¶
We’ll cover the following here:
Installation¶
Installation with fastembed
Installation without fastembed
Embedding, Inserting and Querying¶
add
andquery
with fastembedQdrant without fastembed: Points, upsert and query
Installation¶
Installation with fastembed¶
Qdrant’s Python client ships with FastEmbed, an optional dependency for embedding text without handling
[1]:
!pip install 'qdrant-client[fastembed]' --quiet
Initialize the Client¶
We support 3 production options:
Qdrant Cloud - Recommended for Getting Started
Qdrant Managed Deployment with your Cloud Provider - Recommendeded for Enteprises
Qdrant Self-Hosted with Docker - Recommended for those with specific requirements
In addition, Python client wraps numpy
in a :memory:
mode, which is useful for getting a feel of the client syntax.
[2]:
from qdrant_client import QdrantClient
# client = QdrantClient(path="path/to/db") # Persists changes to disk
# or
client = QdrantClient(":memory:")
Embedding, Inserting and Querying¶
We’ll use the FastEmbed library to embed text without handling the embedding model.
Embedding and Inserting¶
We wrap the Point
creation and insertion into a single API: add
for convenience.
[3]:
# Prepare your documents, metadata, and IDs
docs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]
client.add(
collection_name="demo_collection",
documents=docs,
)
[3]:
['a3e23385a815464385a7589443f850db', 'd5bef7146f1541518cd767313f6569d5']
If you’d like to add points with control over the metadata, you can use the metadata
parameter. Here is a quick example:
[4]:
# Prepare your documents, metadata, and IDs
docs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]
metadata = [
{"source": "Langchain-docs"},
{"source": "Linkedin-docs"},
]
ids = [42, 2]
# Use the new add method
client.add(
collection_name="demo_collection",
documents=docs,
metadata=metadata,
ids=ids
)
[4]:
[42, 2]
Querying with text directly¶
At query time, we need to embed the incoming query and then search for the nearest neighbors. We can do this with the query
API:
[5]:
search_result = client.query(
collection_name="demo_collection",
query_text="This is a query document",
limit=1
)
print(search_result)
[QueryResponse(id=42, embedding=None, metadata={'document': 'Qdrant has Langchain integrations', 'source': 'Langchain-docs'}, document='Qdrant has Langchain integrations', score=0.8276550115796268)]
Qdrant without fastembed
¶
Collection¶
A collection is a set of points with the same dimensionality and a similarity metric (e.g. Dot, Cosine) defined on it. We can create a collection with the create_collection
method:
[6]:
from qdrant_client.http.models import Distance, VectorParams
if not client.collection_exists("test_collection"):
client.create_collection(
collection_name="test_collection",
vectors_config=VectorParams(size=4, distance=Distance.DOT),
)
[6]:
True
Notice that we didn’t have to explicitly create a collection with fastembed
: If the collection exists already, we update or create a new one when using add
.
Points¶
A point is a vector of floats with a unique identifier id
. We can create a point with the Point
API:
[7]:
from qdrant_client.http.models import PointStruct
operation_info = client.upsert(
collection_name="test_collection",
wait=True,
points=[
PointStruct(id=1, vector=[0.05, 0.61, 0.76, 0.74], payload={"city": "Berlin"}),
PointStruct(id=2, vector=[0.19, 0.81, 0.75, 0.11], payload={"city": "London"}),
PointStruct(id=3, vector=[0.36, 0.55, 0.47, 0.94], payload={"city": "Moscow"}),
PointStruct(id=4, vector=[0.18, 0.01, 0.85, 0.80], payload={"city": "New York"}),
PointStruct(id=5, vector=[0.24, 0.18, 0.22, 0.44], payload={"city": "Beijing"}),
PointStruct(id=6, vector=[0.35, 0.08, 0.11, 0.44], payload={"city": "Mumbai"}),
]
)
print(operation_info)
operation_id=0 status=<UpdateStatus.COMPLETED: 'completed'>
This creation of Points
is also abstracted away with the add
API.
Querying with Vector¶
[8]:
search_result = client.search(
collection_name="test_collection",
query_vector=[0.18, 0.81, 0.75, 0.12],
limit=1
)
print(search_result)
[ScoredPoint(id=2, version=0, score=1.2660000014305115, payload={'city': 'London'}, vector=None)]
Querying with a Filter and Vector¶
[9]:
from qdrant_client.http.models import Filter, FieldCondition, MatchValue
search_result = client.search(
collection_name="test_collection",
query_vector=[0.2, 0.1, 0.9, 0.7],
query_filter=Filter(
must=[
FieldCondition(
key="city",
match=MatchValue(value="London")
)
]
),
limit=1
)
print(search_result)
[ScoredPoint(id=2, version=0, score=0.8709999993443489, payload={'city': 'London'}, vector=None)]