qdrant_client.async_qdrant_fastembed module

class AsyncQdrantFastembedMixin(parser: ModelSchemaParser, is_local_mode: bool)[source]

Bases: AsyncQdrantBase

async add(collection_name: str, documents: Iterable[str], metadata: Optional[Iterable[dict[str, Any]]] = None, ids: Optional[Iterable[Union[int, str, UUID]]] = None, batch_size: int = 32, parallel: Optional[int] = None, **kwargs: Any) → list[str | int][source]

Adds text documents into qdrant collection. If collection does not exist, it will be created with default parameters. Metadata in combination with documents will be added as payload. Documents will be embedded using the specified embedding model.

If you want to use your own vectors, use upsert method instead.

Parameters:

collection_name (str) – Name of the collection to add documents to.
documents (Iterable[str]) – List of documents to embed and add to the collection.
metadata (Iterable[dict[str, Any]], optional) – List of metadata dicts. Defaults to None.
ids (Iterable[models.ExtendedPointId], optional) – List of ids to assign to documents. If not specified, UUIDs will be generated. Defaults to None.
batch_size (int, optional) – How many documents to embed and upload in single request. Defaults to 32.
parallel (Optional[int], optional) – How many parallel workers to use for embedding. Defaults to None. If number is specified, data-parallel process will be used.

Raises:

ImportError – If fastembed is not installed.

Returns:

List of IDs of added documents. If no ids provided, UUIDs will be randomly generated on client side.

get_embedding_size(model_name: Optional[str] = None) → int[source]

Get the size of the embeddings produced by the specified model.

Parameters:: model_name – optional, the name of the model to get the embedding size for. If None, the default model will be used.
Returns:: int – the size of the embeddings produced by the model.
Raises:: ValueError – If sparse model name is passed or model is not found in the supported models.

get_fastembed_sparse_vector_params(on_disk: Optional[bool] = None, modifier: Optional[Modifier] = None) → dict[str, SparseVectorParams] | None[source]

Generates vector configuration, compatible with fastembed sparse models.

Parameters:

on_disk – if True, vectors will be stored on disk. If None, default value will be used.
modifier – Sparse vector queries modifier. E.g. Modifier.IDF for idf-based rescoring. Default: None.

Returns:

Configuration for vectors_config argument in create_collection method.

get_fastembed_vector_params(on_disk: Optional[bool] = None, quantization_config: Optional[Union[ScalarQuantization, ProductQuantization, BinaryQuantization]] = None, hnsw_config: Optional[HnswConfigDiff] = None) → dict[str, VectorParams][source]

Generates vector configuration, compatible with fastembed models.

Parameters:

on_disk – if True, vectors will be stored on disk. If None, default value will be used.
quantization_config – Quantization configuration. If None, quantization will be disabled.
hnsw_config – HNSW configuration. If None, default configuration will be used.

Returns:

Configuration for vectors_config argument in create_collection method.

get_sparse_vector_field_name() → str | None[source]: Returns name of the vector field in qdrant collection, used by current fastembed model. :returns: Name of the vector field.

get_vector_field_name() → str[source]: Returns name of the vector field in qdrant collection, used by current fastembed model. :returns: Name of the vector field.

async classmethod list_image_models() → dict[str, tuple[int, Distance]][source]

Lists the supported image dense models.

Returns:: dict[str, tuple[int, models.Distance]] – A dict of model names, their dimensions and distance metrics.

async classmethod list_late_interaction_multimodal_models() → dict[str, tuple[int, Distance]][source]

Lists the supported late interaction multimodal models.

Returns:: dict[str, tuple[int, models.Distance]] – A dict of model names, their dimensions and distance metrics.

async classmethod list_late_interaction_text_models() → dict[str, tuple[int, Distance]][source]

Lists the supported late interaction text models.

Returns:: dict[str, tuple[int, models.Distance]] – A dict of model names, their dimensions and distance metrics.

async classmethod list_sparse_models() → dict[str, dict[str, Any]][source]

Lists the supported sparse text models.

Returns:: dict[str, dict[str, Any]] – A dict of model names and their descriptions.

async classmethod list_text_models() → dict[str, tuple[int, Distance]][source]

Lists the supported dense text models.

Returns:: dict[str, tuple[int, models.Distance]] – A dict of model names, their dimensions and distance metrics.

async query(collection_name: str, query_text: str, query_filter: Optional[Filter] = None, limit: int = 10, **kwargs: Any) → list[QueryResponse][source]

Search for documents in a collection. This method automatically embeds the query text using the specified embedding model. If you want to use your own query vector, use search method instead.

Parameters:

collection_name – Collection to search in
query_text – Text to search for. This text will be embedded using the specified embedding model. And then used as a query vector.
query_filter –
- Exclude vectors which doesn’t fit given conditions.
- If None - search among all vectors
limit – How many results return
**kwargs – Additional search parameters. See qdrant_client.models.QueryRequest for details.

Returns:

list[types.ScoredPoint] – List of scored points.

async query_batch(collection_name: str, query_texts: list[str], query_filter: Optional[Filter] = None, limit: int = 10, **kwargs: Any) → list[list[QueryResponse]][source]

Search for documents in a collection with batched query. This method automatically embeds the query text using the specified embedding model.

Parameters:

collection_name – Collection to search in
query_texts – A list of texts to search for. Each text will be embedded using the specified embedding model. And then used as a query vector for a separate search requests.
query_filter –
- Exclude vectors which doesn’t fit given conditions.
- If None - search among all vectors
This filter will be applied to all search requests.
limit – How many results return
**kwargs – Additional search parameters. See qdrant_client.models.QueryRequest for details.

Returns:

list[list[QueryResponse]] – List of lists of responses for each query text.

set_model(embedding_model_name: str, max_length: Optional[int] = None, cache_dir: Optional[str] = None, threads: Optional[int] = None, providers: Optional[Sequence[None]] = None, cuda: bool = False, device_ids: Optional[list[int]] = None, lazy_load: bool = False, **kwargs: Any) → None[source]

Set embedding model to use for encoding documents and queries.

Parameters:

embedding_model_name – One of the supported embedding models. See SUPPORTED_EMBEDDING_MODELS for details.
max_length (int, optional) – Deprecated. Defaults to None.
cache_dir (str, optional) – The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system’s temp directory.
threads (int, optional) – The number of threads single onnxruntime session can use. Defaults to None.
providers – The list of onnx providers (with or without options) to use. Defaults to None. Example configuration: https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#configuration-options
cuda (bool, optional) – Whether to use cuda for inference. Mutually exclusive with providers Defaults to False.
device_ids (Optional[list[int]], optional) – The list of device ids to use for data parallel processing in workers. Should be used with cuda=True, mutually exclusive with providers. Defaults to None.
lazy_load (bool, optional) – Whether to load the model during class initialization or on demand. Should be set to True when using multiple-gpu and parallel encoding. Defaults to False.

Raises:

ValueError – If embedding model is not supported.
ImportError – If fastembed is not installed.

Returns:

None

set_sparse_model(embedding_model_name: str | None, cache_dir: Optional[str] = None, threads: Optional[int] = None, providers: Optional[Sequence[None]] = None, cuda: bool = False, device_ids: Optional[list[int]] = None, lazy_load: bool = False, **kwargs: Any) → None[source]

Set sparse embedding model to use for hybrid search over documents in combination with dense embeddings.

Parameters:

embedding_model_name – One of the supported sparse embedding models. See SUPPORTED_SPARSE_EMBEDDING_MODELS for details. If None, sparse embeddings will not be used.
cache_dir (str, optional) – The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system’s temp directory.
threads (int, optional) – The number of threads single onnxruntime session can use. Defaults to None.
providers – The list of onnx providers (with or without options) to use. Defaults to None. Example configuration: https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#configuration-options
cuda (bool, optional) – Whether to use cuda for inference. Mutually exclusive with providers Defaults to False.
device_ids (Optional[list[int]], optional) – The list of device ids to use for data parallel processing in workers. Should be used with cuda=True, mutually exclusive with providers. Defaults to None.
lazy_load (bool, optional) – Whether to load the model during class initialization or on demand. Should be set to True when using multiple-gpu and parallel encoding. Defaults to False.

Raises:

ValueError – If embedding model is not supported.
ImportError – If fastembed is not installed.

Returns:

None

DEFAULT_BATCH_SIZE = 8

DEFAULT_EMBEDDING_MODEL = 'BAAI/bge-small-en'

property embedding_model_name: str

property sparse_embedding_model_name: str | None

qdrant_client.async_qdrant_fastembed module

Qdrant

Similarity Learning

Community