{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Quickstart\n",
    "\n",
    "\n",
    "We'll cover the following here: \n",
    "\n",
    "## Installation\n",
    "\n",
    "1. Installation with fastembed\n",
    "2. Installation without fastembed\n",
    "\n",
    "## Embedding, Inserting and Querying\n",
    "\n",
    "1. `add` and `query` with fastembed\n",
    "2. Qdrant without fastembed: Points, upsert and query "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "### Installation with fastembed\n",
    "\n",
    "Qdrant's Python client ships with FastEmbed, an optional dependency for embedding text without handling"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install 'qdrant-client[fastembed]' --quiet"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize the Client\n",
    "\n",
    "We support 3 production options:\n",
    "\n",
    "1. Qdrant [Cloud](https://qdrant.to/cloud?utm_source=docs&utm_medium=website&utm_campaign=python-docs&utm_content=article&utm_term=intro) - Recommended for Getting Started\n",
    "2. Qdrant Managed Deployment with your Cloud Provider - Recommendeded for Enteprises\n",
    "3. Qdrant [Self-Hosted with Docker](https://qdrant.tech/documentation/quick-start/) - Recommended for those with specific requirements\n",
    "\n",
    "In addition, Python client wraps `numpy` in a `:memory:` mode, which is useful for getting a feel of the client syntax."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from qdrant_client import QdrantClient\n",
    "\n",
    "# client = QdrantClient(path=\"path/to/db\")  # Persists changes to disk\n",
    "# or\n",
    "client = QdrantClient(\":memory:\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Embedding, Inserting and Querying\n",
    "\n",
    "We'll use the FastEmbed library to embed text without handling the embedding model.\n",
    "\n",
    "### Embedding and Inserting\n",
    "\n",
    "We wrap the `Point` creation and insertion into a single API: `add` for convenience."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['a3e23385a815464385a7589443f850db', 'd5bef7146f1541518cd767313f6569d5']"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Prepare your documents, metadata, and IDs\n",
    "docs = [\"Qdrant has Langchain integrations\", \"Qdrant also has Llama Index integrations\"]\n",
    "\n",
    "client.add(\n",
    "    collection_name=\"demo_collection\",\n",
    "    documents=docs,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you'd like to add points with control over the metadata, you can use the `metadata` parameter. Here is a quick example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[42, 2]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Prepare your documents, metadata, and IDs\n",
    "docs = [\"Qdrant has Langchain integrations\", \"Qdrant also has Llama Index integrations\"]\n",
    "metadata = [\n",
    "    {\"source\": \"Langchain-docs\"},\n",
    "    {\"source\": \"Linkedin-docs\"},\n",
    "]\n",
    "ids = [42, 2]\n",
    "\n",
    "# Use the new add method\n",
    "client.add(\n",
    "    collection_name=\"demo_collection\",\n",
    "    documents=docs,\n",
    "    metadata=metadata,\n",
    "    ids=ids\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Querying with text directly\n",
    "\n",
    "At query time, we need to embed the incoming query and then search for the nearest neighbors. We can do this with the `query` API:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[QueryResponse(id=42, embedding=None, metadata={'document': 'Qdrant has Langchain integrations', 'source': 'Langchain-docs'}, document='Qdrant has Langchain integrations', score=0.8276550115796268)]\n"
     ]
    }
   ],
   "source": [
    "search_result = client.query(\n",
    "    collection_name=\"demo_collection\",\n",
    "    query_text=\"This is a query document\",\n",
    "    limit=1\n",
    ")\n",
    "print(search_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Qdrant without `fastembed`\n",
    "\n",
    "### Collection\n",
    "\n",
    "A collection is a set of points with the same dimensionality and a similarity metric (e.g. Dot, Cosine) defined on it. We can create a collection with the `create_collection` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from qdrant_client.http.models import Distance, VectorParams\n",
    "\n",
    "if not client.collection_exists(\"test_collection\"):\n",
    "\tclient.create_collection(\n",
    "\t\tcollection_name=\"test_collection\",\n",
    "\t\tvectors_config=VectorParams(size=4, distance=Distance.DOT),\n",
    "\t)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that we didn't have to explicitly create a collection with `fastembed`: If the collection exists already, we update or create a new one when using `add`. \n",
    "\n",
    "## Points\n",
    "A point is a vector of floats with a unique identifier `id`. We can create a point with the `Point` API:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "operation_id=0 status=<UpdateStatus.COMPLETED: 'completed'>\n"
     ]
    }
   ],
   "source": [
    "from qdrant_client.http.models import PointStruct\n",
    "\n",
    "operation_info = client.upsert(\n",
    "    collection_name=\"test_collection\",\n",
    "    wait=True,\n",
    "    points=[\n",
    "        PointStruct(id=1, vector=[0.05, 0.61, 0.76, 0.74], payload={\"city\": \"Berlin\"}),\n",
    "        PointStruct(id=2, vector=[0.19, 0.81, 0.75, 0.11], payload={\"city\": \"London\"}),\n",
    "        PointStruct(id=3, vector=[0.36, 0.55, 0.47, 0.94], payload={\"city\": \"Moscow\"}),\n",
    "        PointStruct(id=4, vector=[0.18, 0.01, 0.85, 0.80], payload={\"city\": \"New York\"}),\n",
    "        PointStruct(id=5, vector=[0.24, 0.18, 0.22, 0.44], payload={\"city\": \"Beijing\"}),\n",
    "        PointStruct(id=6, vector=[0.35, 0.08, 0.11, 0.44], payload={\"city\": \"Mumbai\"}),\n",
    "    ]\n",
    ")\n",
    "print(operation_info)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This creation of `Points` is also abstracted away with the `add` API. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Querying with Vector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ScoredPoint(id=2, version=0, score=1.2660000014305115, payload={'city': 'London'}, vector=None)]\n"
     ]
    }
   ],
   "source": [
    "search_result = client.search(\n",
    "    collection_name=\"test_collection\",\n",
    "    query_vector=[0.18, 0.81, 0.75, 0.12], \n",
    "    limit=1\n",
    ")\n",
    "print(search_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Querying with a Filter and Vector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ScoredPoint(id=2, version=0, score=0.8709999993443489, payload={'city': 'London'}, vector=None)]\n"
     ]
    }
   ],
   "source": [
    "from qdrant_client.http.models import Filter, FieldCondition, MatchValue\n",
    "\n",
    "search_result = client.search(\n",
    "    collection_name=\"test_collection\",\n",
    "    query_vector=[0.2, 0.1, 0.9, 0.7], \n",
    "    query_filter=Filter(\n",
    "        must=[\n",
    "            FieldCondition(\n",
    "                key=\"city\",\n",
    "                match=MatchValue(value=\"London\")\n",
    "            )\n",
    "        ]\n",
    "    ),\n",
    "    limit=1\n",
    ")\n",
    "print(search_result)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "client",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}