AI Assistant
Qdrant simplified: Setting up and using a vector database

Qdrant simplified: Setting up and using a vector database

Installing the Qdrant vector database is simple, or better said, it's simple if you're familiar with Docker and Nginx and have some experience using these tools.

In a previous article, I wrote about using the Pinecone vector database, which is not open source but offered as a managed service. Although very good, this database is not cheap, and at best, it will cost you at least 70$ per month (at the time of writing this article).

If that's not too much for you, then it's definitely better to go with Pinecone and avoid all the unnecessary complications that come with installing the Qdrant database on your server and all the additional settings.

If that's too much money for you or you simply want to keep your data with you, out of reach of third parties, then using an open source database is certainly a better option.

For my needs, I chose a cheap Hetzner cloud shared server CX21, which costs about 6€ per month. That's roughly 10 times cheaper than the Pinecone database, so the savings are not insignificant.

Now we'll see how this database is installed and how it is used. Before we start, just a note that this is not a detailed guide on how to use Docker, Nginx, and command line. I expect that you already know this, so if you don't have this knowledge, you will need to study some more resources on the internet.

1. Installing Qdrant database

When I took a server at Hetzner, I chose to have Ubuntu 22.04 installed on it. Then, I logged into the server using SSH and installed Docker. I used the option to install Docker via the apt repository. This method will make any future Docker updates easier, so I recommend that you use it too. If you're using a different operating system, you should find instructions for it on the same site.

When installing and running Qdrant, or any other software, using Docker, it's generally recommended to use a less privileged user rather than the root user for security reasons. 

This is due to the risk that, if a user successfully breaches the application operating as root within the container, they could potentially obtain root user privileges on the host. Running Docker as a non-root user is part of good security practices, as it minimizes the risks associated with potential security vulnerabilities in the software being run.

Add your user to the Docker group that was created during the installation of Docker. This allows your user to run Docker commands without needing root privileges.

sudo usermod -aG docker YOUR_USERNAME

And then log out and log back in so that your group membership is re-evaluated.

Once you have installed Docker and configured your user, you can pull the latest Qdrant image from Docker Hub:

docker pull qdrant/qdrant

Since you probably, if nothing else, want to set up an API key for database access, it is necessary to override the default configuration of Qdrant. In my case, I created a new file at the location: /home/goran/qdrant/custom_config.yaml

Now, I started the service in detached mode in the background:

docker run -d -p 6333:6333 \
  -v /home/goran/qdrant/custom_config.yaml:/qdrant/config/production.yaml \
  -v /home/goran/qdrant/storage:/qdrant/storage \
  qdrant/qdrant

The docker run command creates and runs a new container from an image that you have pulled from Docker Hub.

Contents of the config file:

service:
  api_key: API_KEY

Directory /home/goran/qdrant/storage - is a place where Qdrant persists all your data on the host machine.

While docker run might be suitable for testing or simple deployments, it lacks the essential functionalities for robust and reliable production environments. Choose an alternative like Docker Compose, Swarm, or Kubernetes for a more scalable, manageable, and failure-resistant production setup. Unfortunately, using any of these tools falls beyond the scope of this article, so you will need to find another resource to assist you with that.

2. Setting up a reverse proxy using Nginx

To avoid accessing Qdrant via an IP address and port number, you can install Nginx and configure it to work as a reverse proxy. This will enable us to set up a readable web address, such as qdrant.example.com, and to set up https via Let's Encrypt.

Installing Nginx is simple, just follow the instructions on the official site.

The initial Nginx config file should look something like this:

upstream qdrant {
  server localhost:6333;
  keepalive 15;
}

server {
  server_name qdrant.example.com;

  access_log /var/log/qdrant/qdrant.example.com_access_log;
  error_log /var/log/qdrant/qdrant.example.com_error_log;

  location / {
    proxy_pass http://qdrant;
    proxy_http_version 1.1;
    proxy_set_header Connection "Keep-Alive";
    proxy_set_header Proxy-Connection "Keep-Alive";
  }
}

Now, instead of having to remember your server's IP and port number, you (and others) can simply type in a regular web URL, and Nginx will automatically redirect the traffic to your Qdrant service on its specific port.

It remains to generate and install an SSL certificate with the help of Certbot. This procedure will change the above given config file, so if you want, you can back it up.

Qdrant should now be accessible:

  • Rest API: https://qdrant.example.com/
  • Web UI: https://qdrant.example.com/dashboard

3. Creating a Qdrant Collection

Vector data is stored in collections within the Qdrant database. To create a new collection via the curl command, do the following:

curl -X PUT "https://qdrant.example.com/collections/test_collection" \
  --header "Content-Type: application/json" \
  --header 'api-key: API_KEY' \
  --data '{"vectors": {"size": 1536, "distance": "Cosine"}}'

The size parameter is equal to 1536 because that is the output dimension of OpenAI embeddings for the text-embedding-ada-002 model. If you are using a different language model, check the size of the embeddings. Other models may have fewer or more dimensions.

The distance metric for comparing vectors can be one of the following values:

  • Dot
  • Cosine
  • Euclidean
  • Manhattan

For LLMs, Cosine similarity is often the preferred choice because it effectively captures the angular distance between vectors, which aligns well with how semantic similarity is often represented in the embedding space of language models. Ultimately, the choice of a similarity metric should be based on the specific needs of the application.

Just so we can see how other vector databases use slightly different terminology, in the Pinecone database a collection is called an index. Other vector databases might use different terms, but fundamentally it all comes down to the same or similar thing.

To find out more information, check out the official Qdrant documentation.

4. Adding vectors to the Quadrant collection

Points are the heart of Qdrant. They're like little containers holding information (a vector) and, if needed, additional data (payload).

Here's how to insert data points via curl command:

curl -X PUT "https://qdrant.example.com/collections/test_collection/points?wait=true" \
  --header "Content-Type: application/json" \
  --header "api-key: API_KEY" \
  --data '{
  "points": [
    {"id": 1, "vector": [0.05, 0.61, 0.76, 0.74, ...], "payload": {"someData": "Some value"}},
    {"id": 2, "vector": [0.19, 0.81, 0.75, 0.11, ...], "payload": {"someData": "Some other value"}}
  ]
}'

To keep the vector concise, only a subset of keys (out of 1536) is used in the above example.

In the Pinecone database, points are referred to as records, and they consist of vectors and, optionally, metadata (same as payload in Qdrant).

5. Deleting vectors from the Quadrant collection

Here's how to delete data points via curl command:

curl -X POST "https://qdrant.example.com/collections/test_collection/points/delete?wait=true" \
  --header "Content-Type: application/json" \
  --header "api-key: API_KEY" \
  --data '{
  "points": [1, 2]
}'

Should your application's logic demand immediate availability of the vector for searching right after the API response, then use the wait=true flag. With this setting, the API will only return the result once the operation has been completed. If you don't need this, then remove the wait parameter from the URL.

6. Searching the Quadrant collection

Modern large language models, are trained to transform text into vectors, ensuring that texts with similar meanings in the real world are represented as closely positioned in the vector space.

Let's examine a sample search query:

curl -X POST "https://qdrant.example.com/collections/test_collection/points/search" \
  --header "Content-Type: application/json" \
  --header "api-key: API_KEY" \
  --data '{
  "vector": [0.05, 0.61, 0.76, 0.74, ...],
  "with_vectors": false,
  "with_payload": true,
  "limit": 1
}'

By default, the search omits any stored information such as payload and vectors. However, this behavior can be changed by using additional parameters with_vectors and with_payload.

More information about using Qdrant for similarity search can be found here.

In summary, Qdrant offers a cost-effective, open-source alternative to managed vector databases like Pinecone, with a simple setup process for those familiar with Docker and Nginx. Whether you're aiming for budget-friendly solutions or prefer to manage your data independently, Qdrant provides a practical and accessible option for creating and managing vector databases.

About the Author

Goran Nikolovski is a web and AI developer with over 10 years of expertise in PHP, Drupal, Python, JavaScript, React, and React Native. He founded this website and enjoys sharing his knowledge.