Friday, April 17, 2026

A Tour on Opensource Vector Database - PineCone

The previous article I share mentioned that unstructured data are usually store in vector database. I quickly google to search for opensource database, to have a look at how vector database look like. And I found PineCone.





In vector database, data are store in index, which is different from normal database where data is store in table.

From the example given, I manage to create 1 index, and load some sentences into the index.




As we can see, the search results list the sentences I load into the index. We can see there is score at every sentences. According to Chatgpt, the sentences(data) are converted into vectors (embeddings). 

  • When you search, your query is also turned into a vector
  • The system compares your query vector with stored vectors

đŸ‘‰ The score is the result of that comparison.

The score depends on the similarity metric used:

Example: 

1. Cosine Similarity (most common)

2. Dot Product

A vector database score is simply:

A numerical measure of how close your query is to stored data in vector space





No comments:

Post a Comment