| Vector Store | Index Type | Minimum Version | Best For |
|---|---|---|---|
CouchbaseQueryVectorStore | Hyperscale Vector Index or Composite Vector Index | Couchbase Server 8.0+ | Large-scale pure vector searches or searches combining vector similarity with scalar filters |
CouchbaseSearchVectorStore | Search Vector Index | Couchbase Server 7.6+ | Hybrid searches combining vector similarity with Full-Text Search (FTS) and geospatial searches |
Setup
To access the Couchbase vector stores you first need to install thelangchain-couchbase partner package:
Credentials
Head over to the Couchbase website and create a new connection, making sure to save your database username and password. You will also need an OpenAI API key for the embeddings. Get one from OpenAI.Create Couchbase Connection Object
We create a connection to the Couchbase cluster initially and then pass the cluster object to the Vector Store. Here, we are connecting using the username and password from above. You can also connect using any other supported way to your cluster. For more information on connecting to the Couchbase cluster, please check the documentation.CouchbaseQueryVectorStore
CouchbaseQueryVectorStore enables the usage of Couchbase for Vector Search using the Query and Indexing Service. It supports two different types of vector indexes:
- Hyperscale Vector Index - Optimized for pure vector searches on large datasets (billions of documents). Best for content discovery, recommendations, and applications requiring high accuracy with low memory footprint. Hyperscale Vector indexes compare vectors and scalar values simultaneously.
- Composite Vector Index - Combines a Global Secondary Index (GSI) with a vector column. Ideal for searches combining vector similarity with scalar filters where scalars filter out large portions of the dataset. Composite Vector indexes apply scalar filters first, then perform vector searches on the filtered results.
Initialization
Below, we create the vector store object with the cluster information and the distance metric. First, set up the embeddings (if not already done):Distance Strategies
TheCouchbaseQueryVectorStore supports the following distance strategies via the DistanceStrategy enum:
| Strategy | Description |
|---|---|
DistanceStrategy.DOT | Dot product similarity |
DistanceStrategy.COSINE | Cosine similarity |
DistanceStrategy.EUCLIDEAN | Euclidean distance (equivalent to L2) |
DistanceStrategy.EUCLIDEAN_SQUARED | Squared Euclidean distance (equivalent to L2_SQUARED) |
Specify the Text & Embeddings Field
You can optionally specify the text & embeddings field for the document using thetext_key and embedding_key fields.
Manage vector store
Once you have created your vector store, we can interact with it by adding and deleting different items. Add items to vector store We can add items to our vector store by using theadd_documents function.
create_index() method after adding your documents to enable efficient vector searches.
Query vector store
Similarity search Performing a simple similarity search can be done as follows:where_str parameter:
similarity_search_with_score method. Lower distances indicate more similar documents.
Async Operations
CouchbaseQueryVectorStore supports async operations:
Use as Retriever
You can transform the vector store into a retriever:Create from texts
You can create aCouchbaseQueryVectorStore directly from a list of texts:
CouchbaseSearchVectorStore
CouchbaseSearchVectorStore enables the usage of Couchbase for Vector Search using Search Vector Indexes. Search Vector Indexes combine a Couchbase Search index with a vector column, allowing hybrid searches that combine vector searches with Full-Text Search (FTS) and geospatial searches.
Requirements: Couchbase Server version 7.6 and above.
For details on how to create a Search index with support for Vector fields, please refer to the documentation:
Search Index Field Mappings for This Tutorial
To follow along with the examples in this documentation, your Search index should include mappings for the following fields:| Field | Type | Description |
|---|---|---|
text | text | The document text content |
embedding | vector | The vector embedding field (dimensions: 3072 for text-embedding-3-large) |
metadata | object (child mapping) | The metadata object with child fields like source, author, rating, date |
- The vector field dimensions must match your embedding model (3072 for
text-embedding-3-largeused in this tutorial) - The metadata child fields (
source,author,rating,date) are needed for the hybrid query examples - You can customize field names using the
text_keyandembedding_keyparameters when initializing the vector store
Initialization
Below, we create the vector store object with the cluster information and the search index name. First, set up the embeddings:Specify the text & embeddings field
You can optionally specify the text & embeddings field for the document using thetext_key and embedding_key fields.
Manage vector store
Once you have created your vector store, we can interact with it by adding and deleting different items. Add items to vector store We can add items to our vector store by using theadd_documents function.
Query vector store
Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent. Similarity search Performing a simple similarity search can be done as follows:similarity_search_with_score method.
Filtering results
You can filter the search results by specifying any filter on the text or metadata in the document that is supported by the Couchbase Search service. Thefilter can be any valid SearchQuery supported by the Couchbase Python SDK. These filters are applied before the Vector Search is performed.
If you want to filter on one of the fields in the metadata, you need to specify it using .
For example, to fetch the source field in the metadata, you need to specify metadata.source.
Note that the filter needs to be supported by the Search Index.
Specifying fields to return
You can specify the fields to return from the document usingfields parameter in the searches. These fields are returned as part of the metadata object in the returned Document. You can fetch any field that is stored in the Search index. The text_key of the document is returned as part of the document’s page_content.
If you do not specify any fields to be fetched, all the fields stored in the index are returned.
If you want to fetch one of the fields in the metadata, you need to specify it using .
For example, to fetch the source field in the metadata, you need to specify metadata.source.
Query by turning into retriever
You can also transform the vector store into a retriever for easier usage in your chains. Here is how to transform your vector store into a retriever and then invoke the retreiever with a simple query and filter.Hybrid queries
Couchbase allows you to do hybrid searches by combining Vector Search results with searches on non-vector fields of the document like themetadata object.
The results will be based on the combination of the results from both Vector Search and the searches supported by Search Service. The scores of each of the component searches are added up to get the total score of the result.
To perform hybrid searches, there is an optional parameter, search_options that can be passed to all the similarity searches.
The different search/query possibilities for the search_options can be found here.
Create Diverse Metadata for Hybrid Search
In order to demonstrate hybrid search, let us create documents with diverse metadata. We add three fields to the metadata: date between 2010 & 2020, rating between 1 & 5, and author set to either John Doe or Jane Doe.
metadata object.
metadata.date.
metadata.rating.
filter parameter instead of hybrid search.
Combining Hybrid Search Query with Filters
Hybrid Search can be combined with filters to get the best of both hybrid search and the filters for results matching the requirements.
In this example, we are checking for documents with a rating between 3 & 5 and matching the string “market” in the text field.
search_options parameter. Please refer to the documentation for more details on the available query methods and their syntax.
Usage for retrieval-augmented generation
For guides on how to use these vector stores for retrieval-augmented generation (RAG), see the following sections:Frequently Asked Questions
Question: Should I create the search index before creating the CouchbaseSearchVectorStore object?
Yes, you need to create the Search index before creating theCouchbaseSearchVectorStore object.
Question: Should I create the index before or after adding documents to CouchbaseQueryVectorStore?
ForCouchbaseQueryVectorStore, you should create the index after adding documents using the create_index() method. This is different from CouchbaseSearchVectorStore.
Question: What is the difference between CouchbaseSearchVectorStore and CouchbaseQueryVectorStore?
| Feature | CouchbaseSearchVectorStore | CouchbaseQueryVectorStore |
|---|---|---|
| Minimum Version | Couchbase Server 7.6+ | Couchbase Server 8.0+ |
| Index Type | Search Vector Index | Hyperscale or Composite Vector Index |
| Index Creation | Before vector store creation | After adding documents |
| Filtering | SearchQuery objects | SQL++ WHERE clauses (where_str) |
| Best For | Hybrid searches (vector + FTS + geo) | Large-scale pure vector searches or vector + scalar filters |
Question: I am not seeing all the fields that I specified in my search results
In Couchbase, we can only return the fields stored in the Search index. Please ensure that the field that you are trying to access in the search results is part of the Search index. One way to handle this is to index and store a document’s fields dynamically in the index.- In Capella, you need to go to “Advanced Mode” then under the chevron “General Settings” you can check “[X] Store Dynamic Fields” or “[X] Index Dynamic Fields”
- In Couchbase Server, in the Index Editor (not Quick Editor) under the chevron “Advanced” you can check “[X] Store Dynamic Fields” or “[X] Index Dynamic Fields”
Question: I am unable to see the metadata object in my search results
This is most likely due to themetadata field in the document not being indexed and/or stored by the Couchbase Search index. In order to index the metadata field in the document, you need to add it to the index as a child mapping.
If you select to map all the fields in the mapping, you will be able to search by all metadata fields. Alternatively, to optimize the index, you can select the specific fields inside metadata object to be indexed. You can refer to the docs to learn more about indexing child mappings.
Creating Child Mappings