VecDB: Simple Vector Embedding Database Tool

Author included in category Go

2024-09-25

Contents

Project Overview

GitHub Stats	Value
Stars	31
Forks	1
Language	Go
Created	2024-07-08
License	MIT License

Introduction

VecDB is a simple vector embedding database designed to find items similar to the one you are searching for, functioning much like a hash table. Created by a databases enthusiast as a fun and learning project, VecDB can also be used in production environments. It uses a {key => value} data model, where the key is a unique identifier and the value is the vector itself, represented as a list of floats. The database can be configured using a config.yml file, allowing you to customize settings such as the HTTP server address and storage driver. Exploring VecDB can provide valuable insights into vector embedding databases and their applications.

Key Features

Overview

VecDB is a simple vector embedding database that functions like a hash table, allowing you to find items similar to the one you are searching for.

Key Features

Data Model

Uses a {key => value} model where key is a unique identifier and value is the vector (a list of floats).

Configurations

Configurable via a config.yml file, with options for HTTP server, storage driver (currently supports BoltDB), and embedder settings (supports Gemini).

Components

Raw Vectors Layer: Allows writing and searching vectors using endpoints like POST /v1/vectors/write and POST /v1/vectors/search.
Embedding Layer (Optional): Enables text embedding and search using endpoints like POST /v1/embeddings/text/write and POST /v1/embeddings/text/search.

Requests

Supports various request types:
- VectorWriteRequest: Store a vector.
- VectorSearchRequest: Search for similar vectors based on cosine similarity.
- TextEmbeddingWriteRequest: Store text as a vector using an embedder.
- TextEmbeddingSearchRequest: Search for similar vectors based on text content.

Installation

Available as a binary or Docker image.

VecDB is designed for both fun and learning, with the potential for use in production environments.

Real-World Applications

Product Recommendation System

You can use VecDB to build a product recommendation system. Here’s how:

Store Product Vectors: Send VectorWriteRequest to store product vectors with unique keys (e.g., product IDs).

json5

{
  "bucket": "products",
  "key": "product-id-1",
  "vector": [1.929292, 0.3848484, -1.9383838383, ...]
}

Search Similar Products: Use VectorSearchRequest to find products similar to a given product vector.

json5

{
  "bucket": "products",
  "vector": [1.929292, 0.3848484, -1.9383838383, ...],
  "min_cosine_similarity": 0.5,
  "max_result_count": 10
}

Text Similarity Search

If you enable the embedder, you can search for similar texts:

Store Text Embeddings: Send TextEmbeddingWriteRequest to store text embeddings.

json5

{
  "bucket": "texts",
  "key": "text-id-1",
  "content": "This is some text representing the product"
}

Search Similar Texts: Use TextEmbeddingSearchRequest to find texts similar to a given text.

json5

{
  "bucket": "texts",
  "content": "A Product Text",
  "min_cosine_similarity": 0.5,
  "max_result_count": 10
}

Configuration and Deployment

Configure VecDB: Use a config.yml file to set up the server, storage, and embedder settings.

yaml

server:
  listen: "0.0.0.0:3000"
store:
  driver: "bolt"
  args:
    database: "./vec.db"
embedder:
  enabled: true
  driver: gemini
  args:
    api_key: "${GEMINI_API_KEY}"
    text_embedding_model: "text-embedding-004"

Deploy Using Docker: You can deploy VecDB using a Docker image for easy setup and management.

Exploring and Benefiting from the Repository

Conclusion

Key Points

Simple Vector Embedding Database: VecDB acts as a hash-table to find items similar to the search item based on vector embeddings.
Data Model: Uses {key => value} model with unique keys and vector values.
Configurable: Supports custom configurations via config.yml for server, storage, and embedder settings.
Components:
- Raw Vectors Layer: Allows writing and searching vectors.
- Embedding Layer: Optionally generates and stores vectors from text using embedders like Gemini.
Requests: Supports various request types for writing and searching vectors and text embeddings.

Future Potential

Production Use: Can be used in production environments for similarity searches.
Customization: Flexible configuration options allow for adaptation to different use cases.
Extensibility: Potential to support additional embedder drivers and storage solutions.
Ease of Use: Simple and straightforward API for vector and text embedding operations.

Installation

Available as a binary or Docker image for easy deployment.

For further insights and to explore the project further, check out the original alash3al/vecdb repository.

Attributions

Content derived from the alash3al/vecdb repository on GitHub. Original materials are licensed under their respective terms.

Contents

VecDB: Simple Vector Embedding Database Tool

Project Overview

Introduction

Key Features

Overview

Key Features

Data Model

Configurations

Components

Requests

Installation

Real-World Applications

Product Recommendation System

Text Similarity Search

Configuration and Deployment

Exploring and Benefiting from the Repository

Conclusion

Key Points

Future Potential

Installation

Attributions

Stay Updated with the Latest AI & ML Insights