VecDB: Simple Vector Embedding Database Tool
Project Overview
GitHub Stats | Value |
---|---|
Stars | 31 |
Forks | 1 |
Language | Go |
Created | 2024-07-08 |
License | MIT License |
Introduction
VecDB is a simple vector embedding database designed to find items similar to the one you are searching for, functioning much like a hash table. Created by a databases enthusiast as a fun and learning project, VecDB can also be used in production environments. It uses a {key => value}
data model, where the key
is a unique identifier and the value
is the vector itself, represented as a list of floats. The database can be configured using a config.yml
file, allowing you to customize settings such as the HTTP server address and storage driver. Exploring VecDB can provide valuable insights into vector embedding databases and their applications.
Key Features
Overview
VecDB is a simple vector embedding database that functions like a hash table, allowing you to find items similar to the one you are searching for.
Key Features
Data Model
- Uses a
{key => value}
model wherekey
is a unique identifier andvalue
is the vector (a list of floats).
Configurations
- Configurable via a
config.yml
file, with options for HTTP server, storage driver (currently supports BoltDB), and embedder settings (supports Gemini).
Components
- Raw Vectors Layer: Allows writing and searching vectors using endpoints like
POST /v1/vectors/write
andPOST /v1/vectors/search
. - Embedding Layer (Optional): Enables text embedding and search using endpoints like
POST /v1/embeddings/text/write
andPOST /v1/embeddings/text/search
.
Requests
- Supports various request types:
VectorWriteRequest
: Store a vector.VectorSearchRequest
: Search for similar vectors based on cosine similarity.TextEmbeddingWriteRequest
: Store text as a vector using an embedder.TextEmbeddingSearchRequest
: Search for similar vectors based on text content.
Installation
- Available as a binary or Docker image.
VecDB is designed for both fun and learning, with the potential for use in production environments.
Real-World Applications
Product Recommendation System
You can use VecDB to build a product recommendation system. Here’s how:
- Store Product Vectors: Send
VectorWriteRequest
to store product vectors with unique keys (e.g., product IDs).{ "bucket": "products", "key": "product-id-1", "vector": [1.929292, 0.3848484, -1.9383838383, ...] }
- Search Similar Products: Use
VectorSearchRequest
to find products similar to a given product vector.{ "bucket": "products", "vector": [1.929292, 0.3848484, -1.9383838383, ...], "min_cosine_similarity": 0.5, "max_result_count": 10 }
Text Similarity Search
If you enable the embedder, you can search for similar texts:
- Store Text Embeddings: Send
TextEmbeddingWriteRequest
to store text embeddings.{ "bucket": "texts", "key": "text-id-1", "content": "This is some text representing the product" }
- Search Similar Texts: Use
TextEmbeddingSearchRequest
to find texts similar to a given text.{ "bucket": "texts", "content": "A Product Text", "min_cosine_similarity": 0.5, "max_result_count": 10 }
Configuration and Deployment
- Configure VecDB: Use a
config.yml
file to set up the server, storage, and embedder settings.server: listen: "0.0.0.0:3000" store: driver: "bolt" args: database: "./vec.db" embedder: enabled: true driver: gemini args: api_key: "${GEMINI_API_KEY}" text_embedding_model: "text-embedding-004"
- Deploy Using Docker: You can deploy VecDB using a Docker image for easy setup and management.
Exploring and Benefiting from the Repository
Conclusion
Key Points
- Simple Vector Embedding Database: VecDB acts as a hash-table to find items similar to the search item based on vector embeddings.
- Data Model: Uses
{key => value}
model with unique keys and vector values. - Configurable: Supports custom configurations via
config.yml
for server, storage, and embedder settings. - Components:
- Raw Vectors Layer: Allows writing and searching vectors.
- Embedding Layer: Optionally generates and stores vectors from text using embedders like Gemini.
- Requests: Supports various request types for writing and searching vectors and text embeddings.
Future Potential
- Production Use: Can be used in production environments for similarity searches.
- Customization: Flexible configuration options allow for adaptation to different use cases.
- Extensibility: Potential to support additional embedder drivers and storage solutions.
- Ease of Use: Simple and straightforward API for vector and text embedding operations.
Installation
- Available as a binary or Docker image for easy deployment.
For further insights and to explore the project further, check out the original alash3al/vecdb repository.
Attributions
Content derived from the alash3al/vecdb repository on GitHub. Original materials are licensed under their respective terms.
Stay Updated with the Latest AI & ML Insights
Subscribe to receive curated project highlights and trends delivered straight to your inbox.