Why We Did This Benchmark
We were tired of choosing vector databases based on marketing materials and Twitter hype. Every RAG project starts with the same question: which vector DB? And the answer is usually whatever the team used last time. So we decided to actually measure things with realistic data and query patterns from our production systems.
Our test dataset: 1.2 million document chunks from a legal document corpus, embedded with OpenAI text-embedding-3-small (1536 dimensions). Queries were 5,000 real user questions from a production system. We tested on equivalent infrastructure — the closest we could get to apples-to-apples.
Pinecone: The Safe Choice (With Caveats)
Pinecone was the easiest to set up. Their serverless tier is genuinely zero-ops — you don't think about infrastructure at all. Query latency was consistent: p50 at 45ms, p99 at 120ms. For a managed service, that's solid. Recall@10 was 0.94 on our test set.
The downside: cost. At our data volume, we were looking at roughly $250/month on the standard plan. That's fine for a well-funded startup, but for the Indian market where we build many projects, clients push back on recurring infrastructure costs denominated in dollars. Also, Pinecone's metadata filtering has limitations — complex boolean queries slow things down significantly, and there's a hard limit on metadata field count.
Weaviate: Powerful but Operationally Heavy
Weaviate is impressive. Hybrid search (combining vector and keyword search) is built-in and actually works well — our recall@10 jumped to 0.97 with hybrid mode enabled. The GraphQL API is flexible, and the multi-tenancy support is the best of any vector DB we've tested. Query latency: p50 at 38ms, p99 at 95ms. Fastest of the four.
But running Weaviate in production is work. We self-hosted on AWS EKS and spent two days just getting the Helm charts configured correctly. Memory usage is aggressive — plan for 2-3x your data size in RAM. We had an OOM incident on a 16GB instance that took down the cluster at 2 AM. Their cloud offering (Weaviate Cloud Services) is easier but pricey — more expensive than Pinecone for equivalent usage.
Chroma: Great for Prototyping, Not for Production
Hot take: Chroma is fantastic for local development and prototyping. The Python API is clean, it runs embedded in your process, and you can go from zero to working RAG in 15 minutes. We love it for hackathons and proof-of-concepts.
For production though, we can't recommend it yet. At 1.2M vectors, query latency was p50 at 180ms, p99 at 600ms. The single-node architecture means no horizontal scaling. Persistence has been flaky in our testing — we lost data twice during unexpected shutdowns. The Chroma team is working on a cloud offering that might solve these issues, but as of mid-2024, it's not there yet.
pgvector: The Dark Horse That Won Us Over
Here's the surprise: for most of our projects, we now default to pgvector. Not because it's the best vector database — it isn't. But because it's good enough, and the operational benefits are massive. You're already running PostgreSQL. Your team already knows PostgreSQL. Your backups, monitoring, connection pooling, and HA setup already cover it.
Performance: with HNSW indexing (added in pgvector 0.5.0), query latency is p50 at 55ms, p99 at 150ms on our 1.2M vector dataset. Recall@10 was 0.92 — slightly lower than Pinecone and Weaviate, but tunable with index parameters. The killer feature is joining vector search results with your relational data in a single query. No separate infrastructure, no sync pipeline, no consistency issues.
We run pgvector on Supabase for most projects now. The cost is a fraction of dedicated vector DB services, and the developer experience is excellent.
Our Recommendation
If you're starting a new project: try pgvector first. If you hit performance limits at scale, move to Weaviate for self-hosted or Pinecone for managed. Use Chroma for prototyping only. And honestly, don't stress about this decision too much early on — the abstraction layer in LangChain or LlamaIndex makes switching relatively painless.