Reduce your pinecone spend instantly
If you are working with large namespaces and have lots of upserts I highly recommend checking out: https://turbopuffer.com/ Much faster and cheaper than standard Vector DBs. Turbopuffer is a serverless vector and full-text search engine built on top of object storage (like S3). It's designed to be fast, roughly 10x cheaper than traditional vector databases, and highly scalable. It's used in production by companies like Cursor, Anthropic, Notion, Linear, Atlassian, Ramp, and Grammarly β handling over 2.5 trillion documents, 10M+ writes/s, and 10k+ queries/s. Turbopuffer supports three search modes: vector search, full-text search (BM25), and hybrid search combining both. For vector search, it uses a centroid-based approximate nearest neighbor (ANN) index based on a system called SPFresh. On a cold query, the centroid index is downloaded from object storage first, then the closest centroids identify which clusters of vectors to fetch β only the relevant clusters are pulled, not the entire dataset. This keeps cold queries feasible even on very large datasets. For full-text search, it uses an inverted index with BM25 scoring. Both index types also support metadata filtering. The system is focused on first-stage retrieval β efficiently narrowing millions of documents down to a manageable set of candidates, which can then be re-ranked or processed further downstream.