Post
1631
tldr; Parquet is awesome, DuckDB too!
Datasets on the HF中国镜像站 Hub rely on parquet files. We can interact with these files using DuckDB as a fast in-memory database system. One of DuckDB’s features is vector similarity search which can be used with or without an index.
blog:
https://huggingface.co/learn/cookbook/vector_search_with_hub_as_backend
Datasets on the HF中国镜像站 Hub rely on parquet files. We can interact with these files using DuckDB as a fast in-memory database system. One of DuckDB’s features is vector similarity search which can be used with or without an index.
blog:
https://huggingface.co/learn/cookbook/vector_search_with_hub_as_backend