Member-only story
ask.py: Perplexity like search-extract-summarize flow in Python
5 min readOct 30, 2024
I’ve been experimenting with ask.py. to create a search-extract-summarize flow — something similar to what AI search engines like Perplexity offer.
While it’s a simplified version compared to what’s out there commercially, it’s a great way to get hands-on experience with the core concepts.
ask.py Workflow
Given a query, here’s the step-by-step process the program follows:
- Web Search: It starts by querying Google to fetch the top 10 relevant web pages.
- Content Scraping: It then crawls each of these pages to extract their textual content.
- Chunking and Storage: The extracted text is broken down into smaller chunks and stored in a vector database.
- Vector Similarity Search: Using the original query, it performs a vector search to find the top 10 matching text chunks.
- [Optional] Hybrid Search: It can also perform a full-text search and combine those results with the vector search.
- [Optional] Re-ranking: A reranker can be used to refine the ordering of the top chunks.
- Answer Generation: The selected chunks serve as context for an LLM (Language Model) to generate a coherent answer.
- Output with References: Finally, it outputs the generated answer along with references to the source material.