Skip to content

feat: Super Search - Hybrid BM25 + Vector search#61

Open
ponbac wants to merge 1 commit intomasterfrom
feat/super-search
Open

feat: Super Search - Hybrid BM25 + Vector search#61
ponbac wants to merge 1 commit intomasterfrom
feat/super-search

Conversation

@ponbac
Copy link
Owner

@ponbac ponbac commented Feb 5, 2026

Summary

Implements hybrid search over PRs and Work Items using:

  • BM25 full-text search via PostgreSQL tsvector
  • Semantic vector search via pgvector + Gemini embeddings
  • Reciprocal Rank Fusion (RRF) for result combination

Features

  • Natural language query parsing ("priority 1 bugs in Lerum")
  • Cmd+K command palette integration
  • Trait-based architecture for testability
  • Auto-initialization when GEMINI_API_KEY is set

Changes

  • toki-api/src/domain/search/ - Core search module
  • toki-api/migrations/ - search_documents table
  • toki-api/src/routes/search.rs - GET /search API
  • app/src/components/cmd-k.tsx - Frontend search UI

Review Fixes Applied

  • ✅ API key moved to header (was in URL)
  • ✅ Frontend search debounced (300ms, min 2 chars)
  • ✅ DB upserts wrapped in transaction
  • ✅ ADO fetches parallelized (10 concurrent)
  • ✅ window.open security (noopener,noreferrer)
  • ✅ Query factory baseKey collision fixed

Greptile Overview

Greptile Summary

  • Adds a new domain/search module implementing hybrid BM25 full-text + pgvector semantic search fused via Reciprocal Rank Fusion (RRF).
  • Introduces search_documents schema/migration (tsvector + HNSW index) and a background index worker gated on GEMINI_API_KEY.
  • Wires a new GET /search API endpoint and integrates results into the Cmd+K command palette with debouncing.
  • Adds frontend React Query search query options with trimming to avoid cache fragmentation.

Confidence Score: 3/5

  • This PR is close to merge-ready but has a definite SQLx parameter type/binding issue in the hybrid search query that should be fixed first.
  • Most changes are additive and well-scoped (new search module, migration, route wiring, UI integration). However, the hybrid query currently binds the LIMIT parameter with the wrong Rust type under sqlx::query_as!, which will break compilation or runtime query execution until corrected.
  • toki-api/src/domain/search/repository/postgres.rs

Important Files Changed

Filename Overview
toki-api/src/domain/search/repository/postgres.rs Adds Postgres-backed hybrid BM25+vector search and upsert/delete APIs; currently has a SQLx bind type mismatch in LIMIT parameter (passes i64 for $11) and uses unnecessary cloning in to_pg_vector.
toki-api/src/routes/search.rs Adds GET /search endpoint delegating to SearchService with optional limit; straightforward wiring and error mapping.
toki-api/src/domain/search/service.rs Implements SearchService that trims query, parses filters, generates embeddings for >=2 chars, and queries repository; includes unit tests.
toki-api/src/domain/search/embedder/gemini.rs Adds Gemini embedder using genai; returns fixed-size zero vectors for empty strings and does not validate returned vector dimensionality against DB schema.
toki-api/migrations/20260205220000_create_search_documents.sql Creates vector extension, enum, search_documents table, and indexes; enum creation is guarded with DO/EXCEPTION for idempotency.
toki-api/src/app_state.rs Wires search service initialization behind GEMINI_API_KEY and spawns index worker in prod; exposes search_service accessor.
toki-api/src/router.rs Nests /search route and starts search indexer task in non-debug builds.
app/src/components/cmd-k.tsx Adds debounced Cmd+K search group using backend search endpoint, shows loading/error states, and navigates/open links.
app/src/lib/api/queries/search.ts Adds React Query options for /search; trims query for enabled + queryKey + URLSearchParams to avoid cache fragmentation.

Sequence Diagram

sequenceDiagram
  autonumber
  participant UI as Cmd+K UI (app)
  participant RQ as React Query
  participant API as toki-api /search
  participant SS as SearchService
  participant P as Query parser
  participant E as GeminiEmbedder
  participant DB as PgSearchRepository
  participant PG as Postgres (search_documents)

  UI->>RQ: queries.search.search(query, limit)
  RQ->>API: GET /search?q=...&limit=...
  API->>SS: search(q, limit)
  SS->>P: parse_query(trim(q))
  alt parsed.search_text length >= 2
    SS->>E: embed(parsed.search_text)
    E-->>SS: embedding[1536]
    SS->>DB: search(parsed, Some(embedding), limit)
    DB->>PG: BM25 CTE + vector CTE + RRF + LIMIT
    PG-->>DB: rows
    DB-->>SS: results
  else short query
    SS->>DB: search(parsed, None, limit)
    DB->>PG: BM25-only query
    PG-->>DB: rows
    DB-->>SS: results
  end
  SS-->>API: Vec<SearchResult>
  API-->>RQ: JSON results
  RQ-->>UI: render results / navigation
Loading

@vercel
Copy link

vercel bot commented Feb 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
toki2 Ready Ready Preview, Comment Feb 6, 2026 9:14am

Request Review

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ponbac
Copy link
Owner Author

ponbac commented Feb 6, 2026

@greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@ponbac ponbac force-pushed the feat/super-search branch from 5cbcfdf to de88982 Compare February 6, 2026 09:01
@ponbac
Copy link
Owner Author

ponbac commented Feb 6, 2026

@greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps

This comment was marked as outdated.

@ponbac ponbac force-pushed the feat/super-search branch from de88982 to 3bfd849 Compare February 6, 2026 09:13
@ponbac
Copy link
Owner Author

ponbac commented Feb 6, 2026

@greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant