LCORE-86: Prioritize BYOK content over built-in content#1208
LCORE-86: Prioritize BYOK content over built-in content#1208are-ces wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
- Add configurable RAG strategies: always RAG which is performed at each query (OKP Solr + BYOK) and tool RAG can be used independently or together - Add chunk prioritization with score multipliers per vector store for always RAG - Added knobs in config to select the RAG strategy - Tool RAG defaults to enabled=True for backward compatibility - Update lightspeed stack configuration enrichment script to build the solr section in llama stack and fix bugs in building the vector stores
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Chunks from Solr did not include Printout of the Solr chunk structure returned by Chunk(
chunk_id = '/documentation/en-us/openshift_container_platform/4.20/html-single/architecture/index_chunk_89',
content = '# Chapter 9. Admission plugins\n\n\n\n\nAdmission plugins are used to help regulate how [..]',
chunk_metadata = ChunkChunkMetadata(
chunk_embedding_dimension = None,
chunk_embedding_model = None,
chunk_id = '/documentation/en-us/openshift_container_platform/4.20/html-single/architecture/index_chunk_89',
chunk_tokenizer = None,
chunk_window = None,
content_token_count = None,
created_timestamp = None,
document_id = '/documentation/en-us/openshift_container_platform/4.20/html-single/architecture/index',
metadata_token_count = None,
source = None,
updated_timestamp = None,
),
embedding = [],
metadata = {},
embedding_model = 'sentence-transformers/ibm-granite/granite-embedding-30m-english',
embedding_dimension = 384,
) |
TL;DR
Motivation
To prioritize BYOK content, a mechanism to tune chunk scoring per vector store is necessary. The alternative was to create a client-side tool to change the behavior of the RAG tool, but that is not optimal for two reasons:
Always RAG with LCORE-side chunk prioritization avoids both issues.
Description
Adds configurable RAG strategies and chunk prioritization. Control which documentation sources to search, how to search them, and which sources are more important.
How It Works
Always RAG retrieves chunks from configured sources (BYOK and/or Solr) and injects them into the query before sending to the LLM. The LLM always has documentation context without calling tools.
Tool RAG lets the AI call the
file_searchtool during generation. This is the original behavior, enabled by default for backward compatibility.Chunk Prioritization applies a
score_multiplierper BYOK source. All sources are queried in parallel, chunk scores are multiplied by their source's weight, then merged and sorted. Top N chunks are selected across all sources. Solr chunks are appended after BYOK chunks without cross-source ranking (TBD, needs discussion / spike).Configuration
In
lightspeed-stack.yaml:Chunk limits in
src/constants.py:Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Prerequisites: BYOK vector stores (FAISS) created with rag-content tool, OKP Solr instance, Llama Stack 0.4.3+, Lightspeed Stack Providers installed, OpenAI API key.
Configure chunk limits in
src/constants.py:Configure Lightspeed Stack - add BYOK RAG sources and RAG strategy to
lightspeed-stack.yaml:Run enrichment script - reads lightspeed-stack config and generates an enriched llama-stack
run.yamlwith BYOK vector stores and Solr provider registered:Install Lightspeed Stack Providers:
Start Llama Stack:
Start Lightspeed Stack:
Test query — send to
/v1/queryor streaming endpoint:Response:
{ "conversation_id": "...", "response": "Admission plugins in Red Hat OpenShift Container Platform are [...]", "rag_chunks": [ { "content": "...", "source": "openshift-docs-part1", "score": 1.0038, "attributes": { "doc_url": "https://www.redhat.com/data/architecture/admission-plug-ins.txt", "title": "Admission plugins", "document_id": "file-7a70ef22c4a646f2a6f657c66961ba2c" } }, { "content": "...", "source": "openshift-docs-part2", "score": 0.926, "attributes": { "doc_url": "https://www.redhat.com/web_console/dynamic-plugin/overview-dynamic-plugin.txt", "title": "Overview of dynamic plugins", "document_id": "file-b266f575a95a4da19d7ba058fd980f00" } }, { "content": "...", "source": "OKP Solr", "score": 63.996, "attributes": { "document_id": "/documentation/en-us/openshift_container_platform/4.19/html-single/architecture/index" } } ], "referenced_documents": [ { "doc_title": "Admission plugins", "doc_url": "https://www.redhat.com/data/architecture/admission-plug-ins.txt", "source": "openshift-docs-part1" }, { "doc_title": "Overview of dynamic plugins", "doc_url": "https://www.redhat.com/web_console/dynamic-plugin/overview-dynamic-plugin.txt", "source": "openshift-docs-part2" }, { "doc_title": null, "doc_url": "https://mimir.corp.redhat.com/documentation/en-us/openshift_container_platform/4.19/html-single/architecture/index", "source": "OKP Solr" } ], "truncated": false, "input_tokens": 3736, "output_tokens": 448 }