feat(skills): add synonym expansion to gws skills search#514
feat(skills): add synonym expansion to gws skills search#514dumko2001 wants to merge 10 commits intogoogleworkspace:mainfrom
Conversation
…x .gitignore formatting
…in src/registry.rs
Add explicit --help/-h handling to `handle_skills_command` so that `gws skills --help` and `gws skills search --help` print a proper help screen instead of treating the flag as a search query. Keeps the same manual dispatch pattern as all other top-level commands in main.rs rather than introducing a full clap subcommand tree for a single new command.
Per CLI convention, invoking a subcommand with no arguments should display usage/help rather than return an error. Since 'gws skills --help', 'gws skills', and 'gws skills <unknown-subcommand>' all indicate the user wants guidance, route all non-search invocations to print_skills_help() and return Ok(()).
Replace single-string substring match with a token-AND approach: split the query into individual tokens and require all tokens to appear somewhere in the combined fields (name + description + aliases). Previously, `gws skills search send email` built the query string "send email" and required that exact contiguous phrase to appear in a field. This failed for descriptions like "Gmail: Send an email" where the words are present but not adjacent. With token-AND matching, both tokens must match anywhere across the combined text, which is the correct behavior for natural-language queries.
Add a static SYNONYMS table mapping 30+ natural-language terms to their canonical service names. When a query token matches a synonym entry it is expanded to also include the canonical form, so: gws skills search email → finds Gmail (email → gmail) gws skills search spreadsheet → finds Sheets (spreadsheet → sheets) gws skills search schedule → finds Calendar (schedule → calendar) gws skills search document → finds Docs (document → docs) The matching logic uses token-AND with synonym-OR per token: every original query token must be satisfied, but it is satisfied if any of its synonym expansions appears in the skill's fields. An exact match on the canonical name still works as before. Adds expand_tokens() as a testable helper with 8 unit tests covering no-synonym pass-through, multi-word expansion, deduplication, and table integrity checks.
🦋 Changeset detectedLatest commit: e11a98b The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the discoverability of skills within the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a synonym expansion feature for the new gws skills search command, making it easier for users to discover skills with natural language queries. It also includes a significant refactoring that moves generated API and helper skills into a references/ subdirectory, improving the project's structure. The implementation is generally strong, with good test coverage for the new search logic.
I've identified one high-severity issue related to how search query arguments are parsed. Quoted queries are not correctly tokenized, which can lead to failed searches. I've provided a code suggestion to address this. Otherwise, the changes, including the large-scale refactoring and the addition of a link-checking script, are well-executed.
| // Split into individual tokens so multi-word queries like "send email" match | ||
| // descriptions where the words appear separately (e.g. "Send an email"). | ||
| // Then expand each token via the synonym table so "email" also matches "gmail". | ||
| let raw_tokens: Vec<String> = args[1..].iter().map(|a| a.to_lowercase()).collect(); |
There was a problem hiding this comment.
The current implementation for parsing search tokens doesn't correctly handle quoted multi-word queries. For example, a command like gws skills search "send email" would incorrectly treat "send email" as a single token instead of two separate tokens, send and email. This leads to incorrect search behavior as it would search for the literal phrase rather than individual keywords.
To ensure both quoted and unquoted queries are handled correctly, you should first join all arguments into a single string and then split that string by whitespace.
| let raw_tokens: Vec<String> = args[1..].iter().map(|a| a.to_lowercase()).collect(); | |
| let raw_tokens: Vec<String> = args[1..].join(" ").split_whitespace().map(|a| a.to_lowercase()).collect(); |
Description
Extends
gws skills search(introduced in #507) with a static synonym expansion table so agents don't need to know exact API names to find the right skill.Problem:
gws skills search emailreturned no results because "email" does not appear literally in the Gmail service'sapi_name("gmail") oraliases. Agents using natural language to discover skills were forced to guess canonical names.Solution: A static
SYNONYMStable maps 30+ common terms to their canonical service names. Each query token is expanded before matching using token-AND / synonym-OR semantics — every token must match, but it is satisfied if any of its expansions appears in the skill's fields.emailemail→gmail)spreadsheetspreadsheet→sheets)scheduleschedule→calendar)documentdocument→docs)presentationpresentation→slides)send emailgws-gmail-sendhelperupload filegws-drive-uploadhelperDry Run Output:
(This is a local discovery command — no JSON API request is produced, so no
--dry-runJSON applies.)Checklist:
AGENTS.mdguidelines (no generatedgoogle-*crates).cargo fmt --allto format the code perfectly.cargo clippy -- -D warningsand resolved all warnings.pnpx changeset) to document my changes.