Skip to content

feat(skills): add synonym expansion to gws skills search#514

Open
dumko2001 wants to merge 10 commits intogoogleworkspace:mainfrom
dumko2001:feature/skills-search-synonyms
Open

feat(skills): add synonym expansion to gws skills search#514
dumko2001 wants to merge 10 commits intogoogleworkspace:mainfrom
dumko2001:feature/skills-search-synonyms

Conversation

@dumko2001
Copy link

Description

Extends gws skills search (introduced in #507) with a static synonym expansion table so agents don't need to know exact API names to find the right skill.

Dependency: This PR builds on gws skills search from #507 and should be merged after that PR lands.

Problem: gws skills search email returned no results because "email" does not appear literally in the Gmail service's api_name ("gmail") or aliases. Agents using natural language to discover skills were forced to guess canonical names.

Solution: A static SYNONYMS table maps 30+ common terms to their canonical service names. Each query token is expanded before matching using token-AND / synonym-OR semantics — every token must match, but it is satisfied if any of its expansions appears in the skill's fields.

Search query Matches
email Gmail (emailgmail)
spreadsheet Sheets (spreadsheetsheets)
schedule Calendar (schedulecalendar)
document Docs (documentdocs)
presentation Slides (presentationslides)
send email gws-gmail-send helper
upload file gws-drive-upload helper

Dry Run Output:

$ gws skills search email
Searching for skills matching "email"...

[Service] gws-gmail - Send, read, and manage email
  Reference: skills/references/gws-gmail/SKILL.md

[Helper] gws-gmail-send - Send an email
  Reference: skills/references/gws-gmail-send/SKILL.md

[Helper] gws-gmail-triage - Show unread inbox summary (sender, subject, date)
  Reference: skills/references/gws-gmail-triage/SKILL.md

[Helper] gws-gmail-reply - Reply to a message (handles threading automatically)
  Reference: skills/references/gws-gmail-reply/SKILL.md

[Helper] gws-gmail-reply-all - Reply-all to a message (handles threading automatically)
  Reference: skills/references/gws-gmail-reply-all/SKILL.md

[Helper] gws-gmail-forward - Forward a message to new recipients
  Reference: skills/references/gws-gmail-forward/SKILL.md

[Helper] gws-gmail-watch - Watch for new emails and stream them as NDJSON
  Reference: skills/references/gws-gmail-watch/SKILL.md

[Recipe] recipe-label-and-archive-emails - Label and Archive Gmail Threads
  Description: Apply Gmail labels to matching messages and archive them to keep your inbox clean.
  Skill: skills/recipe-label-and-archive-emails/SKILL.md

Found 8 matching skills.

(This is a local discovery command — no JSON API request is produced, so no --dry-run JSON applies.)

Checklist:

  • My code follows the AGENTS.md guidelines (no generated google-* crates).
  • I have run cargo fmt --all to format the code perfectly.
  • I have run cargo clippy -- -D warnings and resolved all warnings.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have provided a Changeset file (e.g. via pnpx changeset) to document my changes.

Add explicit --help/-h handling to `handle_skills_command` so that
`gws skills --help` and `gws skills search --help` print a proper
help screen instead of treating the flag as a search query.

Keeps the same manual dispatch pattern as all other top-level commands
in main.rs rather than introducing a full clap subcommand tree for a
single new command.
Per CLI convention, invoking a subcommand with no arguments should display
usage/help rather than return an error. Since 'gws skills --help', 'gws skills',
and 'gws skills <unknown-subcommand>' all indicate the user wants guidance,
route all non-search invocations to print_skills_help() and return Ok(()).
Replace single-string substring match with a token-AND approach: split the
query into individual tokens and require all tokens to appear somewhere in
the combined fields (name + description + aliases).

Previously, `gws skills search send email` built the query string "send email"
and required that exact contiguous phrase to appear in a field. This failed
for descriptions like "Gmail: Send an email" where the words are present but
not adjacent. With token-AND matching, both tokens must match anywhere across
the combined text, which is the correct behavior for natural-language queries.
Add a static SYNONYMS table mapping 30+ natural-language terms to
their canonical service names. When a query token matches a synonym
entry it is expanded to also include the canonical form, so:

  gws skills search email       → finds Gmail (email → gmail)
  gws skills search spreadsheet → finds Sheets (spreadsheet → sheets)
  gws skills search schedule    → finds Calendar (schedule → calendar)
  gws skills search document    → finds Docs (document → docs)

The matching logic uses token-AND with synonym-OR per token: every
original query token must be satisfied, but it is satisfied if any of
its synonym expansions appears in the skill's fields. An exact match
on the canonical name still works as before.

Adds expand_tokens() as a testable helper with 8 unit tests covering
no-synonym pass-through, multi-word expansion, deduplication, and
table integrity checks.
@dumko2001 dumko2001 requested a review from jpoehnelt as a code owner March 16, 2026 12:24
@googleworkspace-bot googleworkspace-bot added area: skills area: docs area: core Core CLI parsing, commands, error handling, utilities labels Mar 16, 2026
@changeset-bot
Copy link

changeset-bot bot commented Mar 16, 2026

🦋 Changeset detected

Latest commit: e11a98b

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the discoverability of skills within the gws CLI by introducing natural language synonym expansion for the skills search command. It also reorganizes the underlying skill documentation into a more structured hierarchy, improving clarity and maintainability. These changes collectively make it easier for users to find and utilize the appropriate tools for their tasks.

Highlights

  • Synonym Expansion for Skill Search: Implemented a static synonym expansion table for gws skills search, allowing agents to use natural language terms (e.g., 'email' for Gmail, 'spreadsheet' for Sheets) to find relevant skills without needing exact API names. This uses 'token-AND / synonym-OR' semantics for matching.
  • Skill Documentation Restructuring: Restructured the skills/ directory into a hierarchical skills/references/ subdirectory for core API services and helpers. This change aims to prevent agent context pollution and improve organization.
  • Skill Generation and Link Validation Enhancements: Refactored the generate-skills logic to automate artifact generation, support filtering during generation, and update internal skill linking. A new Python script (check_links.py) was added to validate local markdown links within skill documentation.
  • New gws skills search Command: Introduced the gws skills search <query> command, providing a dedicated interface for discovering skills across services, helpers, personas, and recipes.
Changelog
  • .changeset/skill-optimization.md
    • Added a changeset file documenting the restructuring of skills into a hierarchical references/ subdirectory.
    • Added a changeset file documenting the new gws skills search <query> command for semantic/keyword discovery.
  • .changeset/skills-search-synonyms.md
    • Added a changeset file documenting the addition of synonym expansion to gws skills search.
Activity
  • The pull request introduces a new feature for skill search with synonym expansion.
  • It includes significant refactoring of skill generation and documentation structure.
  • A new Python script for link checking was added to ensure documentation integrity.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Generative AI Prohibited Use Policy, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a synonym expansion feature for the new gws skills search command, making it easier for users to discover skills with natural language queries. It also includes a significant refactoring that moves generated API and helper skills into a references/ subdirectory, improving the project's structure. The implementation is generally strong, with good test coverage for the new search logic.

I've identified one high-severity issue related to how search query arguments are parsed. Quoted queries are not correctly tokenized, which can lead to failed searches. I've provided a code suggestion to address this. Otherwise, the changes, including the large-scale refactoring and the addition of a link-checking script, are well-executed.

// Split into individual tokens so multi-word queries like "send email" match
// descriptions where the words appear separately (e.g. "Send an email").
// Then expand each token via the synonym table so "email" also matches "gmail".
let raw_tokens: Vec<String> = args[1..].iter().map(|a| a.to_lowercase()).collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation for parsing search tokens doesn't correctly handle quoted multi-word queries. For example, a command like gws skills search "send email" would incorrectly treat "send email" as a single token instead of two separate tokens, send and email. This leads to incorrect search behavior as it would search for the literal phrase rather than individual keywords.

To ensure both quoted and unquoted queries are handled correctly, you should first join all arguments into a single string and then split that string by whitespace.

Suggested change
let raw_tokens: Vec<String> = args[1..].iter().map(|a| a.to_lowercase()).collect();
let raw_tokens: Vec<String> = args[1..].join(" ").split_whitespace().map(|a| a.to_lowercase()).collect();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: core Core CLI parsing, commands, error handling, utilities area: docs area: skills

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants