Skip to content

fix(test): resolve flaky ContentMapTest by using WAIT_FOR index policy (#34600)#34603

Merged
spbolton merged 4 commits intomainfrom
issue-34600-fix-flaky-contentmaptest-testgetrecycledrequest
Feb 12, 2026
Merged

fix(test): resolve flaky ContentMapTest by using WAIT_FOR index policy (#34600)#34603
spbolton merged 4 commits intomainfrom
issue-34600-fix-flaky-contentmaptest-testgetrecycledrequest

Conversation

@spbolton
Copy link
Contributor

@spbolton spbolton commented Feb 11, 2026

Summary

Fixes flaky tests ContentMapTest.testGetRecycledRequest and testGet_showCategories_AsAnonUser by addressing the root cause: findAllContent was using Elasticsearch which can have stale entries after content deletion.

Changes

  1. Fixed ESContentFactoryImpl.findAllCurrent to use database queries instead of Elasticsearch

    • Queries contentlet_version_info table directly for working contentlet inodes
    • Filters out null entries (content deleted but version info not yet cleaned up)
    • Database is the source of truth, ES index can have stale entries
  2. Added ContentletDataGen.findAllContent test utility

    • Provides a reliable DB-based method for test code
    • Same implementation approach as the production fix
  3. Deprecated ContentletAPI.findAllContent

    • Method is only suitable for tests, not recommended for production use
    • Added @Deprecated annotation across all API layers:
      • ContentletAPI.java (interface)
      • ESContentletAPIImpl.java (implementation)
      • ContentletAPIInterceptor.java
      • ContentletAPIPreHook.java
      • ContentletAPIPostHook.java
  4. Updated tests to use new test utility

    • ContentMapTest
    • ContentletIndexAPIImplTest
    • PushedAssetsAPITest
    • ContentletAPITest
  5. Added IndexPolicy.WAIT_FOR to ContentletDataGen destructive methods

    • archive(), delete(), destroy() now wait for ES index updates
    • Ensures index is consistent after content modifications in tests

Root Cause

The Elasticsearch index can have stale entries when content is deleted from the database but not yet removed from the ES index. By querying the contentlet_version_info table directly and then loading contentlets by inode, we ensure we only get valid, existing content.

Test plan

  • Run ContentMapTest.testGetRecycledRequest - passes consistently
  • Run ContentMapTest.testGet_showCategories_AsAnonUser - passes consistently
  • CI pipeline passes

🤖 Generated with Claude Code

Closes #34600

This PR fixes: #34600

@spbolton
Copy link
Contributor Author

@wezell In this case it seemed that the DataGen was actually using DEFER by default when removing. This actually triggered the code in com.dotcms.content.elasticsearch.business.ESContentFactoryImpl#findAllCurrent(int, int) to still find the removed items in search, but the db in the internal find() method in that class already saw the item removed. This lead to the result list containing null entries as the null result from find() was being added to the list. At the moment the test case has been updated to skip these null cases and we should now be ensuring that the search state is correctly updated anyway, but should we update the source of findAllCurrent to just skip null results from find()

Copy link
Contributor

@wezell wezell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the wrong fix for a few reasons.

  1. findAllContent should be using the db not ES
  2. If there are still null entries, these should be filtered out in the API and not in the test

Here is what the method should look like (this is untested but you get the idea):

        final long ultimateLimit = Math.min(limit, ESContentletAPIImpl.MAX_LIMIT);

        List<Contentlet> contentlets = new ArrayList<>();

        try {

            final DotConnect dotConnect = new DotConnect("select working_inode from contentlet_version_info limit " + limit + " offset " + offset);
            dotConnect.addParam(ultimateLimit);
            dotConnect.addParam(offset);
            List<Map<String, Object>> results = dotConnect.loadObjectResults();
            for (Map<String, Object> result : results) {
                contentlets.add(find((String) result.get("working_inode")));
            }
        }catch(Exception e){
            throw new DotRuntimeException(e);
        }

        return contentlets;
****

        return contentlets;

Copy link
Contributor

@wezell wezell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it does not look like findAllContent method is being used anywhere except in tests? Am I right about that? If so, let's depreciate it as it can be a server killer

@spbolton spbolton force-pushed the issue-34600-fix-flaky-contentmaptest-testgetrecycledrequest branch 3 times, most recently from bd8110d to 266c006 Compare February 11, 2026 20:29
Copy link
Contributor

@wezell wezell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks.

@spbolton spbolton enabled auto-merge February 11, 2026 20:35
spbolton and others added 4 commits February 12, 2026 10:39
…#34600)

- Change ESContentFactoryImpl.findAllCurrent to query DB (contentlet_version_info)
  instead of Elasticsearch, which can have stale entries
- Filter null entries in the API layer since find() can return null
- Fix testGet_showCategories_AsAnonUser by ensuring permission propagation
- Update ContentletDataGen destructive methods (archive, delete, destroy) to use
  IndexPolicy.WAIT_FOR for better test reliability

Root cause: The ES index can have stale entries when content is deleted from DB
but not yet removed from ES. Using the DB as source of truth eliminates this
race condition.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a test-specific findAllContent method to ContentletDataGen that
queries the database directly. This provides a reliable alternative
to the deprecated ContentletAPI.findAllContent for test code.

Updates tests to use the new ContentletDataGen.findAllContent:
- ContentMapTest
- ContentletIndexAPIImplTest
- PushedAssetsAPITest
- ContentletAPITest

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ility

Update all findAllContent method deprecation Javadocs across the ContentletAPI
stack to clearly state:
- Method should not be used (DO NOT USE)
- For tests, use ContentletDataGen.findAllContent(offset, limit) instead

Updated files:
- ContentletAPI.java - Main interface with detailed deprecation message
- ESContentletAPIImpl.java - Implementation
- ContentletAPIInterceptor.java - Interceptor
- ContentletAPIPreHook.java - Pre-hook interface
- ContentletAPIPostHook.java - Post-hook interface

Also fixed typo: "0 of no limit" -> "0 if no limit"

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@spbolton spbolton force-pushed the issue-34600-fix-flaky-contentmaptest-testgetrecycledrequest branch from 266c006 to 03ddaa3 Compare February 12, 2026 10:39
Copy link
Member

@dcolina dcolina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for it.

@spbolton spbolton added this pull request to the merge queue Feb 12, 2026
Merged via the queue into main with commit 1c15791 Feb 12, 2026
38 checks passed
@spbolton spbolton deleted the issue-34600-fix-flaky-contentmaptest-testgetrecycledrequest branch February 12, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Fix flaky ContentMapTest - testGetRecycledRequest NPE and testGet_showCategories_AsAnonUser intermittent failures

3 participants