BE-311: Implement entity deletion with multi-level scoping and provenance#8494
BE-311: Implement entity deletion with multi-level scoping and provenance#8494TimDiekmann wants to merge 9 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
PR SummaryMedium Risk Overview Adds Postgres-backed deletion execution that partitions matches into full vs draft-only targets, deletes dependent tables in FK-safe order, validates incoming-link constraints (based on Extends entity provenance with optional flattened Written by Cursor Bugbot for commit 46a1015. This will update automatically on new commits. Configure here. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8494 +/- ##
==========================================
- Coverage 63.03% 62.80% -0.23%
==========================================
Files 1302 1303 +1
Lines 132960 133446 +486
Branches 5509 5509
==========================================
+ Hits 83808 83811 +3
- Misses 48238 48721 +483
Partials 914 914
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🤖 Augment PR SummarySummary: This PR adds filter-based entity deletion to the Postgres graph store, supporting both soft deletion (tombstone) and hard deletion (erase), with draft-aware behavior and provenance stamping. Changes:
Technical Notes: Uses row-level locks ( 🤖 Was this summary useful? React with 👍 or 👎 |
libs/@local/graph/postgres-store/src/store/postgres/knowledge/entity/delete.rs
Outdated
Show resolved
Hide resolved
Merging this PR will not alter performance
Comparing Footnotes
|
…eduplicate draft IDs - Bound deletion impl to `PostgresStore<Transaction<'_>>` to enforce transactional locking at the type level - Add `FOR UPDATE` to `collect_entity_edition_ids` to serialize with concurrent `patch_entity` calls (`FOR NO KEY UPDATE NOWAIT` conflict) - Keep `lock_entity_ids_for_erase` for concurrent link creation (erase only) - Register `EntityDeletionProvenance` in OpenAPI schema components - Deduplicate draft IDs in `select_entities_for_deletion` to prevent inflated `DeletionSummary.draft_deletions` counts from join duplicates - Unify SQL string indentation
- Manual `ToSchema` impl for `InferredEntityProvenance` so utoipa
correctly represents flattened `Option<EntityDeletionProvenance>`
fields as optional instead of required.
- tsify patch module to generate a clean `interface` instead of a
complex type alias (`& (EntityDeletionProvenance | {})`) that
breaks `extends` in `EntityProvenance`.
- Use `DISTINCT ON` in `select_entities_for_deletion` via
`SelectCompiler::add_distinct_selection_with_ordering` instead
of manual draft ID deduplication.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
serde does not support deny_unknown_fields in combination with flatten. The nested flatten chain (EntityProvenance → InferredEntity Provenance → EntityDeletionProvenance) caused "unknown field createdById" during snapshot restore deserialization.
Benchmark results
|
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2002 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1001 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 3314 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 1526 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 2078 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 1033 | Flame Graph |
policy_resolution_medium
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 102 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 51 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 269 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 107 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 133 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 63 | Flame Graph |
policy_resolution_none
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 8 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 3 | Flame Graph |
policy_resolution_small
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 52 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 25 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 94 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 26 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 66 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 29 | Flame Graph |
read_scaling_complete
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id;one_depth | 1 entities | Flame Graph | |
| entity_by_id;one_depth | 10 entities | Flame Graph | |
| entity_by_id;one_depth | 25 entities | Flame Graph | |
| entity_by_id;one_depth | 5 entities | Flame Graph | |
| entity_by_id;one_depth | 50 entities | Flame Graph | |
| entity_by_id;two_depth | 1 entities | Flame Graph | |
| entity_by_id;two_depth | 10 entities | Flame Graph | |
| entity_by_id;two_depth | 25 entities | Flame Graph | |
| entity_by_id;two_depth | 5 entities | Flame Graph | |
| entity_by_id;two_depth | 50 entities | Flame Graph | |
| entity_by_id;zero_depth | 1 entities | Flame Graph | |
| entity_by_id;zero_depth | 10 entities | Flame Graph | |
| entity_by_id;zero_depth | 25 entities | Flame Graph | |
| entity_by_id;zero_depth | 5 entities | Flame Graph | |
| entity_by_id;zero_depth | 50 entities | Flame Graph |
read_scaling_linkless
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | 1 entities | Flame Graph | |
| entity_by_id | 10 entities | Flame Graph | |
| entity_by_id | 100 entities | Flame Graph | |
| entity_by_id | 1000 entities | Flame Graph | |
| entity_by_id | 10000 entities | Flame Graph |
representative_read_entity
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1
|
Flame Graph |
representative_read_entity_type
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| get_entity_type_by_id | Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba
|
Flame Graph |
representative_read_multiple_entities
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_property | traversal_paths=0 | 0 | |
| entity_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=0 | 0 | |
| link_by_source_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true |
scenarios
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| full_test | query-limited | Flame Graph | |
| full_test | query-unlimited | Flame Graph | |
| linked_queries | query-limited | Flame Graph | |
| linked_queries | query-unlimited | Flame Graph |
🌟 What is the purpose of this PR?
Implements filter-based entity deletion for the graph store with two scopes (Purge and Erase), draft/published partitioning, link constraint validation, temporal provenance, and concurrency-safe locking.
🔗 Related links
🔍 What does this change?
Core API
DeleteEntitiesParamsaccepts aFilter<Entity>,DeletionScope,LinkDeletionBehavior, optionaldecision_time, andinclude_draftsentity_idswithEntityDeletionProvenance(soft delete)entity_idsentirely (hard delete, no trace)select_entities_for_deletion+promote_draft_only_entitiesentity_is_of_type→entity_embeddings→entity_temporal_metadata→entity_editions→entity_drafts→entity_edge→entity_idsConcurrency safety
collect_entity_edition_idsacquiresFOR UPDATEonentity_temporal_metadata, serializing with concurrentpatch_entity(which usesFOR NO KEY UPDATE NOWAIT)lock_entity_ids_for_eraseacquiresFOR UPDATEonentity_ids(erase only), blocking concurrent link creation via FKKEY SHAREconflictPostgresStore<Transaction<'_>>— enforced at the type levelError variants
DeletionError::InvalidDecisionTime— decision time exceeds transaction timeDeletionError::IncomingLinksExist— incoming links block deletionDeletionError::InconsistentEntityIds— entity_ids row count mismatchDeletionError::Store— generic database failurePre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
🛡 What tests cover this?
53 integration tests across 5 modules:
❓ How to test this?
cargo test --package hash-graph-postgres-store --test deletion🐾 Next steps
PolicyComponents(TODO indelete_entities)ON DELETE CASCADEonentity_editionsFKs to simplify satellite cleanup for erase scope