Skip to content

[Repo Assist] perf: use bottom-up construction in Heap.ofSeq, extract mergeData helper#242

Draft
github-actions[bot] wants to merge 2 commits intomasterfrom
repo-assist/perf-heap-ofseq-2026-03-11-ccea0939034d3b00
Draft

[Repo Assist] perf: use bottom-up construction in Heap.ofSeq, extract mergeData helper#242
github-actions[bot] wants to merge 2 commits intomasterfrom
repo-assist/perf-heap-ofseq-2026-03-11-ccea0939034d3b00

Conversation

@github-actions
Copy link
Contributor

🤖 This PR was created by Repo Assist, an automated AI assistant.

Addresses the performance issue described in #166 ("Heap.Tail slow").

Root cause

The previous ofSeq built the heap by a sequential left-to-right fold, treating each new element as a merge with the current root. For sorted ascending input on a min-heap, every new element is larger than the root, so it is prepended as a direct child of the root. After inserting n elements this creates:

T(min, [T(n,[]);  T(n-1,[]); …; T(2,[])])
```

The root has O(n) direct children. The first call to `Tail()` (delete-min) must pair and merge all _n−1_ children, costing **O(n)** for that first call instead of the O(log n) amortised bound expected by users.

## Fix

Replace the sequential fold with **bottom-up heap construction**:

1. Create _n_ singleton heaps from the input array.
2. Repeatedly merge adjacent pairs; each pass halves the count.
3. After ⌈log₂ n⌉ passes the result is a single balanced heap of height O(log n).

The first `Tail()` on this balanced heap costs O(log n) regardless of input ordering.

This is the same bottom-up technique used for heapsort and efficient priority-queue initialisation.

## Additional refactor

The `mergeData` logic (merging two `HeapData` values) was duplicated as a local function inside `Tail()`. It is now extracted as a **private static helper** (`Heap.mergeData`) shared by both `ofSeq` and `Tail()`, and the existing `merge` wrapper delegates to it.

## Trade-offs

| Aspect | Before | After |
|--------|--------|-------|
| First `Tail()` on sorted input | O(n) | O(log n) |
| `ofSeq` allocation | O(n) list nodes | O(n) array slots + O(n) list nodes (same asymptote) |
| Streaming input | Yes (lazy fold) | No — `Array.ofSeq` materialises first |
| Code clarity | Inline `mergeData` duplicated | Shared helper |

The materialisation cost of `Array.ofSeq` is O(n) time and space — the same as the existing fold — so the net allocation is unchanged. For users who do not call `Tail()` (e.g., only `Head`/`Peek`), the build is equally fast.

## Test Status

All existing tests pass: **706 passed, 0 failed** (6 skipped, pre-existing).

```
Passed!  - Failed: 0, Passed: 706, Skipped: 6, Total: 712

Format check (dotnet fantomas --check) passes with no changes needed.

Closes #166

Generated by Repo Assist ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@346204513ecfa08b81566450d7d599556807389f

The previous ofSeq implementation built the heap by sequential left-to-right
merges. For sorted input (ascending for a min-heap), every new element was
appended as a direct child of the root, producing a root with O(n) children
after n insertions. The first Tail() call then had to merge all those children,
costing O(n) instead of the amortised O(log n) expected by users (see #166).

New approach: build n singleton heaps from the input array and repeatedly
merge adjacent pairs (bottom-up construction). Each pass halves the number of
heaps; log₂(n) passes produce a balanced heap whose height is O(log n). The
first Tail() on such a heap costs O(log n) regardless of the input ordering.

Also refactors the inline mergeData function (previously duplicated inside
Tail()) into a private static helper Heap.mergeData, used by both Tail() and
the new ofSeq, and simplifies the static merge wrapper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Heap.Tail slow

0 participants