Skip to content

Improve built-in test used during system startup#1544

Open
adamoutler wants to merge 11 commits intonetalertx:mainfrom
adamoutler:built-in-tests
Open

Improve built-in test used during system startup#1544
adamoutler wants to merge 11 commits intonetalertx:mainfrom
adamoutler:built-in-tests

Conversation

@adamoutler
Copy link
Member

@adamoutler adamoutler commented Mar 4, 2026

This PR contains several improvements for online BIT testing.

  • Fix healthcheck for non-0.0.0.0:

    • Updated the healthcheck to pass as long as the service is reachable, improving compatibility with different networking configurations.
  • Startup Script Robustness:

    • Improved entrypoint.sh to use stricter matching for SKIP_STARTUP_CHECKS, preventing accidental skipping of scripts with similar names (e.g., matching "foo" when "foobar" was intended).
  • ARP Flux & Sysctl Handling:

    • Changed approach to detect rather than modify sysctls. The container now warns if ARP flux settings (arp_ignore, arp_announce) are suboptimal but leaves modification to the user via Docker/Compose configuration.
    • Added detailed documentation and remediation steps to arp-flux-sysctls.md.
  • Configuration & Plugin Logic:

    • Fixed a bug in 36-override-individual-settings.sh where the sed delimiter collision caused LOADED_PLUGINS overrides to fail if the value contained |.
  • Test Stability:

    • Fixed test_devices_totals in test_devices_endpoints.py to gracefully handle cases where no device conditions exist, preventing IndexError.
    • Reduced timeout for normal mode test checks to expedite testing.

Summary by CodeRabbit

  • Configuration

    • Added ARP-flux sysctl settings to Docker Compose and deployment compose files to improve network behavior.
  • Documentation

    • New troubleshooting guide explaining ARP flux, risks, and remediation steps with examples.
  • Improvements

    • Startup now validates ARP sysctls and emits a concise warning if unset; port health checks use TCP connectivity and startup skip matching is more robust.
  • Tests

    • Expanded tests covering startup checks, sysctl warnings, Docker compose scenarios, and API/device flows.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 4, 2026

📝 Walkthrough

Walkthrough

Adds ARP‑flux sysctls to Compose files and docs, changes entrypoint scripts to validate (not apply) those sysctls and emit a concise warning if missing, updates healthcheck to use TCP, and adjusts tests to tolerate/verify the new sysctl-warning and timing behavior.

Changes

Cohort / File(s) Summary
Docker Compose configs
docker-compose.yml, install/docker/docker-compose.yml, install/docker/docker-compose.dev.yml
Added sysctls block to the netalertx service setting net.ipv4.conf.all.arp_ignore: 1 and net.ipv4.conf.all.arp_announce: 2.
Documentation
docs/DOCKER_COMPOSE.md, docs/docker-troubleshooting/arp-flux-sysctls.md, mkdocs.yml
Added ARP‑flux troubleshooting doc and linked it into docs navigation; documents remediation via docker-compose and docker run.
Entrypoint scripts
install/production-filesystem/entrypoint.d/36-override-individual-settings.sh, install/production-filesystem/entrypoint.d/37-host-optimization.sh, install/production-filesystem/entrypoint.sh
Removed helper and apply logic; replaced proactive sysctl application with validation-only checks that emit a concise warning if values differ; changed startup-skip detection to fixed-string word matching (grep -wFq).
Health check
install/production-filesystem/services/healthcheck.sh
Replaced HTTP content check (curl+grep) with a TCP connectivity check via /dev/tcp (10s timeout); updated success/failure messages.
Tests — API & MCP
test/api_endpoints/test_devices_endpoints.py, test/api_endpoints/test_mcp_tools_endpoints.py
Added create_dummy/delete_dummy helpers; refactored tests to use try/finally cleanup; moved some tests to real API interactions and added MCP traceroute tests.
Tests — Docker / entrypoint
test/docker_tests/test_docker_compose_scenarios.py, test/docker_tests/test_entrypoint.py
Extended port-check hook signature with timing params and added SKIP_STARTUP_CHECKS env support; tests now allow ARP‑flux warning and include a test asserting entrypoint warning matches host sysctl state.

Possibly related PRs

Poem

🐰
I twitch my nose at ARP by night,
I set no nets, I only write,
If hosts are right I sit and hum,
If not — I warn and beat my drum,
A careful rabbit guarding the byte.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Improve built-in test used during system startup' accurately captures the main focus of the PR, which involves improvements to startup behavior, health checks, and built-in tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this changed? was tehre an issue with the original implementation? it was written to accomodate for future expansion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it was a pointless function to set a single variable. It can be reintroduced if there is more than one variable to be set. Right now it makes it harder to read.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
test/api_endpoints/test_devices_endpoints.py (1)

198-203: Assert the favorite-update API response before validating downstream state.

A quick assertion here improves failure locality when the update call fails or is rejected.

Proposed fix
-        client.post(
+        update_resp = client.post(
             f"/device/{test_mac}",
             json={"devFavorite": 1},
             headers=auth_headers(api_token)
         )
+        assert update_resp.status_code == 200
+        assert update_resp.json.get("success") is True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/api_endpoints/test_devices_endpoints.py` around lines 198 - 203, The
test makes a POST to "/device/{test_mac}" to update devFavorite but doesn't
assert that call succeeded before fetching downstream state; update the test to
capture the POST response (client.post(...)) into a variable (e.g., resp_update)
and assert its status_code (and optional response body/JSON) indicates success
before calling client.get("/devices/by-status?...") and assigning resp_fav so
failures are localized to the update step; reference the existing client.post
call and resp_fav usage when adding the assertion.
test/api_endpoints/test_mcp_tools_endpoints.py (1)

379-382: Replace the magic minimum-length assertion with condition-derived length.

len(data) >= 4 is too loose and can miss regressions. Assert against the actual number of totals conditions.

Proposed fix
 from api_server.api_server_start import app
 from helper import get_setting_value
+from db.db_helper import get_device_conditions
@@
-        # Should return device counts as array
+        # Should return device counts array with one entry per condition
         assert isinstance(data, list)
-        assert len(data) >= 4  # At least online, offline, etc.
+        assert len(data) == len(get_device_conditions())
         assert data[0] >= 1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/api_endpoints/test_mcp_tools_endpoints.py` around lines 379 - 382,
Replace the loose magic check "assert len(data) >= 4" with an exact length
assertion derived from the source of truth for totals conditions; e.g. import or
call the totals conditions used by the endpoint (use the constant or function
name from the code under test such as TOTALS_CONDITIONS or
get_totals_conditions()) and assert "assert len(data) == len(TOTALS_CONDITIONS)"
(or "== len(get_totals_conditions())"), keeping the other assertions (isinstance
and data[0] >= 1) intact.
install/production-filesystem/services/healthcheck.sh (1)

54-54: Use a probe timeout below Docker healthcheck timeout to avoid edge-timeout races.

Line 54 uses timeout 10, which matches Docker's --timeout=10s (line 245 of Dockerfile). When both timeouts are equal, a slow probe (e.g., 9.5s) risks Docker's timeout firing before the script exits gracefully. Lower the script timeout to 8s to create a reliable safety margin.

Suggested change
-if timeout 10 bash -c "</dev/tcp/${CHECK_ADDR}/${PORT:-20211}" 2>/dev/null; then
+if timeout 8 bash -c "</dev/tcp/${CHECK_ADDR}/${PORT:-20211}" 2>/dev/null; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@install/production-filesystem/services/healthcheck.sh` at line 54, The probe
timeout in the healthcheck script uses `timeout 10` which matches Docker's
--timeout=10s and can cause race conditions; update the invocation that checks
the TCP socket (the line using `timeout 10 bash -c
"</dev/tcp/${CHECK_ADDR}/${PORT:-20211}"`) to use `timeout 8` instead so the
script exits faster than Docker's timeout and avoids edge-timeout races.
test/docker_tests/test_docker_compose_scenarios.py (1)

351-351: Avoid defaulting SKIP_STARTUP_CHECKS in the shared compose helper.

Line 351 applies the skip to every normal-startup scenario, which reduces coverage of host-optimization checks unless each caller explicitly overrides it.

Suggested change
-    service_env.setdefault("SKIP_STARTUP_CHECKS", "host optimization")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/docker_tests/test_docker_compose_scenarios.py` at line 351, Remove the
global defaulting of SKIP_STARTUP_CHECKS in the shared compose helper by
deleting the setdefault call (service_env.setdefault("SKIP_STARTUP_CHECKS",
"host optimization")); instead, have the helper leave SKIP_STARTUP_CHECKS unset
and require callers that need to skip host-optimization checks to explicitly set
service_env["SKIP_STARTUP_CHECKS"] = "host optimization" before invoking the
helper (or pass an explicit env override argument), and update any tests that
relied on the implicit default to explicitly opt into skipping.
docs/DOCKER_COMPOSE.md (1)

33-35: Add a direct troubleshooting pointer next to this snippet.

A short inline note linking to docker-troubleshooting/arp-flux-sysctls.md here would reduce context-switching for users applying these settings.

Doc polish suggestion
-    sysctls:                                        # ARP flux mitigation (reduces duplicate/ambiguous ARP behavior on host networking)
+    sysctls:                                        # ARP flux mitigation (see docker-troubleshooting/arp-flux-sysctls.md)
       net.ipv4.conf.all.arp_ignore: 1
       net.ipv4.conf.all.arp_announce: 2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/DOCKER_COMPOSE.md` around lines 33 - 35, The sysctls snippet (sysctls
block with net.ipv4.conf.all.arp_ignore and net.ipv4.conf.all.arp_announce)
needs a one-line troubleshooting pointer—add a short inline note directly below
the snippet that links to the ARP flux troubleshooting doc (arp-flux-sysctls.md)
and briefly describes when to use it (e.g., "If you see duplicate/ambiguous ARP
behavior, see arp-flux-sysctls.md for troubleshooting steps"). Ensure the link
is a relative markdown link and the note is concise and adjacent to the sysctls
example.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/docker-troubleshooting/arp-flux-sysctls.md`:
- Around line 37-41: The docker run example currently references the
non-canonical image name "jokob-sk/netalertx:latest"; update that example to use
the repository's canonical image/tag used elsewhere in this PR (replace
"jokob-sk/netalertx:latest" with the canonical image string used in the compose
files) so examples are consistent and will pull the intended image, keeping the
rest of the command (the --sysctl flags and args) unchanged.

In `@install/docker/docker-compose.dev.yml`:
- Around line 16-18: The docker-compose service sets network_mode: host which
forbids net.* sysctls, so remove the sysctls block (the keys
net.ipv4.conf.all.arp_ignore and net.ipv4.conf.all.arp_announce) from the
Compose service or alternatively stop using network_mode: host and revert to
bridge networking; if you must keep host networking, move those sysctl settings
to the host OS (e.g., /etc/sysctl.d/) rather than declaring them in
docker-compose.

In `@test/api_endpoints/test_mcp_tools_endpoints.py`:
- Around line 25-38: The helpers create_dummy and delete_dummy currently ignore
API responses; update them to validate the client responses from client.post and
client.delete (in create_dummy and delete_dummy respectively) by asserting on
expected status codes (e.g., 200/201 or 204) and raising or failing the test
with the response body when the status is unexpected; also return the response
object so callers can inspect it if needed and continue to use
auth_headers(api_token) for auth in both functions.

In `@test/docker_tests/test_docker_compose_scenarios.py`:
- Around line 899-906: The current assertion allows lines that contain
allowed_warning to also include unexpected messages; update the checks on
default_output so that for each line you assert either the line contains no "⚠️"
or, if it contains allowed_warning, it does not also contain any unexpected
tokens like "CRITICAL" or "Write permission denied". Concretely, adjust the
generator that currently iterates over default_output.splitlines() (using
allowed_warning and default_output) to: for each line, ensure "Write permission
denied" and "CRITICAL" are not in the line, and if allowed_warning in line then
assert that none of those unexpected substrings appear in that same line (i.e.,
disallow mixed-content lines containing allowed_warning plus other error/warning
markers).

In `@test/docker_tests/test_entrypoint.py`:
- Around line 99-107: The test invokes subprocess.run with the "sysctl"
executable for ignore_proc and announce_proc which triggers Ruff S607; use
shutil.which to resolve the absolute path to sysctl before calling
subprocess.run, and if shutil.which returns None gracefully skip the test (e.g.,
pytest.skip) instead of calling subprocess.run; update the calls that reference
"sysctl" (the subprocess.run invocations that set ignore_proc and announce_proc)
to use the resolved path variable and add a short guard that skips when
sysctl_path is falsy.

---

Nitpick comments:
In `@docs/DOCKER_COMPOSE.md`:
- Around line 33-35: The sysctls snippet (sysctls block with
net.ipv4.conf.all.arp_ignore and net.ipv4.conf.all.arp_announce) needs a
one-line troubleshooting pointer—add a short inline note directly below the
snippet that links to the ARP flux troubleshooting doc (arp-flux-sysctls.md) and
briefly describes when to use it (e.g., "If you see duplicate/ambiguous ARP
behavior, see arp-flux-sysctls.md for troubleshooting steps"). Ensure the link
is a relative markdown link and the note is concise and adjacent to the sysctls
example.

In `@install/production-filesystem/services/healthcheck.sh`:
- Line 54: The probe timeout in the healthcheck script uses `timeout 10` which
matches Docker's --timeout=10s and can cause race conditions; update the
invocation that checks the TCP socket (the line using `timeout 10 bash -c
"</dev/tcp/${CHECK_ADDR}/${PORT:-20211}"`) to use `timeout 8` instead so the
script exits faster than Docker's timeout and avoids edge-timeout races.

In `@test/api_endpoints/test_devices_endpoints.py`:
- Around line 198-203: The test makes a POST to "/device/{test_mac}" to update
devFavorite but doesn't assert that call succeeded before fetching downstream
state; update the test to capture the POST response (client.post(...)) into a
variable (e.g., resp_update) and assert its status_code (and optional response
body/JSON) indicates success before calling client.get("/devices/by-status?...")
and assigning resp_fav so failures are localized to the update step; reference
the existing client.post call and resp_fav usage when adding the assertion.

In `@test/api_endpoints/test_mcp_tools_endpoints.py`:
- Around line 379-382: Replace the loose magic check "assert len(data) >= 4"
with an exact length assertion derived from the source of truth for totals
conditions; e.g. import or call the totals conditions used by the endpoint (use
the constant or function name from the code under test such as TOTALS_CONDITIONS
or get_totals_conditions()) and assert "assert len(data) ==
len(TOTALS_CONDITIONS)" (or "== len(get_totals_conditions())"), keeping the
other assertions (isinstance and data[0] >= 1) intact.

In `@test/docker_tests/test_docker_compose_scenarios.py`:
- Line 351: Remove the global defaulting of SKIP_STARTUP_CHECKS in the shared
compose helper by deleting the setdefault call
(service_env.setdefault("SKIP_STARTUP_CHECKS", "host optimization")); instead,
have the helper leave SKIP_STARTUP_CHECKS unset and require callers that need to
skip host-optimization checks to explicitly set
service_env["SKIP_STARTUP_CHECKS"] = "host optimization" before invoking the
helper (or pass an explicit env override argument), and update any tests that
relied on the implicit default to explicitly opt into skipping.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba26f34 and 5c0f29b.

📒 Files selected for processing (14)
  • docker-compose.yml
  • docs/DOCKER_COMPOSE.md
  • docs/docker-troubleshooting/arp-flux-sysctls.md
  • install/docker/docker-compose.dev.yml
  • install/docker/docker-compose.yml
  • install/production-filesystem/entrypoint.d/36-override-individual-settings.sh
  • install/production-filesystem/entrypoint.d/37-host-optimization.sh
  • install/production-filesystem/entrypoint.sh
  • install/production-filesystem/services/healthcheck.sh
  • mkdocs.yml
  • test/api_endpoints/test_devices_endpoints.py
  • test/api_endpoints/test_mcp_tools_endpoints.py
  • test/docker_tests/test_docker_compose_scenarios.py
  • test/docker_tests/test_entrypoint.py

adamoutler and others added 2 commits March 3, 2026 23:20
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/docker_tests/test_entrypoint.py (1)

8-10: ⚠️ Potential issue | 🔴 Critical

Fix merge artifacts causing parse failure and missing import.

The file has three blocking issues:

  1. Missing import shutil at the top (line 8-10) – used at line 101 via shutil.which(), will cause NameError
  2. Duplicate function definition (lines 97-98) – both defining test_host_optimization_warning_matches_sysctl()
  3. Orphaned indented arguments block (lines 119-123) – duplicate capture_output, text, check, timeout appearing after the announce_proc subprocess call already has its closing parenthesis, causing IndentationError: unexpected indent
Proposed fix
 import subprocess
+import shutil
 import uuid
 import pytest
@@
 `@pytest.mark.docker`
 `@pytest.mark.feature_complete`
 def test_host_optimization_warning_matches_sysctl():
     """Validate host-optimization warning matches actual host sysctl values."""
-def test_host_optimization_warning_matches_sysctl():
-    """Validate host-optimization warning matches actual host sysctl values."""
     sysctl_bin = shutil.which("sysctl")
     if not sysctl_bin:
         pytest.skip("sysctl binary not found on host; skipping host-optimization warning check")
@@
     announce_proc = subprocess.run(
         [sysctl_bin, "-n", "net.ipv4.conf.all.arp_announce"],
         capture_output=True,
         text=True,
         check=False,
         timeout=10,
     )
-        capture_output=True,
-        text=True,
-        check=False,
-        timeout=10,
-    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/docker_tests/test_entrypoint.py` around lines 8 - 10, Add the missing
import shutil at the top of the file, remove the duplicate test function
definition for test_host_optimization_warning_matches_sysctl leaving a single
correct implementation, and fix the orphaned indented argument block after the
announce_proc subprocess call by either removing the stray indented lines or
moving the arguments (capture_output, text, check, timeout) into the
announce_proc invocation so the subprocess.run call is syntactically correct.
🧹 Nitpick comments (1)
docs/docker-troubleshooting/arp-flux-sysctls.md (1)

42-45: Consider adding a host-network namespace side-effect warning.

Suggest one short sentence noting that with network_mode: host and NET_ADMIN, these sysctl changes can apply to the host namespace directly.

Based on learnings: When a Docker Compose service uses network_mode: host and also has NET_ADMIN capability, sysctls like net.ipv4.conf.all.arp_ignore and net.ipv4.conf.all.arp_announce are accepted in practice and modify the host's network namespace directly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/docker-troubleshooting/arp-flux-sysctls.md` around lines 42 - 45, Add a
one-sentence warning after the paragraph about using --privileged or
--cap-add=NET_ADMIN indicating that if a container uses network_mode: host (or
docker-compose network_mode: host) while also granting NET_ADMIN, sysctl changes
like net.ipv4.conf.all.arp_ignore and net.ipv4.conf.all.arp_announce may be
applied to and modify the host network namespace directly; reference the sysctl
names and the network_mode: host + NET_ADMIN combination so readers can locate
and understand the side effect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/docker-troubleshooting/arp-flux-sysctls.md`:
- Around line 35-42: The fenced bash code block that starts with "```bash" and
contains the docker run command (including the --sysctl
net.ipv4.conf.all.arp_ignore=1 and --sysctl net.ipv4.conf.all.arp_announce=2
lines) is not closed; add a closing "```" immediately after the
ghcr.io/netalertx/netalertx:latest line so the following Note block renders as
normal Markdown outside the code fence.

---

Outside diff comments:
In `@test/docker_tests/test_entrypoint.py`:
- Around line 8-10: Add the missing import shutil at the top of the file, remove
the duplicate test function definition for
test_host_optimization_warning_matches_sysctl leaving a single correct
implementation, and fix the orphaned indented argument block after the
announce_proc subprocess call by either removing the stray indented lines or
moving the arguments (capture_output, text, check, timeout) into the
announce_proc invocation so the subprocess.run call is syntactically correct.

---

Nitpick comments:
In `@docs/docker-troubleshooting/arp-flux-sysctls.md`:
- Around line 42-45: Add a one-sentence warning after the paragraph about using
--privileged or --cap-add=NET_ADMIN indicating that if a container uses
network_mode: host (or docker-compose network_mode: host) while also granting
NET_ADMIN, sysctl changes like net.ipv4.conf.all.arp_ignore and
net.ipv4.conf.all.arp_announce may be applied to and modify the host network
namespace directly; reference the sysctl names and the network_mode: host +
NET_ADMIN combination so readers can locate and understand the side effect.

ℹ️ Review info
Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 70bffff2-44ca-42bf-8144-25ef2d1a0659

📥 Commits

Reviewing files that changed from the base of the PR and between 5c0f29b and a532c98.

📒 Files selected for processing (2)
  • docs/docker-troubleshooting/arp-flux-sysctls.md
  • test/docker_tests/test_entrypoint.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/api_endpoints/test_devices_endpoints.py (1)

46-48: Add response validation to delete_dummy for consistency with test_mcp_tools_endpoints.py.

This helper silently discards the API response, which could mask cleanup failures and lead to test pollution. The same helper in test_mcp_tools_endpoints.py (lines 41-47) correctly validates the response status.

Proposed fix to add response validation
 def delete_dummy(client, api_token, test_mac):
-    client.delete("/devices", json={"macs": [test_mac]}, headers=auth_headers(api_token))
+    response = client.delete("/devices", json={"macs": [test_mac]}, headers=auth_headers(api_token))
+    assert response.status_code == 200, (
+        f"Expected status 200 for device deletion, got {response.status_code}. "
+        f"Response body: {response.get_data(as_text=True)}"
+    )
+    return response
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/api_endpoints/test_devices_endpoints.py` around lines 46 - 48, The
helper delete_dummy currently calls client.delete("/devices", json={"macs":
[test_mac]}, headers=auth_headers(api_token)) and discards the response; change
it to capture the response and assert the expected status (e.g.,
response.status_code == 200 or use response.raise_for_status()/assert
response.ok) to mirror the validation used in test_mcp_tools_endpoints.py so
cleanup failures surface; update the delete_dummy function to return or assert
on the response object from client.delete and reference delete_dummy and
auth_headers to locate the change.
docs/docker-troubleshooting/arp-flux-sysctls.md (1)

43-45: Add host-namespace impact caveat to the privilege note.

Line 43 explains permission failures, but it should also state that in some setups (notably host networking with NET_ADMIN), these sysctls can apply to the host network namespace.

📝 Suggested doc tweak
 > **Note:** Setting `net.ipv4.conf.all.arp_ignore` and `net.ipv4.conf.all.arp_announce` may fail with "operation not permitted" unless the container is run with elevated privileges. To resolve this, you can:
 > - Use `--privileged` with `docker run`.
 > - Use the more restrictive `--cap-add=NET_ADMIN` (or `cap_add: [NET_ADMIN]` in `docker-compose` service definitions) to allow the sysctls to be applied at runtime.
+> - Be aware: with host networking and sufficient capabilities, these values may be applied to the host network namespace.

Based on learnings: when network_mode: host is combined with NET_ADMIN, ARP sysctls can be accepted and modify the host namespace.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/docker-troubleshooting/arp-flux-sysctls.md` around lines 43 - 45, Add a
brief caveat to the privilege note stating that if a container is run with host
networking (e.g., network_mode: host or docker run --network=host) and NET_ADMIN
capability is granted, applying sysctls like net.ipv4.conf.all.arp_ignore and
net.ipv4.conf.all.arp_announce may succeed but will affect the host network
namespace; reference the specific sysctls (net.ipv4.conf.all.arp_ignore,
net.ipv4.conf.all.arp_announce) and the flags (--privileged, --cap-add=NET_ADMIN
/ cap_add: [NET_ADMIN], network_mode: host) so readers know this can modify host
settings and to avoid/unintentionally alter host network behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/docker_tests/test_entrypoint.py`:
- Around line 126-134: The test currently parses entrypoint output even when the
container run may have failed; add an explicit assertion that _run_entrypoint
succeeded by checking result.returncode == 0 (use the existing result variable
from _run_entrypoint) before combining/parsing result.stdout and result.stderr,
and include the return code and output in the assertion message to aid debugging
if the container run failed.

---

Nitpick comments:
In `@docs/docker-troubleshooting/arp-flux-sysctls.md`:
- Around line 43-45: Add a brief caveat to the privilege note stating that if a
container is run with host networking (e.g., network_mode: host or docker run
--network=host) and NET_ADMIN capability is granted, applying sysctls like
net.ipv4.conf.all.arp_ignore and net.ipv4.conf.all.arp_announce may succeed but
will affect the host network namespace; reference the specific sysctls
(net.ipv4.conf.all.arp_ignore, net.ipv4.conf.all.arp_announce) and the flags
(--privileged, --cap-add=NET_ADMIN / cap_add: [NET_ADMIN], network_mode: host)
so readers know this can modify host settings and to avoid/unintentionally alter
host network behavior.

In `@test/api_endpoints/test_devices_endpoints.py`:
- Around line 46-48: The helper delete_dummy currently calls
client.delete("/devices", json={"macs": [test_mac]},
headers=auth_headers(api_token)) and discards the response; change it to capture
the response and assert the expected status (e.g., response.status_code == 200
or use response.raise_for_status()/assert response.ok) to mirror the validation
used in test_mcp_tools_endpoints.py so cleanup failures surface; update the
delete_dummy function to return or assert on the response object from
client.delete and reference delete_dummy and auth_headers to locate the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e2db0b3a-c399-428a-8a37-f5a12be35ae7

📥 Commits

Reviewing files that changed from the base of the PR and between a532c98 and b854206.

📒 Files selected for processing (6)
  • docs/docker-troubleshooting/arp-flux-sysctls.md
  • install/production-filesystem/entrypoint.d/36-override-individual-settings.sh
  • install/production-filesystem/entrypoint.d/37-host-optimization.sh
  • test/api_endpoints/test_devices_endpoints.py
  • test/api_endpoints/test_mcp_tools_endpoints.py
  • test/docker_tests/test_entrypoint.py

Comment on lines +126 to +134
result = _run_entrypoint(check_only=True)
combined_output = result.stdout + result.stderr
warning_present = "WARNING: ARP flux sysctls are not set." in combined_output

assert warning_present == expected_warning, (
"host-optimization warning mismatch: "
f"arp_ignore={arp_ignore}, arp_announce={arp_announce}, "
f"expected_warning={expected_warning}, warning_present={warning_present}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Assert container run success before evaluating warning output.

Line 126 currently proceeds even if docker run fails, which can make this test pass/fail for the wrong reason. Add an explicit return-code assertion before parsing output.

✅ Suggested fix
     result = _run_entrypoint(check_only=True)
+    assert result.returncode == 0, (
+        f"entrypoint check-only run failed: rc={result.returncode}, "
+        f"stdout={result.stdout!r}, stderr={result.stderr!r}"
+    )
     combined_output = result.stdout + result.stderr
     warning_present = "WARNING: ARP flux sysctls are not set." in combined_output
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
result = _run_entrypoint(check_only=True)
combined_output = result.stdout + result.stderr
warning_present = "WARNING: ARP flux sysctls are not set." in combined_output
assert warning_present == expected_warning, (
"host-optimization warning mismatch: "
f"arp_ignore={arp_ignore}, arp_announce={arp_announce}, "
f"expected_warning={expected_warning}, warning_present={warning_present}"
)
result = _run_entrypoint(check_only=True)
assert result.returncode == 0, (
f"entrypoint check-only run failed: rc={result.returncode}, "
f"stdout={result.stdout!r}, stderr={result.stderr!r}"
)
combined_output = result.stdout + result.stderr
warning_present = "WARNING: ARP flux sysctls are not set." in combined_output
assert warning_present == expected_warning, (
"host-optimization warning mismatch: "
f"arp_ignore={arp_ignore}, arp_announce={arp_announce}, "
f"expected_warning={expected_warning}, warning_present={warning_present}"
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/docker_tests/test_entrypoint.py` around lines 126 - 134, The test
currently parses entrypoint output even when the container run may have failed;
add an explicit assertion that _run_entrypoint succeeded by checking
result.returncode == 0 (use the existing result variable from _run_entrypoint)
before combining/parsing result.stdout and result.stderr, and include the return
code and output in the assertion message to aid debugging if the container run
failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants