doc: add serverless doc with keda and activator.#499
doc: add serverless doc with keda and activator.#499X1aoZEOuO wants to merge 2 commits intoInftyAI:mainfrom
Conversation
|
/kind feature |
|
/kind documentation |
|
|
||
| ```bash | ||
| helm install llmaz oci://registry-1.docker.io/inftyai/llmaz --namespace llmaz-system --create-namespace --version 0.0.10 | ||
| make install-keda |
kerthcet
left a comment
There was a problem hiding this comment.
Please explain the relationship between activator and keda at the very beginning. Thanks!
| any: true | ||
| selector: | ||
| matchLabels: | ||
| llmaz.io/model-name: qwen2-0--5b |
There was a problem hiding this comment.
I think we should not only capture this service right? Let's add some explanations here.
There was a problem hiding this comment.
Thanks! I've enhanced both the Prometheus and KEDA configuration sections with detailed explanations
@kerthcet Thank you for the feedback! I've added a new section "Relationship Between Activator and KEDA". This section now clearly explains:
|
Signed-off-by: X1aoZEOuO <nizefeng2002@outlook.com>
Signed-off-by: X1aoZEOuO <nizefeng2002@outlook.com>
354e4ee to
56fb1ac
Compare
|
@pacoxu @kerthcet Helm ci seemed failed because of network error, can we disable or ignore it now? https://github.com/InftyAI/llmaz/actions/runs/18917273800/job/54003859685?pr=499 |
|
/retest |
@kerthcet Hello, It seems that no space left on e2e test. https://github.com/InftyAI/llmaz/actions/runs/18917278030/job/54003874165?pr=500 [FAILED] in [It] - /home/runner/work/llmaz/llmaz/test/util/validation/validate_playground.go:219 @ 10/29/25 18:02:24.453
[FAILED] in [AfterEach] - /home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:50 @ 10/29/25 18:02:24.453
• [FAILED] [335.923 seconds]
playground e2e tests [It] SpeculativeDecoding with llama.cpp
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:145
[FAILED] Timed out after 335.612s.
Expected success, but got an error:
<*url.Error | 0xc000712900>:
Error: No space left on device : '/home/runner/actions-runner/cached/_diag/pages/7ae5050e-5137-471d-b700-9b1bd0d8553b_338ff102-8e76-46a1-a5ae-f669195390f6_1.log' |
@kerthcet And the helm install is not ready. https://github.com/InftyAI/llmaz/actions/runs/18917273800/job/54003859685?pr=499 Installing v3.17.3
Downloading 'v3.17.3' from 'https://get.helm.sh/'
Request timeout: /helm-v3.17.3-linux-amd64.tar.gz
Waiting 20 seconds before trying again
Request timeout: /helm-v3.17.3-linux-amd64.tar.gz
Waiting 14 seconds before trying again
Error: Error: Failed to download Helm from location https://get.helm.sh/helm-v3.17.3-linux-amd64.tar.gz |
|
/retest |
3 similar comments
|
/retest |
|
/retest |
|
/retest |
@pacoxu The CI issue appears to have popped up over the last couple of weeks, likely resulting from recent changes to the GitHub environment. I receive the issue and take a look about it in next PR. #498 (comment) |
|
/retest |
#508: see https://github.com/InftyAI/llmaz/actions/runs/18964011867/job/54157010420 /retest I opened kerthcet/github-workflow-as-kube#15 to fix the CI.(after that, llmaz should bump the workflow version) |
What this PR does / why we need it
This commit introduces a comprehensive guide for configuring serverless environments on Kubernetes, with a focus on integrating Prometheus for monitoring and KEDA for autoscaling. The guide aims to optimize resource efficiency through event-driven scaling while maintaining observability and resilience for AI/ML workloads and other latency-sensitive applications.
This commit adds a detailed guide for configuring serverless environments on Kubernetes, integrating Prometheus for monitoring and KEDA for autoscaling. The guide includes YAML configurations, step-by-step installation instructions, and performance benchmarks to help users achieve optimal resource efficiency and observability for their applications.
Which issue(s) this PR fixes
Fixes #362
Special notes for your reviewer
Does this PR introduce a user-facing change?
cc @pacoxu @kerthcet