feat(reporting): bound PDF compliance report memory and CPU#11160
Open
pedrooot wants to merge 3 commits into
Open
feat(reporting): bound PDF compliance report memory and CPU#11160pedrooot wants to merge 3 commits into
pedrooot wants to merge 3 commits into
Conversation
Contributor
|
✅ Conflict Markers Resolved All conflict markers have been successfully resolved in this pull request. |
Contributor
|
✅ All necessary |
| with _log_phase("failing_phase", scan_id="s-2", framework="FW"): | ||
| raise RuntimeError("boom") | ||
|
|
||
| messages = [r.getMessage() for r in caplog.records] |
jfagoagas
requested changes
May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
scan-compliance-reportsCelery task generates 5 PDFs per scan (ThreatScore, ENS, NIS2, CSA, CIS). On scans with hundreds of thousands of findings it OOM-ed the worker: a single check with thousands of findings forced ReportLab to resolve layout for one giant LongTable, andFindingOutput.transform_api_findingran pydantic v1 validation per finding plus an N+1 lookup on resources/tags. The master function also re-initialised the Prowler provider five times.Changes
_create_findings_tablesreturns 300-row sub-tables instead of one large LongTable, so ReportLab layout cost is bounded per Flowable.DJANGO_PDF_MAX_FINDINGS_PER_CHECK; set to0to disable). The PDF shows an in-banner "Showing first 100 of N failed findings" and points the reader to the CSV or JSON export, which are never truncated.only_failedfilter down to SQL: PASS findings are never loaded from the DB nor pydantic-transformed.prefetch_related("resources", "resources__tags")in_load_findings_for_requirement_checks.prowler_provideronce per batch and propagate to each framework wrapper instead of re-initialising 5 times.findings_cachebetween frameworks: drop thecheck_ids no remaining framework still needs.ROWBACKGROUNDS(O(1) style entry) instead of N per-rowBACKGROUNDcommands increate_data_table._log_phasecontext manager emittingphase_start/phase_endwithscan_id,framework,elapsed_s,rss_kb,delta_rss_kb. Per-framework error paths now uselogger.exceptionwithscan_id/tenant_id.output_pathearly in_create_document.Steps to review
Please add a detailed description of how to review this PR.
Checklist
Community Checklist
SDK/CLI
UI
API
License
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.