Skip to content

feat! add postgre and fabric mirror#117

Open
mswantek68 wants to merge 13 commits intomainfrom
feat/addPostgreFabricMirror
Open

feat! add postgre and fabric mirror#117
mswantek68 wants to merge 13 commits intomainfrom
feat/addPostgreFabricMirror

Conversation

@mswantek68
Copy link
Collaborator

Purpose

Replace sub module location, add postgreSQL provisioning, add Fabric mirror of PostgreSQL DB

Does this introduce a breaking change?

  • [ X] Yes
  • No

Golden Path Validation

  • [ x] I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.

Deployment Validation

  • [ x] I have validated the deployment process successfully and all services are running as expected with this change.

What to Check

Verify that the following are valid
Follow post_deployment_steps.md for listing

Other Information

As reviewed with team on 3-10-26

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the AI Landing Zone submodule source and reworks provisioning/automation to add PostgreSQL Flexible Server provisioning plus Fabric mirroring support, while improving resiliency of post-provision scripts (outputs-based configuration, retries, and safer error handling).

Changes:

  • Switch AI Landing Zone submodule to a new upstream repo and change preprovision to deploy the submodule directly via az deployment group create.
  • Add PostgreSQL Flexible Server provisioning in the wrapper Bicep, plus PowerShell automation to prep PostgreSQL for Fabric mirroring and create a mirrored database.
  • Improve automation scripts to prefer AZURE_OUTPUTS_JSON, add retries/guardrails, and introduce a Purview skip flag.

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
submodules/ai-landing-zone Updates submodule commit pointer to the new upstream state.
scripts/preprovision-integrated.ps1 Replaces preprovision flow with direct submodule deployment + env output publishing.
scripts/automationScripts/SecurityModule.ps1 Enhances REST error sanitization and adds HTTP status/body diagnostics.
scripts/automationScripts/OneLakeIndex/setup_ai_services_rbac.ps1 Adds AI Foundry project existence checks to improve RBAC setup reliability.
scripts/automationScripts/OneLakeIndex/06_setup_ai_foundry_search_rbac.ps1 Pulls settings from outputs, discovers project, and improves additional principals parsing.
scripts/automationScripts/OneLakeIndex/05_create_onelake_indexer.ps1 Adds public access toggle + retry handling for Search requests and RBAC propagation.
scripts/automationScripts/OneLakeIndex/04_create_onelake_datasource.ps1 Same Search public-access toggle + retry layer; prefers outputs for config.
scripts/automationScripts/OneLakeIndex/03_create_onelake_index.ps1 Same Search public-access toggle + retry layer; prefers outputs for config.
scripts/automationScripts/OneLakeIndex/02_create_onelake_skillsets.ps1 Same Search public-access toggle + retry layer; prefers outputs for config.
scripts/automationScripts/OneLakeIndex/01_setup_rbac.ps1 Prefers outputs for Search/Foundry values used during RBAC.
scripts/automationScripts/FabricWorkspace/mirror/run_postgresql_mirroring_prep_with_public_access.ps1 Adds wrapper to temporarily enable public access for KV/PG during mirroring prep.
scripts/automationScripts/FabricWorkspace/mirror/prepare_postgresql_for_mirroring.ps1 Adds PostgreSQL mirroring preparation (params, roles/grants, seed table).
scripts/automationScripts/FabricWorkspace/mirror/create_postgresql_mirror.ps1 Adds Fabric API automation for creating a mirrored database once a connection exists.
scripts/automationScripts/FabricWorkspace/CreateWorkspace/register_fabric_datasource.ps1 Adds SKIP_PURVIEW_INTEGRATION short-circuit.
scripts/automationScripts/FabricWorkspace/CreateWorkspace/materialize_document_folders.ps1 Improves retry behavior and idempotency checks for folder materialization.
scripts/automationScripts/FabricWorkspace/CreateWorkspace/create_fabric_workspace.ps1 Moves from Power BI to Fabric API and updates admin/capacity assignment flows.
scripts/automationScripts/FabricPurviewAutomation/trigger_purview_scan_for_fabric_workspace.ps1 Adds skip flag and retry logic around scan create/run.
scripts/automationScripts/FabricPurviewAutomation/create_purview_collection.ps1 Adds skip flag to avoid running Purview setup.
scripts/automationScripts/FabricPurviewAutomation/connect_log_analytics.ps1 Removes placeholder Log Analytics linkage script.
infra/main.bicepparam Reorganizes parameters and adds PostgreSQL + mirror-related inputs/toggles.
infra/main.bicep Adds PostgreSQL provisioning and updates wrapper outputs for scripts.
docs/quota_check.md Clarifies terminal requirements for quota script vs deployment.
docs/postgresql_mirroring.md Adds end-to-end runbook for Fabric mirroring with what’s automated vs manual.
docs/post_deployment_steps.md Adds PostgreSQL mirroring verification steps and updates network isolation wording.
docs/automation-outputs-mapping.md Updates mapping descriptions for resolved Fabric modes.
docs/PARAMETER_GUIDE.md Updates submodule param references and documents PostgreSQL options.
docs/DeploymentGuide.md Updates Windows shell guidance and reflects new preprovision/deployment flow.
azure.yaml Switches preprovision to pwsh and adds PostgreSQL mirroring steps.
README.md Updates architecture image, upstream link, and adds PostgreSQL mirroring doc link.
CHANGELOG.md Adds Unreleased notes for new behaviors and script changes.
.gitmodules Updates submodule URL to new upstream repository.
.gitignore Ignores a new local copy parameter file.
Comments suppressed due to low confidence (3)

scripts/preprovision-integrated.ps1:1

  • The GUID validation regex ^[0-9a-fA-F-]{36}$ is too permissive (it accepts many invalid GUID strings). Use [guid]::TryParse(...) (or a stricter GUID regex) to ensure AZURE_PRINCIPAL_ID / principalId is a valid GUID before relying on it for RBAC and deployment parameters.
    scripts/preprovision-integrated.ps1:1
  • The GUID validation regex ^[0-9a-fA-F-]{36}$ is too permissive (it accepts many invalid GUID strings). Use [guid]::TryParse(...) (or a stricter GUID regex) to ensure AZURE_PRINCIPAL_ID / principalId is a valid GUID before relying on it for RBAC and deployment parameters.
    scripts/preprovision-integrated.ps1:1
  • The Select-String pattern only matches parameters that start at column 0. If the submodule formats parameters with indentation (or has leading whitespace), valid params will be missed and silently dropped from $filtered.parameters, causing deployment failures due to missing required params. Use a pattern that allows leading whitespace (e.g., ^\\s*param\\s+...).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if ($currentUsers -and $currentUsers.value) { $hasAdmin = ($currentUsers.value | Where-Object { $_.identifier -eq $admin -and $_.groupUserAccessRight -eq 'Admin' }) }
if ($currentRoleAssignments -and $currentRoleAssignments.value) {
$hasAdmin = ($currentRoleAssignments.value | Where-Object {
(($_.principal.id -eq $admin) -or ($_.principal.userDetails.userPincipalName -eq $admin)) -and $_.role -eq 'Admin'
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two typos that will break admin detection/assignment: userPincipalName should be userPrincipalName, and $pincipalId is referenced but never set (likely intended to be $principalId). This will cause false negatives when checking existing admins and will fail role assignment requests.

Copilot uses AI. Check for mistakes.
}
}

Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $pincipalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two typos that will break admin detection/assignment: userPincipalName should be userPrincipalName, and $pincipalId is referenced but never set (likely intended to be $principalId). This will cause false negatives when checking existing admins and will fail role assignment requests.

Copilot uses AI. Check for mistakes.
}
}

Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $pincipalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two typos that will break admin detection/assignment: userPincipalName should be userPrincipalName, and $pincipalId is referenced but never set (likely intended to be $principalId). This will cause false negatives when checking existing admins and will fail role assignment requests.

Suggested change
Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $pincipalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop
Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $principalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop

Copilot uses AI. Check for mistakes.
param existingVnetResourceId = readEnvironmentVariable('EXISTING_VNET_RESOURCE_ID', '')

// Optional additional Entra object IDs to grant Search roles.
param aiSearchAdditionalAccessObjectIds = ['']
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting aiSearchAdditionalAccessObjectIds to an array containing an empty string is likely to cause downstream role-assignment or principal parsing failures. Use an empty array ([]) as the default, and only include valid GUIDs when needed.

Suggested change
param aiSearchAdditionalAccessObjectIds = ['']
param aiSearchAdditionalAccessObjectIds = []

Copilot uses AI. Check for mistakes.

# Stage 5: Purview Collection Creation
- run: ./scripts/automationScripts/FabricPurviewAutomation/create_purview_collection.ps1
- run: "$env:SKIP_PURVIEW_INTEGRATION='true'; ./scripts/automationScripts/FabricPurviewAutomation/create_purview_collection.ps1"
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These hook commands force SKIP_PURVIEW_INTEGRATION=true unconditionally, which disables Purview integration even when Purview is intended/enabled. This is a behavioral change not called out in the PR description; consider controlling this via a real config toggle (e.g., env/bicep output) rather than hard-coding true in the hook command.

Copilot uses AI. Check for mistakes.

@description('PostgreSQL storage size in GB.')
param postgreSqlStorageSizeGB int = 32
@description('Generated value used when postgreSqlAdminPassword is left as the placeholder token.')
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generatedPostgreSqlAdminPassword contributes directly to the generated admin password but is not marked @secure(). That makes it easier to exfiltrate the derived password via parameter inspection/telemetry/logging. Mark this parameter as secure (or avoid passing it as a parameter altogether and derive it in a way that doesn’t require exposing the seed material).

Suggested change
@description('Generated value used when postgreSqlAdminPassword is left as the placeholder token.')
@description('Generated value used when postgreSqlAdminPassword is left as the placeholder token.')
@secure()

Copilot uses AI. Check for mistakes.
: resourceId('Microsoft.KeyVault/vaults', keyVaultName)

var effectivePostgreSqlAdminPassword = postgreSqlAdminPassword == '$(secretOrRandomPassword)'
? '${uniqueString(subscription().id, resourceGroup().id, postgreSqlServerName)}!${replace(generatedPostgreSqlAdminPassword, '-', '')}'
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generatedPostgreSqlAdminPassword contributes directly to the generated admin password but is not marked @secure(). That makes it easier to exfiltrate the derived password via parameter inspection/telemetry/logging. Mark this parameter as secure (or avoid passing it as a parameter altogether and derive it in a way that doesn’t require exposing the seed material).

Suggested change
? '${uniqueString(subscription().id, resourceGroup().id, postgreSqlServerName)}!${replace(generatedPostgreSqlAdminPassword, '-', '')}'
? '${uniqueString(subscription().id, resourceGroup().id, postgreSqlServerName)}!${uniqueString(deployment().name, environmentName, location)}'

Copilot uses AI. Check for mistakes.
Comment on lines +126 to +131
$reader = New-Object System.IO.StreamReader($response.GetResponseStream())
$responseBody = $reader.ReadToEnd()
} catch { $responseBody = $null }
}
if ($responseBody) {
$responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]'
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues: (1) StreamReader isn’t disposed, which can leak resources; wrap it so it’s closed/disposed reliably. (2) Logging the full HTTP response body can leak sensitive data beyond bearer tokens (e.g., secrets embedded in JSON, signed URLs, PII). Consider truncating the body and/or sanitizing additional common secret patterns before printing.

Suggested change
$reader = New-Object System.IO.StreamReader($response.GetResponseStream())
$responseBody = $reader.ReadToEnd()
} catch { $responseBody = $null }
}
if ($responseBody) {
$responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]'
$reader = $null
try {
$reader = New-Object System.IO.StreamReader($response.GetResponseStream())
$responseBody = $reader.ReadToEnd()
}
finally {
if ($reader -ne $null) {
$reader.Dispose()
}
}
} catch { $responseBody = $null }
}
if ($responseBody) {
# Redact bearer tokens
$responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]'
# Redact common JSON token fields (access/refresh tokens)
$responseBody = $responseBody -replace '"access_token"\s*:\s*"(.*?)"', '"access_token":"[REDACTED]"'
$responseBody = $responseBody -replace '"refresh_token"\s*:\s*"(.*?)"', '"refresh_token":"[REDACTED]"'
# Truncate long bodies to reduce sensitive data exposure
$maxLength = 1024
if ($responseBody.Length -gt $maxLength) {
$responseBody = $responseBody.Substring(0, $maxLength) + '...[truncated]'
}

Copilot uses AI. Check for mistakes.
Comment on lines +130 to +136
if ($responseBody) {
$responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]'
}

Write-Error "Secure $Description failed: $sanitizedError"
if ($statusCode) { Write-Error "HTTP Status: $statusCode" }
if ($responseBody) { Write-Error "HTTP Body: $responseBody" }
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues: (1) StreamReader isn’t disposed, which can leak resources; wrap it so it’s closed/disposed reliably. (2) Logging the full HTTP response body can leak sensitive data beyond bearer tokens (e.g., secrets embedded in JSON, signed URLs, PII). Consider truncating the body and/or sanitizing additional common secret patterns before printing.

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +17
### Fixed
- Power BI headers initialization in Log Analytics linkage script to resolve workspace ID lookups
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changelog claims a fix in the Log Analytics linkage script, but scripts/automationScripts/FabricPurviewAutomation/connect_log_analytics.ps1 is removed in this PR. Update/remove this entry so the changelog matches the actual changes.

Suggested change
### Fixed
- Power BI headers initialization in Log Analytics linkage script to resolve workspace ID lookups
### Removed
- Log Analytics linkage script `scripts/automationScripts/FabricPurviewAutomation/connect_log_analytics.ps1`

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants