Conversation
…reating workspace.
…osoft/Deploy-Your-AI-Application-In-Production into feat/addPostgreFabricMirror
…osoft/Deploy-Your-AI-Application-In-Production into feat/addPostgreFabricMirror
There was a problem hiding this comment.
Pull request overview
This PR updates the AI Landing Zone submodule source and reworks provisioning/automation to add PostgreSQL Flexible Server provisioning plus Fabric mirroring support, while improving resiliency of post-provision scripts (outputs-based configuration, retries, and safer error handling).
Changes:
- Switch AI Landing Zone submodule to a new upstream repo and change preprovision to deploy the submodule directly via
az deployment group create. - Add PostgreSQL Flexible Server provisioning in the wrapper Bicep, plus PowerShell automation to prep PostgreSQL for Fabric mirroring and create a mirrored database.
- Improve automation scripts to prefer
AZURE_OUTPUTS_JSON, add retries/guardrails, and introduce a Purview skip flag.
Reviewed changes
Copilot reviewed 31 out of 32 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| submodules/ai-landing-zone | Updates submodule commit pointer to the new upstream state. |
| scripts/preprovision-integrated.ps1 | Replaces preprovision flow with direct submodule deployment + env output publishing. |
| scripts/automationScripts/SecurityModule.ps1 | Enhances REST error sanitization and adds HTTP status/body diagnostics. |
| scripts/automationScripts/OneLakeIndex/setup_ai_services_rbac.ps1 | Adds AI Foundry project existence checks to improve RBAC setup reliability. |
| scripts/automationScripts/OneLakeIndex/06_setup_ai_foundry_search_rbac.ps1 | Pulls settings from outputs, discovers project, and improves additional principals parsing. |
| scripts/automationScripts/OneLakeIndex/05_create_onelake_indexer.ps1 | Adds public access toggle + retry handling for Search requests and RBAC propagation. |
| scripts/automationScripts/OneLakeIndex/04_create_onelake_datasource.ps1 | Same Search public-access toggle + retry layer; prefers outputs for config. |
| scripts/automationScripts/OneLakeIndex/03_create_onelake_index.ps1 | Same Search public-access toggle + retry layer; prefers outputs for config. |
| scripts/automationScripts/OneLakeIndex/02_create_onelake_skillsets.ps1 | Same Search public-access toggle + retry layer; prefers outputs for config. |
| scripts/automationScripts/OneLakeIndex/01_setup_rbac.ps1 | Prefers outputs for Search/Foundry values used during RBAC. |
| scripts/automationScripts/FabricWorkspace/mirror/run_postgresql_mirroring_prep_with_public_access.ps1 | Adds wrapper to temporarily enable public access for KV/PG during mirroring prep. |
| scripts/automationScripts/FabricWorkspace/mirror/prepare_postgresql_for_mirroring.ps1 | Adds PostgreSQL mirroring preparation (params, roles/grants, seed table). |
| scripts/automationScripts/FabricWorkspace/mirror/create_postgresql_mirror.ps1 | Adds Fabric API automation for creating a mirrored database once a connection exists. |
| scripts/automationScripts/FabricWorkspace/CreateWorkspace/register_fabric_datasource.ps1 | Adds SKIP_PURVIEW_INTEGRATION short-circuit. |
| scripts/automationScripts/FabricWorkspace/CreateWorkspace/materialize_document_folders.ps1 | Improves retry behavior and idempotency checks for folder materialization. |
| scripts/automationScripts/FabricWorkspace/CreateWorkspace/create_fabric_workspace.ps1 | Moves from Power BI to Fabric API and updates admin/capacity assignment flows. |
| scripts/automationScripts/FabricPurviewAutomation/trigger_purview_scan_for_fabric_workspace.ps1 | Adds skip flag and retry logic around scan create/run. |
| scripts/automationScripts/FabricPurviewAutomation/create_purview_collection.ps1 | Adds skip flag to avoid running Purview setup. |
| scripts/automationScripts/FabricPurviewAutomation/connect_log_analytics.ps1 | Removes placeholder Log Analytics linkage script. |
| infra/main.bicepparam | Reorganizes parameters and adds PostgreSQL + mirror-related inputs/toggles. |
| infra/main.bicep | Adds PostgreSQL provisioning and updates wrapper outputs for scripts. |
| docs/quota_check.md | Clarifies terminal requirements for quota script vs deployment. |
| docs/postgresql_mirroring.md | Adds end-to-end runbook for Fabric mirroring with what’s automated vs manual. |
| docs/post_deployment_steps.md | Adds PostgreSQL mirroring verification steps and updates network isolation wording. |
| docs/automation-outputs-mapping.md | Updates mapping descriptions for resolved Fabric modes. |
| docs/PARAMETER_GUIDE.md | Updates submodule param references and documents PostgreSQL options. |
| docs/DeploymentGuide.md | Updates Windows shell guidance and reflects new preprovision/deployment flow. |
| azure.yaml | Switches preprovision to pwsh and adds PostgreSQL mirroring steps. |
| README.md | Updates architecture image, upstream link, and adds PostgreSQL mirroring doc link. |
| CHANGELOG.md | Adds Unreleased notes for new behaviors and script changes. |
| .gitmodules | Updates submodule URL to new upstream repository. |
| .gitignore | Ignores a new local copy parameter file. |
Comments suppressed due to low confidence (3)
scripts/preprovision-integrated.ps1:1
- The GUID validation regex
^[0-9a-fA-F-]{36}$is too permissive (it accepts many invalid GUID strings). Use[guid]::TryParse(...)(or a stricter GUID regex) to ensureAZURE_PRINCIPAL_ID/principalIdis a valid GUID before relying on it for RBAC and deployment parameters.
scripts/preprovision-integrated.ps1:1 - The GUID validation regex
^[0-9a-fA-F-]{36}$is too permissive (it accepts many invalid GUID strings). Use[guid]::TryParse(...)(or a stricter GUID regex) to ensureAZURE_PRINCIPAL_ID/principalIdis a valid GUID before relying on it for RBAC and deployment parameters.
scripts/preprovision-integrated.ps1:1 - The
Select-Stringpattern only matches parameters that start at column 0. If the submodule formats parameters with indentation (or has leading whitespace), valid params will be missed and silently dropped from$filtered.parameters, causing deployment failures due to missing required params. Use a pattern that allows leading whitespace (e.g.,^\\s*param\\s+...).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if ($currentUsers -and $currentUsers.value) { $hasAdmin = ($currentUsers.value | Where-Object { $_.identifier -eq $admin -and $_.groupUserAccessRight -eq 'Admin' }) } | ||
| if ($currentRoleAssignments -and $currentRoleAssignments.value) { | ||
| $hasAdmin = ($currentRoleAssignments.value | Where-Object { | ||
| (($_.principal.id -eq $admin) -or ($_.principal.userDetails.userPincipalName -eq $admin)) -and $_.role -eq 'Admin' |
There was a problem hiding this comment.
There are two typos that will break admin detection/assignment: userPincipalName should be userPrincipalName, and $pincipalId is referenced but never set (likely intended to be $principalId). This will cause false negatives when checking existing admins and will fail role assignment requests.
| } | ||
| } | ||
|
|
||
| Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $pincipalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop |
There was a problem hiding this comment.
There are two typos that will break admin detection/assignment: userPincipalName should be userPrincipalName, and $pincipalId is referenced but never set (likely intended to be $principalId). This will cause false negatives when checking existing admins and will fail role assignment requests.
| } | ||
| } | ||
|
|
||
| Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $pincipalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop |
There was a problem hiding this comment.
There are two typos that will break admin detection/assignment: userPincipalName should be userPrincipalName, and $pincipalId is referenced but never set (likely intended to be $principalId). This will cause false negatives when checking existing admins and will fail role assignment requests.
| Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $pincipalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop | |
| Invoke-SecureWebRequest -Uri "$apiRoot/workspaces/$workspaceId/roleAssignments" -Method Post -Headers ($apiHeaders) -Body (@{ principal = @{ id = $principalId; type = 'User' }; role = 'Admin' } | ConvertTo-Json) -ErrorAction Stop |
| param existingVnetResourceId = readEnvironmentVariable('EXISTING_VNET_RESOURCE_ID', '') | ||
|
|
||
| // Optional additional Entra object IDs to grant Search roles. | ||
| param aiSearchAdditionalAccessObjectIds = [''] |
There was a problem hiding this comment.
Defaulting aiSearchAdditionalAccessObjectIds to an array containing an empty string is likely to cause downstream role-assignment or principal parsing failures. Use an empty array ([]) as the default, and only include valid GUIDs when needed.
| param aiSearchAdditionalAccessObjectIds = [''] | |
| param aiSearchAdditionalAccessObjectIds = [] |
|
|
||
| # Stage 5: Purview Collection Creation | ||
| - run: ./scripts/automationScripts/FabricPurviewAutomation/create_purview_collection.ps1 | ||
| - run: "$env:SKIP_PURVIEW_INTEGRATION='true'; ./scripts/automationScripts/FabricPurviewAutomation/create_purview_collection.ps1" |
There was a problem hiding this comment.
These hook commands force SKIP_PURVIEW_INTEGRATION=true unconditionally, which disables Purview integration even when Purview is intended/enabled. This is a behavioral change not called out in the PR description; consider controlling this via a real config toggle (e.g., env/bicep output) rather than hard-coding true in the hook command.
|
|
||
| @description('PostgreSQL storage size in GB.') | ||
| param postgreSqlStorageSizeGB int = 32 | ||
| @description('Generated value used when postgreSqlAdminPassword is left as the placeholder token.') |
There was a problem hiding this comment.
generatedPostgreSqlAdminPassword contributes directly to the generated admin password but is not marked @secure(). That makes it easier to exfiltrate the derived password via parameter inspection/telemetry/logging. Mark this parameter as secure (or avoid passing it as a parameter altogether and derive it in a way that doesn’t require exposing the seed material).
| @description('Generated value used when postgreSqlAdminPassword is left as the placeholder token.') | |
| @description('Generated value used when postgreSqlAdminPassword is left as the placeholder token.') | |
| @secure() |
| : resourceId('Microsoft.KeyVault/vaults', keyVaultName) | ||
|
|
||
| var effectivePostgreSqlAdminPassword = postgreSqlAdminPassword == '$(secretOrRandomPassword)' | ||
| ? '${uniqueString(subscription().id, resourceGroup().id, postgreSqlServerName)}!${replace(generatedPostgreSqlAdminPassword, '-', '')}' |
There was a problem hiding this comment.
generatedPostgreSqlAdminPassword contributes directly to the generated admin password but is not marked @secure(). That makes it easier to exfiltrate the derived password via parameter inspection/telemetry/logging. Mark this parameter as secure (or avoid passing it as a parameter altogether and derive it in a way that doesn’t require exposing the seed material).
| ? '${uniqueString(subscription().id, resourceGroup().id, postgreSqlServerName)}!${replace(generatedPostgreSqlAdminPassword, '-', '')}' | |
| ? '${uniqueString(subscription().id, resourceGroup().id, postgreSqlServerName)}!${uniqueString(deployment().name, environmentName, location)}' |
| $reader = New-Object System.IO.StreamReader($response.GetResponseStream()) | ||
| $responseBody = $reader.ReadToEnd() | ||
| } catch { $responseBody = $null } | ||
| } | ||
| if ($responseBody) { | ||
| $responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]' |
There was a problem hiding this comment.
Two issues: (1) StreamReader isn’t disposed, which can leak resources; wrap it so it’s closed/disposed reliably. (2) Logging the full HTTP response body can leak sensitive data beyond bearer tokens (e.g., secrets embedded in JSON, signed URLs, PII). Consider truncating the body and/or sanitizing additional common secret patterns before printing.
| $reader = New-Object System.IO.StreamReader($response.GetResponseStream()) | |
| $responseBody = $reader.ReadToEnd() | |
| } catch { $responseBody = $null } | |
| } | |
| if ($responseBody) { | |
| $responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]' | |
| $reader = $null | |
| try { | |
| $reader = New-Object System.IO.StreamReader($response.GetResponseStream()) | |
| $responseBody = $reader.ReadToEnd() | |
| } | |
| finally { | |
| if ($reader -ne $null) { | |
| $reader.Dispose() | |
| } | |
| } | |
| } catch { $responseBody = $null } | |
| } | |
| if ($responseBody) { | |
| # Redact bearer tokens | |
| $responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]' | |
| # Redact common JSON token fields (access/refresh tokens) | |
| $responseBody = $responseBody -replace '"access_token"\s*:\s*"(.*?)"', '"access_token":"[REDACTED]"' | |
| $responseBody = $responseBody -replace '"refresh_token"\s*:\s*"(.*?)"', '"refresh_token":"[REDACTED]"' | |
| # Truncate long bodies to reduce sensitive data exposure | |
| $maxLength = 1024 | |
| if ($responseBody.Length -gt $maxLength) { | |
| $responseBody = $responseBody.Substring(0, $maxLength) + '...[truncated]' | |
| } |
| if ($responseBody) { | ||
| $responseBody = $responseBody -replace 'Bearer [A-Za-z0-9\-\._~\+\/=]+=*', 'Bearer [REDACTED]' | ||
| } | ||
|
|
||
| Write-Error "Secure $Description failed: $sanitizedError" | ||
| if ($statusCode) { Write-Error "HTTP Status: $statusCode" } | ||
| if ($responseBody) { Write-Error "HTTP Body: $responseBody" } |
There was a problem hiding this comment.
Two issues: (1) StreamReader isn’t disposed, which can leak resources; wrap it so it’s closed/disposed reliably. (2) Logging the full HTTP response body can leak sensitive data beyond bearer tokens (e.g., secrets embedded in JSON, signed URLs, PII). Consider truncating the body and/or sanitizing additional common secret patterns before printing.
| ### Fixed | ||
| - Power BI headers initialization in Log Analytics linkage script to resolve workspace ID lookups |
There was a problem hiding this comment.
The changelog claims a fix in the Log Analytics linkage script, but scripts/automationScripts/FabricPurviewAutomation/connect_log_analytics.ps1 is removed in this PR. Update/remove this entry so the changelog matches the actual changes.
| ### Fixed | |
| - Power BI headers initialization in Log Analytics linkage script to resolve workspace ID lookups | |
| ### Removed | |
| - Log Analytics linkage script `scripts/automationScripts/FabricPurviewAutomation/connect_log_analytics.ps1` |
Purpose
Replace sub module location, add postgreSQL provisioning, add Fabric mirror of PostgreSQL DB
Does this introduce a breaking change?
Golden Path Validation
Deployment Validation
What to Check
Verify that the following are valid
Follow post_deployment_steps.md for listing
Other Information
As reviewed with team on 3-10-26