Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ closed SaaS product.
- Provider-agnostic, modular design
- AI output is always a draft requiring human review

Hosted-mode contributions must preserve local-first defaults. New hosted features should remain gated behind explicit `HOSTED_MODE` flags.

## Ways to Contribute

You do NOT need clinical experience to contribute.
Expand Down Expand Up @@ -56,7 +58,7 @@ High-priority contribution areas:

1. Create a feature branch:
```bash
git checkout -b feature/descriptive-name
git checkout -b feat/descriptive-name
```
2. Make changes with clear commits
3. Add or update tests where applicable
Expand All @@ -72,6 +74,26 @@ High-priority contribution areas:
```
6. Submit a Pull Request

### Branch Naming Convention

Use these branch prefixes:

- `xmain` - only long-lived branch
- `feat/<description>` - features
- `fix/<description>` - bug fixes
- `refactor/<description>` - non-functional code changes
- `docs/<description>` - documentation
- `chore/<description>` - maintenance/config/deps
- `ci/<description>` - CI/CD changes

Examples:

```bash
git checkout -b feat/hipaa-hosted-auth-guard
git checkout -b ci/release-tag-workflow
git checkout -b docs/branch-protection-policy
```

## Code Standards

- TypeScript with explicit types
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,12 @@ curl http://127.0.0.1:8002/health

OpenScribe supports three workflows. **Mixed web mode is the default path.**

### Local-First Guardrail
- Hosted/multi-user controls are opt-in and disabled by default.
- Unless `HOSTED_MODE=true` (and `NEXT_PUBLIC_HOSTED_MODE=true`) is explicitly set, OpenScribe continues running in local-first mode.
- Local workflows remain the core path and are not replaced by hosted features.
- Hosted operations/runbook details: [docs/compliance/HOSTED_OPERATIONS_RUNBOOK.md](./docs/compliance/HOSTED_OPERATIONS_RUNBOOK.md)

### Mixed Web (default)
- Transcription: local Whisper server (`pnpm whisper:server`) with default model `tiny.en`
- Notes: larger model (default Claude in web path)
Expand Down
24 changes: 24 additions & 0 deletions apps/web/.env.local.example
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ OPENAI_API_KEY="sk-proj-your-key"
# - whisper_local (default, local tiny Whisper server)
# - whisper_openai (OpenAI hosted Whisper API)
# - medasr (legacy local MedASR server)
# - gcp_stt_v2 (Google Speech-to-Text v2, hosted mode)
TRANSCRIPTION_PROVIDER="whisper_local"

# Local Whisper server config (used when TRANSCRIPTION_PROVIDER=whisper_local)
Expand All @@ -21,10 +22,33 @@ WHISPER_LOCAL_MAX_RETRIES="2"
# OpenAI Whisper config (used when TRANSCRIPTION_PROVIDER=whisper_openai)
WHISPER_OPENAI_MODEL="whisper-1"

# Google STT v2 config (used when TRANSCRIPTION_PROVIDER=gcp_stt_v2)
GCP_PROJECT_ID="your-gcp-project-id"
GCP_STT_LOCATION="us-central1"
GCP_STT_MODEL="chirp_2"
GCP_STT_LANGUAGE_CODE="en-US"
# Optional: custom recognizer resource name suffix, default "_"
GCP_STT_RECOGNIZER="_"

# Anthropic API key for clinical note generation (required)
# Get your key at: https://console.anthropic.com/
ANTHROPIC_API_KEY="sk-ant-your-key"

# Base64-encoded 32-byte secret used for client-side secure storage.
# Generate with: openssl rand -base64 32
NEXT_PUBLIC_SECURE_STORAGE_KEY="base64-secret"

# Hosted mode controls
HOSTED_MODE="false"
NEXT_PUBLIC_HOSTED_MODE="false"
ALLOW_USER_API_KEYS="true"
PERSIST_SERVER_PHI="false"

# Optional Identity Platform audience check
# GCP_IDENTITY_PLATFORM_CLIENT_ID="your-client-id.apps.googleusercontent.com"

# Optional distributed session store
# SESSION_STORE_BACKEND="redis"
# REDIS_HOST="127.0.0.1"
# REDIS_PORT="6379"
# REDIS_TLS="false"
72 changes: 72 additions & 0 deletions docs/PR_SERIES_IMPLEMENTATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# PR Series Implementation Guide

This repository currently contains a combined working tree for hosted-mode hardening.
Use the following PR lanes to split and merge safely.

## PR order
1. `ci/harden-checks-and-scope`
2. `fix/auth-bootstrap-deadlock`
3. `fix/sse-auth-without-query-token`
4. `feat/authz-note-generation-route`
5. `refactor/session-store-redis-reliability`
6. `feat/gcp-stt-provider-hardening`
7. `fix/hosted-api-key-and-audit-sanitization`
8. `docs/local-first-and-hosted-ops`
9. `feat/terraform-minimum-viable-stack`
10. `ci/release-tag-prod-deploy`

## Suggested split by file groups

### 1) CI hardening
- `.github/workflows/ci.yml`
- `config/eslint.config.mjs`
- `docs/BRANCH_PROTECTION.md`

### 2) Auth bootstrap deadlock
- `apps/web/src/lib/auth.ts`
- `apps/web/src/app/api/auth/bootstrap/route.ts`

### 3) SSE auth without query token
- `apps/web/src/app/page.tsx`
- `apps/web/src/app/api/transcription/stream/[sessionId]/route.ts`
- `packages/pipeline/transcribe/src/hooks/segment-upload-controller.ts`

### 4) Authenticated notes route
- `apps/web/src/app/api/notes/generate/route.ts`
- `apps/web/src/app/page.tsx`
- `apps/web/src/app/actions.ts`

### 5) Session store reliability
- `packages/pipeline/assemble/src/session-store.ts`
- `apps/web/src/app/api/transcription/segment/route.ts`
- `apps/web/src/app/api/transcription/final/route.ts`
- `packages/pipeline/eval/src/tests/e2e-basic.test.ts`
- `packages/pipeline/eval/src/tests/e2e-real-api.test.ts`

### 6) GCP STT hardening
- `packages/pipeline/transcribe/src/providers/gcp-stt-transcriber.ts`
- `packages/pipeline/transcribe/src/providers/provider-resolver.ts`
- `packages/pipeline/transcribe/src/__tests__/provider-resolver.test.ts`
- `packages/pipeline/transcribe/src/__tests__/gcp-stt-transcriber.test.ts`

### 7) Hosted API key + audit sanitization
- `apps/web/src/app/api/settings/api-keys/route.ts`
- `packages/storage/src/server-api-keys.ts`
- `packages/storage/src/server-audit.ts`
- `packages/storage/src/__tests__/server-audit.test.ts`
- `scripts/check-no-phi-logs.mjs`

### 8) Docs updates
- `README.md`
- `CONTRIBUTING.md`
- `docs/compliance/HOSTED_OPERATIONS_RUNBOOK.md`

### 9) Terraform MVP stack
- `infra/terraform/modules/**`
- `infra/terraform/environments/**`
- `infra/terraform/README.md`
- `.github/workflows/terraform-plan.yml`
- `.github/workflows/terraform-apply.yml`

### 10) Release deploy workflow
- `.github/workflows/release.yml`
49 changes: 49 additions & 0 deletions docs/compliance/HOSTED_OPERATIONS_RUNBOOK.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Hosted Operations Runbook

## Local-first default
OpenScribe remains local-first by default. Hosted behavior is only enabled when both:
- `HOSTED_MODE=true`
- `NEXT_PUBLIC_HOSTED_MODE=true`

## Required runtime env vars (hosted)
- `HOSTED_MODE=true`
- `NEXT_PUBLIC_HOSTED_MODE=true`
- `ALLOW_USER_API_KEYS=false`
- `PERSIST_SERVER_PHI=false`
- `AUTH_SESSION_SECRET=<strong-random-secret>`
- `TRANSCRIPTION_PROVIDER=gcp_stt_v2`
- `GCP_PROJECT_ID=<project-id>`
- `GCP_STT_LOCATION=us-central1`
- `GCP_STT_MODEL=chirp_2`
- `GCP_STT_LANGUAGE_CODE=en-US`
- `ANTHROPIC_API_KEY=<from Secret Manager/env injection>`

## Hosted auth flow
1. User authenticates with Identity Platform and obtains ID token.
2. Client sends token to `POST /api/auth/bootstrap`.
3. Backend verifies token, creates/loads org membership, sets HttpOnly session cookie.
4. Protected routes authorize via session cookie (or bearer token fallback).

## Redis session store (optional for multi-instance)
- `SESSION_STORE_BACKEND=redis`
- `REDIS_HOST=<memorystore-ip-or-dns>`
- `REDIS_PORT=6379`
- `REDIS_PASSWORD=<if configured>`
- `REDIS_TLS=true|false`

## Release process
1. Merge via PR to `main` with required checks passing.
2. Create release tag: `vX.Y.Z`.
3. `release.yml` builds image, pushes Artifact Registry, deploys Cloud Run.
4. Validate health and authentication in production.

## Security checks before release
- `pnpm build:test`
- `pnpm test:no-phi-logs`
- CI secret scan and dependency scan pass

## Incident response baseline
- Revoke compromised credentials in Secret Manager.
- Rotate `AUTH_SESSION_SECRET`.
- Invalidate sessions by rotating secret and redeploying.
- Review Cloud Logging and audit events for affected window.
26 changes: 26 additions & 0 deletions docs/compliance/SECURITY_REVIEW_CHECKLIST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Security Review Checklist

## Access Control
- Auth required for all PHI-processing endpoints
- Authorization checks include org scope
- No anonymous access in hosted mode

## PHI Handling
- No durable server-side transcript/audio/note persistence
- No PHI fields in logs
- Error messages are sanitized

## Secrets and Keys
- Secrets sourced from Secret Manager / env only
- No plaintext keys stored on disk in hosted mode
- Rotation procedure documented

## Infrastructure
- TLS enforced end-to-end
- Least-privilege IAM applied
- Audit logs exported with retention policy

## Release and Change Control
- Change merged via PR with approvals
- Required checks passed
- Rollback plan documented
6 changes: 6 additions & 0 deletions scripts/setup-env.js
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,12 @@ ANTHROPIC_API_KEY="sk-ant-your-key"

# Auto-generated secure storage key (do not modify)
NEXT_PUBLIC_SECURE_STORAGE_KEY="${secureKey}"

# Hosted mode toggles
HOSTED_MODE="false"
NEXT_PUBLIC_HOSTED_MODE="false"
ALLOW_USER_API_KEYS="true"
PERSIST_SERVER_PHI="false"
`
}

Expand Down