I have a virtualized infrastructure running on Hyper-V with a separate backup/DRP server. The entire stack runs on 19 VMs: two Active Directory domains with domain joining, Linux DNS forwarders (BIND9), RADIUS, monitoring, application services… in short, not something we can afford to just reboot haphazardly.
The project’s objective was simple to state, but complex to achieve:
> Switchover the entire infrastructure to the DRP server with minimal effort, ensuring service continuity (at the very least, not cutting off internet access), then return to production cleanly.
Spoiler: It took me several iterations, a few cold sweats, and a good dose of PowerShell to pull it off. A few liters of coffee, too…
Architecture
HYPERV1 (Hyper-V prod)
├── 2 AD domains with trust
│ ├── Domain1: SRV-PDC1 (PDC + AD-integrated DNS)
│ │ SRV-DC1 (Secondary DC + AD-integrated DNS)
│ └── Domain2: SRV-PDC2 (PDC + AD-integrated DNS + DHCP)
│ SRV-DC2 (Secondary DC + AD-integrated DNS + DHCP)
├── 2 Linux BIND9 DNS forwarders + NTP
│ ├── SRV-DNS1 (Primary DNS forwarder + NTP)
│ └── SRV-DNS2 (Secondary DNS forwarder + NTP)
├── RADIUS, Proxy, SMTP, SIEM, Monitoring...
└── 3 virtual workstations
HYPERV2 (Hyper-V DRP)
└── Veeam B&R; 12 + NVMe replicas of all VMs
Replication runs 4 times a week with a retention of 2 restore points — sufficient to guarantee a maximum RPO of 24 hours.
The two scenarios
From the outset, I identified two radically different use cases:
FULL DRP — Actual crash Production is down. We switch everything over immediately. The PDCs start up first, followed by the rest. No continuity to manage since everything is already down.
MCO DRP — Planned maintenance Production is still running. We must switch over without service interruption. Here, the startup order becomes critical—we must always have at least one active DC per domain and one active DNS forwarder somewhere on the network.
This distinction necessitated the creation of two distinct failover plans in Veeam, with different startup orders.
The MCO Failover Plan: The Order That Changes Everything
For planned maintenance, the correct sequence is as follows:
The principle relies on two groups that switch over alternately, ensuring that at all times Domain1, Domain2, and the DNS always have at least one active service somewhere on the network.
Group 1 — PDC Domain1 + Secondary DC Domain2 + Primary DNS forwarder:
SRV-PDC1 · SRV-DC2 · SRV-DNS1
Group 2 — Secondary DC Domain1 + PDC Domain2 + Secondary DNS forwarder:
SRV-DC1 · SRV-PDC2 · SRV-DNS2
┌────────────────────────────────────────────────────────────────────────────────────────┐
│ HYPERV1 (prod) HYPERV2 (DRP) │
│ Domain1 Domain2 DNS Domain1 Domain2 DNS │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ Start SRV-PDC1 ✅ SRV-PDC2 ✅ DNS1 ✅ — — — │
│ SRV-DC1 ✅ SRV-DC2 ✅ DNS2 ✅ │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ Step 1 SRV-PDC1 ⬇️ SRV-PDC2 ✅ DNS1 ⬇️ — — — │
│ Group 1 SRV-DC1 ✅ SRV-DC2 ⬇️ DNS2 ✅ │
│ H1 down ↑DC1+PDC2+DNS2 still UP on HYPERV1 → continuity guaranteed ✅ │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ Step 2 SRV-DC1 ✅ SRV-PDC2 ✅ DNS2 ✅ SRV-PDC1 🔄 SRV-DC2 🔄 DNS1 🔄 │
│ Group 1 ↑Group 2 remains UP on HYPERV1 ↑Group 1 starts on HYPERV2 │
│ up H2 ↑MCO-DRP Wave 1 Failover Plan launched │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ Step 3 SRV-DC1 ✅ SRV-PDC2 ✅ DNS2 ✅ SRV-PDC1 ✅ SRV-DC2 ✅ DNS1 ✅ │
│ Group 1 ↑Group 2 still UP on HYPERV1 ↑Group 1 confirmed Running H2 │
│ confirmed │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ Step 4 SRV-DC1 ⬇️ SRV-PDC2 ⬇️ DNS2 ⬇️ SRV-PDC1 ✅ SRV-DC2 ✅ DNS1 ✅ │
│ Group 2 ↑Group 2 down on HYPERV1 ↑PDC1+DC2+DNS1 up on HYPERV2 │
│ down H1 → continuity guaranteed ✅ │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ Step 5 — SRV-PDC1 ✅ SRV-DC2 ✅ DNS1 ✅ │
│ Group 2 SRV-DC1 ✅ SRV-PDC2 ✅ DNS2 ✅ │
│ up H2 ↑All on HYPERV2 ✅ │
└────────────────────────────────────────────────────────────────────────────────────────┘
At each step, Domain1 always has an active DC, Domain2 does too, and the DNS forwarder always responds. No interruption in AD, DNS, or DHCP service throughout the entire failover process.
The Start-DRP.ps1 script
Everything is automated starting with HYPERV2 in PowerShell. The script manages both modes via an interactive menu:
.\Start-DRP.ps1
============================================================
DRP PROCEDURE - Select failover mode
============================================================
[1] CRASH - Production down, immediate startup
Uses the Failover Plan: FULL DRP
[2] MCO - Scheduled maintenance, continuity guaranteed
Uses the Failover Plan: MCO DRP
[3] MCO + Skip - MCO without prior replication
============================================================
Your choice (1/2/3):
After confirmation, the script automatically proceeds:
- Manual replication — final sync before failover
- Anti-shutdown flag — blocks the post-script that shuts down HYPERV1 after replication
- Ordered shutdown of non-critical VMs (waves 5→3)
- In MCO mode: Shutdown of Group 1, launch of the plan, wait for Group 1 ping, shutdown of Group 2
- In CRASH mode: Shut down everything, launch the plan
Verification of replica startup is done via local Hyper-V (Get-VM SRV-DC1_VeeamReplica) and not over the network — avoids false positives when the same IP is still responding from HYPERV1.
Return to production: Start-FailbackToProd.ps1
This is where things got complicated. Several versions were needed to achieve a clean solution.
Veeam 12 pitfalls
Pitfall 1 — Get-VBRJobSession no longer exists
Replaced by Get-VBRSession with a filter on JobName.
Pitfall 2 — Committing to the wrong restore point
After a Start-VBRHvReplicaFailback, Veeam creates a new restore point (index 0). If you pass this RP to Stop-VBRHvReplicaFailback to commit, the VM gets stuck in LockedItem. You must commit to index 1 — the RP from before the failback.
# Index 0 = new RP created by the failback → DO NOT commit this
# Index 1 = old Failover RP → this is the one to commit
$rpCommit = Get-VBRRestorePoint |
Where-Object { $_.IsReplica() -and $_.VmName -eq $vmName } |
Sort-Object CreationTime -Descending |
Select-Object -Skip 1 -First 1
Stop-VBRHvReplicaFailback -RestorePoint $rpCommit
Pitfall 3 — The Failover Plan that restarts the replicas
As long as the Failover Plan is active, it automatically restarts the replicas after each commit. The solution: use Stop-VBRReplicaFailover individually on each VM before initiating its failback. This shuts down the replica cleanly without affecting the others.
# Individual undo — does not affect other VMs in the plan
Stop-VBRReplicaFailover -RestorePoint $rpFailover
Pitfall 4 — The VHDX is still locked
After the commit, Hyper-V has not yet released the VHDX file. An immediate Start-VM fails with "The process cannot access the file". Solution: wait 15 seconds between the commit and starting the VM.
The final sequence for each VM
1. Stop-VBRReplicaFailover (index 0)
→ Cleanly shuts down the replica on HYPERV2
2. Start-VBRHvReplicaFailback (WITHOUT RunAsync = blocking)
→ Complete resync to HYPERV1
→ Veeam creates a new RP
3. Stop-VBRHvReplicaFailback (index 1)
→ Commit
4. Start-Sleep 15
→ VHDX release
5. Start-VM on HYPERV1
→ Wave delay → Next VM
Rollback order for AD/DNS continuity
SRV-PDC1 (PDC Domain1 + AD DNS) 120s
SRV-PDC2 (PDC Domain2 + AD DNS + DHCP) 90s
SRV-DNS1 (DNS forwarder + NTP) 60s
SRV-DC1 (DC2 Domain1 + DNS AD) 60s
SRV-DC2 (DC2 Domain2 + DNS AD + DHCP) 60s
SRV-DNS2 (DNS forwarder + NTP) 30s
... application services ...
... workstations ...
The logic is identical to the DRP MCO: at any given moment, at least one DC per domain and one DNS server are active somewhere on the network.
Bugs that cost me time
Hyper-V module not loaded in script context
Get-VM worked perfectly in an interactive session but returned null in the script. Cause: the Hyper-V module is not automatically loaded in a non-interactive PowerShell context. Fix:
Import-Module Hyper-V -ErrorAction Stop -WarningAction SilentlyContinue
The .State property returned as a PSObject via WinRM
When querying a VM’s state via Invoke-Command, .State returns a PSObject rather than a string. The comparison $state -eq "Off" fails silently. Fix:
(Get-VM -Name $name).State.ToString()
Get-VBRSession requires a mandatory parameter
Contrary to what the documentation suggests, Get-VBRSession without parameters opens an interactive prompt. You must pass -ErrorAction SilentlyContinue to prevent a hang in an automated script.
The MCO deadlock
Initial implementation: the script waited for the replicas to be Running on HYPERV2 before shutting them down on HYPERV1. Problem: Veeam does not start a replica while the source VM is running. Result: infinite deadlock. The solution was to shut them down before launching the Failover Plan, then waiting for them to start.
Results
After a day of testing and iterations, both scripts are working in production:
| Scenario | Total time | AD/DNS downtime |
|---|---|---|
| MCO DRP (failover) | ~8 min | 0 seconds |
| MCO failback | ~25 min | 0 seconds |
| FULL DRP (crash) | ~3 min | N/A |
The failback time is longer because each VM is processed sequentially with its own resync—that’s the price of service continuity.
What I Would Have Done Differently
Test on an isolated VM before automating everything. Each Veeam pitfall (commit index 1, RunAsync vs. blocking, Stop-VBRReplicaFailover) could have been discovered on a single VM before integrating it into the full script.
Document Veeam PowerShell commands as you go. The official documentation is incomplete on certain points (behavior of Start-VBRHvReplicaFailback with -RunAsync, RP management after failback). The Veeam forums are more reliable.
The Scripts
Both scripts are designed to run from HYPERV2 (the DRP server). They include a versioned changelog, a -WhatIf mode for simulation without execution, and complete timestamped logs in C:\Scripts\DRP\Logs\.
Start-DRP.ps1
Switches from HYPERV1 (prod) to HYPERV2 (DRP). Interactive menu on launch.
# Usage
.\Start-DRP.ps1 # Interactive menu
.\Start-DRP.ps1 -Mode MCO # Direct MCO
.\Start-DRP.ps1 -Mode CRASH # Direct crash
.\Start-DRP.ps1 -Mode MCO -SkipReplication # MCO without replication
# =============================================================================
# Start-DRP.ps1
# Complete DRP failover script from GIEDI PRIME
#
# Two failover modes:
#
# CRASH MODE (default) :
# Use when production is down or inaccessible.
# Starts all VMs on HYPERV2 via the "FULL-DRP" Failover Plan
# without worrying about service continuity.
# Steps:
# 1. Veeam Replication
# 2. Shut down all VMs on HYPERV1
# 3.Launch "FULL-DRP" Failover Plan
#
# MCO MODE (scheduled maintenance):
# Use for scheduled maintenance with service continuity.
# Ensures that one DC per domain and one DNS server remain up during the switchover.
# Steps:
# 1. Veeam replication
# 2. Shut down waves 5/4/3 on HYPERV1
# 3. Launch the "MCO-DRP" failover plan
# 4. Wait for Group 1 to be running on HYPERV2 (SRV-DC1+SRV-DC2+SRV-DNS1)
# then shut down Group 1 on HYPERV1
# 5. Wait for Group 2 to be running on HYPERV2 (SRV-PDC1+SRV-PDC2+SRV-DNS2)
# then shut down Group 2 on HYPERV1
#
# Prerequisites:
# - WinRM enabled on HYPERV1
# - Admin rights on HYPERV1
# - Veeam Backup & Replication console installed on GIEDI PRIME
# - Script to be run in PowerShell Administrator on GIEDI PRIME
# - Failover Plans "FULL-DRP" and "MCO-DRP" created in Veeam
#
# Usage:
# .\Start-DRP.ps1 -> CRASH mode + replication
# .\Start-DRP.ps1 -SkipReplication -> CRASH mode without replication
# .\Start-DRP.ps1 -Mode MCO -> MCO mode + replication
# .\Start-DRP.ps1 -Mode MCO -SkipReplication -> MCO mode without replication
#
# =============================================================================
# WARNING SCRIPT MAINTENANCE - READ BEFORE MAKING ANY CHANGES
# =============================================================================
#
# Whenever a VM is added or removed from the infrastructure:
#
# 1. Update the "FULL-DRP" and "MCO-DRP" Failover Plans in Veeam
# 2. Update $VMShutdownNonCritical, $MCOGroupe1, and $MCOGroupe2
# 3. Update the CHANGELOG
#
# FULL-DRP sequence reminder:
# Wave 1: SRV-PDC1 (120s) > SRV-PDC2 (90s) > SRV-DNS1 (60s)
# Wave 2: SRV-DC1 (60s) > SRV-DC2 (60s) > SRV-DNS2 (30s)
# Wave 3: SRV-RADIUS (45s) > SRV-PROXY (30s) > SRV-SMTP (30s) > SRV-PASSBOLT (30s)
# Wave 4: SRV-SIEM (45s) > SRV-MONITORING (45s) > SRV-WSUS (30s)
# > SRV-PRINT (30s) > SRV-PKI (20s)
# Wave 5: SRV-PXE (20s) > WS-01 (20s) > WS-02 (20s)
# > WS-03 (20s)
#
# MCO-DRP order reminder:
# Wave 1: SRV-DC1 (90s) > SRV-DC2 (90s) > SRV-DNS1 (60s)
# Wave 2: SRV-PDC1 (120s) > SRV-PDC2 (90s) > SRV-DNS2 (30s)
# Wave 3: SRV-RADIUS (45s) > SRV-PROXY (30s) > SRV-SMTP (30s) > SRV-PASSBOLT (30s)
# Wave 4: SRV-SIEM (45s) > SRV-MONITORING (45s) > SRV-WSUS (30s)
# > SRV-PRINT (30s) > SRV-PKI (20s)
# Wave 5: SRV-PXE (20s) > WS-01 (20s) > WS-02 (20s)
# > WS-03 (20s)
#
# =============================================================================
# CHANGELOG
# =============================================================================
# 2026-03-25 - v1.0 - Initial version - 18 VMs
# 2026-03-25 - v1.1 - Job name correction: ReplicaVM-HYPERV1_Dayly
# 2026-03-25 - v1.2 - Added DRP flag
# 2026-03-28 - v1.3 - Import-Module instead of Add-PSSnapin
# 2026-03-28 - v1.4 - Fixed job end detection + -SkipReplication
# 2026-03-28 - v1.5 - Fixed Get-VBRSession
# 2026-03-28 - v1.6 - Fixed .ToString() on State via WinRM
# 2026-03-28 - v1.7 - Added verification of critical NETLOGON DCs (incorrect logic)
# 2026-03-28 - v1.8 - Fixed Get-VBRSession -ErrorAction SilentlyContinue
# 2026-03-28 - v1.9 - Refactored DC logic (incorrect order)
# 2026-03-28 - v2.0 - Completely refactored DC/DNS logic in pairs
# 2026-03-28 - v2.1 - Added MCO-DRP mode with guaranteed service continuity
# Group 1 (SRV-DC1+SRV-DC2+SRV-DNS1) starts on HYPERV2
# then shuts down on HYPERV1 before Group 2
# Verification via local Hyper-V (Get-VM _VeeamReplica)
# to avoid any name/IP conflicts during failover
# 2026-03-28 - v2.2 - Added interactive menu if -Mode is not specified
# Confirmation before launch
# 2026-03-28 - v2.3 - Fixed null state on Get-VM
# 2026-03-28 - v2.4 - Fixed MCO deadlock:
# Group 1 shuts down on HYPERV1 BEFORE the Failover Plan
# The Failover Plan starts Group 1 on HYPERV2
# Then Group 2 shuts down on HYPERV1
# =============================================================================
param(
[ValidateSet("CRASH","MCO")]
[string]$Mode = "",
[switch]$SkipReplication
)
# --- INTERACTIVE MENU ---------------------------------------------------------
# If Mode is not specified in the parameter, display the selection menu
if ($Mode -eq "") {
Write-Host ""
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host " DRP PROCEDURE - Select failover mode" -ForegroundColor Cyan
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host ""
Write-Host " [1] CRASH " -ForegroundColor Red -NoNewline
Write-Host "- Production down, immediate startup on GIEDI PRIME"
Write-Host " Uses the Failover Plan: FULL-DRP"
Write-Host ""
Write-Host " [2] MCO " -ForegroundColor Yellow -NoNewline
Write-Host "- Scheduled maintenance, service continuity guaranteed"
Write-Host " Uses the Failover Plan: MCO-DRP"
Write-Host ""
Write-Host " [3] MCO + Skip " -ForegroundColor Yellow -NoNewline
Write-Host "- MCO without replication (replicas already up to date)"
Write-Host " Use the Failover Plan: MCO-DRP"
Write-Host ""
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host ""
$choice = Read-Host "Your choice (1/2/3)"
switch ($choice) {
"1" {
$Mode = "CRASH"
Write-Host ""
Write-Host "CRASH mode selected." -ForegroundColor Red
}
"2" {
$Mode = "MCO"
Write-Host ""
Write-Host "MCO mode selected." -ForegroundColor Yellow
}
"3" {
$Mode = "MCO"
$SkipReplication = $true
Write-Host ""
Write-Host "MCO + SkipReplication mode selected." -ForegroundColor Yellow
}
default {
Write-Host ""
Write-Host "Invalid choice. Stopping script." -ForegroundColor Red
exit 1
}
}
# Confirmation before running
Write-Host ""
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host " CONFIRMATION" -ForegroundColor Cyan
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host " Mode : $Mode" -ForegroundColor White
Write-Host " Failover Plan: $(if ($Mode -eq 'MCO') { 'MCO-DRP' } else { 'FULL-DRP' })" -ForegroundColor White
Write-Host " Replication : $(if ($SkipReplication) { 'IGNORED' } else { 'YES' })" -ForegroundColor White
Write-Host " Prod Host : HYPERV1" -ForegroundColor White
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host ""
$confirm = Read-Host "Confirm launch? (Y/N)"
if ($confirm -notmatch "^[Y]$") {
Write-Host "Cancelled by user." -ForegroundColor Yellow
exit 0
}
}
# --- CONFIGURATION -----------------------------------------------------------
$ScriptVersion = "2.8"
$VeeamReplicaJobName = "ReplicaVM-HYPERV1_Daily"
$VeeamFailoverPlan = if ($Mode -eq "MCO") { "MCO-DRP" } else { "FULL-DRP" }
$ProdHost = "HYPERV1"
$LogFile = "C:\Scripts\DRP\Logs\DRP_${Mode}_$(Get-Date -Format 'yyyyMMdd_HHmmss').log"
$ShutdownTimeout = 300 # Max seconds to wait for a VM to shut down (5 min)
$ReplicationTimeout = 7200 # Max seconds to wait for replication to complete (2h)
$VMReadyTimeout = 600 # Max seconds to wait for a VM to be Running on HYPERV2 (10 min)
$VeeamModule = "C:\Program Files\Veeam\Backup and Replication\Console\Veeam.Backup.PowerShell.dll"
# Non-critical VMs - shutdown in both modes (waves 5, 4, 3)
$VMShutdownNonCritical = @(
# Wave 5 - Workstations
"WS-03", "WS-02", "WS-01",
# Wave 4 - Application services
"SRV-PXE", "SRV-PKI", "SRV-PRINT", "SRV-WSUS", "SRV-MONITORING", "SRV-SIEM",
# Wave 3 - Network services
"SRV-PASSBOLT", "SRV-SMTP", "SRV-PROXY", "SRV-RADIUS"
)
# CRASH Mode - complete shutdown of waves 2 and 1 (non-critical)
$VMShutdownCrash = @(
"SRV-DNS2", "SRV-DC2", "SRV-DC1",
"SRV-DNS1", "SRV-PDC2", "SRV-PDC1"
)
# MCO Mode - Group 1: Secondary DCs + Primary DNS
# Start in Wave 1 of the MCO-DRP → shut down on HYPERV1 once up on HYPERV2
# (their counterparts SRV-PDC1, SRV-PDC2, SRV-DNS2 remain up on HYPERV1)
$MCOGroup1 = @("SRV-DC1", "SRV-DC2", "SRV-DNS1")
# MCO Mode - Group 2: Primary DCs + Failover DNS
# Start in Wave 2 of the MCO-DRP → Shut down on HYPERV1 once up on HYPERV2
$MCOGroup2 = @("SRV-PDC1", "SRV-PDC2", "SRV-DNS2")
# --- FUNCTIONS ---------------------------------------------------------------
function Write-Log {
param([string]$Message, [string]$Level = "INFO")
$timestamp = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
$line = "[$timestamp] [$Level] $Message"
Write-Host $line -ForegroundColor $(switch ($Level) {
"INFO" { "Cyan" }
"OK" { "Green" }
"WARN" { "Yellow" }
"ERROR" { "Red" }
default { "White" }
})
Add-Content -Path $LogFile -Value $line
}
function Wait-VMOff {
param([string]$VMName, [int]$TimeoutSec = $ShutdownTimeout)
$elapsed = 0
while ($elapsed -lt $TimeoutSec) {
$state = Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($name)
(Get-VM -Name $name -ErrorAction SilentlyContinue).State.ToString()
} -ArgumentList $VMName
if ($state -eq "Off") { return $true }
Start-Sleep -Seconds 5
$elapsed += 5
}
return $false
}
function Stop-VMProprement {
param([string]$VMName)
$vmState = Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($name)
$v = Get-VM -Name $name -ErrorAction SilentlyContinue
if ($v) { $v.State.ToString() } else { "NotFound" }
} -ArgumentList $VMName
if ($vmState -eq "NotFound") {
Write-Log "VM '$VMName' not found on $ProdHost, skipped" "WARN"
return
}
if ($vmState -eq "Off") {
Write-Log "VM '$VMName' already powered off, skipped" "OK"
return
}
Write-Log "Shutting down '$VMName' (state: $vmState)..."
Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($name)
Stop-VM -Name $name -Force -ErrorAction SilentlyContinue
} -ArgumentList $VMName
$isOff = Wait-VMOff -VMName $VMName
if ($isOff) {
Write-Log "VM '$VMName' shut down properly" "OK"
} else {
Write-Log "VM '$VMName' did not respond, forced power off..." "WARN"
Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($name)
Stop-VM -Name $name -TurnOff -Force -ErrorAction SilentlyContinue
} -ArgumentList $VMName
Start-Sleep -Seconds 10
$finalState = Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($name)
(Get-VM -Name $name).State.ToString()
} -ArgumentList $VMName
if ($finalState -eq "Off") {
Write-Log "VM '$VMName' forcefully shut down" "OK"
} else {
Write-Log "VM '$VMName' could not be shut down! (state: $finalState)" "ERROR"
}
}
}
function Wait-VMRunningLocal {
# Verifies that the VM replica is running on HYPERV2 via local Hyper-V
# Checks the _VeeamReplica name directly on the local hypervisor
# No network dependency — avoids any conflict with the source VM on HYPERV1
param([string]$VMName, [int]$TimeoutSec = $VMReadyTimeout)
$replicaName = "${VMName}_VeeamReplica"
$elapsed = 0
Write-Log "Waiting for '$replicaName' to be running on GIEDI PRIME (local Hyper-V)..." "WARN"
while ($elapsed -lt $TimeoutSec) {
$vm = Get-VM -Name $replicaName -ErrorAction SilentlyContinue
$state = if ($vm) { $vm.State.ToString() } else { "NotFound" }
if ($state -eq "Running") {
Write-Log "'$VMName' confirmed as Running on GIEDI PRIME" "OK"
return $true
}
Write-Log "'$VMName' not yet Running (state: $state) — $([math]::Round($elapsed/60,1)) min" "WARN"
Start-Sleep -Seconds 15
$elapsed += 15
}
Write-Log "'$VMName' not Running after $($TimeoutSec/60) min — continuing anyway" "ERROR"
return $false
}
function Wait-GroupRunning {
param([string[]]$VMNames)
Write-Log "Waiting for all VMs in the group to be Running on GIEDI PRIME..."
$allReady = $true
foreach ($vmName in $VMNames) {
$ready = Wait-VMRunningLocal -VMName $vmName
if (-not $ready) { $allReady = $false }
}
return $allReady
}
# --- INITIALIZATION ----------------------------------------------------------
$logDir = Split-Path $LogFile
if (-not (Test-Path $logDir)) { New-Item -ItemType Directory -Path $logDir -Force | Out-Null }
Write-Log "============================================================"
Write-Log "START OF DRP PROCEDURE"
Write-Log "Version : $ScriptVersion"
Write-Log "Mode : $Mode"
Write-Log "Failover Plan: $VeeamFailoverPlan"
Write-Log "Veeam Job : $VeeamReplicaJobName"
Write-Log "Prod Host : $ProdHost"
if ($SkipReplication) { Write-Log "Replication : skipped (-SkipReplication)" "WARN" }
Write-Log "============================================================"
# Load the Hyper-V module (required for Get-VM in the script context)
Write-Log "Loading Hyper-V module..."
try {
Import-Module Hyper-V -ErrorAction Stop -WarningAction SilentlyContinue
Write-Log "Hyper-V module loaded" "OK"
} catch {
Write-Log "Unable to load Hyper-V module: $_" "ERROR"
exit 1
}
# Load the Veeam module
Write-Log "Loading Veeam PowerShell module..."
try {
Import-Module $VeeamModule -ErrorAction Stop -WarningAction SilentlyContinue
Write-Log "Veeam module loaded" "OK"
} catch {
Write-Log "Unable to load the Veeam module: $_" "ERROR"
exit 1
}
# Check WinRM on HYPERV1
try {
Invoke-Command -ComputerName $ProdHost -ScriptBlock { $env:COMPUTERNAME } -ErrorAction Stop | Out-Null
Write-Log "WinRM connection to $ProdHost OK" "OK"
} catch {
Write-Log "Unable to connect to $ProdHost via WinRM: $_" "ERROR"
exit 1
}
# --- STEP 1: REPLICATION ---------------------------------------------------
if ($SkipReplication) {
Write-Log "------------------------------------------------------------"
Write-Log "STEP 1: Replication skipped (-SkipReplication)" "WARN"
} else {
Write-Log "------------------------------------------------------------"
Write-Log "STEP 1: Launching replication job '$VeeamReplicaJobName'"
$flagFile = "C:\Scripts\DRP\DRP_MODE.flag"
try {
Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($f) New-Item -Path $f -ItemType File -Force | Out-Null
} -ArgumentList $flagFile
Write-Log "DRP flag created on $ProdHost: $flagFile" "OK"
} catch {
Write-Log "Unable to create DRP flag on $ProdHost: $_" "ERROR"
exit 1
}
try {
$job = Get-VBRJob -Name $VeeamReplicaJobName -ErrorAction Stop
} catch {
Write-Log "Replication job not found: $_" "ERROR"
Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($f) Remove-Item $f -Force -ErrorAction SilentlyContinue
} -ArgumentList $flagFile
exit 1
}
if ($job.IsRunning) {
Write-Log "Job is already running, waiting for completion..." "WARN"
} else {
Start-VBRJob -Job $job | Out-Null
Write-Log "Replication job starting" "OK"
}
Write-Log "Waiting for the job to actually start..."
Start-Sleep -Seconds 20
Write-Log "Waiting for replication to finish (timeout $($ReplicationTimeout/60) min)..."
$elapsed = 0
$success = $false
while ($elapsed -lt $ReplicationTimeout) {
$job = Get-VBRJob -Name $VeeamReplicaJobName
if (-not $job.IsRunning) {
$lastSession = Get-VBRSession -ErrorAction SilentlyContinue |
Where-Object { $_.JobName -eq $VeeamReplicaJobName } |
Sort-Object CreationTime -Descending |
Select-Object -First 1
if ($lastSession -and ($lastSession.Result -eq "Success" -or $lastSession.Result -eq "Warning")) {
Write-Log "Replication completed successfully (Result: $($lastSession.Result))" "OK"
$success = $true
break
} elseif ($lastSession -and $lastSession.Result -ne "" -and $lastSession.Result -ne "None") {
Write-Log "Replication completed with an ERROR (Result: $($lastSession.Result))" "ERROR"
break
} else {
Write-Log "Job starting, waiting for result..." "INFO"
Start-Sleep -Seconds 30
$elapsed += 30
continue
}
}
Start-Sleep -Seconds 30
$elapsed += 30
Write-Log "Replication in progress... ($([math]::Round($elapsed/60,1)) minutes elapsed)"
}
if (-not $success) {
Write-Log "Replication failed or timed out. Stopping script." "ERROR"
Write-Log "Run again with -SkipReplication if replicas are up to date." "ERROR"
exit 1
}
}
# --- STEP 2: SHUT DOWN NON-CRITICAL VMs ------------------------------------
Write-Log "------------------------------------------------------------"
Write-Log "STEP 2: Shut down non-critical VMs on $ProdHost (waves 5/4/3)"
foreach ($vmName in $VMShutdownNonCritical) {
Stop-VMProprement -VMName $vmName
}
# --- STEP 3: SHUT DOWN DC/DNS + LAUNCH FAILOVER PLAN --------------------
Write-Log "------------------------------------------------------------"
if ($Mode -eq "CRASH") {
# CRASH Mode: Shut down everything, then launch the Failover Plan
Write-Log "STEP 3: CRASH Mode — Shut down DC/DNS on $ProdHost"
foreach ($vmName in $VMShutdownCrash) {
Stop-VMProprement -VMName $vmName
}
Write-Log "------------------------------------------------------------"
Write-Log "STEP 4: Launching the Failover Plan '$VeeamFailoverPlan'"
try {
$fp = Get-VBRFailoverPlan -Name $VeeamFailoverPlan -ErrorAction Stop
Start-VBRFailoverPlan -FailoverPlan $fp | Out-Null
Write-Log "Failover Plan launched — VMs starting on GIEDI PRIME" "OK"
} catch {
Write-Log "Error launching the Failover Plan: $_" "ERROR"
exit 1
}
} else {
# MCO mode: sequence in 2 groups with service continuity
#
# Sequence:
# 1. Shut down Group 1 on HYPERV1 (SRV-DC1+SRV-DC2+SRV-DNS1)
# SRV-PDC1+SRV-PDC2+SRV-DNS2 still up on HYPERV1 -> AD/DNS continuity
# 2. Launch the MCO-DRP Failover Plan
# -> Wave 1 starts SRV-DC1+SRV-DC2+SRV-DNS1 on HYPERV2
# 3. Wait for Group 1 to be running on HYPERV2
# 4. Shut down Group 2 on HYPERV1 (SRV-PDC1+SRV-PDC2+SRV-DNS2)
# SRV-DC1+SRV-DC2+SRV-DNS1 up on HYPERV2 -> AD/DNS continuity
Write-Log "STEP 3: MCO Mode — Shut down Group 1 on $ProdHost"
Write-Log "Group 1: SRV-DC1 + SRV-DC2 + SRV-DNS1"
Write-Log "SRV-PDC1 + SRV-PDC2 + SRV-DNS2 remain up on $ProdHost -> AD/DNS continuity" "WARN"
foreach ($vmName in $MCOGroup1) {
Stop-VMProprement -VMName $vmName
}
Write-Log "Group 1 shut down on $ProdHost" "OK"
Write-Log "------------------------------------------------------------"
Write-Log "STEP 4: Launching the '$VeeamFailoverPlan' Failover Plan"
try {
$fp = Get-VBRFailoverPlan -Name $VeeamFailoverPlan -ErrorAction Stop
Start-VBRFailoverPlan -FailoverPlan $fp | Out-Null
Write-Log "Failover Plan launched — Wave 1 starts Group 1 on GIEDI PRIME" "OK"
} catch {
Write-Log "Error launching Failover Plan: $_" "ERROR"
exit 1
}
Write-Log "------------------------------------------------------------"
Write-Log "STEP 5: Waiting for Group 1 to run on GIEDI PRIME"
Write-Log "Group 1: SRV-DC1 + SRV-DC2 + SRV-DNS1"
Wait-GroupRunning -VMNames $MCOGroup1 | Out-Null
Write-Log "------------------------------------------------------------"
Write-Log "STEP 6: Shutting down Group 2 on $ProdHost"
Write-Log "Group 2: SRV-PDC1 + SRV-PDC2 + SRV-DNS2"
Write-Log "Group 1 up on GIEDI PRIME -> AD/DNS continuity guaranteed" "WARN"
foreach ($vmName in $MCOGroup2) {
Stop-VMProprement -VMName $vmName
}
Write-Log "Group 2 shut down on $ProdHost" "OK"
}
# --- STEP 5: FINAL VERIFICATION -------------------------------------------
Write-Log "------------------------------------------------------------"
Write-Log "STEP 5: Final verification — all VMs off on $ProdHost"
$allOff = $true
$vmStates = Invoke-Command -ComputerName $ProdHost -ScriptBlock {
Get-VM | Select-Object Name, @{N="State";E={$_.State.ToString()}}
}
foreach ($vm in $vmStates) {
if ($vm.State -ne "Off") {
Write-Log "VM '$($vm.Name)' still in state '$($vm.State)" "ERROR"
$allOff = $false
} else {
Write-Log "VM '$($vm.Name)': Off" "OK"
}
}
if (-not $allOff) {
Write-Log "Some VMs not powered off — check manually" "WARN"
} else {
Write-Log "All VMs are Off on $ProdHost" "OK"
}
# --- END ---------------------------------------------------------------------
Write-Log "============================================================"
Write-Log "DRP PROCEDURE $Mode COMPLETED"
Write-Log "Monitor VM startup in the Veeam console"
Write-Log "Full log: $LogFile"
Write-Log "============================================================"
Start-FailbackToProd.ps1
Failback from HYPERV2 (DRP) to HYPERV1 (prod). Interactive menu upon launch.
# Usage
.\Start-FailbackToProd.ps1 # Interactive menu
.\Start-FailbackToProd.ps1 -WhatIf # Dry run
# =============================================================================
# Start-FailbackToProd.ps1
# Script for returning to production from GIEDI PRIME
#
# Service continuity principle:
# VMs are processed in PAIRS to ensure that one DC per domain
# and one DNS are always up throughout the entire failback process.
#
# For each VM in order:
# 1. Stop-VBRReplicaFailover → Individual undo, shuts down the replica on HYPERV2
# without affecting other VMs in the Failover Plan
# 2. Start-VBRHvReplicaFailback (blocking) → resync to HYPERV1
# 3. Stop-VBRHvReplicaFailback (index 1) → commit
# 4. Wait 15s for VHDX release
# 5. Start-VM on HYPERV1
# 6. Wave delay → Next VM
#
# Example Domain1 (without interruption):
# SRV-PDC1 undo → failback → commit → start HYPERV1 (120s)
# SRV-DC1 still failover to HYPERV2 → D1 covered ✅
# SRV-DC1 undo → failback → commit → start HYPERV1 (60s)
# SRV-PDC1 up on HYPERV1 → D1 covered ✅
#
# Prerequisites:
# - WinRM enabled on HYPERV1
# - Admin rights on HYPERV1
# - Veeam Backup & Replication console installed on GIEDI PRIME
# - Script to be run in PowerShell Administrator on GIEDI PRIME
# - DRP VMs must be in Failover state in Veeam
#
# Usage:
# .\Start-FailbackToProd.ps1 -> actual execution (interactive menu)
# .\Start-FailbackToProd.ps1 -WhatIf -> dry run (simulation without action)
#
# =============================================================================
# WARNING SCRIPT MAINTENANCE - READ BEFORE MAKING ANY CHANGES
# =============================================================================
#
# Whenever a VM is added or removed from the infrastructure:
#
# 1. Update $VMStartOrder below
# Follow the order in PAIRS for DC/DNS continuity:
# - Domain X PDC first, then Domain X secondary DC
# - Primary DNS first, then secondary DNS
#
# 2. Update the CHANGELOG
#
# Reminder of the order (pairs for service continuity):
# Wave 1: SRV-PDC1 (120s) > SRV-PDC2 (90s) > SRV-DNS1 (60s)
# Wave 2: SRV-DC1 (60s) > SRV-DC2 (60s) > SRV-DNS2 (30s)
# Wave 3: SRV-RADIUS (45s) > SRV-PROXY (30s) > SRV-SMTP (30s) > SRV-PASSBOLT (30s)
# Wave 4: SRV-SIEM (45s) > SRV-MONITORING (45s) > SRV-WSUS (30s)
# > SRV-PRINT (30s) > SRV-PKI (20s)
# Wave 5: SRV-PXE (20s) > WS-01 (20s) > WS-02 (20s) > WS-03 (20s)
#
# =============================================================================
# CHANGELOG
# =============================================================================
# 2026-03-28 - v1.0 - Initial release
# 2026-03-28 - v1.1 - Complete refactoring: logic VM by VM
# 2026-03-28 - v1.2 - Added DC verification before commit
# 2026-04-03 - v1.3 - Added SRV-PASSBOLT (Passbolt) in Wave 3 after SRV-SMTP (30s)
# 2026-04-03 - v1.4 - Removed RunAsync, commit index 1, global Undo, 15s delay
# 2026-04-03 - v1.5 - Fixed active plan detection via restore points
# 2026-04-03 - v1.6 - Simplified Undo logic
# 2026-04-03 - v1.7 - Added interactive MCO/CRASH menu
# 2026-04-03 - v1.8 - Order by pairs, removed global Undo
# 2026-04-03 - v1.9 - Individual Stop-VBRReplicaFailover before each failback
# 2026-04-03 - v2.0 - Final order: SRV-PDC1 > SRV-PDC2 > SRV-DNS1 > SRV-DC1
# > SRV-DC2 > SRV-DNS2 > services > workstations Replaced global Undo with individual Stop-VBRReplicaFailover
# before each failback
# Ensures that the Failover Plan does not restart the replica
# after commit. Zero service interruption.
# =============================================================================
param(
[ValidateSet("MCO","CRASH")]
[string]$Mode = "",
[switch]$WhatIf
)
# --- INTERACTIVE MENU ---------------------------------------------------------
if ($Mode -eq "") {
Write-Host ""
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host " FAILBACK PROCEDURE TO PRODUCTION" -ForegroundColor Cyan
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host ""
Write-Host " [1] MCO FAILBACK " -ForegroundColor Yellow -NoNewline
Write-Host "- Return after scheduled maintenance"
Write-Host " Failover Plan: MCO-DRP"
Write-Host ""
Write-Host " [2] FAILBACK CRASH " -ForegroundColor Red -NoNewline
Write-Host "- Return after disaster"
Write-Host " Failover Plan: FULL-DRP"
Write-Host ""
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host ""
$choice = Read-Host "Your choice (1/2)"
switch ($choice) {
"1" { $Mode = "MCO"; Write-Host "`nMCO FAILBACK mode selected." -ForegroundColor Yellow }
"2" { $Mode = "CRASH"; Write-Host "`nFailback mode CRASH selected." -ForegroundColor Red }
default {
Write-Host "`nInvalid choice. Script terminated." -ForegroundColor Red
exit 1
}
}
Write-Host ""
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host " CONFIRMATION" -ForegroundColor Cyan
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host " Mode : FAILBACK $Mode" -ForegroundColor White
Write-Host " Prod host : HYPERV1" -ForegroundColor White
Write-Host "============================================================" -ForegroundColor Cyan
Write-Host ""
$confirm = Read-Host "Confirm launch? (Y/N)"
if ($confirm -notmatch "^[Yo]$") {
Write-Host "Cancelled by user." -ForegroundColor Yellow
exit 0
}
}
# --- CONFIGURATION -----------------------------------------------------------
$ScriptVersion = "2.0"
$ProdHost = "HYPERV1"
$LogFile = "C:\Scripts\DRP\Logs\FAILBACK_${Mode}_$(Get-Date -Format 'yyyyMMdd_HHmmss').log"
$VeeamModule = "C:\Program Files\Veeam\Backup and Replication\Console\Veeam.Backup.PowerShell.dll"
# Production startup order
$VMStartOrder = @(
# Wave 1 - Primary DCs + Primary DNS
@{ Name = "SRV-PDC1"; Delay = 120 },
@{ Name = "SRV-PDC2"; Delay = 90 },
@{ Name = "SRV-DNS1"; Delay = 60 },
# Wave 2 - Secondary DCs + Secondary DNS
@{ Name = "SRV-DC1"; Delay = 60 },
@{ Name = "SRV-DC2"; Delay = 60 },
@{ Name = "SRV-DNS2"; Delay = 30 },
# Wave 3 - Network services
@{ Name = "SRV-RADIUS"; Delay = 45 },
@{ Name = "SRV-PROXY"; Delay = 30 },
@{ Name = "SRV-SMTP"; Delay = 30 },
@{ Name = "SRV-PASSBOLT"; Delay = 30 },
# Wave 4 - Application Services
@{ Name = "SRV-SIEM"; Delay = 45 },
@{ Name = "SRV-MONITORING"; Delay = 45 },
@{ Name = "SRV-WSUS"; Delay = 30 },
@{ Name = "SRV-PRINT"; Delay = 30 },
@{ Name = "SRV-PKI"; Delay = 20 },
# Wave 5 - Workstations
@{ Name = "SRV-PXE"; Delay = 20 },
@{ Name = "WS-01"; Delay = 20 },
@{ Name = "WS-02"; Delay = 20 },
@{ Name = "WS-03"; Delay = 20 }
)
# --- FUNCTIONS ---------------------------------------------------------------
function Write-Log {
param([string]$Message, [string]$Level = "INFO")
$timestamp = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
$prefix = if ($WhatIf) { "[WHATIF] " } else { "" }
$line = "[$timestamp] [$Level] $prefix$Message"
Write-Host $line -ForegroundColor $(switch ($Level) {
"INFO" { "Cyan" }
"OK" { "Green" }
"WARN" { "Yellow" }
"ERROR" { "Red" }
default { "White" }
})
Add-Content -Path $LogFile -Value $line
}
function Get-FailoverRestorePoint {
# Index 0: restore point in Failover state → proceeds to Stop-VBRReplicaFailover
# then to Start-VBRHvReplicaFailback
param([string]$VmName)
return Get-VBRRestorePoint |
Where-Object { $_.IsReplica() -and $_.VmName -eq $VmName -and $_.State.ToString() -eq "Failover" } |
Sort-Object CreationTime -Descending |
Select-Object -First 1
}
function Get-CommitRestorePoint {
# Index 1: second most recent RP after failback
# Start-VBRHvReplicaFailback creates a new RP (index 0)
# We commit index 1 = the old RP to prevent the VM from being locked in LockedItem
param([string]$VmName)
return Get-VBRRestorePoint |
Where-Object { $_.IsReplica() -and $_.VmName -eq $VmName } |
Sort-Object CreationTime -Descending |
Select-Object -Skip 1 -First 1
}
# --- INITIALIZATION ----------------------------------------------------------
$logDir = Split-Path $LogFile
if (-not (Test-Path $logDir)) { New-Item -ItemType Directory -Path $logDir -Force | Out-Null }
Write-Log "============================================================"
Write-Log "START OF THE FAILBACK TO PRODUCTION PROCEDURE"
Write-Log "Version : $ScriptVersion"
Write-Log "Mode : FAILBACK $Mode"
Write-Log "Prod Host : $ProdHost"
if ($WhatIf) { Write-Log "DRY RUN MODE - NO ACTUAL ACTION" "WARN" }
Write-Log "============================================================"
# Load the Veeam module
Write-Log "Loading the Veeam PowerShell module..."
try {
Import-Module $VeeamModule -ErrorAction Stop -WarningAction SilentlyContinue
Write-Log "Veeam module loaded" "OK"
} catch {
Write-Log "Unable to load the Veeam module: $_" "ERROR"
exit 1
}
# Check WinRM on HYPERV1
try {
Invoke-Command -ComputerName $ProdHost -ScriptBlock { $env:COMPUTERNAME } -ErrorAction Stop | Out-Null
Write-Log "WinRM connection to $ProdHost OK" "OK"
} catch {
Write-Log "Unable to connect to $ProdHost via WinRM: $_" "ERROR"
exit 1
}
# --- PRE-VOL VERIFICATION ----------------------------------------------------
Write-Log "------------------------------------------------------------"
Write-Log "VERIFICATION: Restore points in Failover state"
$missingVMs = @()
foreach ($vm in $VMStartOrder) {
$rp = Get-FailoverRestorePoint -VmName $vm.Name
if (-not $rp) {
Write-Log "WARNING: No failover restore point for '$($vm.Name)'" "WARN"
$missingVMs += $vm.Name
} else {
Write-Log "OK: '$($vm.Name)' → restore point from $($rp.CreationTime)" "OK"
}
}
if ($missingVMs.Count -gt 0) {
Write-Log "$($missingVMs.Count) VM(s) without a failover restore point: $($missingVMs -join ', ')" "WARN"
Write-Log "These VMs will be ignored for failback" "WARN"
}
# --- DRY RUN -----------------------------------------------------------------
if ($WhatIf) {
Write-Log "------------------------------------------------------------"
Write-Log "DRY RUN - Simulation of the paired procedure:" "WARN"
Write-Log "Continuity guaranteed: 1 DC per domain + 1 DNS always up" "WARN"
Write-Log "------------------------------------------------------------" "WARN"
foreach ($vm in $VMStartOrder) {
$rp = Get-FailoverRestorePoint -VmName $vm.Name
if ($rp) {
Write-Log " → Stop-VBRReplicaFailover '$($vm.Name)' — HYPERV2 replica shut down" "WARN"
Write-Log " → Failback '$($vm.Name)' (blocking — resync to HYPERV1)" "WARN"
Write-Log " → Commit failback '$($vm.Name)' (index 1)" "WARN"
Write-Log " → Waiting 15s for VHDX release" "WARN"
} else {
Write-Log " → '$($vm.Name)' not in Failover — direct startup if present" "WARN"
}
Write-Log " → Starting '$($vm.Name)' on $ProdHost — delay $($vm.Delay)s" "WARN"
Write-Log " ---" "WARN"
}
Write-Log "------------------------------------------------------------"
Write-Log "DRY RUN COMPLETED - No action taken" "WARN"
Write-Log "Run again without -WhatIf to execute" "WARN"
exit 0
}
# --- VM-BY-VM PROCESSING ----------------------------------------------------
Write-Log "------------------------------------------------------------"
Write-Log "STARTING VM-BY-VM FAILBACK (pair-wise order)"
Write-Log "Continuity guaranteed: 1 DC per domain + 1 DNS always up" "OK"
foreach ($vm in $VMStartOrder) {
$vmName = $vm.Name
$delay = $vm.Delay
Write-Log "============ $vmName ============"
$rp = Get-FailoverRestorePoint -VmName $vmName
if ($rp) {
# STEP A - Individual undo via Stop-VBRReplicaFailover
# Cleanly shuts down the replica on HYPERV2 without affecting the others
# Prevents the Failover Plan from restarting the replica after commit
Write-Log "[$vmName] Individual undo (Stop-VBRReplicaFailover)..."
try {
Stop-VBRReplicaFailover -RestorePoint $rp -ErrorAction Stop | Out-Null
Write-Log "[$vmName] Replica shut down on HYPERV2" "OK"
} catch {
Write-Log "[$vmName] Stop-VBRReplicaFailover error: $_" "ERROR"
}
# STEP B - Blocking failback to HYPERV1
Write-Log "[$vmName] Starting failback (blocking — resync to HYPERV1)..."
$rpFresh = Get-VBRRestorePoint |
Where-Object { $_.IsReplica() -and $_.VmName -eq $vmName } |
Sort-Object CreationTime -Descending |
Select-Object -First 1
if ($rpFresh) {
try {
Start-VBRHvReplicaFailback `
-RestorePoint $rpFresh `
-QuickRollback `
-PowerOn:$false `
-ErrorAction Stop | Out-Null
Write-Log "[$vmName] Failback completed" "OK"
} catch {
Write-Log "[$vmName] Failback error: $_" "ERROR"
}
} else {
Write-Log "[$vmName] No restore point available for failback" "ERROR"
}
# STEP C - Commit on index 1
Write-Log "[$vmName] Commit failback (index 1)..."
$rpCommit = Get-CommitRestorePoint -VmName $vmName
if ($rpCommit) {
try {
Stop-VBRHvReplicaFailback -RestorePoint $rpCommit -ErrorAction Stop | Out-Null
Write-Log "[$vmName] Commit OK (RP from $($rpCommit.CreationTime))" "OK"
} catch {
Write-Log "[$vmName] Commit error: $_" "WARN"
}
} else {
Write-Log "[$vmName] No RP index 1 found" "WARN"
}
# STEP D - Waiting for VHDX release
Write-Log "[$vmName] Waiting 15s for VHDX release..."
Start-Sleep -Seconds 15
} else {
Write-Log "[$vmName] No Failover restore point — VM skipped for failback" "WARN"
}
# STEP E - Start the VM on HYPERV1
$vmState = Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($name)
$v = Get-VM -Name $name -ErrorAction SilentlyContinue
if ($v) { $v.State.ToString() } else { "NotFound" }
} -ArgumentList $vmName
if ($vmState -eq "NotFound") {
Write-Log "[$vmName] VM not found on $ProdHost" "WARN"
} elseif ($vmState -eq "Running") {
Write-Log "[$vmName] VM already running on $ProdHost" "OK"
} else {
try {
Invoke-Command -ComputerName $ProdHost -ScriptBlock {
param($name)
Start-VM -Name $name -ErrorAction Stop
} -ArgumentList $vmName
Write-Log "[$vmName] VM started on $ProdHost" "OK"
} catch {
Write-Log "[$vmName] Startup error on $ProdHost: $_" "ERROR"
}
}
# STEP F - Delay before next VM
Write-Log "[$vmName] Waiting $delay seconds before the next VM..."
Start-Sleep -Seconds $delay
}
# --- END ---------------------------------------------------------------------
Write-Log "============================================================"
Write-Log "FAILBACK COMPLETE — PRODUCTION RESTORED ON $ProdHost"
Write-Log "Check the services (AD, DNS, DHCP, auth...)"
Write-Log "Don't forget to re-enable VM autostart on $ProdHost:"
Write-Log " Get-VM | Set-VM -AutomaticStartAction StartIfRunning"
Write-Log "Full log: $LogFile"
Write-Log "============================================================"