Suntory Azure Managed Service Standard Document

Azure Virtual Machine
Operation Manual
Document IDAZ-VM-OPS-001
Version1.0
StatusRELEASED
Created2026-05-18
Revised2026-05-18
Service ManagerSuntory Holdings Limited(SHDοΌ‰
Operations TeamTCS (Responsible for procedure execution and proactive updates to this document)
AuthorTomoki Koyama

This document defines the day-to-day operational procedures for Azure Virtual Machines, including procedures for incident and failure alert response.
TCS shall perform all operations in accordance with this document and proactively keep it up to date whenever changes occur.

Revision History

Ver.Revision Date AuthorDescriptionApprover
1.02026-05-18Tomoki Koyama (SHD)Initial releaseβ€”
πŸ“‹ Related Documents
Document IDDocument Name TypeNotes
AZ-VM-OVERVIEW-001Azure Virtual Machine Service OverviewService Overviewβ€”
AZ-VM-DESIGN-001Azure Virtual Machine Design DocumentDesign DocumentDesign rationale and standard value details
AZ-VM-PARAM-001Azure Virtual Machine Parameter SheetParameter SheetEntry and approval form for build-time configuration
AZ-VM-BUILD-001Azure Virtual Machine Build ProcedureBuild ProcedurePortal / Terraform build procedures
AZ-VM-OPS-001Azure Virtual Machine Operation Manual (this document)Operation Manualβ€”
πŸ“Œ Out of Scope for This Document (to be covered in separate documents)
The following procedures are not included in this document. Please refer to the respective separate documents.
Β· Azure technical inquiry and incident inquiry procedures / · OS patch management operations / · Monitoring alert operations

Responsibility Matrix (RACI)

Role Definitions

R = Responsibleγ€€A = Accountableγ€€C = Consultedγ€€I = Informed

Task / Activity SHD TCS BU Representative
Operation Manual management and final approvalARI
ServiceNow request submission (Spec Upgrade / Disk Addition / Deletion / Restore)IIR
ServiceNow CR / Incident closureIRI
VM Restart approval (Normal Operations)ICA
VM Restart approval (Emergency: Incident / Failure)ARI
Pre-Work Backup executionIRI
VM Restart executionIRI
VM Spec Upgrade executionIRI
Disk Addition executionIRI
Disk Expansion execution (TCS decision based on monitoring alert)IAI
VM Deletion executionIRI
VM Restore executionIRI
Work Completion ReportIRI

Common Rules / Pre-Work Checklist

⚠️ Mandatory Pre-Work Backup Principle: TCS must, as a rule, obtain an on-demand backup via Azure Backup before executing any CR (Change Request).
(Although periodic backups are already taken by Azure Backup, this ensures the state immediately before the operation is preserved.)

Common Pre-Work Checklist (applies to all operations)

☐No.Verification ItemDetails / Reference
☐1Verify target VM and Resource Group nameConfirm information stated in the ServiceNow ticket or alert
☐2Obtain pre-work backupRefer to "Procedure 1: Pre-Work Backup" in this document. Excluded for emergency restarts.
☐3Log in to Azure Portal and confirm permissionsVerify that the Contributor role or higher is assigned for the target Subscription
☐4Confirm maintenance windowAs a rule, perform work within the agreed maintenance window to minimize business impact
☐5Confirm ServiceNow ticketEnsure the ticket includes work details, requester, and approval status (for request-triggered operations)
☐6Confirm where to record completionRecord the work result in the ServiceNow ticket and ensure it is properly closed

Trigger Types and Work Flows

Trigger Type Source Basic Flow
ServiceNow CR BU Representative submits a request via ServiceNow Review CR β†’ Pre-work checks β†’ Obtain backup β†’ Execute work β†’ Verify β†’ Close CR
Monitoring Alert Azure Monitor / NewRelic alert Review and assess alert β†’ Emergency decision β†’ Notify BU (if possible) β†’ Execute work β†’ Record incident
Incident (Emergency) Failure detected / Escalation Assess situation β†’ Immediate response (post-hoc BU notification) β†’ Root cause investigation β†’ Record incident

Procedure 1: Pre-Work Backup

Trigger TCS executes this as a rule before performing any CR (except emergency restarts)
Operator TCS
Estimated Duration Approx. 10–30 minutes (until the backup job completes)
Take an On-Demand Backup from Azure Portal
☐StepActionDetails / Verification Points
☐1Open the Recovery Services VaultAzure Portal β†’ "Recovery Services vaults" β†’ Select the target Vault
☐2Confirm the backup target VM"Backup items" β†’ "Azure Virtual Machine" β†’ Confirm the target VM is displayed
☐3Execute "Backup Now"Select the target VM β†’ "Backup Now" β†’ Set retention period (30 days or more recommended for work backups) β†’ "OK"
☐4Confirm backup job completionIn "Backup Jobs", wait until the job status shows "Completed". Proceed to the next step only after completion.
☐5Record the backup pointRecord the date and time of the completed backup point in the ServiceNow ticket
Azure CLI (Alternative Procedure)
# Trigger an on-demand backup az backup protection backup-now \ --resource-group <RG-name> \ --vault-name <Vault-name> \ --container-name <VM-name> \ --item-name <VM-name> \ --backup-management-type AzureIaasVM \ --retain-until 30-06-2026 # Backup retention date (DD-MM-YYYY) # Check backup job status az backup job list \ --resource-group <RG-name> \ --vault-name <Vault-name> \ --status InProgress

Procedure 2: Azure VM Restart

Trigger When an incident or failure alert occurs, or when TCS determines that a restart is necessary for service recovery
Operator TCS
Important Rules [Normal Operations] Before restarting, TCS must always contact the BU Representative for the target system, obtain approval for execution and the scheduled date/time, and then proceed.
[Emergency] In the event of a complete VM outage where service continuation is impossible, an immediate restart without prior BU notification is permitted. However, TCS must provide a post-hoc report to the BU promptly after the restart.
Normal Operations Restart (after BU approval) ServiceNow CR or TCS-initiated

Step A: Pre-Confirmation with BU Representative

☐StepActionDetails / Verification Points
☐1Prepare restart reason, target VM, and proposed date/timeConfirm and record the basis for the restart need (alert details, error logs, etc.)
☐2Send a confirmation message to the BU RepresentativeContact via Teams or Email using the confirmation template below
☐3Obtain approval from BUReceive the approval message and record the approval details (scheduled date/time) in the ServiceNow ticket. If approval cannot be obtained, escalate to the next level.

Confirmation Message Template for BU Representative

Subject: [Action Required / Response Needed] VM Restart Approval Request (VM Name: <Target VM Name>) Dear <BU Representative Name>, I hope this message finds you well. This is <Operator Name> from the TCS Operations Team. We have identified an issue with the following VM and would like to perform a restart. We apologize for the inconvenience, and would appreciate your confirmation on whether we may proceed, along with your preferred date and time. β–  Target VM Name : <VM Name> β–  Target RG Name : <Resource Group Name> β–  Reason for Restart: <Describe specifically, e.g., sustained CPU spike, OS freeze, etc.> β–  Proposed Date/Time: <e.g., May 18, 2026, 22:00–22:30 (JST)> β–  Estimated Downtime: Approx. 5–10 minutes β–  Operator : <TCS Operator Name / Contact> To approve, please reply with "Approved (Scheduled time: XX:XX)" at your earliest convenience. Thank you for your cooperation.

Step B: Executing the Restart

☐StepActionDetails / Verification Points
☐1Obtain pre-work backupExecute "Procedure 1: Pre-Work Backup"
☐2Open the target VM in the PortalAzure Portal β†’ "Virtual machines" β†’ Select the target VM
☐3Execute "Restart"Click "Restart" in the top menu β†’ Click "Yes" in the confirmation dialog
☐4Confirm VM returns to Running stateWait until the VM "Status" shows "Running" (typically 3–5 minutes)
☐5Verify application and service operationConfirm that the application is functioning normally via the BU Representative or system monitoring
☐6Record results in ServiceNow ticket and closeRecord the restart date/time, result, and verification details
Azure CLI
# Restart the VM (OS shutdown followed by start) az vm restart \ --resource-group <RG-name> \ --name <VM-name> # Check VM status az vm get-instance-view \ --resource-group <RG-name> \ --name <VM-name> \ --query "instanceView.statuses[?starts_with(code,'PowerState')]" \ --output table
Emergency Restart (when service is down due to failure) Emergency Response
⚠️ Prior BU confirmation may be omitted. However, TCS must promptly provide a post-hoc report to the BU Representative after the restart is complete.
If the issue is not resolved after the restart, escalate to the next level.
☐StepActionDetails / Verification Points
☐1Confirm and record the failure situationRecord alert details, logs, and error messages. Register the incident in ServiceNow.
☐2Execute "Restart" on the VMPerform an immediate restart via Azure Portal or CLI (backup not required)
☐3Confirm VM returns to Running stateWait until Status shows "Running"
☐4Confirm service recoveryConfirm the service is operating normally via system monitoring
☐5Send post-hoc report to BU RepresentativePromptly report the reason for restart, execution date/time, result, and service recovery status
☐6Root cause investigation and incident recordingInvestigate the root cause that necessitated the restart and record it in ServiceNow

Procedure 3: Azure VM Spec Upgrade (VM Resize)

Trigger ServiceNow CR Performed upon receipt of a ServiceNow request from a BU Representative
Operator TCS
Important Notes VM resizing requires the VM to be stopped (deallocated). The VM will be down during the operation (service interruption). Always agree on a downtime window with the BU before proceeding.

Notes on VM Family Change

⚠️ When changing VM families (e.g., D-series β†’ E-series), always verify the following items.
Verification Item Details
Availability Zone compatibility Confirm that the new VM size is available in the current Availability Zone (Zone 1 / 2 / 3).
Check "Zone availability" on the VM size selection screen in the Portal.
Trusted Launch (vTPM / Secure Boot) support Confirm that the new VM size supports Trusted Launch (check via Basics tab β†’ Security type)
Accelerated Networking support Confirm that the new size supports SR-IOV. Re-verify that Accelerated Networking remains enabled in the NIC settings.
Ultra Disk compatibility If Ultra Disk is in use, confirm that the new VM size supports Ultra Disk.
Data disk cache settings Confirm that data disk cache settings are maintained after the family change.
VM Resize Procedure
☐StepActionDetails / Verification Points
☐1Review the ServiceNow CRConfirm that the new VM size, scheduled date/time, requester, and BU approval status are all documented
☐2Review Family Change notesCheck the "Notes on VM Family Change" section above
☐3Obtain pre-work backupExecute "Procedure 1: Pre-Work Backup"
☐4Notify BU Representative of work startNotify the planned start time and expected downtime duration
☐5Stop (deallocate) the VMAzure Portal β†’ "Stop" β†’ "Yes" β†’ Wait until Status shows "Stopped (deallocated)"
☐6Resize the VMPortal β†’ VM β†’ "Settings" β†’ "Size" β†’ Select the new size β†’ Click "Resize"
☐7Start the VMClick "Start" β†’ Wait until Status shows "Running"
☐8Verify the size changeConfirm the new size is displayed in VM "Overview" β†’ "Size"
☐9Re-verify Accelerated NetworkingConfirm NIC "Overview" β†’ "Accelerated networking" shows "Enabled"
☐10Verify application operationCoordinate with the BU Representative to confirm the service is operating normally
☐11Complete and close the ServiceNow CRRecord the before/after sizes, execution date/time, and verification results, then close
Azure CLI
# Stop (deallocate) the VM az vm deallocate --resource-group <RG-name> --name <VM-name> # Check available sizes (sizes available in the current zone) az vm list-vm-resize-options \ --resource-group <RG-name> \ --name <VM-name> \ --query "[?name=='Standard_D4s_v5']" --output table # Resize the VM az vm resize \ --resource-group <RG-name> \ --name <VM-name> \ --size Standard_D4s_v5 # Start the VM az vm start --resource-group <RG-name> --name <VM-name>

Procedure 4: Disk Addition and Expansion

Disk Addition Trigger ServiceNow CR Performed upon receipt of a ServiceNow request from a BU Representative
Disk Expansion Trigger ServiceNow CR Request from BU, or
Monitoring Alert When a disk capacity exhaustion alert is triggered and TCS determines an emergency expansion is needed (TCS may proceed at their own discretion)
Operator TCS
4-A: New Data Disk Addition ServiceNow CR Trigger
☐StepActionDetails / Verification Points
☐1Review the ServiceNow CRConfirm the disk size, storage type, and intended use
☐2Obtain pre-work backupExecute "Procedure 1: Pre-Work Backup"
☐3Open the VM's Disks settings in the PortalAzure Portal β†’ Target VM β†’ "Settings" β†’ "Disks" β†’ Click "+ Add data disk"
☐4Create and configure the new disk Configure the following in "Create disk":
Name: <hostname>_data<N> (e.g., JZJP1WAPSP001_data01)
Size: Size (GiB) as per the request
Storage type: Standard SSD LRS (general) / Premium SSD LRS (DB data)
Source type: None (empty disk)
Key management: Platform-managed key
Shared disk: No
Delete with VM: ON
☐5Click "Save" to attach the diskOn the Disks screen, click "Save" β†’ Confirm the disk appears under Data disks
☐6Initialize and format the disk at the OS levelFollow the OS-specific procedures below (requires connection to the VM)
☐7Verify operation and close the ServiceNow CRReport disk addition completion to the BU Representative and close the CR

Disk Initialization at OS Level (Windows)

Windows PowerShell (Administrator privileges)
# Check for uninitialized disks Get-Disk | Where-Object PartitionStyle -eq 'RAW' # Initialize the disk (GPT) Initialize-Disk -Number <disk-number> -PartitionStyle GPT # Create a partition and assign a drive letter (e.g., drive D) New-Partition -DiskNumber <disk-number> -UseMaximumSize -DriveLetter D # Format with NTFS Format-Volume -DriveLetter D -FileSystem NTFS -NewFileSystemLabel "Data" -Confirm:$false

Disk Initialization at OS Level (Linux)

Linux Bash (root / sudo)
# Check the newly added disk lsblk # Create a partition (e.g., /dev/sdc) sudo parted /dev/sdc --script mklabel gpt sudo parted /dev/sdc --script mkpart primary ext4 0% 100% # Format with ext4 sudo mkfs.ext4 /dev/sdc1 # Create mount point and mount (e.g., /data01) sudo mkdir -p /data01 sudo mount /dev/sdc1 /data01 # Configure auto-mount on startup (append to /etc/fstab) echo "/dev/sdc1 /data01 ext4 defaults 0 2" | sudo tee -a /etc/fstab
4-B: Existing Disk Capacity Expansion Monitoring Alert Response or ServiceNow CR
πŸ’‘ TCS self-initiated expansion: If monitoring alerts show disk utilization has reached a critical level (e.g., over 90%), TCS may perform an immediate expansion at their own discretion. However, TCS must promptly notify the BU Representative afterward.
☐StepActionDetails / Verification Points
☐1Review the alert details and identify the target diskDetermine the target VM name, disk name, current utilization, and the size after expansion
☐2Obtain pre-work backupExecute "Procedure 1: Pre-Work Backup" (if possible)
☐3Expand the Azure Managed Disk sizePortal β†’ Target Disk (Managed Disk) β†’ "Size + performance" β†’ Enter the new size β†’ "Save"
Both OS disks and data disks support online expansion while the VM is running (no VM stop required)
☐4Expand the partition and file system at the OS levelFollow the OS-specific procedures below (perform immediately after expansion)
☐5Verify disk capacity after expansionConfirm that the disk capacity has increased at the OS level (Windows: File Explorer / Linux: df -h)
☐6Report completion to BU Representative and update ServiceNowRecord the before/after sizes, execution date/time, and verification results

Azure Managed Disk Expansion (CLI)

Azure CLI
# Expand disk size (e.g., expand to 200 GiB) az disk update \ --resource-group <RG-name> \ --name <disk-name> \ --size-gb 200

File System Expansion at OS Level (Windows)

Windows PowerShell (Administrator privileges)
# Check the expandable size $size = (Get-PartitionSupportedSize -DriveLetter C) Write-Host "Max size: $($size.SizeMax / 1GB) GB" # Expand the partition to maximum size (e.g., drive C) Resize-Partition -DriveLetter C -Size $size.SizeMax

File System Expansion at OS Level (Linux)

Linux Bash (root / sudo)
# Check current partition status lsblk df -h [For ext4 (RHEL / Ubuntu)] # Expand the partition (e.g., partition 1 of /dev/sda) sudo growpart /dev/sda 1 # Expand the file system (online expansion, no reboot required) sudo resize2fs /dev/sda1 [For xfs] sudo xfs_growfs /dev/sda1 # Verify after expansion df -h

Procedure 5: Azure VM Deletion

Trigger ServiceNow CR Performed upon receipt of a ServiceNow request from a BU Representative
Operator TCS
⚠️ Deletion is an irreversible operation. Recovery from an accidental deletion requires Azure Backup.
Always confirm the ServiceNow CR details and approval status, and double-check the target VM name and Resource Group name before proceeding.
VM Deletion Procedure
☐StepActionDetails / Verification Points
☐1Review the ServiceNow CRConfirm the target VM name, RG name, deletion reason, requester, and BU approval status
☐2Double-check the target VM name and RG nameOpen the VM in the Portal and confirm that the VM name, RG, and tags match the details in the CR
☐3Obtain a final backupExecute "Procedure 1: Pre-Work Backup" to preserve the final state before deletion
☐4Notify the BU Representative of the deletionNotify the deletion date/time and target VM via Teams / Email
☐5Delete the VMPortal β†’ Target VM β†’ "Delete" β†’ Confirm that the target resources (VM, NIC, OS disk) are checked β†’ Enter VM name β†’ "Delete"
☐6Confirm related resources are deletedVerify in the Portal that the NIC, OS disk, and data disks (if Delete with VM is ON) have been deleted
☐7Remove the VM from CyberArkDelete the target VM entry from CyberArk
☐8Update ServiceNow CMDBRetire/delete the corresponding record in the ServiceNow CMDB
☐9Complete and close the ServiceNow CRRecord the deletion completion date/time and verification results, then close
Azure CLI
# Pre-verify VM name and RG name az vm show --resource-group <RG-name> --name <VM-name> --output table # Delete the VM (automatic deletion of related resources depends on "Delete with VM" settings) az vm delete \ --resource-group <RG-name> \ --name <VM-name> \ --yes # Check if any NICs remain (delete manually if found) az network nic list --resource-group <RG-name> --output table # Check if the OS disk remains (delete manually if found) az disk list --resource-group <RG-name> --output table

Procedure 6: Azure VM Restore

Trigger ServiceNow CR Performed upon receipt of a ServiceNow request from a BU Representative
Operator TCS
Restore Types β‘  File / Folder Restore: Restore specific files only to the original VM (can be performed while the VM is running)
β‘‘ Full VM Restore: Restore the entire VM by creating a new VM with a different (or the same) name
6-A: File / Folder Restore
☐StepActionDetails / Verification Points
☐1Review the ServiceNow CRConfirm the file path to restore, backup point (date/time), and restore destination
☐2Open the Recovery Services VaultAzure Portal β†’ "Recovery Services vaults" β†’ Select the target Vault
☐3Select "File Recovery""Backup items" β†’ "Azure Virtual Machine" β†’ Target VM β†’ "File Recovery"
☐4Select the restore pointSelect the backup point (date/time) specified in the CR
☐5Download and execute the recovery scriptGet the script via "Download Executable" β†’ Run it on the target VM β†’ The backup disk will be mounted
☐6Copy the target files to the restore destinationCopy the target files from the mounted backup disk to the original path
☐7Unmount the diskAfter recovery is complete, click "Unmount disks" in the Portal to unmount the backup disk (auto-released after 12 hours)
☐8Verify restore results and close the ServiceNow CRHave the BU Representative confirm file restore completion, then close the CR
6-B: Full VM Restore (Restore VM)
⚠️ A full VM restore cannot overwrite an existing VM. Always specify a new VM name / NIC / disk for the restore.
After the restore, re-registration in CyberArk and CMDB update are required for the restored VM.
☐StepActionDetails / Verification Points
☐1Review the ServiceNow CRConfirm the target VM, backup point (date/time), destination RG, and new VM name
☐2Open the Recovery Services VaultPortal β†’ "Recovery Services vaults" β†’ Target Vault β†’ "Backup items" β†’ "Azure Virtual Machine"
☐3Select "Restore VM" for the target VMTarget VM β†’ Click "Restore VM"
☐4Select the restore pointSelect the backup point (date/time) specified in the CR
☐5Configure the restore settings Configure the following under "Create new":
Restore type: "Create new virtual machine"
Virtual machine name: New VM name (follow naming conventions)
Resource group: Destination RG
Virtual network / Subnet: Select from existing VNets / subnets
Staging location: Specify a temporary storage account
☐6Click "Restore" to start the restoreIn "Backup Jobs", wait until the job shows "Completed" (typically 30–60 minutes)
☐7Assign the Common NSG to the restored VMRe-assign the Common NSG (si2-securitygroup-shd-cs-tokyo-cmn-01) to the NIC of the restored VM
☐8Reconfigure the backup policy for the VMEnable backup for the restored VM in the Recovery Services Vault
☐9Reconfigure tagsSet Suntory standard tags (Subsidiary / ServiceName / Environment, etc.) on the restored VM
☐10Re-register in CyberArkRegister the restored VM in CyberArk and configure/change the administrator password
☐11Update the ServiceNow CMDBUpdate the corresponding CMDB record with the new VM name and IP address
☐12Verify operation and close the ServiceNow CRHave the BU Representative confirm service recovery, then close the CR

Escalation Flow

Escalation Criteria

Situation Escalation Target Response Method
An event occurs that is not covered by the documented procedures TCS Azure Operation Team β†’ SHD Create a ServiceNow incident, record the situation, and escalate
Service does not recover after restart or restore TCS Azure Operation Team β†’ Azure Support Follow the Azure support inquiry procedure (refer to separate document)
Cannot obtain restart approval from the BU Representative TCS Azure Operation Team β†’ SHD Request SHD to coordinate with the BU
An error occurs during Azure Portal operations TCS Azure Operation Team β†’ Azure Support Capture the error message and operation logs via screenshots and report
Disk capacity shortage continues after expansion SHD β†’ BU Escalate to SHD from an architecture review perspective

Contact Information (update separately)

Role Person / Team Contact Method
TCS Operations Team Leadβ€”Teams channel / Phone
SHD Service ManagerTomoki Koyama (SHD / Digital & AI Global ITG)Teams / Email
Azure Technical Supportβ€”Azure Portal Support Request (refer to separate document)