← Back to job listings
DT
Infrastructure Subject Matter Expert for BCDR & DR Automation - KSA
DeepSource Technologies · Riyadh, Riyadh Province, Saudi Arabia
About The Role
Role Overview
The Infrastructure SME plays a critical role in ensuring that the underlying IT infrastructure fully supports Business Continuity and Disaster Recovery (BCDR) objectives, with a strong focus on DR Automation. This role bridges infrastructure and automation teams, ensuring resilience, scalability, and seamless failover/failback execution across all infrastructure layers.
Key Responsibilities
- Infrastructure Architecture & Readiness
- Review and validate the end-to-end infrastructure architecture supporting automated DR failover and failback, including:
o Network, Security, Compute, storage, virtualization, containers, and data center components
- Ensure the design supports high availability, resiliency, and recoverability aligned with business requirements.
- DR Automation Integration
- Act as the primary bridge between infrastructure teams and the DR Automation team, ensuring alignment and seamless collaboration.
- Review and validate automated failover/failback workflows across infrastructure components, including:
o Network, Security, Servers, storage, DNS, virtualization platforms, and container environments
- Collaborate on the development of pre-failover validation scripts to ensure readiness before execution.
- Recovery Objectives & Capacity Planning
- Review and validate infrastructure-level RTOs, ensuring alignment with application and business recovery requirements.
- Ensure sufficient capacity and performance within DR sites and automation platforms to support:
- o Full failover scenarios
- o Partial or phased failover scenarios
- Technical Leadership & Engagement
- Lead and actively participate in technical discussions and workshops across:
- o Discovery
- o Validation
- o Tabletop exercises
- Provide domain expertise and recommendations to ensure robust infrastructure design and DR strategy alignment.
- Performance & Validation
- Oversee and validate infrastructure performance testing during and after DR failover/failback activities.
- Ensure that systems meet defined performance benchmarks and recovery objectives post-recovery.
- Compliance & Audit Readiness
- Review and ensure adherence to audit and regulatory requirements, particularly around:
- o Logging
- o Monitoring
- o Traceability of DR activities
- Support audit readiness by ensuring proper documentation and controls are in place.
- Cross-Functional Collaboration
- Collaborate with Application, Network, Security, Database, and Business teams to ensure end-to-end alignment.
- Coordinate with stakeholders to ensure dependencies are properly managed across infrastructure and application layers.
- Continuous Improvement & Optimization
- Identify opportunities to optimize infrastructure resilience, performance, and cost efficiency.
- Drive continuous improvement initiatives based on test results, incidents, and evolving business needs.
- Strong expertise in enterprise infrastructure design and operations (network, compute, storage, virtualization, cloud).
- Hands-on experience with Disaster Recovery architectures and DR automation tools.
- Deep understanding of failover/failback mechanisms and infrastructure dependencies.
- Experience in capacity planning, performance testing, and high availability design.
- Knowledge of regulatory and compliance requirements related to DR and infrastructure.
- Strong stakeholder communication and cross-team coordination skills.
Preferred Qualifications
- Experience in large-scale BCDR and DR Automation programs.
- Certifications in infrastructure technologies, cloud platforms, or DR/BCDR frameworks.
This listing was posted by a verified recruiter at DeepSource Technologies. Report this listing
JobSpring