Site Reliability Engineer with 7+ years of experience building and operating high-availability, production systems in fintech environments serving large-scale user traffic.
I combine strong SRE principles with hands-on DevOps engineering expertise—designing resilient cloud infrastructure, optimizing CI/CD pipelines, and improving system reliability across AWS and Azure environments.
My core focus is reliability at scale, leveraging Kubernetes, infrastructure as code (Terraform), and observability tooling (Prometheus, Grafana, Datadog) to maintain performant and fault-tolerant systems.
Key impact:
• Maintained 99.9%+ uptime across critical production services
• Reduced MTTR through structured incident response, monitoring, and automation
• Built and optimized CI/CD pipelines using GitHub Actions for faster, safer deployments
• Improved system performance and scalability through proactive monitoring and tuning
I have experience working in fast-paced, cross-functional teams and collaborating across time zones to support distributed systems in production environments.