avatar

Dmitry Verkhoturov
Staff Site Reliability Engineer

About

I'm passionate about creating systems that run smoothly with minimal human intervention. My focus is on setting up effective automation and alerts so that when problems arise, the right people are notified and equipped with the necessary information and action plans.
I excel in high-ambiguity environments where I need to figure out the goals and pathways to achieve them. While I've worked extensively with major cloud providers like AWS and GCP, I prefer using their basic services to avoid vendor lock-in and unnecessary complexity.
Contributing to open-source projects is particularly rewarding for me. I enjoy solving real-world problems for people who might not have the technical skills to do so themselves. I'm constantly pushing myself to learn and improve - for instance, I went from having no experience with technical interviewing to becoming one of the most experienced interviewers in my area within a year.
I'm particularly drawn to challenging projects where I can make a significant impact. My aim is to build robust, efficient systems that support a healthy work-life balance, allowing for standard working hours while accommodating those who wish to dedicate more time to their work.
In essence, I strive to create reliable, low-maintenance systems that empower rather than hinder their users and maintainers.

Work Experience

Secret Startup
Dublin
July 2024 – Present
Site Reliability Engineer
Reliability of complex business-critical software systems managing hardware.
Booking.com
Amsterdam
April 2022 – June 2024
Senior Site Reliability Engineer, ABU Reliability
Booking.com is a leading internet travel company by market capitalization. I worked in the Accommodations BU, responsible for the company's core customer and partner-facing services.
Highlights
  • Spearheaded reliability improvements across the company's largest revenue-generating department.
  • Scaled a team from two to seven members, establishing robust processes.
  • Conducted over 100 technical interviews in 2022, contributing to finding new colleagues and improving the process with feedback and question adjustments.
  • Participated in critical on-call rotations at team, department, and company-wide levels.
  • Contributed weekly tweaks to processes, code, and documentation within my reach across the company. Consistently ranked among top 5 Puppet code contributors company-wide.
  • Drove improvements in Engineering promotions process alongside leadership, sharing knowledge about preparation for promotions with peers, and conducting peer case review groups with dozens of participants.
Booking.com
Amsterdam
April 2021 – March 2022
Senior Site Reliability Engineer, Private Cloud Adoption
Led the onboarding of major internal customer to the Private Cloud platform, adapting the foundation to customer requirements and integrating vendor products with internal systems, until dedicated EM and TPM took over most responsibilities, significantly reducing the scope of the role.
Highlights
  • Acted as Technical Project Manager, leading negotiations and keeping the project on track.
  • Authored and reviewed numerous Architecture Design Records, addressing complexity management challenges.
  • Leveraged team's technical capabilities to focus on strategic direction and customer trust-building.
Booking.com
Amsterdam
July 2020 – March 2021
Site Reliability Engineer, Image Service and Email Infrastructure
For a year, responsible for image serving and transactional email delivery infrastructure.
Highlights
  • Reduced maintenance and toil activities from 90% to 60% across three teams through automation.
  • Drove migration from a pet to cattle approach to infrastructure for multiple teams and helped them decrease the noise-to-signal alerting ratio.
Booking.com
Amsterdam
March 2019 – June 2020
Site Reliability Engineer, Cloud Security
Provided security guardrails for cloud migration, balancing security needs with functionality for internal customers. My contributions included researching and implementing technical measures for SOx compliance and auditability, as well as keeping up with the work of hundreds of developers.
IPonWeb (acquired by Criteo)
Moscow
February 2015 – February 2019
Monitoring Team Lead
Led a team ensuring 24/7/365 smooth operation of services for a 300+ employee adtech company with infrastructure in AWS and GCP processing millions of transactions per minute.
Highlights
  • Slashed daily company-wide incident alerts from 200 to 5-10.
  • Developed applications in Python, Puppet (Ruby), and SQL.
  • Implemented ITIL-based incident management procedures, still in use years later.
  • Established and led a new eight-person team across three cities.
  • Produced extensive documentation and recorded videos for internal education.
  • Developed incident management stats gathering and visualization systems.
  • Introduced key SRE concepts to the company.
  • In the last 6 months, switched to the devops team to tackle new challenges and expand expertise in technologies like Jenkins, GitLab and Kubernetes. I've implemented Prometheus monitoring for team infrastructure, and developed Groovy code and tests, creating numerous internal and open-source Helm charts.
Svyaznoy Zagruzka
Moscow
January 2014 – January 2015
Senior System Engineer
Responsible for client integration and support for a telecom content provider. Focused on monitoring, troubleshooting applications, and network investigations. Automated routine tasks using Bash, Perl, and Python.
Russian Federation Army
Russia
December 2012 – December 2013
Compulsory military service
Completed one year of compulsory military service, which turned out to be a crash course in resilience and people skills. Stuck with the same hundred people 24/7, I had to learn how to navigate complex social situations and negotiate effectively - there was simply no other option. This experience opened my eyes to how people can affect each other, for better or worse, and taught me the art of finding compromises in tough spots. It wasn't always pretty, but it definitely made me more empathetic and adaptable, skills that have proven surprisingly useful in the tech world.
BioChemMack (medical equipment dealer)
Moscow
October 2010 – December 2012
System Engineer
Created and populated an internal infrastructure wiki, migrated the primary server from CentOS to Ubuntu, and improved network architecture. Provided technical support for medical equipment users in a 70-employee company.
FS Group (metal reseller)
Moscow
April 2009 – October 2010
System Engineer
As the sole IT professional, established network setups, created infrastructure documentation, and contributed to website development and SEO. Later transitioned to assisting with IT contractor management and goal-setting.

Contact

Dublin IE
+31645790406
StackOverflow
GitHub

Education

  • 2006 2012

    Synergy University

    Master's degree

    Information Security

    Courses
    • Information security and information protection
    • Computer networks
    • Database administration
    • Operating systems and environments
    • Programming theory
    • C++ programming basics
    • Information security engineering
    • Security of operating system of clients
    • Server operating system security
    • Database security
    • Security of data transmission networks and channels
    • Security of internet resources

Skills

Programming
Go Python TypeScript Microservices Algorithms and Data Structures
High Load
Site Reliability Engineering Incident Management Load Balancing Scalable and Reliable Systems Cost-effective Redundancy Automated Provisioning
Monitoring
SRE Principles Analysis and Troubleshooting Establishing On-call Procedures Zabbix Bosun Prometheus
Scalable System Administration
Cross-functional and Cross-team Collaboration Technical Project Management Linux Core Utils Infrastructure as Code Configuration Management Bash Ruby Puppet Terraform Docker Kubernetes Helm
Cloud Providers
Amazon Web Services Google Cloud Platform Alibaba Cloud
Databases
SQL MongoDB Local App Storage Engines (bbolt)

Interests

Skydiving
Fiction books, English and Russian classics and sci-fi