Resume of Dmitry Verkhoturov

About

I'm passionate about creating systems that run smoothly with minimal human intervention. My focus is on setting up effective automation and alerts so that when problems arise, the right people are notified and equipped with the necessary information and action plans.
I excel in high-ambiguity environments where I need to figure out the goals and pathways to achieve them. While I've worked extensively with major cloud providers like AWS and GCP, I prefer using their basic services to avoid vendor lock-in and unnecessary complexity.
Contributing to open-source projects is particularly rewarding for me. I enjoy solving real-world problems for people who might not have the technical skills to do so themselves. I'm constantly pushing myself to learn and improve - for instance, I went from having no experience with technical interviewing to becoming one of the most experienced interviewers in my area within a year.
I'm particularly drawn to challenging projects where I can make a significant impact. My aim is to build robust, efficient systems that support a healthy work-life balance, allowing for standard working hours while accommodating those who wish to dedicate more time to their work.
In essence, I strive to create reliable, low-maintenance systems that empower rather than hinder their users and maintainers.

Work Experience

Secret Startup

Dublin

July 2024 – Present

Site Reliability Engineer

Reliability of complex business-critical software systems managing hardware.

Booking.com

Amsterdam

https://www.booking.com/

April 2022 – June 2024

Senior Site Reliability Engineer, ABU Reliability

Booking.com is a leading internet travel company by market capitalization. I worked in the Accommodations BU, responsible for the company's core customer and partner-facing services.

Highlights

Spearheaded reliability improvements across the company's largest revenue-generating department.
Scaled a team from two to seven members, establishing robust processes.
Conducted over 100 technical interviews in 2022, contributing to finding new colleagues and improving the process with feedback and question adjustments.
Participated in critical on-call rotations at team, department, and company-wide levels.
Contributed weekly tweaks to processes, code, and documentation within my reach across the company. Consistently ranked among top 5 Puppet code contributors company-wide.
Drove improvements in Engineering promotions process alongside leadership, sharing knowledge about preparation for promotions with peers, and conducting peer case review groups with dozens of participants.

Booking.com

Amsterdam

https://www.booking.com/

April 2021 – March 2022

Senior Site Reliability Engineer, Private Cloud Adoption

Led the onboarding of major internal customer to the Private Cloud platform, adapting the foundation to customer requirements and integrating vendor products with internal systems, until dedicated EM and TPM took over most responsibilities, significantly reducing the scope of the role.

Highlights

Acted as Technical Project Manager, leading negotiations and keeping the project on track.
Authored and reviewed numerous Architecture Design Records, addressing complexity management challenges.
Leveraged team's technical capabilities to focus on strategic direction and customer trust-building.

Booking.com

Amsterdam

https://www.booking.com/

July 2020 – March 2021

Site Reliability Engineer, Image Service and Email Infrastructure

For a year, responsible for image serving and transactional email delivery infrastructure.

Highlights

Reduced maintenance and toil activities from 90% to 60% across three teams through automation.
Drove migration from a pet to cattle approach to infrastructure for multiple teams and helped them decrease the noise-to-signal alerting ratio.

Booking.com

Amsterdam

https://www.booking.com/

March 2019 – June 2020

Site Reliability Engineer, Cloud Security

Provided security guardrails for cloud migration, balancing security needs with functionality for internal customers. My contributions included researching and implementing technical measures for SOx compliance and auditability, as well as keeping up with the work of hundreds of developers.

IPonWeb (acquired by Criteo)

Moscow

https://www.iponweb.com/

February 2015 – February 2019

Monitoring Team Lead

Led a team ensuring 24/7/365 smooth operation of services for a 300+ employee adtech company with infrastructure in AWS and GCP processing millions of transactions per minute.

Highlights

Slashed daily company-wide incident alerts from 200 to 5-10.
Developed applications in Python, Puppet (Ruby), and SQL.
Implemented ITIL-based incident management procedures, still in use years later.
Established and led a new eight-person team across three cities.
Produced extensive documentation and recorded videos for internal education.
Developed incident management stats gathering and visualization systems.
Introduced key SRE concepts to the company.
In the last 6 months, switched to the devops team to tackle new challenges and expand expertise in technologies like Jenkins, GitLab and Kubernetes. I've implemented Prometheus monitoring for team infrastructure, and developed Groovy code and tests, creating numerous internal and open-source Helm charts.

Svyaznoy Zagruzka

Moscow

https://www.en.zagruzka.com

January 2014 – January 2015

Senior System Engineer

Responsible for client integration and support for a telecom content provider. Focused on monitoring, troubleshooting applications, and network investigations. Automated routine tasks using Bash, Perl, and Python.

Russian Federation Army

Russia

https://eng.mil.ru

December 2012 – December 2013

Compulsory military service

Completed one year of compulsory military service, which turned out to be a crash course in resilience and people skills. Stuck with the same hundred people 24/7, I had to learn how to navigate complex social situations and negotiate effectively - there was simply no other option. This experience opened my eyes to how people can affect each other, for better or worse, and taught me the art of finding compromises in tough spots. It wasn't always pretty, but it definitely made me more empathetic and adaptable, skills that have proven surprisingly useful in the tech world.

BioChemMack (medical equipment dealer)

Moscow

https://biochemmack.ru/en/

October 2010 – December 2012

System Engineer

Created and populated an internal infrastructure wiki, migrated the primary server from CentOS to Ubuntu, and improved network architecture. Provided technical support for medical equipment users in a 70-employee company.

FS Group (metal reseller)

Moscow

https://favor-group.ru/

April 2009 – October 2010

System Engineer

As the sole IT professional, established network setups, created infrastructure documentation, and contributed to website development and SEO. Later transitioned to assisting with IT contractor management and goal-setting.

Contact

Dublin IE

+31645790406

StackOverflow

GitHub

Education

2006 2012
Synergy University

https://synergy.ru/

Master's degree

Information Security

Courses
- Information security and information protection
- Computer networks
- Database administration
- Operating systems and environments
- Programming theory
- C++ programming basics
- Information security engineering
- Security of operating system of clients
- Server operating system security
- Database security
- Security of data transmission networks and channels
- Security of internet resources

Skills

Programming

Go Python TypeScript Microservices Algorithms and Data Structures

High Load

Site Reliability Engineering Incident Management Load Balancing Scalable and Reliable Systems Cost-effective Redundancy Automated Provisioning

Monitoring

SRE Principles Analysis and Troubleshooting Establishing On-call Procedures Zabbix Bosun Prometheus

Scalable System Administration

Cross-functional and Cross-team Collaboration Technical Project Management Linux Core Utils Infrastructure as Code Configuration Management Bash Ruby Puppet Terraform Docker Kubernetes Helm

Cloud Providers

Amazon Web Services Google Cloud Platform Alibaba Cloud

Databases

SQL MongoDB Local App Storage Engines (bbolt)

Interests

Skydiving

Fiction books, English and Russian classics and sci-fi

Dmitry VerkhoturovStaff Site Reliability Engineer

About

Work Experience

Contact

Education

Synergy University

Skills

Interests

Dmitry Verkhoturov
Staff Site Reliability Engineer