Dmitry Verkhoturov
Senior Site Reliability Engineer


  • 5+ years of programming in Python and Go;
  • 10+ years of system engineering expertise both with bare metal and all major cloud providers;
  • 3.5 years of team management;
  • Excellent communication skills, stakeholder management and ability to organize teams;
  • Solid UNIX and TCP/IP network fundamentals knowledge;
  • Extensive expertise with monitoring and improving the reliability of very complex systems, as well as organizing and improving incident management;
  • Advanced troubleshooting and analytical skills;
  • Well-developed understanding of internet technologies and client/server architecture;
  • The developed ability to figure out proper targets in chaos environment and getting them done, both contributing and leading people into achieving the right goals.

Work Experience

March 2019 – Present
Site Reliability Engineer
Top 1 internet travel company by market cap. My job is to keep cloud migration secure from the technical side of things.
  • Providing cloud environment to internal customers in a secure and yet not crippled way
August 2018 – March 2019
Senior DevOps Engineer
BidSwitch is a product of IPonWeb. It allows participation in RTB auctions with many different partners with minimal technical efforts, as well as spending advertising budgets on traffic of the highest quality.
I switched the team and position to find new challenges and to get acquainted with new technologies: Jenkins + Groovy, GitLab, kubernetes + helm.
  • Set up monitoring (with Prometheus) for dev kubernetes clusters from scratch
  • Wrote a lot of Groovy code and tests
  • Created many helm charts and participated in developing helm/prometheus-operator
February 2015 – August 2018
Monitoring Team Lead
IPonWeb (300+ employees) provides technology for the advertising industry, Real-Time Bidding auctions in particular — the company handles millions of transactions every minute, terabytes of data stored and processed every day. I created and managed a team of engineers, ensuring the company’s service operations 24/7/365.
  • Developed applications/modules in Python, Puppet (+Ruby) and SQL.
  • Deployed and maintained applications in Amazon and Google cloud services.
  • Reduced amount of incidents from up to 200 a day to 5-10 a day.
  • Part of my work was to implement changes, and I know well how to make other people do what I need. I interacted with every team in the company regarding the introduction of improvements.
  • Created and integrated on-duty incident management procedures (how, when and what to do when you're on-call), by which duty shifts are operating since then.
  • Created a new team of eight people in three cities.
  • I wrote a ton of documentation and recorded a couple of videos for internal education.
  • Created incident management stats gathering and visualization systems.
  • Introduced SRE practices to the company.
Svyaznoy Zagruzka
January 2014 – February 2015
Sr. System Engineer
Svyaznoy Zagruzka (~40 employees) is a telecommunication content provider, it powers services which send and receive SMS messages and USSD requests for partners. I handled client integration and support. My day-to-day activity was monitoring and troubleshooting (network and application levels, Wireshark and SQL console were my best friends).
  • Maintained and monitored the company’s software, self-written parts in Java, served by Glassfish and Tomcat — I tuned these systems and debugged Java software.
  • Created a web application with Python+Flask for recurring tasks of extracting network dump slices cut by time and client.
  • Rewrote dozen of scripts from Perl and Bash to Python, and created a dozen new ones.
  • My tasks were to detect if a problem was on our or client's side and to find ways to fix problems on the client's side no matter if they are tech-savvy or not. Also, it was my responsibility to interact with other companies’ technical departments, troubleshooting and fixing problems with external services providers.
Russian Federation Army
December 2012 – December 2013
Compulsory military service
Served in the army for one year after obtaining my university degree.
BioChemMack (medical equipment dealer)
October 2010 – December 2012
System Engineer
70 employees company, BioChemMack resells medical equipment. I've created and populated a wiki with information about internal infrastructure, migrated a primary server from old CentOS version to Ubuntu, improved company network architecture, and was participating in medical equipment users' technical support.
FS Group (metal reseller)
April 2009 – October 2010
System Engineer
I was a one-man army providing every IT service small multi-location company needed. I've established a network setup, created documentation on company infrastructure, and participated in site development, SEO and advertising.


Amsterdam NL


  • 2006 2012

    Moscow Financial-Industrial Academy

    Master degree

    Informational security


Go Python Algorithms and data structures
High load
Site Reliability Engineering Load Balancing Scalable and reliable systems Cost-effective redundancy Automated provisioning
SRE principles Analysis and troubleshooting Establishing on-call procedures Zabbix Prometheus
(Scalable) System Administration
Linux core utils Bash Puppet Terraform Docker Kubernetes Helm
Cloud providers
Amazon Web Services Google Cloud Platform
MySQL MongoDB Local app storage engines Structured Query Language


  • Englishfluent
  • Russiannative speaker
  • Dutchelementary