avatar

Dmitry Verkhoturov
Site Reliability Engineer

About

  • 5+ years of programming in Python and Go;
  • 10+ years of system engineering expertise both with bare metal and all major cloud providers;
  • 3.5 years of team management;
  • Excellent communication skills;
  • Solid UNIX and TCP/IP network fundamentals knowledge;
  • Huge expertise with monitoring and reliability of very complex systems;
  • Advanced troubleshooting and analytical skills;
  • Well-developed understanding of internet technologies and client/server architecture;
  • The developed ability to figure out proper targets in chaos environment and getting them done, both contributing personally and leading people into achieving the right goals.

Work Experience

Booking.com
March 2019 – Present
Site Reliability Engineer
Making the cloud secure for the company.
BidSwitch
August 2018 – March 2019
Senior DevOps Engineer
BidSwitch is a product of IPonWeb. It allows participation in RTB auctions with many different partners with minimal technical efforts, as well as spending advertising budgets on traffic of the highest quality.
I switched the team and position to find new challenges and to get acquainted with new technologies: Jenkins + Groovy, GitLab, kubernetes + helm.
Highlights
  • Set up monitoring (with Prometheus) for dev kubernetes clusters from scratch
  • Wrote a lot of Groovy code and tests
  • Created many helm charts and participated in developing helm/prometheus-operator (ex coreos/prometheus-operator) chart
IPonWeb
February 2015 – August 2018
Monitoring Team Lead
IPonWeb (300+ employees) provides technology for the advertising industry, Real-Time Bidding auctions in particular — the company handles millions of transactions every minute, terabytes of data are stored and processed every day. I created and managed a team of engineers, ensuring the company’s service operations 24/7/365.
Highlights
  • Developed applications/modules in Python, Puppet (+Ruby) and SQL.
  • Deployed and maintained applications in Amazon and Google cloud services.
  • Reduced amount of incidents from up to 200 a day to 5-10 a day.
  • Part of my work was to implement changes, and I know well how to make other people do what I need. I interacted with literally every team in the company regarding the introducing of improvements.
  • Created and integrated on-duty incident management procedures (how, when and what to do when you're on-call), by which duty shifts are operating since then.
  • Created a new team of eight people in three cities.
  • Wrote a ton of documentation and recorded a couple of videos for internal education.
  • Created incident management stats gathering and visualization system.
  • Introduced SRE practices to the company.
Svyaznoy Zagruzka
January 2014 – February 2015
Sr. System Engineer
Svyaznoy Zagruzka (~40 employees) is a telecommunication content provider, it powers services which send and receive SMS messages and USSD requests for partners. I handled client integration and support. My day-to-day activity was monitoring and troubleshooting (network and application levels, Wireshark and SQL console were my best friends).
Highlights
  • Maintained and monitored the company’s software, self-written parts in Java, served by Glassfish and Tomcat — I tuned these systems and debugged Java software.
  • Created a web application with Python+Flask for recurring tasks of extracting network dump slices cut by time and client.
  • Rewrote dozen of scripts from Perl and Bash to Python, and created a dozen new ones.
  • My tasks were to detect if a problem was on our or client's side and to find ways to fix problems on the client's side no matter if they are tech-savvy or not. Also, it was my responsibility to interact with other companies’ technical departments, troubleshooting and fixing problems with external services providers.
Russian Federation Army
December 2012 – December 2013
Compulsory military service
Served in the army for one year after obtaining my degree.
BioChemMack (medical equipment dealer)
October 2010 – December 2012
System Engineer
70 employees company, BioChemMack resells medical equipment. I've created and populated a wiki with information about internal infrastructure, migrated a primary server from old CentOS version to Ubuntu, improved company network architecture, and was participating in medical equipment users technical support.
FS Group (metal reseller)
April 2009 – October 2010
System Engineer
I was one-man army providing every IT service small multi-location company needed. I've established a network setup, created documentation on company infrastructure, and participated in site development, SEO and advertising.

Contact

Amsterdam NL
+310645790406‬
StackOverflow
GitHub

Education

  • 2006 2012

    Moscow Financial-Industrial Academy

    Master degree

    Informational security

Skills

Cloud providers
Amazon Web Services Google Cloud Platform Alibaba Cloud
Databases
MySQL MongoDB Local app storage engines Structured Query Language
Monitoring
SRE principles Analysis and troubleshooting Establishing procedures Zabbix Prometheus
Programming
Go Python Bash Ruby Perl
High load
Site Reliability Engineering Load Balancing Scalable and reliable systems Cost-effective redundancy Automated provisioning
(Cloud) System Administration
Linux core utils Debian and Red Hat distribution families Puppet Terraform Docker Kubernetes Helm
Continuous Integration and Delivery
Jenkins Groovy Fabric Glassfish Tomcat Open Build Service

Languages

  • Englishfluent
  • Russiannative speaker
  • Dutchelementary

Interests

Skydiving