Experience
- Drove fleet-rebalancing strategy for Meta's large-scale compute allocator, improving AI-training placement efficiency and contributing to $700K+ cost reduction in 2026 with multi-million-dollar annualized potential.
- Optimized non-guaranteed capacity allocation across Meta's fleet, preventing ~25K task interruptions/hr for high-memory, multi-core workloads averaging ~55 GB RAM and 16 physical CPU cores.
- Improved allocator reliability by introducing production observability, placement diagnostics, and interruption-analysis workflows that exposed fleet-level inefficiencies before broad AI-training workload escalation.
- Built debugging and diagnostics workflows adopted by on-call engineers, reducing time-to-diagnosis for placement and interruption issues affecting high-value AI workloads.
- Created a recurring operational review forum that standardized incident analysis and on-call practices across the team.
- Designed and delivered lawful-intercept integration for RCS messaging in partnership with legal and backend infrastructure teams, supporting compliance-sensitive production flows.
- Delivered backend optimization for the onboarding of ~300M Apple users into Google's RCS infrastructure, improving cross-platform integration readiness at scale.
- Identified and resolved edge-case implementation flaws in RCS backend flows; simplified integration paths to reduce operational risk during large-scale customer migration.
- Architected and delivered GraphQL API platform capabilities for enterprise customers including Airbnb and Bank of America; designed a modular extension system enabling internal teams to inject custom functionality and revenue-generating filters.
- Established testing, monitoring, and release standards for REST and GraphQL APIs; served as security champion and technical mentor across 15–25 engineers, raising delivery quality for customer-facing platform APIs.
- Redesigned company-wide phishing-training infrastructure, eliminating critical security gaps and saving ~200 engineering hours/year through process and platform improvements.
- Delivered environment-synchronization incident-handling capability, improving operational efficiency for mission-critical customer teams.
- Rebuilt promotion engine, business-object validation engine, pricing configurability, operation-log storage, and product-definition database structures for Wargaming's game platform.
- Redesigned Kafka messaging pipeline for BI data flow.
- Designed low-latency, high-throughput microservices; redesigned a key platform service, reducing latency by 10×.
- Migrated OSGI monolith modules to a unified Scala version; delivered a company-wide metrics library.
Prior Experience · 2009 – 2016
- Optimized large investment fund platform (+20% execution speed).
- Led SVN→Git migration reducing dev–QA cycle by 10%; added technical documentation saving tens of engineer-hours per month.
- Designed Storm-based distributed anti-fraud computation system from scratch for a major bank.
- Built bank emulator for load and functional QA testing; designed deployment of 8 system components to AWS.
- Built high-throughput REST API for smart home startup: scheduling, device management, notifications.
- Designed REST-API and database schemas; implemented C++ device communication layer.
- Time management and attendance system for a major UK retailer with a large workforce.
- Extended business logic, DB model design, and requirements clarification.
- Call management system for automating call-center operations with metadata-based reporting.
- UI rework reducing backend calls by 2.5×; doubled test coverage.
- Junior developer on Out of Pawn — a marketplace and catalogue system for a pawnbroker network.
Technical Skills
Languages & Stack
JavaScalaPythonSpringKafkaHibernateLiquibaseJPAJDBC
Databases
PostgreSQLMySQLMongoDBRedis
Engineering
Distributed SystemsPlatform ArchitectureFleet AllocationObservabilityReliabilityAPI DesignTDDMonitoringDesign Patterns
Education
M.S. Applied Mathematics
National University of Ukraine · Faculty of Cybernetics · Kyiv
Additional training
Distributed Systems — MIT OpenCourseWare
Algorithms — Stanford / Coursera
Concurrent Programming — EPFL / Coursera
Functional Programming in Scala — EPFL
M101 MongoDB for Developers — 10gen