SRE Manager with Apex

Remote - Hybrid (US)

$130K - 160K a year

Apex Order Pickup Solutions applies innovative, scalable software and hardware to enable safe, secure, frictionless order fulfillment for foodservice, retail and wholesale distribution companies. We are providing transformative Contactless Pickup solutions to a rapidly expanding global client base.

We don't just talk about the Internet of Things and Big Data …we live it. We use both every day to enable customers to perform contactless pickups of their coffee, pizza, take out and curbside orders or laptops. Our technology is helping Brands save time and reduce contact with their customers enabling killer customer experiences without killer overhead costs.

Position description

The ideal candidate for this key role is an innovative leader with an innate drive to automate the repeatable, has a passion for all things technology, strong fundamentals in enterprise application development and production support, a passion for scripting and maintains his/her own public code repositories for the things that interest him/her most. You'll be working with leading technologies and will be expected to contribute towards the continued innovation and advancement of Apex's SaaS platforms and the stacks used therein.

Duties & responsibilities

  • Oversee the day-to-day operations of the SRE team
  • Define, and continuously refine, appropriate SLOs and SLIs in order to consistently exceed business SLAs
  • Foster and drive a blameless postmortem culture
  • Define infrastructure as code for streamlined, repeatable deployments of platform components
  • Solve complex operational problems with software and automations, including contributing your own code at times
  • Lead and own the design, implementation and maintenance of the build/release infrastructure and CI/CD pipelines
  • Continuously improve system design and enhance platform resilience through collaboration with both systems engineers and software developers
  • Analyze and provide ongoing reports to senior leadership on current and projected future capacity and performance requirements using data rich dashboards whenever possible
  • Administer, monitor, upgrade, tune, test, secure and ensure high availability of applications and middleware in Apex's 24x7 SaaS environments
  • Participate in on-call efforts such as software deployments, maintenance, troubleshooting and issue escalations
  • Assist the development teams with identifying issues, troubleshooting, stack tracing and debugging across multiple applications and platforms
  • Assist in defining and testing high availability, disaster recovery and business continuity protocols

Knowledge & experience

  • Bachelor's degree in Computer Science or related technical field
  • 5+ years writing code for a SaaS product, in any modern language
  • 3+ years in a technical leadership role, preferably with responsibility for direct reports
  • 5+ years combined experience administering Node.js, Python or Java applications in a production SaaS environment
  • 4+ years working with production SaaS applications built on top of at least one of AWS, Azure or GCP and a solid understanding of the PaaS models used by public cloud platforms
  • 2+ years managing applications on production Kubernetes clusters as well as the clusters themselves
  • Hands-on experience with creation and management of CI/CD pipelines at an enterprise scale using tools such as Jenkins, CirciCI, GitLab CI/CD or GitHub Actions
  • Terraform, Ansible, Chef or Puppet experience
  • Strong working knowledge of microservice design patterns, high availability application architectures and how to properly secure systems and data
  • Solid fundamentals in networking and traffic management, ideally including experience delivering applications via reverse proxies like Nginx, Caddy, Traefik, etc.
  • Hands-on experience with configuration of monitoring and alerting tools such as Prometheus, Datadog, Nagios, Zabbix or Munin
  • Experience deploying, configuring and leveraging application performance monitoring (APM) tools such as Dynatrace, AppDynamics or New Relic for 24/7 production support use
  • Experience writing automations using Unix shell scripts, Python, Perl, C/C++, PHP or Go
  • Disciplined in the use of the Git version control system
  • A firm grasp of how Docker Engine overlays the underlying OS and network stack
  • High degree of comfort with the Linux command line interface

Preferred qualifications

  • Experience with messaging queuing middleware such as ActiveMQ, Kafka or RabbitMQ
  • Strong grasp of IoT principles, ideally with enterprise exposure

Work location: Willing to consider remote work for candidates outside the Cincinnati area. Local employees work together in the office Tuesday through Thursday and enjoy flexibility to work remote Monday and Friday.

We offer a very competitive base salary and a full benefits package including health insurance, life, dental and a 401(k) plan with a company match.