Site Reliability Engineer with Tanzle

Remote in the US

$130K - 150K a year

Tanzle is an information infrastructure company that has assembled key innovative technological advances which address the imperatives of the next generation data analysis and visualization needs. We look at the perceptual and cognitive aspects of how people see data in order to make the most sense of it.

What you'll do:

  • Lean in to collaborate as a hands-on operability subject matter expert with all cross-functional groups responsible for architecting, building, securing, scaling, and supporting a greenfield platform which unifies content for better decision making.
  • Establish and enhance our infrastructure, tooling, and processes to extend operability as a self-service function for other groups in the engineering value stream with an emphasis on Terraform and Kubernetes in AWS.
  • Champion SRE principles of proactivity, automation, cross-functional collaboration, data-driven decision making, and fast+safe failure to continually improve our technology and culture.
  • Ensure we collectively define, instrument, and meet customer-focused Service Level Objectives across all Tanzle Platform services.

Must-have skills: Kubernetes, Terraform, AWS Nice to have skills: Security engineering, Network engineering

Required skills

  • Previous experience contributing in a production Site Reliability, DevSecOps, or SaaS/Technical Operations environment
  • Dedicated commitment to technical excellence and quality customer service
  • Ability to write code in at least one programming language (e.g. Python, Go, Perl, Ruby, Java, C++) and use Git for practical configuration data and code management
  • Experience with major cloud computing providers (AWS, Azure, or GCP) and hybrid/on-premises/hyperconverged private compute environments
  • Familiarity with the Cloud Native ecosystem
  • Knowledge in one or more of these disciplines forms a central pillar of your skill set:
  • Large scale production UNIX/Linux operating system, application, and security administration in an online service provider environment
  • Software-defined / Infrastructure-as-Code automation framework orchestration, configuration management, and related tooling (SaltStack, Terraform, Ansible, Puppet, Chef, etc.)
  • Application virtualization and container orchestration at scale (Docker, Kubernetes, Nomad, Mesos)
  • Security operations and defense in depth via left-shifted application security engineering, proactive threat assessment and detection, technical policy implementation, and promoting a culture of awareness
  • Comprehensive observability for infrastructure/system/application/security instrumentation, monitoring, logging, tracing, and alerting
  • Network architecture and secure operation with an emphasis on application load balancing at local and global scales, IPv4/6 routing and dynamic routing protocols
  • Hands-on experience with any of the following is a plus:
  • Service-oriented-architecture-based distributed systems incorporating service discovery/mesh/proxy and/or API gateway technologies
  • Achieving fast and comprehensive unit, integration, and end-to-end system test coverage across continuous integration and deployment/delivery pipelines in a release engineering or SDET context
  • Chaos engineering and related methodologies for benchmarking application performance, scalability, fault-tolerance, and reliability RDBMS, NoSQL, Graph, or hybrid data tier platform architecture, operation, and warehousing
  • Effectively delivering and supporting software products and services in a hybrid SaaS or pure customer-premises model, including regulated or airgapped environments
  • Highly scalable and available real-time communications technology operation (RTP, RTCP, RTSP, SIP, WebRTC, et al.)

Fully remote is great, with a post-pandemic expectation of occasional travel to the California Bay Area for major company events or team building activities once everyone involved feels that's safe.