Oracle is leading the digital revolution. We are empowering nearly half a million businesses to thrive in the age of skyrocketing connections. Join us and play an instrumental role in masterminding the software that will have a truly global impact.
Description
What You’ll Do
- Engage in and improve the whole Java Management Service lifecycle of applications deployment and operation
- Improve the existing continuous deployment pipeline for a wide range of functionalities across geographically separated zones
- Improve JMS Observability platform, Security and Incident management to meet the SLAs and SLOs defined for all Oracle cloud services
- Architect highly available and scalable service
- Skills to troubleshoot and trace symptoms back to the root cause
- Document and present methodologies to operations, engineering, and executive teams
- Educate the wider engineering organization on design and operational best practices for distributed computing
- Helping to meet the SLAs/SLOs for internal and external services and continual improvement of operational processes (weekly ops meetings, metrics, etc)
- Build tools and automation to improve system observability, availability, reliability, performance/latency, monitoring, emergency response
- On-call duties
Required Skills/Experience
What You’ll Bring
- Strong track record of implementing services on OCI/AWS/GCP/Azure in a variety of distributed computing environments, with good understanding on Docker, Kubernetes
- Understanding of CNI/CNCF landscape is good to have
- Strong knowledge of runtimes of Storage/RDBMS and NoSQL databases
- Experience in implementing multi cloud networking and deployment architecture
- Good understanding of the L3/4/7 network layers (including SDN)
- Hand on design, coding on any one of - Python, Shell, Go or Java
- Strong debugging/troubleshooting skills
- Experience on implementing observability platforms using any of products suites like DataDog, NewRelic, ELK, Prometheus preferably using Grafana
- Strong Experience with infrastructure automation and monitoring tools- Terraform, Helm, Ansible, Puppet, Chef, etc
- Experience with modern cloud development practices (microservices architectures, REST interfaces, etc.)
- Deep working knowledge on Linux servers and networking preferably Oracle Linux