← Back to all jobs

This job has been posted a while ago and might no longer be available.

Senior Site Reliability Engineer

You: You are a Senior-level Site Reliability Engineer who builds robust, scalable automated systems and tools. You have a sky-high bar for all things operations. Your team can count on you to deliver innovative and inventive solutions to hard problems in distributed, highly-available environments. You're experienced working remotely as part of a globally-distributed team. You understand(and can articulate) the value of DevOps and why it's more than just tools or a job title. You embrace the idea of "immutable infrastructure" when you design systems and architecture. You're comfortable working with(and proactively engaging) developers, senior leadership, and non-technical stakeholders to help deliver value to the larger organization. You take opportunities to fix problems, mentor others, and step outside your comfort zone to develop your own skillset. You hold yourself and others in the team to a high bar of quality when it comes to working with our production environments.

Us: Apptio TechOps Engineering Services team is a globally distributed group of highly technical folks who build robust and reliable services to enhance our internal engineering capabilities. Our mission is to deliver reliable, scalable, simplified solutions through guidance, visibility, and robust automation fabric.


What we want you to do:

We are looking for a talented Senior Site Reliability Engineer to help us design and build the next generation platform that will support Apptio's production services. You will be a member of our globally distributed team and will contribute during EMEA business hours coverage as well as working on longer term projects.

Our environment consists of both co-located hardware as well as cloud deployments. We work with everything from the hardware through to the application, dealing with operating systems, supporting software, monitoring, metrics and logging.


Basic Qualifications:

  • 2-5 years experience in a large-scale, distributed Linux/Unix environment
  • Robust knowledge of Linux internals and tool chains, including shell scripting
  • Proven experience working remotely in a globally distributed team. Familiarity with the tools, processes, and expectations that environment brings
  • Demonstrated experience with configuration management tools (i.e. Puppet)
  • Experience with high level programming languages such as Python, Ruby, or Go
  • Experience with RESTful systems and their APIs. Be very comfortable with JSON
  • Experience with cloud providers such as AWS, Azure, or Google Cloud Platform
  • High-level understanding of container/workload scheduling systems like Mesos or Kubernetes
  • Proven experience identifying and resolving high-severity, time-sensitive issues and outages in a customer-facing environment
  • Metrics, metrics, metrics. Have a deep understanding the importance of observability and a good understanding of what to measure, when to measure it, and how to measure it
  • This position requires that you are a resident in Denmark, France, Germany, Italy, Netherlands, Sweden, Spain, or the UK. We would also accept those willing to relocate but can not provide visa sponsorship for this role


Preferred Qualifications:

Experience with and/or serious interest to learn one or more of the following is very valuable:

  • 5+ years of senior-level responsibility in a RedHat/CentOS based Linux environment
  • Containers (Docker, OpenVZ, Virtuozzo)
  • An innate understanding of the value DevOps brings to a technology organization
  • General knowledge of database technologies. RDBMS like MySQL is a definite plus. NoSQL experience is nice to have as well
  • Experience designing and implementing a CI/CD(Continuous Integration/Continuous Delivery) pipeline architecture
  • Familiarity and/or experience deploying serverless architectures
  • Experience with at least one Infrastructure as Code tool, such as Terraform or Cloudformation
  • Distributed Storage Systems such as Parallels Cloud Storage, Hadoop, Ceph, or similar
  • Messaging Queues (RabbitMQ, SQS, Kafka)
  • Monitoring tools such as Sensu, Splunk or Grafana, experience with Prometheus is definitely a plus

How to apply?

Please send your resume to romekanda@apptio.com