This job has been posted a while ago and might no longer be available.
Site Reliability Engineer
Responsibilities include but are not limited to deploying, supporting, monitoring and troubleshooting large scale micro-service based distributed systems with high transaction volume; documenting the IT infrastructure, policies, and procedures.
All candidates will have
- a Bachelor's or higher degree in technical field of study
- a minimum of three years' experience deploying, monitoring and troubleshooting large scale distributed systems
- a good understanding of network and routing protocols (TCP/IP, DNS and others)
- excellent knowledge of at least one modern programming language, such as Go, Java, C++, Python and Scala
- experience with systems for automating deployment, scaling, and management of containerised applications, such as Kubernetes and Mesos
- excellent troubleshooting and creative problem-solving abilities
- excellent written and oral communication and interpersonal skills
Ideally, candidates will also have
- experience deploying and supporting big data technologies, such as Kafka, Spark, Storm, Flink and Cassandra
- experience implementing, operating, and supporting open source tools for network and security monitoring and management on Linux/Unix platforms
- experience with encryption and cryptography standards
How to apply?
Please apply here: https://www.numbrs.com/de/careers/open-positions/details#oXi76fwc