Reliability Engineer
Bologna | Rome
- Organization: ECMWF - European Centre for Medium-Range Weather Forecasts
- Location: Bologna | Rome
- Grade: Junior level - A2 - Grade band
-
Occupational Groups:
- Engineering
- Closing Date: Closed
Job Description
The role
Our Reliability Engineers (RE) are responsible for ensuring that applications operate reliably and with good performance on the infrastructure. The RE will engage with, advise, steer and support all functions involved in the lifecycle of application deployment and hosting, including technical strategy, design, infrastructure, software development, tooling, service transition, service operation and use.
This role requires experience of both IT systems and software development with a focus on maintaining effective operations. Key skills include cloud native, automation, IT observability and service and application performance monitoring, logging and analytics.
This position within the WEkEO/Copernicus project will focus on enhancing WEkEO services further enabling it to cope with increased demands for quality, primarily in the area of Observability, and also in Identity and Access Management services.
Day-to-day, you will be working as a bridge between the ECMWF Computing Department, in-house and community system/service providers and application developers, and your technical peers of our WEkEO partners (EUMETSAT, Mercator Ocean, and EEA) advocating good practice and building a greater understanding of architecture and design to enable reliable and performant operations of the WEkEO distributed platform. There will also be services to be developed and you will provide support on-call, for observability and identity and access management services.
The team
The Platforms and Services Section forms part of ECMWF’s Computing Department, and is responsible for delivering a wide range of services including mission-critical virtual and bare-metal server infrastructure, data centre and wide area networks, security, monitoring and analytics, enterprise ICT, as well as the toolchain to support software development, integration and testing, and automated deployment for in-house or community software developers. Within the Section, the Service Reliability Team is responsible for IT observability, service monitoring, configuration management, centralised logging and analytics, and identity and access management, and helping services to run reliably and with good performance.
About ECMWF
ECMWF is the European Centre for Medium-Range Weather Forecasts. It is an intergovernmental organisation created in 1975 by a group of European nations and is today supported by 34 Member and Co-operating States, mostly in Europe. The Centre’s mission is to serve and support its Member and Co-operating States and the wider community by developing and providing world-leading global numerical weather prediction. ECMWF functions as a 24/7 research and operational centre with a focus on medium and long-range predictions and holds one of the largest meteorological archives in the world. The success of its activities relies primarily on the talent of its scientists, strong partnerships with its Member and Co-operating States and the international community, some of the most powerful supercomputers in the world, and the use of innovative technologies such as machine learning across its operations.
ECMWF has recently become a multi-site organisation, with its headquarters based since its creation in Reading, UK, a new data centre in Bologna, Italy, and new offices in Bonn, Germany. ECMWF has adopted a hybrid organisation model which allows its staff to mix office working and teleworking. This generous and flexible model provides our staff with considerable flexibility to spend time outside or away from their duty station and decide how they wish to manage their professional working time at ECMWF. ECMWF is an organisation that values the whole being and understands and values the need for flexibility in the way its staff work.
For additional details, see www.ecmwf.int
About Copernicus
Copernicus is the flagship Earth-observation component of EU’s space programme, which through operational monitoring of the atmosphere, oceans, and continental surfaces, provides validated environmental data, products, documentation and tools for anyone to use on a full, free and open basis. Exploiting space-based and in-situ observations and models, Copernicus consists of six information services, covering: land, marine, atmospheric and climate monitoring, as well as emergency management and security. ECMWF has been entrusted to operate two of these Services - the Copernicus Atmosphere Monitoring Service (CAMS) and the Copernicus Climate Change Service (C3S) - on behalf of the European Commission for a second term, until the end of 2027.
For details see https://www.copernicus.eu/en
About WEkEO
The WEkEO Data and Information Access Service provides uniform access to environmental data from the five Copernicus Services (CMS, CAMS, C3S, CLMS, CEMS) as well as data from the Sentinel satellites operated by EUMETSAT and ESA. WEkEO also provides users with hosted processing capabilities.
For details see https://www.wekeo.eu/
Main duties and key responsibilities
- Advocating for Reliability Engineering within WEkEO and its Copernicus partners.
- Developing observability capabilities for WEkEO services and for the underlying Climate and Atmosphere Data Store (CADS) services,
- Deploying open source, commercial, and proprietary software to containers, VMs, or bare metal as required by WEkEO.
- Contributing to Federated Identity Management deployments including liaising with technical peers at our WEkEO partner organizations (EUMETSAT, Mercator ocean, and EEA).
- Participating in regular 24-hour on-call rotas for critical WEkEO, data downloading, and data processing services.
What we are looking for
- Excellent interpersonal and communication skills
- Strong analytical and problem-solving skills, with a proactive continuous improvement approach
- Self-motivated, and able to work with minimal supervision
- Dedication and enthusiasm to work in a geographically distributed team
- Ability to work efficiently and complete diverse tasks in a timely manner
Education and experience
- A university degree (EQF Level 6) or equivalent industry experience
- Demonstrated relevant professional experience
- Experience in configuring network, server and storage infrastructures
- Experience in operational monitoring and application performance systems
- Experience in Identity and access management systems
- Experience in designing and developing in Linux-based Cloud environment
This role would suit IT professionals with either a software development or IT Operations background.
Knowledge and Skills
Demonstrable knowledge and skills in some of the following is required:
- Programming (any language) or scripting (Python, Ruby, Perl, Go)
- Observability, monitoring, logging and analytics, tracing applications
- Ansible, or similar modular configuration management
- Linux system administration
- Cloud Native (Kubernetes, Docker)
- Cloud IaaS (Terraform, VMware, OpenStack, Amazon, Google)
- DevOps: CI/CD, Software Engineering, Automation pipelines
- The server, storage and networking components of Cloud applications
A working knowledge in some of the following is desirable:
- Splunk, Grafana, Prometheus, Loki, ELK or similar
- Application Performance Monitoring
- Go, Python, PHP, Java, Perl, Django, Tomcat
- NOSQL (MongoDB), SQL (PostgreSQL or MySQL)
- VMware vSphere, Containers (e.g. Docker, Kubernetes)
- git, Atlassian Bamboo
- AJAX, JSON, XML, HTTP protocol
- Microsoft Active Directory
- Identity and Access Management (e.g. OpenID Connect, SAML)
Please provide clear examples of your knowledge and experience in the space provided on the application form.
Candidates must be able to work effectively in English and interviews will be conducted in English. A good knowledge of one of the Centre’s other working languages (French or German) is an advantage but not essential.
Other information
Grade remuneration
The successful candidate will be recruited at the A2 grade, according to the scales of the Co-ordinated Organisations and the annual basic salary will be EUR 70,794.48 net of tax. ECMWF also offers a generous benefits package, including a flexible teleworking policy. The position is assigned to the employment category STF-PL as defined in the ECMWF Staff Regulations. Full details of salary scales and allowances available on the ECMWF website at www.ecmwf.int/en/about/jobs, including the ECMWF Staff Regulations and the terms and conditions of employment.
Starting date: As soon as possible from October 2023
Length of contract: The contract duration is expected to be four years
Location: The position is based at ECMWF’s Data Centre in Bologna, Italy
As a multi-site organisation, ECMWF has adopted a hybrid organisation model which allows flexibility to staff to mix office working and teleworking.
Successful applicants and members of their family forming part of their households will be exempt from immigration restrictions.
Interviews by videoconference (via Microsoft Teams) are expected to take place in the first half of October 2023.
Who can apply
Applicants are invited to complete the online application form by clicking on the apply button below.
At ECMWF, we consider an inclusive environment as key for our success. We are dedicated to ensuring a workplace that embraces diversity and provides equal opportunities for all, without distinction as to race, gender, age, marital status, social status, disability, sexual orientation, religion, personality, ethnicity and culture. We value the benefits derived from a diverse workforce and are committed to having staff that reflect the diversity of the countries that are part of our community, in an environment that nurtures equality and inclusion.
Applications are invited from nationals from ECMWF Member States and Co-operating States, as well as from all EU Member States.
ECMWF Member and Co-operating States are: Austria, Belgium, Bulgaria, Croatia, Czech Republic, Denmark, Estonia, Finland, France, Georgia, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Latvia, Lithuania, Luxembourg, Montenegro, Morocco, the Netherlands, Norway, North Macedonia, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Türkiye and the United Kingdom.
In these exceptional times, we also welcome applications from Ukrainian nationals for this vacancy.
Applications from nationals from other countries may be considered in exceptional cases.
However, we have found similar vacancies for you: