Rackspace

Returning Candidate?

Software Developer IV - SRE

Software Developer IV - SRE

Req # 
35548
Location(s) 
US-Remote
Category 
Software Development

Job Overview

Overview & Responsibilities

Applicants in this job role are responsible for the provisioning tools, configuration management systems, and installation/uptime of customer-facing infrastructure and services for Rackspace's various OpenStack Private Cloud related product offerings. We are looking for software engineers with a background in operations to design and implement tools that automate the provisioning, configuration, installation, and maintenance of reliable and performant provisioning systems, distributed configuration management systems, and installation and upgrade frameworks with an eventual goal of Continuous Deployment in mind. 

 

Highlights 

 

Automation - We take the approach of "If we've done this more than twice, automate it" creating an emphasis on building tools over manual processes.  


Open Source - As the co-founders of OpenStack we have a demonstrated history of working with open source communities and contributing much of our work back.  


Reliability - We have many private clouds that power customer production workloads and they depend on us to understand their environment and keep it running at peak performance for the entirety of our 99.99% uptime guarantee. 


DevOps - Development, and Operations co-exist in our product development teams making them both first class citizens in our infrastructure focus. 

 

Scale – We support numerous large-scale open source systems on behalf of our customers and are constantly pushing the limits of these systems. 

 

 

Responsibilities 


Extracting and automating all duplicate/manual work contained in service delivery runbooks or playbooks for customer upgrades and maintenances, 

 

Provide and maintain Framework automatic provisioning /configuration/installation/upgrading of various product offerings in a consistent manner 


Collect and provide metrics for entire product lifecycle process (provisioning, configuration, installation, upgrading) at a granular level 

Automating capacity additions, hardware replacements, operating system upgrades, security updates of the underlying systems with minimal to no impact to the product running on top 

Testing and tuning system configuration for various workloads (Compute heavy, Storage heavy, Network/IO heavy, etc)  

Automate testing of reference architecture deployments of RPC-OpenStack/OSA, Upgrades,
moving towards Continuous Deployment 

 

Qualifications

  • OpenStack operational experience required 
  • Knowledge of Ansible is preferred 
  • Strong knowledge of low level provisioning technologies (iPXE, Automated Bios Updates, Dracut/Initramfs, In-place reboots/upgrades of the OS) 
  • Strong experience in large fleet automation driven by centralized, versioning configuration/change management 
  • Ability to troubleshoot networking issues in distributed systems 
  • Experience bringing software to production at large scale. 
  • Fanatical focus on automation and instrumentation 
  • Ability to decompose complex systems and understand potential failure scenarios 
  • Contributions to Open Source projects

 

#LI-AS1