Site Reliability Engineer Team Lead (Remote) #3201
Careers
- US-VA-Fairfax
- IT Infrastructure & Network Engineering & Operations
- Suitability/Public Trust
- Fully remote
Share
Overview
GovCIO is currently hiring for an Engineering Senior Team Lead to support building processes that manage and improve OIT’s response posture to system events impacting end users and Veterans. This includes working with business partners to improve communication and responsiveness to application failures by minimizing impacts in performance degradation and availability, working towards a significant reduction in application downtime and impact to the users. This role leads a team of both junior and senior Site Reliability Engineers to deliver effective project management in support of group objectives. This position will be fully remote.
Responsibilities
- Leader first who has SRE experience focusing on managing the team to perform at government expectations and exceeding them, mentoring SREs and providing them with feedback, and creating a culture of teamwork within the SRE's.
- Enabling SREs to be open and communicative with systems owners on incident calls and establishing a proactive environment.
- Leading and managing a team of 20+ SREs, both Jr and Sr SREs
- Develop and nurture a working relationship across multiple teams within OTG
- Participate in customer-impacting incidents and lead timely mitigation and long-term resolution.
- Deeply self-motivated with the ability to work independently, coordinating activities within multi-functional teams
- Participate and designate internal projects within the SRE Team
- Support design and implementation of improvement planning, data analysis, assessments, and organizational strategies.
- Support and provide guidance for tracking complex business procedures to achieve goals and overcome barriers in the collection of technical information from the relevant stakeholders, or in support of content for white papers and other communication devices; and assessing and evaluating the effectiveness of executive communication to effect process improvement.
- Support Triage efforts during Major Incidents by deconstructing application performance, interoperability, instrumentation, and human factors to facilitate resolution and development of resilient solutions.
- Support coordination and ensure all High Priority Incident (HPI) and Critical Priority Incident (CPI) are triaged properly and routed to the appropriate and correct groups for immediate resolution.
- Strong understanding of incident troubleshooting processes using SRE best practices and voicing identified impacts to a larger audience during triaging events.
- Demonstrated ability to communicate complex technical expertise and insight with entire IT Stack Stakeholders.
- Demonstrated proficiency with DevOps tools, JIRA, ServiceNow, MS Project and perform tasks using the tools.
- Organizational skills and Documentational skills are a MUST – attention to detail.
- Leverages use of Monitoring tools during Incident Triage and performs analysis. EG: Splunk, DynaTrace, SolarWinds, AppD.
- Provide support to Problem Management’s enterprise root cause analysis (RCA) processes in collaboration with appropriate OI&T organizations.
Qualifications
Required Skills and Experience
- Bachelor's Degree is preferred in Business Administration, Business Management, Computer Science, Information Systems, Information Resource Management, Industrial Engineering, Operations Research or related fields.
- 8 to 10 years of relevant experience may be substituted for education (13-15 years total)
- 5+ years of relevant experience
- Certifications in relevant UX software plus 5-8 years of relevant experience
- Strong analytical, investigation, and organization skills.
- High attention to detail and precise data accuracy.
- Critical thinker with a proactive approach to lead Incident troubleshooting efforts.
- Ability to recognize Incident triaging flow and ability to drive call to resolution.
- Scripting and development experience in at least one Object Oriented Language. Python or RegEx
Preferred Skills and Experience
- Full-Stack experience highly preferred.
Clearance Required: Suitability/Public Trust
Company Overview
GovCIO is a team of transformers--people who are passionate about transforming government IT. Every day, we make a positive impact by delivering innovative IT services and solutions that improve how government agencies operate and serve our citizens.
But we can't do it alone. We need great people to help us do great things - for our customers, our culture, and our ability to attract other great people. We are changing the face of government IT and building a workforce that fuels this mission. Are you ready to be a transformer?
We are an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, disability, or status as a protected veteran. EOE, including disability/vets.
Posted Pay Range
The posted pay range, if referenced, reflects the range expected for this position at the commencement of employment, however, base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, education, experience, and internal equity. The total compensation package for this position may also include other compensation elements, to be discussed during the hiring process. If hired, employee will be in an “at-will position” and the GovCIO reserves the right to modify base salary (as well as any other discretionary payment or compensation program) at any time, including for reasons related to individual performance, GovCIO or individual department/team performance, and market factors.
Pay range: $170,000 - $185,000 Annually
Apply NowNot The Right Fit?
Is this not the job you’re looking for? That’s ok! We’ve got plenty of other opportunities for you to peruse. Search all of our open positions by your area of interest or location.
