Site Reliability Engineer (SRE) Job at ShiftPixy Resources Inc, Washington DC

WFVJRWlNZXUzQUJNODJrL1ZacTk1TTZX
  • ShiftPixy Resources Inc
  • Washington DC

Job Description

Responsibilities Deployment & Automation

  • Implement and maintain CI/CD pipelines using tools such as GitHub Actions, AWS CodePipeline, and Jenkins.
  • Automate infrastructure provisioning and management using Infrastructure-as-Code (IaC) with Terraform, CloudFormation, or AWS CDK.
  • Develop robust automation scripts and self-service tooling to minimize toil and enhance operational efficiency.
Capacity, Performance & Cost Optimization
  • Lead and implement operational cost optimization initiatives across cloud infrastructure and data platforms.
  • Configure, maintain, and tune auto-scaling policies and performance thresholds.
  • Develop and execute Resiliency Test plans and provide critical support for Performance testing efforts.
Incident Management & SRE Principles
  • Serve as a production on-call responder, employing strong troubleshooting skills to quickly resolve complex incidents.
  • Proficiently utilize ITIL framework concepts and ITSM tools (e.g., ServiceNow) for incident and change management.
  • Develop high-quality Root Cause Analysis (RCA) documentation and Knowledge articles to prevent future recurrence.
  • Implement and enforce SRE principles, including the definition and tracking of Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets.
Observability & Monitoring
  • Manage and leverage advanced observability platforms (Dynatrace preferred, AppDynamics, ELK, etc.).
  • Implement distributed tracing with accurate context propagation across data services and applications.
  • Optimize monitoring queries, and configure actionable dashboards, alerts, and anomaly detectors using tools like Dynatrace and Kibana.
Data Analytics Platform Reliability
  • Ensure the reliability, performance tuning, and access control for Databricks cluster management and data pipelines.
  • Maintain Informatica workflow orchestration, connector reliability, and error handling for critical data flows.
  • Manage Power BI gateway health, access control, and ensure reliable data refresh processes.
Security & Compliance
  • Manage service accounts, access permissions, and roles following the principle of least privilege.
  • Create, deploy, and manage digital certificates and TLS/SSL configurations.
  • Execute effective remediation tasks and respond to security incidents as part of the operational team.
Qualifications Education & Experience
  • Bachelor's degree in Computer Science, Engineering, or a related technical field.
  • 2 to 4 years of hands-on experience in a DevOps, Site Reliability Engineering (SRE), or Cloud Infrastructure role.
  • Practical, working experience with major cloud platforms, specifically AWS and Azure.
Technical Skills
  • Mid-level proficiency in Python or other scripting languages (e.g., Bash, Go) for automation tasks.
  • Mid-level proficiency with Configuration Management tools, including Ansible.
  • Strong knowledge of containerization technologies (Docker, Kubernetes/ECS).
  • Solid understanding of Linux systems and networking fundamentals (TCP/IP, DNS, Load Balancing).
  • Working knowledge of relational, cloud-native (e.g., AWS RDS), and NoSQL database technologies.
  • Direct hands-on experience supporting and maintaining data platforms like Databricks, Informatica, or Power BI is highly desirable.
Professional Attributes
  • Excellent written and verbal communication skills, with a proven ability to document complex systems.
  • Demonstrated ability to work independently, manage shifting priorities, and drive initiatives to completion.
  • Availability for on-call duties and to work outside of standard business hours as required to support a 24/7 production environment.

Job Tags

Work experience placement, Shift work,

Similar Jobs

NavitsPartners

Registered Nurse - Telemetry - RNTELE 25-33505 Job at NavitsPartners

 ...Job Title: Registered Nurse Telemetry Location: Saratoga Springs, NY 12866 Contract Length...  ...Shift: Nights | 312-Hour Shifts | 19:0007:30 Block Scheduling: Not available...  ...or approved exemption required COVID-19 vaccine: Not required per facility... 

Motion Recruitment

Project Manager- 100% Remote Job at Motion Recruitment

Project Manager This role is responsible for managing multiple customer onboarding projects and related initiatives, with a strong focus...  ...The company is located in Reston, VA and will remain 100% remote. What You Will Be Doing Lead large-scale technical projects... 

SGA Talent

Recruiter Trainee Job at SGA Talent

 ...Recent College Students Welcome - We Will Train Recruiter - Hospitality Division Location: Saratoga Springs, New York 0n-Site This is not a remote role About Us: SGA Talent is a prestigious Top 250 Recruiting & Staffing Firm, known for our commitment to quality... 

Tanson Corp

W2 - Software Tester (Testing vended, Mobile/Web Application, ADO, Manual test cases, Agile/waterfall) - Remote Job at Tanson Corp

 ...meetings Documenting, maintaining, and executing well-structured, manual test cases Ability to prioritize work across several product lines in a fast paced environment (cliche, but it's feast or famine in this space) Motivated individual that digs in to... 

P.E.C. Manufacturing Co

Structural Steel Fabricator Job at P.E.C. Manufacturing Co

 ...The Structural Steel Fabricator III is an experienced professional responsible for overseeing complex steel fabrication projects, ensuring high-quality standards, and mentoring less experienced fabricators. This role requires advanced knowledge of steel fabrication techniques...