Svitla Systems Inc. is looking for a Senior Site Reliability Engineer (SRE) for a full-time position (40 hours per week) in Mexico. Our client is a technology company specializing in high-performance, cost-effective log data management through its innovative streaming data lake platform that assists organizations in managing large volumes of data effectively. The company specializes in the platform, which is built to tackle data storage, real-time analytics, and scalability challenges. This cloud data platform is designed for real-time log analytics that powers next-gen, data-intensive applications in observability, security, AdTech, media, AI, and other sectors. With the unique design, the customers can save 75% or more on log management and analytics while retaining 4x more data. Utilizing a robust tech stack that includes Hotjar, Google Cloud, and Salesforce, it is committed to cutting-edge technology, appealing to tech-savvy clients seeking advanced data solutions. It was founded in 2018 and is based in Portland, Oregon.
A working hour: West Coast (8 am – 5 pm or 9 – 6 pm).
The duration of engagement is 6 months contract initially, with the possibility of extension: a start date – mid-January, possibly early February.
Requirements
- Minimum of 5 years of experience as a Site Reliability Engineer or similar role, with a history of supporting complex distributed systems.
- Understanding monitoring and debugging tools like Prometheus, Vector, Grafana, Superset, or Kibana.
- Experience with at least one major cloud platform (AWS, GCP, Azure, or Linode), knowledge of at least AWS and GCP.
- Understanding of SQL databases.
- Knowledge of programming languages such as Python, Go, or Rust.
- Strong knowledge of Linux systems, including performance tuning and system-level troubleshooting.
- A customer focus and experience with B2B interactions.
- Excellent written and verbal communication skills, with the ability to convey technical concepts clearly to diverse audiences, including customers and cross-functional teams.
Nice to have
- Familiarity with PostgreSQL.
Responsibilities
- Deploy, maintain, and ensure a highly reliable fleet of Kubernetes clusters and deployments across multiple cloud platforms.
- Design, implement, and maintain systems and processes to enhance service reliability, availability, and performance.
- Build and optimize CI/CD tools and processes to ensure efficient and reliable deployments.
- Develop and manage monitoring, alerting, and incident response strategies to minimize downtime and enable rapid recovery.
- Conduct comprehensive root cause analyses for system failures, implementing long-term preventive measures.
- Automate repetitive tasks and optimize system performance to improve operational efficiency.
- Participate in a 24/7 on-call rotation, covering weekday business hours and once-monthly weekend shifts.
- Work closely with software engineering, infrastructure, and product teams to integrate reliability practices into every development lifecycle stage.
- Champion SRE best practices and foster a culture of operational excellence across the organization.
- Collaborate with a distributed team of engineers worldwide to provide round-the-clock support.
- Interface with customers to address and resolve reported incidents, ensuring a seamless user experience.
We offer
- Work with #1 winner of the ‘Best Place to Code’ award!
- US and EU projects based on advanced technologies.
- Legal IMSS contract and competitive compensation.
- Annual performance appraisals.
- Flexibility in workspace, either remote or in our welcoming office.
- Remote work financial support.
- Comprehensive medical insurance including family.
- Life insurance, maternity policy, family days off.
- Christmas Bonus in the amount of 30 days' salary.
- Bonuses for recommendations of new employees.
- Bonuses for article writing, public talks, other activities.
- 15 vacation days, 25% vacations bonus, 11 national holidays.
- English lessons and education with Platzi.
- Free webinars, meetups and conferences organized by Svitla.
- Monthly Pantry Vouchers, free office snacks, and drinks.
- Fun corporate online\offline celebrations and activities.
- Awesome team, friendly and supportive community!
About Svitla
Svitla Systems is a global trusted IT solutions company headquartered in California, with business and development offices through out the US, Latin America, Europe, and Asia. Svitla is an outspoken advocate of workplace flexibility, best known for its well-established remote culture, individual approach to our teammate’s professional and personal growth, and family-like environment.
Since 2003, Svitla has served a wide range of clients, from innovative start-ups in California to mega-large corporations such as Ingenico, Amplience, InvoiceASAP and Global Citizen. At Svitla, developers work with clients’ teams directly, building lasting and successful partnerships, as a result of seamless integration with on-site processes.
Svitla Systems’ global mission is to build a business that contributes to the well-being of our partners, personnel and their families, improves our communities, and makes a lasting difference in the world. Join us!