• students/interns
  • contractors
  • consultants
  • temporary
  • graduate
  • experienced professionals
  • Category 1
  • Category 2
  • Category 3
  • Category 4
  • Category 5
  • Category 6
  • Country 1
  • Country 2
  • Country 3
  • Country 4
  • Country 5

    Site Reliability Engineer Lead

    • 17050
    • Information Technology
    • Experienced Professionals
    • ISG Data
    • Data  - Service Support
    • UK
    • Cambridge
    • UK
    • Manchester


    We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class.

    Job Description

    Arm Treasure Data began by offering data warehousing and processing services; since then we’ve moved further up the value chain with our Customer Data Platform application (CDP), which is seeing a lot of traction with customers new and old. This growth has prompted a greater focus on Site Reliability Engineering as we’ve growing past our current practices and we’re looking to add a 9th member to our team, as such you’ll playing an essential role in maturing the company’s approach to service reliability and continuity.
    The team and you will be directly responsible for solutions for the platform in these key areas: availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning.
    This will require working with engineering teams on complex problems/projects where analysis of situations or data requires an in-depth evaluation of multiple factors and wise trade-offs between competing factors when arriving at a solution.

    Success in this role requires a passion for helping others and making their lives better, you do this by simplifying complex systems to make them understandable and operable. You are able to effectively communicate decisions, ideas, designs, and operation of systems and services in a clear and concise manner.
    You are both a generalist, capable of picking up and working with multiple, disparate systems, and an expert, having an ability to dive deep into specific topics and quickly master them. You comfortably move between system, service, and instance level views.

    You have a love of stateful systems containing Treasured data, ensuring we continue to protect customer data from loss occurring from outages.

    What you will be accountable for:

    • Build and maintain services, automation, and tooling that will positively impact key areas (see above) with our team, be responsible for       the systems you build.
    • Drive continuous improvement by measuring and reducing the amount of manual operational work.
    • Help us measure and improve reliability and performance across the product line by working with product owners and engineering teams.
    • Make wise decisions balancing availability and delivery, and communicating those decisions clearly.
    • Be an active participant and internal evangelist for our shared processes, such as blameless post-mortems
    • Work with engineering teams as a subject matter expert on operating software and systems at scale, teaching them from your experience or know-how, and helping them reach their goals.
    • Investigate system performance, errors, and problems.

    Job Requirements

    What we are looking for

    • A minimum of 5+ years relevant working experience.
    • Experience building and maintaining software addressing key SRE areas of responsibility (see above).
    • Strong Software Engineering experience, with an ability to work in multiple programming languages.
    • Experience with Distributed Systems and operating them as they scale.
    • Experience operating services running in the cloud (AWS primarily) or virtualized API-driven platforms.
    • Articulate and personable with strong spoken and written English language abilities.
    • Knowledge and experience in Systems Engineering, Administration, and Operations.
    • Demonstrate the ability to work independently and collaboratively as part of a specialized team.
    • Ability to communicate clearly and effectively across language barriers.
    We would be thrilled if you:
    • Have experience automating datastore operations or datastores as a service.
    • Crafted APIs and specifications that allow for future non-breaking changes while remaining backwards compatible for as long as possible.
    • Had experience analyzing system-wide performance: latency, throughput, and efficiency.
    • A student of complex systems theory and how to build resilient and adaptive systems.
    • Able to build services backed by BLOB, relational, and/or document data stores, currently: S3, PostgreSQL, and DynamoDB.
    • Have experience working as part of a distributed or partially distributed team and thrive in an a highly collaborative and communicative work environment.
    • Pride yourself on giving back to your community: open source contributions, speaking, teaching, mentoring, or helping others.
    • Experience speaking and/or writing Japanese.
    At Arm, we are guided by our core beliefs that reflect our rare culture and guide our decisions, defining how we work together to defy ordinary and shape extraordinary:

    We not I

    Take daily responsibility to make the Global Arm community thrive.

    No individual owns the right answer. Brilliance is collective.

    Information is crucial, share it.

    Realise that we win when we collaborate — and that everyone misses out when we don’t.

    Passion for progress

    Our differences are our strength. Widen and mix up your network of connections.

    Difficult things can take unexpected directions. Stick with it.

    Make feedback positive and expansive, not negative and narrow.

    The essence of progress is that it can’t stop. Grow with it and own your own progress.

    Be your brilliant self

    Be quirky not egocentric.

    Recognise the power in saying ‘I don’t know’.

    Make trust our default position.

    Hold strong opinions lightly.

    About Arm Treasure Data

    Arm Treasure Data provides an end-to-end, fully managed cloud service (data acquisition, storage and analysis capability) for Big Data that is trusted and simple.  As the original developers of Fluentd, an advanced open-source log collector specifically designed to solve the big data log collection problem, Arm Treasure Data solves the problems for companies wanting the ability to manage their big data needs.

    Arm has a responsibility to ensure that all employees are eligible to live and work in the UK.


    ARM Benefits


    Your particular benefits package will depend on position and type of employment and may be subject to change. Your package will be confirmed on offer of employment. Arm’s benefits program provides permanent employees with the opportunity to stay innovative and healthy, ensure the wellness of their families, and create a positive working environment.

    • Annual Bonus Plan
    • Discretionary Cash Award
    • Group Personal Pension Plan with enhanced company contribution
    • Medical, Travel, Health & Life Insurances
    • Holiday, 25 days annual leave with option to buy an additional 5 days per year
    • Sabbatical, 20 paid days every four-year of service
    • Volunteering, One (1) paid working day each year (TeamARM)
    • Varies by location: cycle to work, free car parking, gym on site, team and social events

    About Arm

    Arm® technology is at the heart of a computing and connectivity revolution that is transforming the way people live and businesses operate. From the unmissable to the invisible; our advanced, energy-efficient processor designs are enabling the intelligence in 86 billion silicon chips and securely powering products from the sensor to the smartphone to the supercomputer. With more than 1,000 technology partners including the world’s most famous business and consumer brands, we are driving Arm innovation into all areas compute is happening inside the chip, the network and the cloud.

    With offices around the world, Arm is a diverse community of dedicated, innovative and highly talented professionals. By enabling an inclusive, meritocratic and open workplace where all our people can grow and succeed, we encourage our people to share their unique contributions to Arm's success in the global marketplace.

    About the office

    At our global HQ in Cambridge, England we house the majority of our engineering and our corporate groups that deliver our extraordinary success. As a world-renowned university town, Cambridge boasts both a beautiful countryside and a historical town center. Local activities include punting on the River Cam and the many museums that reside within Cambridge University.

    Cambridge, UK - Global HQ
    Arm Ltd.
    110 Fulbourn Road
    GB-CB1 9NJ
    See on Google maps