Site Reliability Engineer (SRE)

Cloud Operations · Bangalore, Karnataka
Department Cloud Operations
Employment Type Full-Time
Minimum Experience Mid-level

Riversand Technologies is a Master Data Management (MDM) visionary and a Product Information Management (PIM) leader. We are a team of passionate people who are rethinking the way MDM and PIM work. We recently raised $35 million in series A funding and we are on a trajectory for an accelerated product innovation and growth over the next two years. If you are a Cloud Support Engineer looking to advance your career in the latest in cloud and open source technologies, then now is the best time to join Riversand. Our solutions power enterprises worldwide, in a variety of industries including Retail, Manufacturing, Distribution, Energy, Healthcare, and Food Services.


Successful Site Reliability Engineers need to have a wide range of skills so that they can effectively contribute to all stages of Cloud system, software or application development. This includes attention to detail while working with deep technical issues, identifying opportunities of addressing root causes and improving the telemetry, monitoring and automation. They also need good problem-solving skills so that they can identify issues and determine how to correct them. Reviewing configuration, infrastructure/application architecture, lines of code and operational processes also involves the use of analytical skills. In addition, site reliability engineers need excellent communication and teamwork skills in order to effectively relay information to other engineers and properly document their work.

Primary responsibilities include;

  • Live with 24x7 available mind set for ensuring 99.97% uptime of Riversand’s SaaS platforms and applications.
  • Own, resolve and restore major technical issues to meet the uptime commitment. Expected to be available on-call any time (24x7) for
  • Develop, deploy and continually improve the telemetry, monitoring and automation (self-heal, self-help, self-service) of the SaaS platform and the applications
  • Ensure the Cloud Infrastructure, platform components and applications are secured and safeguarded via strong controls, monitoring and security incident management
  • Own Root Cause Analysis of incidents end to end and demonstrate quantifiable technological, stability and process improvement for Riversand’s Cloud Infrastructure, SaaS platforms and applications
  • Enable technology support teams, customers and business users by building and continually developing knowledge base driven by analyzing practical usage/issues and related challenges.
  • Will be highest level of Technical Escalation point and act as guide, coach and mentor for first and second level Application/Infrastructure support teams. Should be the ·         bridge between Support and Product engineering teams and faces customers and business users as and when required proactively.
  • Owns and drives the end to end technical resolution of critical incidents which might need involvement from multiple parties and ensures the right collaboration and communication is maintained to ultimately get the issue resolved fast paced through shortest and the most efficient path.

Technology Skills Required:

SaaS Platform (Mandatory): Elastic Search Big Data platform, Kafka, Zookeeper, Storm, Hbase, Hadoop, Spark and other components like Netty/ NGINX, Docker/ Docker swarm, Kubernetes, Auth0/IDP/SSO etc.

Cloud Infrastructure: Microsoft Azure (Mandatory): Vnet, VMs, Subscriptions, Resource groups, Storage accounts, Scale set, Azure functions, Event hub, Blob store, Azure ARM, Data bricks, Azure networking/security etc, OMS, AppInsights.

AWS (good to have): VPC, EC2, Load balancers, Auto scaling, EBS, Kinesis, S3, Lambda, CFT, CloudWatch

OS: Primarily Linux and partly Windows Servers

Scripting: Powershell, Python, Bash or Ruby

Monitoring: Monitoring tool like Sensu, Application Insights, OMS, Logic Monitor, Sumologic etc.

Container Orchestration: Docker or Kubernetes knowledge

Programming: Good Understanding of REST API and JSON Structures. Should be familiar with engineering practices, architecture / workflows, used cases etc.

Experience/ Background:  8+ years in a role of Cloud SRE, Architect, Senior Support Expert for SaaS solution with the above-mentioned technologies. Should have certifications in 2 or more of the related technologies.

Qualification: BE/ BTech/ MCA

 Other Important Requirements: The team this role belongs to is expected 24x7 available (on call and availability in office) as it’s about managing global production and customer facing highly critical systems. Hence the individual should be flexible to adapt the roster/shift arrangements as required.



Thank You

Your application was submitted successfully.

  • Location
    Bangalore, Karnataka
  • Department
    Cloud Operations
  • Employment Type
  • Minimum Experience