• SRE- Kafka Administrator

    Job Locations IN-Pune
    Job ID
    2018-1672
    Category
    Engineering
  • Overview

    Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies that to IT operations problems. The main goals are to create ultra-scalable and highly reliable software systems. This opening is for Site Reliability Engineering team for Cloud Operations. This team is responsible for managing and maintaining our Datacenter server / Cloud operations

    Responsibilities

    • Manage large scale multi-nodes Kafka cluster environments residing on AWS.
    • Handle all Kafka environment builds, including design, capacity planning, cluster setup, performance tuning and ongoing monitoring.
    • Perform high-level, day-to-day operational maintenance, support, and upgrades for the Kafka Cluster.
    • Creation of key performance metrics, measuring the utilization, performance and overall health of the cluster.
    • Capacity planning and implementation of new/upgraded hardware and software releases as well as for storage infrastructure.
    • Research and recommend innovative, and where possible, automated approaches for system administration tasks.
    • Ability to closely calibrate with product managers and lead engineers.
    • Provide guidance in the creation and modification of standards and procedures
    • Proactively monitor and setup alerting mechanism for Kafka Cluster and supporting hardware to ensure system health and maximum availability
    • Proven ability to lead multiple high priorities initiative with aggressive timelines leveraging an agile framework.
    • Comfortable performing in a fast paced, dynamic and ambiguous business environment.
    • Ability to concentrate on a wide range of loosely defined complex situations, which require creativity and originality, where guidance and counsel may be unavailable.

     

    Qualifications

    • 4+ years of solid Kafka Admin experience in managing critical 24/7 applications
    • Strong knowledge on Linux
    • Strong knowledge on AWS Cloud Services.
    • Design, build, assemble, and configure application or technical architecture components using business requirements.
    • Hands-on experience with Kafka clusters hosted on Amazon cloud is a plus.
    • Experience in Kafka build pipelines using Ansible, Cloud formation templates, shells etc.
    • Experience in Jenkins, GitHub
    • Experience in implementing security & authorization (permission based) on Kafka cluster.
    • Experience in open source Kafka, zookeepers, Kafka connect, Kafka Manager.
    • High availability cluster setup, maintenance and ongoing support
    • Create topics, setup redundancy cluster, deploy monitoring tools, alerts Has good knowledge of best practices
    • Exposure to Kafka APIs
    • Hands on experience in standing up and administrating Kafka platform which includes creating a backup & mirroring of Kafka Cluster brokers, broker sizing, topic sizing, h/w sizing, performance monitoring, broker security, topic security, consumer/producer access management(ACL)
    • Knowledge of Kafka API (development experience is a plus)
    • Provide technical expertise and guidance to production support staff.
    • Involvement with grouping/clustering and high-volume systems.
    • Handle all Kafka environment builds, including design, capacity planning, cluster setup, performance tuning and ongoing monitoring.
    • Manage large scale multi-nodes cluster environments residing on AWS
    • Perform high-level, day-to-day operational maintenance, support, and upgrades for the Kafka cluster
    • Knowledge of best practices related to security, performance, and disaster recovery.
    • Proven track record of sound, effective decision making.
    • Excellent listening and communication skills.
      o Must be able to communicate technical information clearly, able to ‘translate’ between diverse groups of technical and non-technical individuals.
      o Ability to synthesize large amounts of complex data into meaningful conclusions and present recommendations to a vast array of individuals.
      o Partner with business units in order to improve the effectiveness of business decisions.
      o Ability to structure documents to effectively communicate with senior leadership and drive alignment and decision making.

    Education :- BE/BTech/ME/MTech/MCA

    Options

    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share on your newsfeed