Company logo

DEVOPS TECHNICAL SUPPORT (Experienced)

REIZEND PRIVATE LIMITED

Trivandrum

in 20 days

Brief DescriptionDevops Technical Support Engineer with 3 years of experience We are looking for a proactive &  highly skilled Technical Support Engineer to provide exceptional technical support to overseas projects, working in rotational shifts to ensure 24/7 availability. This role involves troubleshooting complex issues across cloud platforms, networking, application architectures, and DevOps toolchains. The ideal candidate should be self motivated, a collaborator, agile and a continuous learner. Key Responsibilities

Provide technical support and troubleshoot issues related to cloud platforms and services such as Fargate, ECS, DynamoDB, BigQuery, SNS etc. Understand the problems by consuming logs and metrics from various sources using the services such as CloudWatch, Prometheus, Grafana, Loki, Alert Managers and Splunk etc. Analyze and resolve networking challenges, including load balancers, API gateways, reverse proxies, ingress controllers, and service-to-service communications. Work on issues related to client-server communications, firewalls, and virtual machines. Collaborate with DevOps teams to manage and troubleshoot toolchains like Docker, Kubernetes, Jenkins, Ingress Controllers etc. Act as the first point of contact for technical queries and escalate issues when necessary. Liaise with development and operations teams to identify root causes and resolve incidents effectively. Document troubleshooting steps, solutions, and maintain a knowledge base for recurring issues. Collaborate with cross-functional teams to implement best practices for monitoring and incident response. Participate in shift handovers and provide timely updates on ongoing issues. Preferred SkillsTechnical Skills Cloud Platforms and Services

Experience or hands on knowledge working with Fargate and ECS for managing and troubleshooting containerized workloads. Proficiency with DynamoDB and BigQuery for analyzing data and take decisions based on the analysis. Hands-on knowledge of SNS for debugging message delivery issues and integration workflows.

Monitoring and Logging Tools

Proficiency in CloudWatch Logs, Loki, and Splunk for consuming and analyzing logs to identify and resolve issues. Experience or hands-on knowledge with Prometheus and Grafana for analysing metrics using dashboards and monitoring system health. Knowledge of Alert Manager for configuring and managing alert escalation. Ability to interpret metrics from various sources and create actionable insights.

Networking and Security

Strong understanding of load balancers (e.g., ALB, NLB) for distributing traffic and troubleshooting connectivity issues. Knowledge in API Gateways like AWS API Gateway or NGINX for managing API traffic. Knowledge of reverse proxies and ingress controllers (e.g., NGINX Ingress, Traefik) for managing internal/external traffic. Understanding service-to-service communications, including DNS, HTTP/HTTPS, and gRPC protocols. Experience or hands-on knowledge with firewalls, security groups, and IAM roles for secure communications. Troubleshooting skills for VM-related issues in platforms like AWS EC2 or equivalent.

DevOps Toolchains

Proficiency with Docker for managing container images and runtime debugging. Strong understanding of Kubernetes concepts of managing deployments, ingress setups, and pod-related issues and related troubleshooting commands and mechanisms. Knowledge of CI/CD pipeline building tools such as Jenkins, GitHub Actions, ArgoCD for building, deploying, and managing automated pipelines. Understanding of Ingress controllers (e.g., NGINX, Traefik) and SSL termination for secure routing.

Troubleshooting and Incident Management

Strong problem-solving skills to identify root causes using logs, metrics, and system-level debugging. Ability to document detailed troubleshooting steps and solutions for recurring issues.

Collaboration and Communication

Ability working with cross-functional teams (DevOps, development, and operations) to resolve incidents. Skills in effective and proactive communication to escalate issues and provide updates during shift handovers. Proficiency with tools like Slack, JIRA, Confluence, or Google Workspace for collaboration and issue tracking.

Salary Package- 7-9 LPA Kindly share resume to careers@reizend.ai