Senior DevOps Engineer Interview at JioHotstar

Round 1 — Streaming at Scale, Kubernetes, Cloud & Linux (45 mins)

  1. How would you design auto-scaling for 50M+ concurrent viewers across multiple Kubernetes clusters without over-provisioning?
  2. During an IPL final, a new region needs to spin up instantly. How would you pre-warm nodes and scale workloads with zero cold-start impact?
  3. Explain how you would use Envoy and Istio to route low-latency live streams differently from VOD without service restarts.
  4. What is your approach to multi-zone pod affinity and anti-affinity to ensure a node failure does not impact regional streaming SLAs?
  5. How would you monitor HPA scaling decisions in real time and detect if the metrics server is lagging?
  6. Describe Kubernetes readiness and liveness probe configurations to catch buffering and lag issues in stream-processing microservices before users notice.
  7. A kube-proxy update rolls out mid-match. What is your network rollback plan to avoid packet drops?

Round 2 — RCA, Fire Drills & Streaming Chaos (75 mins)

  1. Playback failures spike for only 3% of users in the APAC region. CPU, memory, and pods look fine. Walk through your triage plan.
  2. Your Kafka ingestion pipeline lags by 2 minutes during a traffic surge. Producers are fine, consumers are idle. What is your debug path?
  3. Sudden tail latency on a Redis-based stream session store during a Champions League match. How do you find and fix the bottleneck?
  4. HPA refuses to scale in a critical pod set even though Prometheus shows CPU > 90%. What is the root cause and fix?
  5. NAT gateway costs double in 24 hours during a live series. No infrastructure changes were made. What could be silently causing it?

Round 3 — Leadership, Reliability Culture & Scaling Influence (30 mins)

  1. How do you build a culture where latency SLOs are enforced like uptime SLAs in a streaming organization?
  2. You are asked to ship multi-region failover for live events in 2 weeks with no DNS-based routing allowed. What is your plan?
  3. How would you simulate chaos in a streaming pipeline without risking real user impact?
  4. How do you justify infrastructure costs for pre-warmed scaling capacity to executives before a major sports event?

💡 TL;DR

RELATED POSTS

AWS EKS pod-identity-association

The association is a two-sided thing — the AWS side links (namespace, serviceAccount) → IAM role, and the Kubernetes side is just which SA a pod runs as. There's no direct kubectl resource that shows the combined picture, but here's

Read More »

Căn hộ dành cho ai !

Đất giá 20 trđ/m2 thì căn hộ sẽ là 20:5:0.6 = 7trd Xây nhà liền thổ trệt + 1 lầu giá 8 trđ/m2 thì căn hộ sẽ là 1.5×8 : 0.6 = 20 trđ ( tim tường ) Căn hộ

Read More »
Share the Post: