LEONEL BENITEZ · PORTFOLIO
leonel@portfolio:~
$ cat about.txt

About me.

SRE with 7+ years of experience building reliable, scalable infrastructure for ML/Data platforms and mission-critical systems. Expert in end-to-end automation — from CI/CD pipelines to ML workflow orchestration in production environments.

Proven track record of reducing downtime by 60% while scaling cloud-native ML systems with 99.9%+ uptime. I bridge the gap between development, operations, and ML engineering.

7+
Years of experience
99.9%
Uptime achieved
60%
MTTR reduction
3
Countries / clouds
$ git log --oneline

Experience.

Senior Site Reliability Engineer
Oct 2025 — Present·Panama · Hybrid
  • Led L3 support for mission-critical telco platforms, MTTR optimization
  • Implemented full Grafana observability stack: Tempo, Loki, Mimir, Pyroscope — reducing root-cause analysis time by 40%
  • Developed internal CLI automation tools in Go, reducing manual tasks by 50%
  • Drove RCA, reducing repeat incidents by 65%
GoGrafanaPrometheusLokiTempoPyroscopeKubernetes
$ neofetch --skills

Tech Stack.

{ }Languages
Python
Go
Java
Bash
SQL
JavaScript
Cloud
AWS (EKS, S3, Lambda, CloudWatch)
Azure (ADF, Synapse, Functions)
Infrastructure
Kubernetes
Docker
Terraform
Ansible
GitOps
Observability
Grafana
Prometheus/Mimir
Loki
Tempo
Pyroscope
CloudWatch
MLOps
Flyte
Kubeflow
Airflow
Jenkins
GitHub Actions
ML / Data
TensorFlow
scikit-learn
Apache Beam
Pandas
NumPy
$ ls projects/

Projects.

CLI Tool↓ 50% manual ops tasks
go-sre-toolkit

Internal CLI automation suite built in Go for Millicom's SRE team — streamlines incident response, runbook execution, and service health checks across hybrid cloud environments.

GoREST APIsKubernetes
Infra Project↓ 40% RCA time
grafana-observability-stack

End-to-end observability platform using Grafana ecosystem (Tempo, Loki, Mimir, Pyroscope) for distributed telco systems. Provides full-stack visibility from infra metrics to continuous profiling.

GrafanaTempoLokiMimirPyroscope
MLOps Platform↑ 40% pipeline reliability
flyte-mlops-platform

Production MLOps platform deployed on Amazon EKS using Flyte for workflow orchestration. Includes comprehensive monitoring, automated recovery, and SLI/SLO frameworks for ML pipelines.

FlyteEKSCloudWatchDocker
Architecture↑ 99.9% availability
event-driven-ml-arch

Multi-cloud event-driven architecture using AWS (SQS, S3, Lambda) and Azure (Functions, Event Grid) for scalable ML operations — automating 80% of manual ML ops processes.

AWSAzureTerraformLambda
$ open contact.json

Get in touch.

Currently open to Senior SRE, DevOps, and MLOps roles. Feel free to reach out.

locationPanama City, Panama