Leonel Benitez — SRE / DevOps / MLOps

LEONEL BENITEZ · PORTFOLIO

leonel@portfolio:~

$ cat about.txt

About me.

SRE with 7+ years of experience building reliable, scalable infrastructure for ML/Data platforms and mission-critical systems. Expert in end-to-end automation — from CI/CD pipelines to ML workflow orchestration in production environments.

Proven track record of reducing downtime by 60% while scaling cloud-native ML systems with 99.9%+ uptime. I bridge the gap between development, operations, and ML engineering.

7+

Years of experience

99.9%

Uptime achieved

60%

MTTR reduction

3

Countries / clouds

$ ls projects/

Projects.

Startup · Co-founder400 lines → 14

velar

Serverless GPU cloud. Replaced 400 lines of infra YAML with 14 lines of Python — deploy to H100s in 38 seconds, no Dockerfiles, no K8s configs. Building the entire infra layer from scratch.

PythonGoKubernetesDockerGPU Cloud↗ velar.run

Startup · Co-founder13 languages · 90+ POI

ubbi.io

Free GPS navigation app with 3D maps and a social travel community. 90+ POI categories, offline maps, community-curated routes, and real-time travel insights — 13 languages, iOS & Android. Pre-launch, Panama-founded.

AstroTypeScriptTailwind CSSGPS/Maps APIsiOSAndroid↗ ubbi.io

Agtech · Co-founderSeed → plate in <8 days

The District Greens

Urban microgreens farm in Panama City delivering pesticide-free produce from seed to plate in under 8 days. Solar-powered hydroponic operation serving health-conscious consumers and restaurants — same-day harvest, 24h delivery.

AstroTypeScriptTailwind CSSWhatsApp API↗ thedistrictgreens.com

// work projects

CLI Tool↓ 50% manual ops tasks

go-sre-toolkit

Internal CLI automation suite built in Go for Millicom's SRE team — streamlines incident response, runbook execution, and service health checks across hybrid cloud environments.

GoREST APIsKubernetes

Infra Project↓ 40% RCA time

grafana-observability-stack

End-to-end observability platform using Grafana ecosystem (Tempo, Loki, Mimir, Pyroscope) for distributed telco systems. Provides full-stack visibility from infra metrics to continuous profiling.

GrafanaTempoLokiMimirPyroscope

MLOps Platform↑ 40% pipeline reliability

flyte-mlops-platform

Production MLOps platform deployed on Amazon EKS using Flyte for workflow orchestration. Includes comprehensive monitoring, automated recovery, and SLI/SLO frameworks for ML pipelines.

FlyteEKSCloudWatchDocker

Architecture↑ 99.9% availability

event-driven-ml-arch

Multi-cloud event-driven architecture using AWS (SQS, S3, Lambda) and Azure (Functions, Event Grid) for scalable ML operations — automating 80% of manual ML ops processes.

AWSAzureTerraformLambda

$ git log --oneline

Experience.

Millicom (Tigo)

Senior Site Reliability Engineer

↓ 40%

RCA time

↓ 50%

manual ops

↓ 65%

repeat incidents

›Led L3 support for mission-critical telco platforms, MTTR optimization
›Implemented full Grafana observability stack: Tempo, Loki, Mimir, Pyroscope — reducing root-cause analysis time by 40%
›Developed internal CLI automation tools in Go, reducing manual tasks by 50%
›Drove RCA, reducing repeat incidents by 65%

GoGrafanaPrometheusLokiTempoPyroscopeKubernetes

Cheil Worldwide

Cloud Data Engineer & SRE

↑ 40%

ML reliability

↑ 99.9%

availability

↓ 70%

recurring incidents

›Deployed Flyte on Amazon EKS, improving ML pipeline reliability by 40%
›Designed event-driven ML architecture on AWS + Azure achieving 99.9% availability
›Standardized Terraform IaC across multi-cloud, eliminating config drift
›Led post-mortems reducing recurring incidents by 70%

FlyteEKSAWSAzureTerraformJenkinsDocker

Banco General

Site Reliability Engineer & DevOps

↑ 99.9%

uptime

↓ 60%

deploy time

4h → 45m

MTTR

›Built proprietary monitoring & alerting for critical banking infrastructure — 99.9% uptime
›CI/CD pipelines with Ansible, Docker, Jenkins — deployment time reduced 60%
›Reduced MTTR from 4 hours to 45 minutes through runbook automation

AnsibleDockerJenkinsPythonBash

Banco General

Systems Engineer & Operations

7+

years tenure

↓ 40%

manual effort

24/7

ops coverage

›Managed 24/7 production systems for critical banking operations
›Automated routine tasks with Python and Bash, reducing manual effort 40%
›Collaborated on performance optimization and reliability improvements

PythonBashJava

$ neofetch --skills

Tech Stack.

{ }Languages

—Python

—Go

—Java

—Bash

—SQL

—JavaScript

☁Cloud

—AWS (EKS, S3, Lambda, CloudWatch)

—Azure (ADF, Synapse, Functions)

⬡Infrastructure

—Kubernetes

—Docker

—Terraform

—Ansible

—GitOps

◎Observability

—Grafana

—Prometheus/Mimir

—Loki

—Tempo

—Pyroscope

—CloudWatch

⚙MLOps

—Flyte

—Kubeflow

—Airflow

—Jenkins

—GitHub Actions

∿ML / Data

—TensorFlow

—scikit-learn

—Apache Beam

—Pandas

—NumPy

$ open contact.json

Get in touch.

Currently open to Senior SRE, DevOps, and MLOps roles. Feel free to reach out.

emailleoalejandrobenitez@gmail.com

linkedinlinkedin.com/in/leonel-benitez

githubgithub.com/LeOx26

locationPanama City, Panama