Kerlyn Difo

kerlyn angel difo

Learning to build systems and create meaningful projects

Currently at Datadog

Previously at Capital One & Columbia

Experience

My professional journey in software engineering and data engineering.

UPCOMING
Google
May - Aug 2026

Software Engineer Intern

Google Cloud Platform

Joining GCP in New York to rewrite hardware health detection libraries from C++ to Go.

GoC++Cloud Infrastructure
Datadog
Jan - Apr 2026

Software Engineer Intern

Datadog Agent | Log Pipelines

Implemented disk-backed log buffering ensuring zero data loss during network outages. Worked on compressor failover and log pipeline reliability.

  • Designed and developed a production-ready compressor failover system for the Datadog Agent to eliminate data loss during compression blocking, featuring health monitoring, automatic pipeline failover with less than 1% overhead, and comprehensive telemetry
  • Fixed multiple critical bugs in the logs pipeline including metric consistency across TCP/HTTP senders, duplicate log truncation issues, and wildcard file path validation restrictions
GolangKubernetesData PipelinesAgentic Workflows
Capital One
Jun - Aug 2025

Software Engineer Intern

Debt Collection Services

Designed and deployed two serverless REST APIs in Python powering a new Debt Assistance tile across web and mobile.

  • Architected a Java Spring Boot microservice that interfaces with Capital One's internal Message Hub API to push real-time and batched settlement offers to customers, automating outreach
  • Created a Snowflake SQL + Streams/Tasks pipeline that stitches together customer, balance, and consent data across 3+ schemas, transforms it into JSON payloads, and streams them directly to Message Hub shrinking payload-prep time by ≈ 20%
  • Designed and deployed two serverless REST APIs in Python (AWS Lambda + CDK) to power a new 'Debt Assistance' tile in Capital One's web & mobile apps
PythonSnowflakeAWS LambdaCDK
Columbia University Irving Medical Center
Jun 2024 - Dec 2025

Data Engineer Intern

Genomic Data Pipelines

Built a full-stack tool unifying biomedical records from PubMed, GEO, ENA, dbGaP and BioStudies into a single searchable schema.

  • Engineered a full-stack tool that unifies study, sample, and assay records from PubMed, GEO, ENA, dbGaP, and BioStudies into a single searchable schema
  • Orchestrated an asynchronous ingestion layer (aiohttp + asyncio) and vectorized clustering logic, shrinking metadata harvest time from 60 min to < 8 min
  • Built a fault-tolerant ETL pipeline on AWS (EC2 + S3, IAM-scoped roles, CloudWatch alerts) that streams multi-GB FTP payloads directly into object storage
PythonETLPostgreSQLAWS
LifeSci NYC
Sep 2024 - Jun 2025

Technical Campus Ambassador

Ambassador Program

Leading technical interviews and workshops for aspiring professionals.

  • Conducted many mock technical interviews and resume workshops for students pursuing tech careers
  • Provided actionable feedback on resumes and application materials to improve students' internship prospects
Technical InterviewsMentoringWorkshops
NYC DOHMH
Jun - Aug 2022

Data Analyst Intern

COVID Outreach

Enhanced COVID-19 vaccination programs with data-driven strategies.

  • Optimized vaccine distribution for NYC DOHMH, using SQL & Excel models that reallocated doses 40% more efficiently
SQLExcelData Analysis

Achievements & Leadership

Head of Newsletter

Head of Newsletter

Queens College Code for All Club

Contributing to tech journalism and community engagement initiatives. Writing articles about emerging technologies and industry trends for the college community.

🎓

Computer Science Senior

Queens College, CUNY

Specializing in data engineering and software development with strong academic performance. Active participant in computer science research and projects.

🏆

Hackathon Winner

Multiple Competitions

Contributed to hackathon-winning projects including RefuConnect, demonstrating innovation and teamwork in high-pressure development environments.

Skills & Technologies

💻Languages

GoPythonJavaTypeScriptC++

☁️Infrastructure

KubernetesAWSDockerBazel

🗄️Databases

PostgreSQLMongoDBSQLiteSnowflake

🔧Tools

GitLinuxCI/CDDatadog