Saikiran Nuthakki

About Me

I'm a data engineer with 4+ years of experience building large scalable data pipelines and designing medallion data architectures. Specialized in the adtech industry, I've designed and implemented efficient ETL processes for campaign, measurement, and attribution workflows. I also develop and run data audits to optimize data pipelines for performance and cost. I'm passionate about leveraging modern data engineering practices to solve complex business challenges and drive data-driven decision making.

Work Experience

May 2025 - Present

Trkkn (Acquired by Omnicom Media Group)

Data Engineer

Back-end development for an internal web application used for analytics and reporting by a Global 500 client. Actively develop and update DBT macros/models, and Airflow DAGs for ETL processes throughout advertising workflows.

Python
Bash
SQL
Airflow
GCP
BigQuery
DBT

Feburary 2022 - March 2025

LiveRamp

Data Engineer

Built a system of data pipelines to continuously append new users and remove inactive users for a Fortune 100 client's first party ad server.
- Built Airflow DAGs using Python, SQL, and Bash to orchestrate the batch processing and transformation of user and campaign data, filtering over 100 million active entities daily. This improved the client's targeting efficiency across 20+ campaigns, reaching an average of 40+ million users per campaign.
- Created a comprehensive end-to-end logging framework, capturing detailed metadata for each stage of the pipeline to support monitoring, debugging, and auditing.
Developed scalable ETL pipelines for attribution workflows to process and deliver transaction data to partners daily.
- Ingested over 50 million transaction data points, and transformed them to meet partner API specifications in order to assign credit for conversions across different marketing channels (attribution).
- The pipeline logged, processed, and delivered on average 18+ million data points daily to partners (Google Ads, Facebook Ads, Criteo, Snapchat, and TikTok) through an automated Airflow DAG, ensuring accurate and timely partner reporting.
Optimized data processing pipelines by implementing more efficient job scheduling, preemptible VM usage, and autoscaling. This reduced idle time and maximized resource utilization, achieving a 20% reduction in overall GCP costs, saving approximately $350k annually.

Python
Bash
SQL
Airflow
GCP
BigQuery

June 2021 - August 2021

Verizon

Strategy & Analytics Intern

Developed linear and logistic regression models in R to identify trends relating user performance to site interaction, enabling supported predictions on future spending habits.
- Visualized data using Tableau and presented findings to 8 stakeholders including Directors and VPs to modify existing incentive model, to reach an improved target goal of $85M from previous $78M.
Cross-analyzed and clustered 4M data points to develop performance groups among Verizon’s top 4 customer service groups, groupings were utilized for research surrounding response rates, and resolution strategies.

R
Python
Machine Learning
Linear / Logistic Regression
K-Means Clustering
Tableau

January 2020 - April 2020

The Walt Disney Company

Software Engineering Intern

Developed a back-end Rest API in Java utilizing a Spring Boot Framework and a SQL Server database to upgrade an internal application for Disney’s Labor Attendance and Time Team. Tested code using Postman, used Github for source control, and utilized Docker during deployment.
- Application’s utilized by 72 members to make time and pay adjustments for 70,000 employees on a daily basis.

Java
Spring Boot
Microsoft SQL Server

September 2019 - January 2020

The Pennsylvania State University

Undergraduate Research Assistant

Worked on a funded project that was researching and developing acoustically aware rotorcrafts, in the Aerospace Engineering Department.
- Updated an existing postgreSQL data model to handle increased data collection. Refactored data processing scripts in C++ to improve performance and scalability.

Python
PostgreSQL
C++

Education

2017 - 2021

The Pennsylvania State University, University Park

B.S. in Computational Data Science, College of Engineering

Minor: Mathematics, Eberly College of Science

Relevant Courses:

CPMSC 221: Object Oriented Programming
CMPSC 410: Programming Models for Big Data
CPMSC 465: Data Structures and Algorithms
CMPSC 448: Machine Learning
CMPEN 454: Computer Vision
MATH 220: Matrices
MATH 230: Calculus III
MATH 484: Linear Programs
STAT 414: Probability Theory
DS 220: Data Management for Data Science
DS 300: Data Privacy and Security

Interests & Hobbies

Finance

Actively manage investments in stocks, bonds, and cryptocurrencies.

Crypto/Web3

Very bulllish on overall global crypto adoption, and web3 utilization in general.

Sports Analytics

Interested in the intersection of sports and data science.

Travel

Love exploring new places, cultures, and cuisines. Click to view all of the different countries I've visited.

Photography

Capturing moments and scenes through photography. Enjoy both digital and film photography.

Fitness

Staying active through running, weightlifting, and recreational sports. Believe in maintaining a healthy body and mind.

Get In Touch

I'd love to hear from you. Let's connect!

Email

snuthakki99@gmail.com

Phone

+1 (510) 333-0193

Location

New York City, USA