Work Experience

Zoho

Zoho Corporation, Chennai

AI/Data Engineer — LLM ZLabs R&D
Jan 2024 – July 2025

  • Performed large-scale data categorization using Lilac clustering; Benchmarked scalability, identifying high time costs and false-positives. Developed a rate-limited Google Classification API pipeline to produce reliable data–category pairs for downstream model training.
  • Built a Streamlit app using Hugging Face APIs to load data, tokenize, filter, ingest meta, generate recipes & perform regex-based search.
  • Researched various open-source data tools for LLMs. Generated 100s of billions of tokens of Indic data by translating open-source English datasets using an internal CPU-based translation model, optimized with multi-threading for faster processing.

HuggingFace Data Engineering LLM Lilac Streamlit
Manager: Prathima MR, AI Engineer (Team Lead), LLM ZLabs R&D
Zoho

Zoho Corporation, Chennai

ML/Deep Learning Engineer — ASR ZLabs R&D
July 2021 – Dec 2023

  • Integrated a FST based Inverse Text Normalization (ITN) system into ASR post-processing using NVIDIA NeMo and added 4 custom grammar rules using Pynini. Compiled into FAR, reducing processing time by 140× and memory use by 80%.
  • Generated 5M+ synthetic spoken-written pairs for capitalization using OpenAI APIs and led 4-member team for ITN data annotation.
  • Benchmarked F1 scores for internal punct-cap models; Refactored & retrained punct-cap transformer-encoder in PyTorch Lightning.
  • Generated transcriptions using Whisper to fine-tune an internal ASR model. Built & deployed a Flask inference demo as a Linux service.
  • Preprocessed ASR datasets, boosting training data by 16%. Enhanced ASR functionality with a torchaudio based audio I/O module.
  • Built internal tools (pystratus, dvc-stratus) for dataset and model management, enabling secure cross-team adoption with OAuth and resource policies. Maintained ZWAF for OneAuth token validation in cross-team Zoho API communication.
  • Integrated ZWAF into FastAPI-based ASR web server middleware, securing cross-team usage.

ITN Pynini PyTorch NVIDIA NeMo Flask FastAPI DVC
Manager: Ananda Seelan Lakshmi Narasimhan, Senior Deep Learning Scientist, NVIDIA (formerly ML Engineer Team Lead, ASR ZLabs R&D)
Zoho

Zoho Corporation, Chennai

Project Trainee (Intern) — ASR ZLabs R&D
Jan 2021 – June 2021

  • Collected and preprocessed datasets for ASR model, assisting peer ML Engineers with data requirements, increased benchmark data suite by 83% (400 hours of audio). Added PyTorch Iterable Dataset classes for each dataset and unit tests using pytest.
  • Processed open-source datasets using youtube-dl for YouTube data, Google ASR for synthetic transcriptions and developed a Streamlit tool for audio recording. Organized team sessions to create a limited benchmark with real-time recordings.

Python PyTorch Dataset Pytest Streamlit Google ASR
Manager: Ananda Seelan Lakshmi Narasimhan, Senior Deep Learning Scientist, NVIDIA (formerly ML Engineer Team Lead, ASR ZLabs R&D)
Mentor: Raman Rajarathinam, ML Engineer, ASR ZLabs R&D
ONGC

ONGC, Chennai

Intern — Regional Computer Center
May 2019

  • Developed an issue tracking system on ONGC's intranet, enabling issue creation, retrieval by ID, and resolution status tracking. Users could accept or deny resolutions to flag if further assistance was needed.
  • Gained hands-on experience collaborating with a team of software engineers on a real-world product.

Python Web Development Intranet
Advisor: Shri B. Ravindranath, Chief Manager of Programming, ONGC Ltd.
Mentor: Shri Pruthvee Mamidikuduru, Deputy Manager of Programming, ONGC Ltd.