Ramesh Oswal

Ramesh Oswal


Personal Profile

I'm currently pursuing my masters from Carnegie Mellon University. My areas of interest include Data Science, Machine Learning, Big Data, Data Analytics and Software Development


Carnegie Mellon University, Language Technologies Institute (School of Computer Science)

Master of Science - BIC Aug’16 - May’18

University of Pune

Bachelors of Engineering - Computer Engineering Aug’10 - May‘14

Course work

Intro to Machine Learning - 10601
Machine Learning for Large Datasets - 10605
Machine Learning for Signal Processing - 11-755
Automation for Biological Research 02-750


Machine Learning Engineer at Noble.AI

Jan’18 – May’18

Built offline ML Pipeline to extract & structure information in semi-structured docx as a part of UIE using Transfer Learning.
Worked on preprocessing the documents and creating client visualization for the unstructured documents presented to clients
Created Data Visualization for R&D experiment dataset showing various issues like variance in the dependent variable.
Built first working MVP for Intelligent Recommendation Engine.
Tools Used: Python, sklearn, matplotlib, luminoth, Tensorboard, Django

Real Time Audio Event Detection on Edge (RA - Prof Yuvraj Agarwal, Synergy Labs)

Jan’18 – May’18

Built from scratch the entire ML and Data Pipeline, stages include – Feature Extraction, Feature Engg, Hyper Parameter Tuning etc.
Ran Multiple Experiment using classical ML algorithms like Logistic Regression and SVM’s automatically detect Audio Events like Vacuum Cleaner, Drill Machine, Faucet Running etc.
Built a parallel pipeline running multiple experiments for each label tuning hyperparameter.
Performed Data Analysis to debug ML algorithm performance using dimensional reduction algo like PCA.
Tools Used – python, librosa, sklearn, jupyter.

Speech Recognition using Wall Street Journal Data (Professor Bhiksha Raj)

Jan’18 – May'18

Used the WSJ labelled dataset at frame and phoneme level to recognize unlabeled speech signal.
Built a 3 layer Neural Network on frame level data to train & make predictions resulting in accuracy of 56% for 136 labels.
Built a 4 layer CNN Model on phoneme level data to train and make predictions resulting in 80% accuracy for 46 labels.
Preprocessed data to deal with issues like variable length phoneme representation for CNN inputs.
Built an end-to-end ASR using Listen-Attend-Spell Architecture with the CMUSphinx language model.
Tools Used – Tensorflow, Pytorch, Python

Audio Forensic for Maritime Recognition (Carnegie Mellon University – Prof. Rita Singh and Prof. Bhiksha Raj)

Aug’17 – Dec’17

Built a system to automatically identify maritime audio signatures like Boat and Helicopter sound which can be used in Hoax Call Identification, solve criminal cases etc.
Collected audio recordings from Youtube 8M dataset using automatic scripts and parsing video description.
Used feature representations like Constant-Q. Correlograms , Modulation Spectrograms. Also used a pretrained CNN model to extract proxy features using the fully connected layer of CNN architecture.
Achieved accuracy of 73% using decision trees and 77% using Adaboost. Also proposed a full end to end architecture which could help in a more detailed analysis of sounds like make/type of helicopter and boat engine.
Tools Used – Python, Sklearn, Spark, MATLAB.

Data Science Intern, Walmart Labs

June’17 – Aug’17

Working on the Walmart Performance Ads team to optimize the current model used by Walmart to display relevant ads.
Predicting Click through Rate(CTR) of ads using contextual information resulting in increase in the revenue.
Feature Engineering (e.g Binning, polynomial and logarithmic feature transformation), identifying new features & performing experiments to tune hyper-parameters.
Tools Used – Python, Spark(MLlib), Scala, Hive, Cassandra, Weka

Movie Recommendation System using MovieLens Dataset (Carnegie Mellon University)

May’17 - June’17

Used the Matric Factorization Technique to recommend movies to users following the Netflix Prize Winner’s Strategy on the Movie Lens Dataset consisting of 1 million ratings as training set.
Implemented the Alternating Least Squares Optimizing Technique to solve the “RMSE” Objective Function.
Performed Experimental Analysis to tune hyperparamaters like K, lambda etc.
Tool Used: Spyder, Python (NumPy, matplotlib, SciPy)

Home Depot Product Search Relevance (Carnegie Mellon University)

May’17 - June’17

Performed feature engineering like cosine similarity, edit distance etc. using NLP techniques like word embeddings on the unstructured dataset consisting of Product Description and Attributes.
Data preprocessing like Stop word removal, Stemming, and typo correction were performed before feature engineering.
Used Machine Learning Algos like RandomForest Regressor and Linear Regression to score each search query.
Tool Used: Python (NumPy, matplotlib), Big Data/Distributed sytems -Spark – Pyspark, MongoDB

Super Fridge: Automated Grocery List using Object Detection in Refrigerator(CMU)


Built an application running on Raspberry Pi using the camera module to detect objects in a Refrigerator and creating a Grocery List for missing items.
Built modules consuming Clarifai Api used for object detection using a picture clicked from Pi camera and push the grocery list to google drive for users.
Tools Used: Python, Raspberry Pi & Camera, Calrifai API (Object Detection), Google Drive API

Musicon: Music playing based on User Activity Recognition: SteelHacks’17

24hr – Hackathon (Feb’17)

Built an Android app which used Google’s Accelerator(Motion Sensor) data to determine User Activity(Brisk Walk, Jogging, Sprint, Standing etc).
Integrated the User activity recognition module with Spotify API, which played song based on user activity and switched between them.
Tools Used: Android JDK, Java, Google Accelerator (motion sensor) API, Spotify API

Project Intern at Talencea Inc, Pittsburgh

Oct ’16 - May'17

Working with a Pittsburgh based startup founded by LTI Director Dr. Jaime Carbonell.
1st phase of project involves working on Big Data from different external sources like client and social media platforms and building Skill Repository.
2nd Phase includes building a cognitive model which matches candidates with appropriate job openings.
Data Munging activities include Data Cleanup, Indexing, Classification, Redundancy Removal & etc.
Technologies Used – Python, MS Excel, VBA, Informatica Siperian, PL/SQL

Image classification to classify proteins into subcellular localization patterns (Carnegie Mellon University)

Aug’16 - Dec’16

Built an Active Learning Framework containing Pool Based Data Access Model, Uncertainty based Querying Strategy and different base learners like SVM, Gaussian NB, KNN and Logistic Regression
Used SelectKBest algorithm for feature selection.
Accuracy score of 0.97 was achieved on test data using SVM as base learner.
Tool Used: Spyder, Python (sklearn, NumPy, matplotlib, SciPy)

Stock Price Prediction using Probabilistic Graphical Model (Carnegie Mellon University)

Aug’16 - Nov’16

Feature transformed stock prices into a log space for previous 5 days for each stock price of 6 companies (Apple, MS, Hecla, NEM Mining, GM, Ford)
Created precision matrix using transformed features. Marginalized Precision Matrix for missing data.
Conclusively was able to predict with minimal error rate the stock prices for Apple by using only 3 days worth of data and stock prices for companies MS, Hecla, NEM.
Tool Used: Spyder, Python (NumPy, SciPy)

Linear and Forward Stagewise Regression on unknown Dataset (Carnegie Mellon University)

Aug’16 - Nov’16

Feature transformed stock prices into a log space for previous 5 days for each stock price of 6 companies (Apple, MS, Hecla, NEM Mining, GM, Ford)
Created precision matrix using transformed features. Marginalized Precision Matrix for missing data.
Conclusively was able to predict with minimal error rate the stock prices for Apple by using only 3 days worth of data and stock prices for companies MS, Hecla, NEM.
Tool Used: Spyder, Python (NumPy, SciPy)

Paper Presentation “A Cloud Framework for Parameter Sweeping Data Mining Application”

Jan ’13 – Feb ’13

Explained the system framework i.e. its architecture and execution mechanism of how parameter sweeping could be achieved in data mining application
Finally, concluded by showing a performance evaluation w.r.t clustering & classification algorithms

Work Experience

Business Operation Associate at ZS Associates Inc.

Sept 2014 - June 2016

Automated processes like loading client data and QCing client deliverable and performed Ad-hoc analysis.
Automation of Processes to reduce response time for file processing by over 80%.
Technologies Used – Python, MS Excel, VBA, Informatica Siperian, PL/SQL

Hackathon Winner at ZS Quest'15

Oct 2015 (24hr Hackathon)

Participated & won in Quest’15 organized by ZS Associates which had 44 participating teams.
Created the architecture of product detailing communication between different modules.
Implemented an algorithm for “Thomson Tau Method of Outlier detection” to detect outliers.
Technologies Used – R, MS Excel, VBA and MS Access

Load Balancer for OpenFlow compliant SDN architecture (Sponsored by GS Labs Pvt. Ltd)

July ’13 - Jun ’14

Aimed at enhancing s/w load-balancer in distributing traffic based on server capacity by adding generic flows.
Based on paper “OpenFlow-based server load balancing gone wild” published in ACM Hot ICE’11 conference.
R. Oswal et. al. “A Survey of Past, Present and Future of Software Defined Networking”
Tools Used - Mininet with POX controller, OpenVswitch and OpenFlow protocol

Summer Intern at Softkoash Solutions Pvt Ltd

May ’12 – July ’12

Implemented Microsoft’s NerdDinner project as a POC
Fixed bugs and made changes to proprietary ERP Solution used by customers in production.
Technologies Used – C#, Microsoft’s .NET Framework, HTML, CSS and JavaScript

Key Skills

  • C, Core Java, Visual Basic, Python, PL/SQL
  • Oracle 9i and 10g, MySQL
  • Turbo C, Informatica Siperian, Excel, Anaconda, Eclipse, PyCharm
  • C++, R, Javascript, SAS, MATLAB
  • MS Access
  • MS Visual Studio 2010 & 14, Jupyter Notebook


  1. Won in Quest ‘15 (Hackathon at ZS Associates India Office)
  2. 2nd Best Project - PICT’s “Impetus & Conceptus’14”
  3. 2nd Prize in College TechFest Event ‘Network Raptor s‘
  4. Best Project in Operations Excellence (ZS Associate Global Offices)