Hi! My name is Abhijit, and I'm currently pursuing my Master's in Computer Science at the University of Southern California, Los Angeles. I'm looking for full-time opportunities involving machine learning, natural language processing and software engineering after I graduate in Spring 2019.
Outside of computer science, my interests lie in music, aviation, and space exploration.
Student Research Assistant (NLP) @ Information Sciences Institute, Los Angeles
I’m currently working on a project about automatically replying to phishing emails in an attempt to waste the sender’s time.
Machine Learning Mentor @ Southern California Earthquake Center, Los Angeles
For 3 months, I was the machine learning mentor for 10 undergraduates as part of a new venture by my organization to include ML in their internship program. This involved giving weekly classes, preparing exercises, formulating machine learning tasks given a large dataset of synthesized earthquake data and working on machine learning models to predict the probabilities of earthquakes occurring on the San Andreas fault. In the remaining 3 months, I worked on techniques like ML diagnostics and implementing better algorithms. The success of the program last year prompted the organization to have another one this year.
Web Developer @ Southern California Earthquake Center, Los Angeles
R&D Intern @ Amadeus Software Labs, Bangalore, India
I worked as an intern in the R&D division of the mobile solutions team at Amadeus Labs, Bangalore. I evaluated a few enterprise mobility platforms and developed a hybrid enterprise mobile application using IBM MobileFirst Platform. The application searches for flights and syncs data between devices. It is integrated with Facebook login, Parse application backend, and Amadeus services through REST APIs.
I've worked with a wide range of technologies in multiple fields: game development, mobile application development, web development and general software development.
I have 2+ years of practical experience in ML through Kaggle competitions, personal projects, mentorship, research and coursework as well as a strong theoretical foundation through various university and online courses. I've recently become active in the area of computer vision using deep learning.
I have 3+ years of experience in both traditional and machine learning based NLP tasks involving text classification, grammar development, parsing, dialogue systems, and more, as well as university and online courses.
I've been a Linux user for 5 years and am proficient at using, debugging and configuring Linux desktops and servers, and writing advanced Bash scripts and commands.
My projects fall under the following categories.
Machine Learning Natural Language Processing Web App Development Android Development Apache Spark Game Development
A text classifier that determines which website on the Stack Exchange network a given question most likely belongs on. I created a dataset of around 300,000 questions using Stack Exchange APIs and web scraping. I experimented with several machine learning models to classify the data: bag-of-words models like Multinomial Naive Bayes and a fully connected neural network, as well as sequence models like an LSTM network, CNN, and SepCNN. Using a neural bag-of-words model, I achieved a top-3 accuracy of 90%.
Concepts Text preprocessing text classification word embeddings TF-IDF vectorization bag-of-words models sequence models
Technologies Python Tensorflow Scikit-learn
Using a synthetic dataset of simulated earthquakes, I created an anomaly detection model that predicts whether a major earthquake will occur on some section of the San Andreas fault in the next ten years with an F1-score of 84%.
Concepts Data preprocessing Outlier detection Data visualization
Technologies Python Scikit-learn Pandas Matplotlib
This was a Kaggle competition about classifying questions posted on Quora (the Q&A site) as sincere or insincere. I learned about and implemented some advanced and practical text preprocessing tricks to improve the performance of word embeddings. Using an LSTM network with TensorFlow/Keras, I ended up in the top 31% of the competition.
Concepts Text preprocessing Text classification Word embeddings Sequence models
Technologies Python Tensorflow
By participating in this competition, I learnt about the importance of data exploration and feature engineering, as well as how to handle different types of data in datasets, choose a machine learning model and preprocess data according to the model.
Concepts Data preprocessing Data exploration Model selection
Technologies Python Scikit-learn Pandas
A chatbot that recommends mobile phones to users through conversation, by understanding user preferences described in natural English - for example, "I want a phone that has a really good camera and a lot of storage". I developed a grammar specific to the types of user queries expected to identify key parts of the queries.
Concepts Dialog systems Grammars Part-of-Speech tagging Regular expressions Sentiment analysis
Technologies Python NLTK Dialogflow Flask Facebook Messenger Apps
Created an Android application using Java and a backend web service using Node.js to help users find activities to do around a given area or the user's location. It leverages the Google Places and Maps APIs. This was the final project for my Web Technologies (CSCI-571) class at USC.
Concepts App development Software development Web services
A chatbot that runs on Facebook Messenger and helps users by answering common questions asked by French learners, like translations, verb conjugations, noun genders, and more.
Concepts Dialog systems
Technologies Python Dialogflow Flask Facebook Messenger Apps
Some of the main concepts covered in this class were part-of-speech tagging, parsing, text classification, language modelling, sequence models and machine translation.
Learnt about the MapReduce programming model with Scala and Apache Spark, and algorithms to handle large volumes of data. The class gave me in-depth knowledge of concepts like PageRank, finding similar sets and frequent itemsets, community detection in large social graphs, mining data streams, recommender systems and clustering on massive amounts of data.
This class taught me the mathematical and theoretical details of several machine learning algorithms and familiarized me with implementing them from scratch using Python libraries like Numpy. I also learnt how to apply these algorithms on datasets.
Created a web-based English language spell checker that automatically offers correct suggestions to incorrectly spelt words as the user types into an input box. It uses Levenshtein (edit) distance to find nearby words. New words are added to the database of known words through crowd-sourcing and verification using the Merriam Webster Dictionary API.View on GitHub
Created a natural language speaking bot that connects to the Internet Relay Chat (IRC) network as a user. It defines words and generally converses with other users on the network.View on GitHub
Downloads Instagram images from a public user profile.View on GitHub
M.S. Computer Science @ University of Southern California, Los Angeles
Classes CSCI567 - Machine Learning CSCI544 - Natural Language Processing INF553 - Foundations and Applications of Data Mining CSCI570 - Analysis of Algorithms CSCI585 - Database Systems CSCI572 - Information Retrieval and Web Search Engines CSCI571 - Web Technologies
B.Tech. Computer Science and Engineering @ VIT University, Vellore, India
Classes Soft Computing Data Warehousing and Data Mining Cloud Computing Software Engineering Internet and Web Programming
Machine Learning by Stanford University on Coursera
Deep Learning Specialization by deeplearning.ai on Coursera