Ketan Akade

M.S. (Computer Science)
University of Southern California
Contact: +1 213-810-7377
Email :
Address: 320 Crescent Village Cir, Apt #1245
San Jose, CA 95134

About Me
I am adventure loving, a person of numbers and algorithms. My first encounter with the computers was long ago when I was 12 or 13. And it was mainly for playing games! It all started with playing games, writing few simple 'C' programs and eventually growing an immense interest in this omnipresent machine. This is when I decided to learn more about it.

With this intention in mind, I got admission in one of the best Undergraduate schools under University of Pune - Pune Institute of Computer Technology. I spent really wonderful time of my educational career there learning about Computers, Databases, Networking, Web Technologies, different programming languages and what not!
After finishing my Undergraduation, I decided to have some industrial experience. I joined 'Great Software Laboratory Pvt. Ltd.', a product based company in Pune, India. Later, I also worked in Soft Corner.

Currently I am pursing my Masters in Computer Science at the University of Southern California and really looking forward to make the best use of this opportunity.

Current Objective
To achieve professional as well as personal satisfaction and growth through challenging work by applying my expertise and knowledge in the field of Data Science and Computer Engineering.

Technologies and Research Interests
Data Science, Databases, Web Domain and Vedic Maths

Programming languages:
C, C++, java
Web Technologies: HTML, PHP, XML, CSS, JavaScript, jsp, Perl, RDF, SPARQL
Web Frameworks : PHP Yii framework, Twitter Bootstrap, Spring MVC framework
Databases : MySQL, Oracle
Web Server : Apache, Tomcat
Operating Systems : UNIX /Linux, Windows, Ubuntu
Tools : Git, Tortoise SVN

Spoken Languages
Proficient (speaking, reading, and writing) in English, Hindi and Marathi.


University of Southern California, Los Angeles, California.

Master of Science, Department of Computer Science
August 2012 - Present

Course Work:
Fall 2012
CS 570: Analysis of Algorithms under Prof. Aaron Cote
CS 571: Web Technologies under Prof. Marco Papa
CS 450: Introduction to Computer Networks under Prof. Ali Zahid

Spring 2013
CS 561: Foundations of Artificial Intelligence under Prof. K. Narayanaswamy
CS 572: Information Retrieval and Web Search Engines under Prof. Ellis Horowitz
CS 402: Operating Systems under Prof. Bill Cheng

Fall 2013
CS 548: Information Integration on the Web under Prof. Craig Knoblock and Prof. Pedro Szekely
CS 585: Database Systems under Prof. Shahriar Shamsian
Directed Research under Prof. Craig Knoblock

Pune Institute of Computer Technology, University of Pune, India.

Bachelor of Engineering, Department of Information Technology
August 2006 - June 2010

Work Experience

Yahoo!, Sunnyvale, CA.

Technical Intern
June 2013 - August 2013

University of Southern California, HSC!

Web Developer
September 2012 - April 2013

Soft Corner Pvt. Ltd.

Software Developer
March 2012 - July 2012

Great Software Laboratory Pvt. Ltd.

Member of Technical Staff (MTS) / Software Developer
July 2010 - May 2011

In this assignment, I searched the given dataset for the given list of keywords. The dataset contains over 6000 declassified documents that have been scanned from paper and made available via a digital content management system. The occurences of the keywords were determined by a parser, that takes a pdf file passed to it and extract its textual content using Tika. Regex expressions are used to find the keywords from this textual content.

Using Tika, finding occurrences of keywords was an easy tasks since it was accomplished by making few function calls. Tika might not extract fully-formed text on each of the documents, depending on the quality of the OCR scanning that was used. I encounterd a wide variety in the extracted text, depending on the nature of the original files - ranging from those with clean, embedded text, to those with some OCR errors, to files filled with OCR noise. This is expected given the unsanitized nature of the "realistic" data set.

Learning Points:
Tika is a content analysis and detection toolkit. Tika provides a set of Java APIs which provides MIME type detection, language identification, integration of various parsing libraries.
Characteristics of Tika:
  • Ability to uniformly extract and present metadata.
  • It is scalable, hence advantageous to use on large dataset.
  • Integrating third party libraries is very easy.
My goal in this assignment was to download, install and leverage the Apache Nutch web crawling framework to obtain all of the PDFs from a tiny subset of the FBI's Vault website that we have mirrored at vault, and then to package those PDFs and in order to create my vault.tar.gz file.

Configuring Nutch to perform this task was one of the trickiest parts of the assignment. I had to pay particular attention to the RegexURLFilter required to crawl the site, and to the crawl command arguments/parameters.
In this assignment, I:
  • Set up the Apache Solr search engine technology, which allows to index the PDFs from vault.tar.gz and annotate them with geographical location information
  • Leverage the GeoNames dataset at which maps names and geographic locations to latitudes and longitudes, in order to associate each of the PDF files with a single (approximate) location of interest.
  • Load the PDFs along with their associated geolocation information into Solr, making them available for searching.
  • Develop a web page that enables text search of the PDF files. The web page integrates Google Maps to plot locations of the search results on a map.
This was a part of my Operating Systems course assignment.
In this assignment, we Implemented module that handles process, threads, mutexes, condition variables and synchronisation primitives for Weenix OS (UNIX Based).
We also developed modules that are responsible for
  • Process Management
  • Virtual Memory Management
  • File System operations of Weenix Kernel OS.
The project proposed a solution to run applications demanding higher processing capabilities on thin clients by making use of cloud instances and the internet. Using VNC - Virtual Network Computing, only the application interface was rendered over the client interface, giving the end user an experience of using a locally installed system. The prototype could render linux or windows based application, running on EC2 instances, on clients having windows or linux. Users could seamlessly switch amongst multiple application having a rich user experience, as if the application was running locally.

Keywords: Cloud Computing, Amazon EC2, VNC, Java, Socket programming, Shell scripting
This android based mobile app aimed to get 2 visually impaired person together within the same room without using GPS. It was a submission in the 48hr code challenge at SS12, held at USC. We used the signal wifi strength as well as FTF for calculating the amplitude of sound wave magnitude to decide the distance between two devices.
Our team secured the runners up position.

Keywords: java, servlets, Android
This android based mobile app is a IMDB and Facebook MashUp using AJAX, Java Servlet, Perl Script, JSON which queried data from the IMDB database and the obtained results were then posted on Facebook.

Technologies: java, servlets, Android, Perl, ajax, json
This was an assignment in my Operating System course. I implemented the Token-Bucket algorithm in Computer Networks using the concept of multi-threading in Operating system. I have also implemented Mutexes, System signals for synchronising and communicating between different threads.

Keywords: C, Operating Systems, Mutex, Signals, multi-threading
This tool gathers the data about the perormance of Yahoo! products in all the locales from several sources. The tool analyses the data on different perormnace factors like PageViews, Unique users, Timespent by users etc. It then provides a comprehensive view to end user and also provides intelligence in taking product related business decisions.

Keywords:PHP, Yii framework, Javascript, jquery, YUI, CURL, HTML, Twitter Bootstrap, PureCss, MySql, yapache.
We developed an idea of the mashup where user could upload any image to the dashboard. This image was then parased using IQEngines APIs to generate relevant tagas from it. These tage are then used to get relevant data from different sources like Yahoo! news, Amazon item prices, Flickr relevant images etc. for the given image.
Thus, it provides a centralized access to diverse data for given image. This idea was implemented in the hackday event at Yahoo!

Keywords: javascript, IQ Engines API, Amazon API, Yahoo! search APIs, Flickr APIs
Project-X is a social networking platform that binds students, colleges and industry with the context of projects. I was a part of the team which designed and developed the project.

Technologies: PHP Yii Framework, MySQL, Neo4j, Java, Memcache, Twitter Bootstrap, jQuery
Powabunga is an auction based e-commerce website.
Users can sell items at their intended price and can also participate in an auction with minimum cost. It also implements a virtual currency which is used to make bids in an auction.
I was responsible for several module developments in it and also handling the backend part in MySql.

Technologies: Java, Spring MVC, JavaScript, MySQL, jsp, Apache
The paper was presented by me as a part of course work CSCI 572: Information Retrieval and Web Search Engines. The paper focuses on the algorithms to find near duplicate web pages using different approaches.

Finding Near Duplicate Web Pages
Gave a seminar on Phishing attacks, ways and precautions to tackle phishing attacks in my Undergraduataion.
  • Runners-up in the Hack-a-thon mobile application development competition organized by USC ACM at USC in 2013.
  • Runners-Up at 'Concepts 2010' - National level project competition at PICT, Pune, under 'User Application' domain for Cloud Me Project.
  • Ranked 18th in the State Merit List in SSC examination in University of Pune. (2003-04)

  • Won best-organized team prize 'Bhagirath Karandak' in Purushottam Karandak 2007 and 2008.