1247 W 30th St, Apt #307

Los Angeles, CA, 90007

Long. Chen

(310) 806-2810




Senior Software Engineer

Baidu International product Department

                    July 2012 – July 2014

Baidu IME(Input Method Editor for Japanese) and SIMEJI

·   Processed Terabytes of log data and web corpus on Hadoop to generate IME models, achieved accuracy improvement (from 88% to 92%) by combining the user-behavior model with language model.

·   Improved conversion precision of SIMEJI greatly in Mobile platform by implementing a personalized conversion algorithm.

·   Built training process for IME models, which reduce the time of iteration (from week to day).

Software Engineer Internship

Baidu NLP Department

July 2011 – June 2012

·   Promoted CRF based Japanese morphological Analyzer (Mecab) in both recall and precision by expanding training corpus and adding rules.

·   Filtered Porn words of IME by implementing Porn words detection algorithm.

Software Engineer Internship

China-Telecom & Lucent Joint Adventure

July 2009 – June 2010

·   Developed a web-based File-System Management system.

·   Designed and Implemented data synchronization module based on Web Services protocol.


Los Angeles, CA

University of Southern California

Fall 2014  2016(Expected)

·   M.S. in Computer Science, August 2014. GPA: 3.5 (First Semester)

·   Graduate Coursework: Analysis of Algorithm, Multimedia System Design, Design of User Interface Software.

Harbin, China

Harbin Institute of Technology

July 2006 – June 2012

·   M.Eng. In Software Engineering, Graduated on July 2014. GPA: 3.8

·   Graduate Coursework: Natural Language Processing (A+); Information Retrieval and Practices (A). Data Warehouse and Data Mining (A+); Numerical Analysis, Advanced Database System.

·   B.S.E. in Software Engineering, Graduated on May 2010. GPA: 3.6.

·   Undergraduate Coursework: Operating Systems (A+); Databases; Java Language (A+); Probability Theory and Mathematical Statistics (A); Algorithms; Software Quality Assurance and Testing; Engineering;

Technical Experience


·  Created a new method to improve the clustering effect of IME entries based on N-pos language model, further facilitated the accuracy of Kana-Kanji conversion.

·  The paper “Using Collocations and K-means Clustering to Improve the N-pos Model for Japanese IME (Input Method Editor).” was accepted by COLLING 2012 as Long Paper.


·   Personalization Model of IME (2014) Designed and built a personalization kana-kanji conversion system, attained great accuracy advance on different application. Python, Hadoop, Data Mining.

·   News Corpus Mining (2014) Web crawl system which mining 5+ major media in japan, accumulated more than 5G news corpus, Improving IME conversion concise on business phrase. Python, Spider.

·   Multi-User Drawing Tool (2012). Collaborative platform where multiple users can view and simultaneously draw on a “chalkboard” with each person’s edits synchronized.  C++, QT.

·   FTP Search Engine (2010) Grab and index FTP files (file name and content) Based on Lucene Library. Designed and implemented search suggestions using search logs. Java, Lucene, BFS, Ajax.


·   Prize of First Grade Patent (2014): Three First grade patents in Baidu Inc.

·   Applied more than 20 patents in the field of IME (Input method editor) and wearable devices.

Languages and Technologies

·   Mandarin, English, Japanese.

·   Python, Hadoop, R Language, Linux, AWK, SAS, C++, Java, J2EE. SQL.

·   Agile methodology, Git.