Statistical Learning (Featured Bioinformatics Topic on Kaggle)
Predicting Diabetes Incidence for the Pima Indian Dataset, SFU Canada
Sept 2017 – Dec 2017
- Explored different statistical learning methods, including Generalized Additive Model (GAM), Gradient Boosting Machine(GBM), Support Vector Machine(SVM), Random Forest(RF) and Logistic Regression
- Used VIM package in R to visualize missingness pattern and utilized Multivariate Imputation using Chained Equation(MICE) method in imputing missing values
- The best model is an ensemble of GAM, GBM and SVM, achieving 80.6% average test accuracy which is comparable to state-of-the-art models
Deep Learning (Natural Language Processing)
Aspect Based Sentiment Analysis using Deep Neural Networks, SFU Canada
Jan 2017 – May 2017
- Analyzed the sentiment of a product review given an aspect of the product using Deep Memory Network (DMM)
- Achieved test accuracy higher than state-of-the-art neural network based model in 3 classes sentiment classification (Positive/Negative/Neutral)
- Test accuracy for Restaurant Data (3041 training, 100 test): 84.8% > 77.2% (State-of-the-art)
- Test accuracy for Laptop Data (3045 training, 100 test): 73.44% > 68.9% (State-of-the-art)
Machine Learning (Computer Vision)
Fingerprint Liveness Detection using Neural Networks, SFU Canada
Sept 2016 – Dec 2016
- Developed neural network models in classifying real and fake fingerprint images (2000 training images: 1000 real and 1000 fake, 2500 test images: 1000 real and 1500 fake)
- Architectures implemented include multi-layer perceptron, CNN and a model based on different input features extracted using local image descriptors such as BSIF and WLD
- Utilized dimensionality reduction technique PCA which improved test accuracy by ~9% for all models
- The best model achieved test accuracy of 99% and ACE score of 1.1(Metric used by LivDet competition)
Theoretical Computer Science (Design and Analysis of Algorithm)
Online Randomized Algorithm, HKU Hong Kong and SFU Canada
Aug 2015 – Jan 2016, Sept – Dec 2016
- Studied the design of competitive online algorithms using primal dual approach and applied this approach in analysing RANKING algorithm for online bipartite matching problem
- Research idea developed in HKU was further investigated by Dr. Huang’s group
- Re-explored this problem as a course project in SFU and obtained 100% for this project