Statistical Learning (Featured Bioinformatics Topic on Kaggle)
Predicting Diabetes Incidence for the Pima Indian Dataset, SFU Canada
Sept 2017 – Dec 2017

  • Explored different statistical learning methods, including Generalized Additive Model (GAM), Gradient Boosting Machine(GBM), Support Vector Machine(SVM), Random Forest(RF) and Logistic Regression
  • Used VIM package in R to visualize missingness pattern and utilized Multivariate Imputation using Chained Equation(MICE) method in imputing missing values
  • The best model is an ensemble of GAM, GBM and SVM, achieving 80.6% average test accuracy which is comparable to state-of-the-art models

Deep Learning (Natural Language Processing)
Aspect Based Sentiment Analysis using Deep Neural Networks, SFU Canada
Jan 2017 – May 2017

  • Analyzed the sentiment of a product review given an aspect of the product using Deep Memory Network (DMM)
  • Achieved test accuracy higher than state-of-the-art neural network based model in 3 classes sentiment classification (Positive/Negative/Neutral)
  • Test accuracy for Restaurant Data (3041 training, 100 test): 84.8% > 77.2% (State-of-the-art)
  • Test accuracy for Laptop Data (3045 training, 100 test): 73.44% > 68.9% (State-of-the-art)

Machine Learning (Computer Vision)
Fingerprint Liveness Detection using Neural Networks, SFU Canada
Sept 2016 – Dec 2016

  • Developed neural network models in classifying real and fake fingerprint images (2000 training images: 1000 real and 1000 fake, 2500 test images: 1000 real and 1500 fake)
  • Architectures implemented include multi-layer perceptron, CNN and a model based on different input features extracted using local image descriptors such as BSIF and WLD
  • Utilized dimensionality reduction technique PCA which improved test accuracy by ~9% for all models
  • The best model achieved test accuracy of 99% and ACE score of 1.1(Metric used by LivDet competition)

Theoretical Computer Science (Design and Analysis of Algorithm)
Online Randomized Algorithm, HKU Hong Kong and SFU Canada
Aug 2015 – Jan 2016, Sept – Dec 2016

  • Studied the design of competitive online algorithms using primal dual approach and applied this approach in analysing RANKING algorithm for online bipartite matching problem
  • Research idea developed in HKU was further investigated by Dr. Huang’s group
  • Re-explored this problem as a course project in SFU and obtained 100% for this project