Looking back into my past mathematics teaching assistantship, my tutor and research assistant experience at Michigan State University, I am impressed by the wide application of mathematics in all walks of life, especially in data science. My motivation for understanding advanced data representation and analysis has greatly grown after I participated in several machine learning research projects. Therefore, I have decided to apply for a Master of Science in Data Science at the University of Michigan, Ann Arbor, to consolidate my statistical and computational skills and pave a solid way for my future doctoral research in the relevant area.
As early as my sophomore year, I began to work as a mathematics tutor at the Math Tutoring Center at MSU, where I was in charge of answering a range of mathematics questions in calculus, linear algebra and abstract algebra for college students of all ages. I tried to exemplify the solutions to facilitate their understanding of some complex mathematical concepts, which helped me practice my oral English and problem-solving abilities, as well as prepare for my later joining of the Undergraduate Mathematics Research Center at MSU as a research assistant.
Starting from August 2018, I have been working with Prof. Ilya Kachkovskiy and Dr. Shiwen Zhang to do some researches on the Hamilton Landscape for Schrodinger Operators with General Hopping Terms on a Finite Lattice, and Landscape Theory for Tight-Binding Hamiltonians. Based on the original landscape theory proof, I utilized the power series expansion method to study whether this theory was applicable to other matrices. The research topics were truly difficult, and it was often difficult to give new proof ideas. When we encountered hardships during the researches, I tried to serve as the window of the communication between Chinese exchange students and American undergraduates, trying to share some prototypes that prove the idea and think together whether it is feasible and continue to prove.
The research results proved that most symmetric matrices are applicable to this theory, and the landscape function can be used to obtain 2D and 3D spectrum graphs of these applicable matrices. More excitingly, one of our research papers has been accepted by **** and will be delivered in *****. Thanks to this more than two years’ research experience, I trained my research skill and critical thinking, and more importantly, I learned how to well cooperate with team members and professors in the concrete projects, which would help me quickly adapt to the future graduate research at the University of Michigan, Ann Arbor (UM).
During my current study of mathematics, I focus more on topics related to machine learning. From September to December 2019, I worked on the project of “Machine Learning with Missing Data based on TensorFlow and Scikit Learn” under the guidance of Prof. Taps Maiti. The main objective of this project was to investigate some machine learning tools and imputation methods when data were partially missing or contaminated. After randomly deleting the 20%-95% (increases every 5%) x values of the dataset to stimulate the missing data, I imputed the missing data by utilizing three most efficient ways including mean value method, KNN (k nearest neighbor) and hot deck method, and then trained and tested the completed datasets to explore accuracy rates of neural network models and random forest models.
Finally, by estimating the changes of MSE (mean square error), the best methods and machine learning models were chosen for solving numerical missing data. While machine learning is under the spotlight and definitely will get better improvement alongside its wider application, I will continue working on optimizing the algorithm and data sets training methods for gaining better results.
Besides researches, I also tried to train my mathematical analysis and programming skills in professional settings, to test different models’ accuracy in real-world circumstances. In the early summer of 2019, I partook in the project of restoring the ideal inversion factor model based on the data 3000 A-shares including their trading volumes, fixture numbers and checking purchase prices at Guotai Jun’an Securities. I used python to restore the new algorithm ‘W-cut’ proposed by the financial model, bring all the stock data into the operation, and finally get a set of stock data with the new ideal reversal factor, and express the programmed stock data in the form of images.
Within a month, I built a more efficient and accurate financial model with Python, which replaced the previous method using MATLAB, while they both illustrated the calculation result in diagrams. Although my attempt at upgrading received high praise from my supervisor, I could still recall how powerless I felt in processing massive amounts of data. A higher level of techniques must be key to my further performance, which strengthened my desire to apply for a Master of Science in Data Science at UM.
Once the competition of my master’s degree, I hope to continue Ph.D. research in the future. From Professor Cobrry Dirk, I learn more about UM’s strong research background in data science and rigorous teaching style, which made me believe that applying for a Master of Science in Data Science at UM is the very right choice for me at the moment. Courses like Data Structures for Scientists and Engineers and Introduction to Combinatorics would strengthen my ability to identify relevant datasets and apply the suitable statistical tools to the dataset to address practical problems. I am also interested in a series of Capstone courses, which would polish my research and paper writing skills and benefit my future doctoral application.