Research Papers

Document Type

Conference Paper


Understanding the factors that influence the choice of a STEM major is important for developing effective strategies to increase participation in STEM fields and meet the growing demand for skilled workers. This research is based on the nationally representative data of 25,206 students surveyed in the High School Longitudinal Study of 2009 (HSLS:09). The HSLS:09 includes longitudinal data from 9th-grade students through their postsecondary study. First, we use machine learning to predict who is going to opt for a STEM major. Then we use interpretable ML tools, such as SHAP values, to investigate the key factors that influence students' decisions to pursue a college STEM major. We identified with a relatively high degree of accuracy the students who will later choose a STEM major, namely our CatBoost classifier achieved an AUC score of 0.791. Moreover, by interpreting the model, we find that having a science or math identity, as well as demographic characteristics, such as gender and race, play important roles in the decision to pursue a STEM major. For example, Asians are more, females are less likely to consider a STEM major, on the other hand, we also find that gender and race do not influence students’ science or math identity.


Creative Commons License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.