Use of Khan Academy and Mathematics Achievement

Updated March 18, 2025 21:52

Implementation

In the summer of 2017, LBUSD partnered with Khan Academy to support pilot teachers in their implementation of Khan Academy in the classroom. A total of 89 pilot teachers volunteered to integrate Khan Academy into their lessons for at least 30 minutes per week during the 2017-18 school year. Khan Academy provided pilot teachers face-to-face professional development to deepen their knowledge of how to implement Khan Academy in the classroom. Teachers were presented with several uses of Khan Academy—guided practice, review, homework, for example—during the professional development session, and they had full autonomy to choose the model that best met their needs. Khan Academy provided LBUSD pilot teachers with ongoing virtual support from our staff, including high priority helpdesk support.

Research Design

In this study, we focused on middle school mathematics and explored how the use of Khan Academy in the classroom relates to student achievement on standardized assessments. Of the 89 pilot teachers, 49 were middle school mathematics teachers. The teachers taught at 20 different schools with a total of 5,348 students in their classes. We analyzed associations between Khan Academy usage data from these pilot teachers and their students and student Smarter Balanced Assessment mathematics scores from the California Assessment of Student Performance and Progress (CAASPP).

District Profile¹

Large urban district with 72,000 students located in Southern California
65% socioeconomically disadvantaged and 15% English learners
57% Hispanic students, 12% African American students, and 12% White students

Key findings

After statistically controlling for students’ prior achievement and their demographic characteristics, using Khan Academy for more than 30 minutes a week is associated with a statistically significant difference of +22 points on the Smarter Balanced Assessment mathematics scale score.

These results equated to a 0.20 positive difference in standard deviation units, which is practically meaningful for education research.
They also held true regardless of race/ethnicity, gender, eligibility for free/reduced lunch, and English learner status.
Students performed better than expected on district established targets.

¹http://www.lbschools.net/District

Association between use of Khan Academy and 2018 Smarter Balanced Assessment mathematics score, compared to students with no use
After statistically controlling for students’ prior achievement and demographic characteristics

Results

In this pilot, we recommended that teachers implement Khan Academy in their classes for 30 minutes a week. However, teachers and their students ultimately decided how much time was spent using Khan Academy. In our sample we found that there were variations in the use of Khan Academy across students. We sought to understand how different levels of Khan Academy use relate to achievement. As such, in our analyses, we segmented the usage data into four categories reflecting students’ average weekly use throughout the school year: no use (0 minutes/week), low use (<15 minutes/week), medium use (15-30 minutes/week), and recommended use (30+ minutes/week). Fourteen percent (n=748) of the students were in the no use category, 58% (n=3,092) were in the low use category, 17% (n=926) were in the medium use category, and 11% (n=582) were in the recommended use category.

We conducted a regression analysis with 2018 Smarter Balanced Assessment mathematics scaled scores as the outcome variable and Khan Academy usage as the intervention of interest. We controlled for 2017 Smarter Balanced Assessment mathematics scaled score, mathematics course, gender, race/ethnicity, eligibility for free/reduced lunch, and English learner status². Our analyses indicated that students who used Khan Academy for more than 30 minutes a week, the recommended usage time, scored 22 points higher³ on the 2018 mathematics portion of the Smarter Balanced Assessment, compared to students who did not use Khan Academy, those in the no use category. LBUSD established targets for the math portion of the Smarter Balanced Assessment. To contextualize the +22 point difference this means that, on average, students performed better than expected on district targets.

This difference is equivalent to an effect size of 0.20, which is considered substantial ineducation research⁴. In practicality, what this means is that the typical student who did not use Khan Academy would score at the 50th percentile, whereas the typical student who used Khan Academy for more than 30 minutes per week would score at the 58th percentile. Our analyses also concluded that these results hold true regardless of race/ethnicity, gender, eligibility for free/reduced lunch, or English learner status.

These results are positive and promising. However, given the correlational research design, we cannot conclude that Khan Academy specifically causes these results. There could be factors that we were not able to account for in our analyses that may be driving these results. Future studies with a more rigorous research design may be able to provide better insight to know if time spent on Khan Academy is causing these results. Additionally, these f indings may not broadly generalize to other districts, different groups of teachers, or alternative Khan Academy implementation models, particularly since teachers volunteered to participate in the pilot.

Want to learn more?

This research brief provides high-level findings from this study. A full technical report will be published later this year with all of the details of our methodology. If you would like to receive notification when the technical report is available, please contact efficacy@khanacademy.org

²We additionally controlled for teacher variability by specifying regression models that included all student-level covariates plus either teacher fixed effects or teacher random effects. The findings regarding use of Khan Academy were consistent across the models with and without teacher-level effects. See the technical report for full modeling details.
³Statistically significant (β Recommended_use = 21.5, t = 7.2, p < 0.001). See the technical report for full modeling details.
⁴Kraft, M.A. (2018). Interpreting Effect Sizes of Education Interventions. Brown University Working Paper.