A comparison of lessons planned by different publicly available large language models in the context of physical education: an expert survey

Meixner B, Tristram C, Schranner M, Kenner A, Serwe-Pandrick E, Sperlich B, Düking P (2026)

Publication Language: English

Publication Type: Journal article, Online publication

Publication year: 2026

Journal

Frontiers in Education Frontiers Media S.A.

DOI: 10.3389/feduc.2026.1765699

Abstract

Introduction:

Large Language Models (LLMs) have the potential to assist teachers, particularly in lesson planning. The quality of lessons generated by various LLMs remains unexplored.

Methods:

We investigated the quality of different LLMs for lesson planning, using the basketball layup as example and surveying experts in the field. A prompt was submitted to three LLMs (GPT-4o, Claude Sonnet, and Google Gemini). Twenty-eight quality criteria to evaluate lessons were predefined and employed. Teaching experts rated the plans on 5-point Likert scales. A Friedman test was conducted to identify differences in quality among lesson plans.

Results:

The most frequent median rating across all lesson plans was “acceptable” (3 on a 1–5 Likert scale), accounting for 64 out of 84 total ratings. For most criteria (26 out of 28), no group differences were observed between the lesson plans by Claude, Gemini, and GPT-4o.

Discussion:

LLMs are capable of generating basketball layup lessons of acceptable quality; however, these require review and refinement by experienced teachers. Herein investigated LLMs displayed no differences for most evaluated criteria. While LLMs can provide valuable starting points, teachers need to acknowledge their limitations and tailor the lessons accordingly.

Authors with CRIS profile

Benedikt Meixner Department Sportwissenschaft und Sport (GB Lehre) Clara Tristram Lehrstuhl für Sportwissenschaft mit der Ausrichtung Gesundheitsförderung/Public Health/Sozialwissenschaften des Sports Alessandra Kenner Zentrum für Lehr-/Lernforschung, -innovation und Transfer, Abteilung FBZHL

Involved external institutions

Technische Universität Braunschweig

Germany (DE) Julius-Maximilians-Universität Würzburg

Germany (DE)

How to cite

APA:

Meixner, B., Tristram, C., Schranner, M., Kenner, A., Serwe-Pandrick, E., Sperlich, B., & Düking, P. (2026). A comparison of lessons planned by different publicly available large language models in the context of physical education: an expert survey. Frontiers in Education. https://doi.org/10.3389/feduc.2026.1765699

MLA:

Meixner, Benedikt, et al. "A comparison of lessons planned by different publicly available large language models in the context of physical education: an expert survey." Frontiers in Education (2026).

BibTeX: Download