Evaluating Language Teacher Training Through the Kirkpatrick Model: A Comprehensive Framework for In-Service and Pre-Service Development

Evaluation, Institutional Improvement, Kirkpatrick Model, Language Education, Professional Development, Supervision, Teacher Training 0 comments

The Kirkpatrick Model in ELT

AI-generated picture by Prof. Jonathan Acuña Solano in September 2025

Introductory Note to the Reader

As someone who served as a teacher supervisor many years ago, I have long reflected on how teacher training can generate effects that truly endure over time. More recently, working as a coach for pre-service teachers during their practicum as part of a university course has provided me with a fresh perspective: training must not only prepare teachers for the classroom but also foster long-lasting practices that contribute to institutional growth.

The Kirkpatrick Model, though not originally designed for language education, appears adaptable to the supervision and evaluation of language teachers. However, its actual effectiveness in this field can only be confirmed when supervisors begin testing and refining its application within their specific contexts. This paper explores how the model can guide teacher training evaluation and open new avenues for accountability, growth, and meaningful institutional impact.

Evaluating Language Teacher Training Through the Kirkpatrick Model: A Comprehensive Framework for In-Service and Pre-Service Development

Abstract

This paper examines the application of the Kirkpatrick Model to evaluate language teacher training programs for both in-service and pre-service educators. The model’s four levels—reaction, learning, behavior, and results—provide a comprehensive framework for assessing teacher professional development. Drawing on studies across educational contexts, the paper argues that systematic evaluation strengthens institutional accountability, enhances pedagogical practices, and ensures that training leads to measurable improvements in student learning. The adaptability of the model makes it a valuable tool for language education, though its effectiveness must be tested by supervisors in real classroom contexts.

Keywords: Kirkpatrick Model, Teacher Training, Supervision, Language Education, Professional Development, Evaluation, Institutional Improvement

Resumen

Este artículo analiza la aplicación del Modelo de Kirkpatrick para evaluar programas de formación docente en lenguas, tanto en servicio como en formación inicial. Los cuatro niveles del modelo—reacción, aprendizaje, comportamiento y resultados—ofrecen un marco integral para valorar el desarrollo profesional del profesorado. Con base en diversos estudios educativos, se sostiene que una evaluación sistemática fortalece la rendición de cuentas institucional, mejora las prácticas pedagógicas y garantiza que la capacitación genere avances medibles en el aprendizaje estudiantil. La adaptabilidad del modelo lo convierte en una herramienta útil para la enseñanza de lenguas, aunque su eficacia debe probarse en contextos reales de supervisión.

Resumo

Este artigo examina a aplicação do Modelo de Kirkpatrick na avaliação de programas de formação de professores de línguas, tanto em serviço quanto em formação inicial. Os quatro níveis do modelo—reação, aprendizagem, comportamento e resultados—oferecem uma estrutura abrangente para avaliar o desenvolvimento profissional docente. A partir de estudos em diferentes contextos educacionais, argumenta-se que a avaliação sistemática fortalece a responsabilidade institucional, melhora as práticas pedagógicas e assegura que a formação resulte em avanços mensuráveis na aprendizagem dos alunos. A adaptabilidade do modelo o torna uma ferramenta valiosa para o ensino de línguas, embora sua eficácia precise ser testada em contextos de supervisão real.

Introduction

Language teaching is a dynamic field requiring teachers to continuously adapt to evolving pedagogical standards, linguistic trends, industries’ requirements, and learner needs. Professional development, both for in-service and pre-service teachers, is essential to maintain instructional quality to achieve linguistic goals. However, the effectiveness of such teacher training must be rigorously evaluated. The Kirkpatrick Model, developed by Donald Kirkpatrick in 1959, remains one of the most widely used frameworks for evaluating training programs across sectors, including education (Kirkpatrick, 1959).

James D. Kirkpatrick emphasized that “training has little value unless what is learned gets applied on the job, and the subsequent on-the-job performance contributes to key organizational outcomes” (Kirkpatrick, 2025, p. 24). This principle is particularly relevant in language education, where the ultimate goal is improved student learning and classroom effectiveness. In other words, providing training that will not be measured by supervisors or mentors is not conducive to teacher improvement or accountability.

The Kirkpatrick Model: Structure and Relevance

The Kirkpatrick Model evaluates training across four levels:

1. Reaction – How participants feel about the training.

2. Learning – What participants have learned.

3. Behavior – How learning is applied in practice.

4. Results – The broader impact on organizational goals.

Alsalamah and Callinan (2021) argue that the model “helps evaluators to conceptualize the assessment of learning outcomes of training programmes with metrics and instruments” (p. 2). Its adaptability and clarity make it particularly suitable for educational contexts where teacher observation, lesson delivery, planning, and reflective teaching journals can contribute to help boost teacher professional development. Guskey (2000) further emphasized that “evaluation is not an afterthought to training, but rather is meant to be integrated into the entire learning and development process” (p. 5). As stated before, having instructors participate in language training sessions focused on specific areas of improvement with no measurement of training impact and application is futile.

Level 1: Reaction

Level 1: Reaction assesses participants’ immediate responses to the training experience they have undergone. In language teacher education, this includes satisfaction with training content, delivery methods used by trainers, and relevance to classroom practice as of the next class they must teach.

Alsalamah and Callinan (2021) found that positive reactions among head teachers correlated with higher engagement and motivation to apply learned strategies. Similarly, Mat Yusoff, Rahim, & Yaacob (2016) reported that in a study of 1,200 teachers, “the assessment on reaction… was on average at a high level,” indicating strong initial acceptance of the training program (p. 23). Level 1 should corroborate that instructors are in agreement with what was addressed in training sessions and that their application can boost student learning if used thoroughly.

Malik and Asghar (2020), in their evaluation of early childhood education (ECE) teacher training, noted that affective and utility reactions were strong indicators of perceived training quality. James D. Kirkpatrick noted, “If formal training components are planned and executed well, resulting in strong Level 1 Reaction, it is very likely that Level 2 Learning will take place” (Kirkpatrick, 2025, p. 39). Level 1 is crucial to ground the next step on fertile soil.

Level 2: Learning

Level 2: Learning measures the acquisition of knowledge, skills, and attitudes by trainees. In language teacher training, this might involve assessments of pedagogical knowledge, linguistic proficiency, and classroom management strategies. At this point teachers do not show evidence that what they now know is being applied in the classroom and is producing better learning outcomes among learners.

Alsalamah and Callinan (2021) emphasized the use of concrete metrics such as quizzes, simulations, and reflective journals to assess learning outcomes. Mat Yusoff et al. (2016) found that reaction variables contributed significantly to learning outcomes: “21.7% to knowledge, 19.4% to skills, and 17.2% to attitudes” (p. 24). Metrics are indeed necessary for teachers to demonstrate how their attitudes and teaching practices are being modeled and little by little mutating towards something more student-oriented and communicative.

Guskey (2002) proposed a backward planning model that begins with desired student outcomes and works backward to design effective professional learning. He argued that “professional development must be planned with the end in mind—what students are expected to learn” (p. 46). Kirkpatrick (2025) reinforced this by stating, “Evaluation must be built into the training process from the beginning, not added as an afterthought” (p. 27). In short, training has to be planned in terms of what teachers must demonstrate in the classroom and its correct application for students to better assimilate content and use it meaningfully.

Level 3: Behavior

Behavioral change (or Level 3: Behavior) refers to the transfer of learning into one’s lesson planning and teaching practice. For in-service or pre-service language teachers, this step includes the implementation of new instructional strategies to help learners assimilate new content, use of assessment tools such as formative rubrics to guide student learning, and engagement with students through class activities especially when providing feedback for improvement.

The Educator Diversity framework (2025) outlines how mentor observations, teaching journals, and video analysis can be used to assess behavioral change in pre-service teachers. Alsalamah and Callinan (2021) found that behavior change was most evident when training was supported by follow-up coaching and peer collaboration. Once again, if training does not include a follow-up component is it bound to be fruitless for the teacher, the students, and the organization where the instructors work.

On the other hand, Malik and Asghar (2020) emphasized that social support and motivation to transfer training were critical factors influencing behavioral change. Kirkpatrick (2025) stated, “Organizations that reinforce the knowledge and skills learned during training with accountability and support systems can expect as much as 85 percent application on the job” (p. 45). If no accountability is part of the post-training process, this 85 percent stated by Kirkpatrick will not materialize among all teachers since expected changes are not compulsory.

Level 4: Results

The final level, Results, evaluates the broader impact of training, such as student achievement, teacher retention, and institutional improvement. In this area, Alsalamah and Callinan (2021) demonstrated that training programs evaluated through all four levels yielded measurable improvements in school leadership and teacher performance. The Educator Diversity framework (2025), furthermore, recommends using student learning gains and principal surveys to assess the effectiveness of newly trained teachers. To sum up, language instructors must demonstrate what they have learned and have implemented in the classroom, and supervisors must be measuring the correct application of specific changes in behavior expected in the classroom.

Moreover, Guskey (2002) argued that “the ultimate goal of professional development is improved student learning, and evaluations must reflect that priority” (p. 47). Kirkpatrick (2025) concluded, “If training evaluation shows that on-the-job performance increased and results improved, then training effectiveness has occurred” (p. 44).

Implications for Language Teacher Education

Applying the Kirkpatrick Model to both in-service and pre-service language teacher training ensures a comprehensive evaluation process for both types of instructors. This way of training language instructors allows institutions to identify strengths and weaknesses in the lesson planning or teaching process, justify investments in specific areas of teacher development, and align training with educational goals to guarantee language learning among students.

Kirkpatrick (1959) emphasized that “training programs should be evaluated to determine whether or not they should be continued” (p. 25). Moreover, integrating evaluation into the design phase enhances the relevance and effectiveness of training initiatives. If, perchance, the way teachers are being trained is not compelling and practical, abandoning it may result in a great idea. As Guskey (2000) advocated for a systemic approach, “Evaluation must be part of a continuous improvement cycle, not a one-time event” (p. 6), and one that does not make language teachers accountable for its correct implementation.

James D. Kirkpatrick concluded regarding his model of professional development and training that “The power is in connecting the levels, not keeping them separate” (Kirkpatrick, 2025, p. 51).

Conclusion

The Kirkpatrick Model provides a valuable framework for evaluating language teacher training programs. By systematically applying its four levels, educators, supervisors, curriculum developers, and the academic staff of an institution can ensure that professional development leads to meaningful learning, behavioral change, and improved educational outcomes. The model’s adaptability and clarity make it a powerful tool for enhancing teacher education and ultimately student success.

📚 References

Alsalamah, A., & Callinan, C. (2021). Adaptation of Kirkpatrick’s Four-Level Model of Training Criteria to Evaluate Training Programmes for Head Teachers. Education Sciences, 11(116). https://doi.org/10.3390/educsci11030116

Educator Diversity. (2025). Supporting Evidence of Teacher Candidate Development: Kirkpatrick Model. https://www.educatordiversity.org/wp-content/uploads/2025/08/Supporting-Evidence-of-Teacher-Candidate-Development.pdf

Guskey, T. R. (2000). Evaluating Professional Development. Corwin Press.

Guskey, T. R. (2002). Does it make a difference? Evaluating professional development. Educational Leadership, 59(6), 45–51.

Kirkpatrick, D. L. (1959). Techniques for Evaluation Training Programs. Journal of the American Society of Training Directors, 13, 21–26.

Kirkpatrick, J. D. (2025). Kirkpatrick's Four Levels of Training Evaluation. Bookey. https://www.bookey.app/book/kirkpatrick%27s-four-levels-of-training-evaluation/quote

Malik, S., & Asghar, M. Z. (2020). In-Service Early Childhood Education Teachers’ Training Program Evaluation Through Kirkpatrick Model. Journal of Research and Reflections in Education, 14(2), 259–270. https://www.researchgate.net/publication/349179171

Mat Yusoff, M. S., Abdul Rahim, A. F., & Yaacob, M. J. (2016). The Kirkpatrick Model: A Useful Tool for Evaluating Training Outcomes in Higher Education. Education in Medicine Journal, 8(3), 19–26.

Training Proposal - Enhancing Language Teaching Competencies Using the Kirkpatrick Model

Training Proposal - Enhancing Language Teaching Competencies Using the Kirkpatrick Model by Jonathan Acuña

Evaluation Questions for Supervisors

Instructions: The following questions are designed to help supervisors assess the applicability of the Kirkpatrick Model in their institution. Use them to identify strengths, weaknesses, and areas of improvement in teacher training programs.

1. Reaction – Are teachers satisfied with the training content and delivery methods, and do they perceive it as relevant to their classroom practice?

2. Learning – What measurable evidence shows that teachers have acquired new knowledge, skills, or attitudes during training?

3. Behavior – Are teachers implementing newly learned strategies in their lesson planning and classroom teaching? Provide examples.

4. Follow-up Support – What forms of coaching, mentoring, or peer collaboration are in place to reinforce the transfer of training into practice?

5. Institutional Alignment – How well does the training align with the institution’s language learning goals and curriculum standards?

6. Results – What concrete impact does teacher training have on student learning outcomes and overall classroom performance?

7. Accountability – What mechanisms ensure that teachers consistently apply what they learned in training?

8. Sustainability – How can the institution ensure that changes brought about by training endure over time rather than fade after initial implementation?

9. Adaptability – How can the Kirkpatrick Model be modified to suit the unique needs of language teaching supervision in your institution?

Evaluating Language Teacher Training Through the Kirkpatrick Model by Jonathan Acuña