Assessing EFL university students' writing: A study of score reliability

Autores

  • Elsa Fernanda González Universidad Autónoma de Tamaulipas, Unidad Académica Multidisciplinaria de Ciencias, Educación y Humanidades
  • Nelly Paulina Trejo Universidad Autónoma de Tamaulipas, Unidad Académica Multidisciplinaria de Ciencias, Educación y Humanidades
  • Ruth Roux Universidad Autónoma de Tamaulipas, Centro de Estudios Sociales

DOI:

https://doi.org/10.24320/redie.2017.19.2.928

Palavras-chave:

Inglés como lengua extranjera, Evaluación de la escritura, Confiabilidad, Rúbrica de evaluación

Agências de fomento:

Universidad Autónoma de Tamaulipas

Resumo

The assessment of English as a Foreign Language (EFL) writing is a complex activity that is subject to human judgment, which makes it difficult to achieve a fair, accurate and reliable assessment of student writing (Pearson, 2004: 117; Hamp-Lyons, 2003). This study reports on the variability that exists between the analytical grades that 11 Mexican EFL university teachers awarded to five writing samples. It describes the raters’ views on writing assessment and their use of analytical scoring rubrics. Data obtained from the grades awarded to each paper, and a background questionnaire, suggested that great variability was found between grades, and raters differed in their levels of leniency and severity, suggesting that having similar backgrounds and using the same rubric are not enough to ensure rater reliability. Participants’ perceptions were found to be similar in terms of the use of analytical rubrics.

Downloads

Não há dados estatísticos.

Referências

Attali, Y., Lewis, W. & Steier, M. (2012). Scoring with the computer: Alternative procedures for improving the reliability of holistic essay scoring. Language Testing, 30(1), 125-141.
Bacha, N. (2001). Writing evaluation: What can analytic versus holistic essay scoring tell us? System, 29(3), 371-383.
Barkaoui, K. (2007). Rating scale impact on EFL essay marking: A mixed-method study. Assessing Writing, 12(2), 86-107.
Barkaoui, K. (2010). Variability in ESL essay rating processes: The role of the rating scale and rater experience. Language Assessment Quarterly,7(1), 54-74.
Council of Europe. (2002). Common European framework of reference for languages: Learning, teaching and assessment. Strasbourgh, FR: Author.
Council of Europe. (2009a). The manual for language test development and examination. Strasbourgh, FR: Author.
Council of Europe. (2009b). Manual for relating language examinations to the common european framework of reference for languages: Learning, teaching and assessment. Strasbourgh, FR: Author.
Creswell, J. W. (2013). Research design: Qualitative, quantitative, and mixed methods approaches. Thousand Oaks, CA: Sage Publications.
Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing, 7, 31-51.
Eckes, T. (2008). Rater types in writing performance assessments: a classification approach to rater variability. Language Testing, 25(2), 155-185.
Esfandiari, R. & Myford, C. (2013). Severity differences among self-assessors, peer-assessors, and teacher assessors rating EFL essays. Assessing Writing, 18(2), 111-131.
Gonzalez, E.F. & Roux, R. (2013). Exploring the variability of Mexican EFL teachers’ ratings of high school students’ writing ability. Argentinian Journal of Applied Linguistics, 1(2), 61-78.
Glówka, D. (2011). Mix? Yes, but how? Mixed methods research illustrated. In M. Pawlak (Ed.), Extending the boundaries of research on second language learning and teaching (pp. 289-300). Poland: Springer.
Hamp-Lyons, L. (1989). Raters respond to rhetoric in writing. In H. Dechert & C. Raupach (Eds.), Interlingual processes (pp. 229-244). Tubingen: Gunter Narr Verlag.
Hamp-Lyons, L. (1991). Assessing second language writing in academic contexts. Norwood, NJ: Ablex Publishing Corporation Assessing.
Hamp-Lyons, L. (2002). The scope of writing assessment. Assessing Writing, 8, 5-16.
Hamp-Lyons, L. (2003) Writing teachers as assessors of writing. In Kroll, B. (Ed.) Exploring the dynamics of second language writing (pp.162-189). New York: Cambridge University Press.
Jacobs, H., Zinkgraf, S., Wormuth, D., Hartfiel, V. & Hughey, J. (1981). Testing ESL composition: A practical approach. Rowley, MA: Newbury House.
Knoch, U. (2009). Diagnostic assessment of writing: A comparison of two rating scales. Language Testing, 26(2), 275-304.
Kroll, B. (1998). Assessing writing abilities. Annual Review of Applied Linguistics,18, 219-240.
Lim, G. (2011). The development and maintenance of rating quality in performance writing assessment: A longitudinal study of new and experienced raters. Language Testing, 28, 543-560.
Mendelsohn, D. & Cumming, A. (1987). Professors’ ratings of language use and rhetorical organizations in ESL compositions. TESL Canada Journal, 5(1), 9-26.
Nunan, D. (1992). Research methods in language learning. New York: Cambridge University Press.
Pearson, P.C. (2004). Controversies in second language writing: Dilemmas and decisions in research and instruction. The University of Michigan Press.
Saxton, E., Belanger, S. & Becker, W. (2012) The Critical Thinking Analytic Rubric (CTAR): Investigating intra-rater and inter-rater reliability of a scoring mechanism for critical thinking performance assessments. Assessing Writing, 17(4), 251-270.
Shi, L. (2001). Native and non-native speaking EFL teachers’ evaluation of Chinese students’ English writing. Language Testing, 18(3), 303-325.
Shi, L., Wan, W. & Wen, Q. (2003). Teaching experience and evaluation of second language students’ writing. The Canadian Journal of Applied Linguistics, 6, 219-236.
Vann, R., Lorenz, F. & Meyer, D. (1991). Error gravity: Faculty response to errors in the written discourse of non-native speakers of English. In L. Hamp-Lyons (Ed.), Assessing second language writing in academic contexts (pp. 181-195). Norwood, NJ: Ablex.
Weigle, S. C. (1994). Effects of training on raters of ESL compositions. Language Testing, 11, 97-223.
Weigle, S. C. (2002). Assessing writing. United Kingdom: Cambridge University Press.
Weir, C. J. (1990). Communicative language testing. NJ: Prentice Hall Regents.
White, E. M. (1990). Language and reality in writing assessment. College Composition and Communication, 41(2), 87-200.
Wiseman, C. (2012). Rater effects: Ego engagement in rater decision-making. Assessing Writing, 17, 150-173.

Downloads

Visitas à página de resumo do artigo: 2110

Publicado

2017-04-10