This paper reports a comparison of human and computer marking of approximately 600 essays produced by 11-year-olds in the UK. Each essay script was scored by three human markers. Scripts were also scored by the e-rater program. There was a good agreement between human and machine marking. Scripts with highly discrepant scores were flagged and assessed blind by expert markers for characteristics considered likely to produce human–machine discrepancies. As hypothesised, essays marked higher by humans exhibited more abstract qualities such as interest and relevance, while there was little, if any, difference on more mechanical factors such as paragraph demarcation.
© 2001-2025 Fundación Dialnet · Todos los derechos reservados