Announcing the ICLE500 Dataset
cecl |
We are delighted to announce the release of the ICLE500 dataset (Thwaites et al., 2024). This dataset contains 500 argumentative essays from the International Corpus of Learner English (ICLE; Granger et al., 2020), each annotated with its corresponding CEFR level.
As part of the CLAP project, we partnered with Kollias (Harry) Charalambos from Polytomous Limited, who oversaw an assessment task aimed at mapping ICLE texts to the CEFR levels. This was accomplished using the CEFR's Table C4 - Written Assessment Grid (Council of Europe, 2020, pp. 187-189) and the guidelines provided in the manual "Relating Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment" (Council of Europe, 2009). A comprehensive technical report detailing the full assessment procedure accompanies the dataset (Kanistra & Kollias, 2024).
You can access the dataset at https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/RIOSSC
We hope the research community will find this dataset useful and look forward to seeing it used in future studies.