Huoji: Chinese Texts Analyser
Extension References
Datasets:
  • subtlex_(frequency): Cai and Brysbaert 2010.
  • dajun_(frequency): DA Jun 2004.
  • taiwan_(frequency): Taiwanese Ministry of Education (TME).
  • france_(levels): Association Francaises des Professeurs
    de Chinois (AFPC).
  • hsk_(levels): HSK 2010 list.
  • ebcl_(levels): EBCL 2012 A1 list (Allanic B.).
  • Use limited to the 3.000 first entries, citation here. For academic use only.
    Lexicon & background concepts:
  • CFL lexicon objectives: about 23.000 words (詞) need for newspapers correct reading (Da J. 2005:15-16). In modern texts, it's equal to a words coverage of 95% and above the first 1.600 characters.
  • Lexical approach: is the now dominant view that the learners should focus on the hight usefulness (=hight frequency) unknow items (Nation 2006). Huoji (this Apps) is based on this usefulness-based lexical approach.
  • Benefit-Cost approach / Proximal learning: is the emerging view that the learners should focus on the hight usefulness (=hight frequency) and hight easiness (=low learning burden) unknow items since it indeed speed up learning (Barker 2007;Metcalfe 2005, 2006, 2010)
  • Frequency rank (character, words ~): an important predictor of reading comprehension (Shen H. H., 2005).
  • Characters by sentence: an important predictor of text difficulty (Da J. 2009:9;19).
  • Comprehensive input: is the view that learners should focus on items (grammatical or lexical) just a bit over their current level (Krashen & Terell 1983)
  • Coverage (lexical ~): ratio of understood items VS all items in % (Laufer 2010)
  • Learning burden / word difficulty / easiness:
  • Thresholds: limiting line between groups of lexical items, for us, the learner´s vocabulary is divided in known/partially know/unknow (Laufer 1997)
  • Opacity (Chinese characters ~): Chinese characters being non-phonetic, there is opacity —no certainty to read read an unknow characters—. This sound-graph disconnection slower learning.
  • Plateau (lexical~): finding that Chinese language learner of intermediate level are usually locked on a lexical level from which they cannot progress due to Chinese opacity (Shi & Wen 2009).
  • Vocabulary profile: the learner´s current lexical knowledge. Both the number of items and deph of knowledge are usually relevant.
  • Sample results:
  • CEFR study : Results of a CFL reading comprehension study aligned upon the CEFR's levels (Lopez 2012).