Language Evaluation in Developmental Language Disorder: Celf 4 vs Celf 5

The CELF assessment test is one of the tools most used today for the differential diagnosis of Developmental Language Disorder (DLD) because it allows obtaining a lingüistic level with different areas. The recent update of the test offers new scales especially in Spanish speakers. The objective of this study was to analyze the language level of people with DLD with the CELF 4 and CELF 5 tests to verify the possible differences that exist between both tests. The sample consisted of 26 children and adolescents with a diagnosis of TDL between 6 and 15 years old who were evaluated with both tests. The results obtained indicate that, in general, the scores are lower when they are evaluated with the CELF 5 test, with significant differences in Core Language, Receptive Language and Expressive Language. These data lead us to consider the CELF test as an essential tool in the diagnosis of DLD but also to take into account a complementary evaluation that allows obtaining a complete linguistic profile as a starting point for the intervention.


Introduction
Language Developmental Disorder/ Specific Language Impairment (LDL/SLI) is characterized by being one of the most prevalent alterations in learning difficulties, with a prevalence of approximately 7% (Sans et al, 2012). The main characteristics that define SLD/LDD include the difficulty in acquiring oral language, but without cognitive and sensorial deficits, or neurological alterations, and those that affect the functioning or configuration of organs participating in speech (Adams & Lloyd, 2005;Bishop & Norbury, 2002 and Botting;Conti-Ramsden, 2003). It, therefore, seems clear that language development does not follow the usual pattern that takes place in typical development (TD), and persistent difficulties appear in various language areas (Andreu et al., 2014;Muñoz-Yunta et al., 2005). Children with this alteration face problems to both acquire and develop language (Rodríguez et al., 2017). Children may present deficits in structural language components, but associated pragmatic disorders might also exist because of them (Adams & Lloyd, 2005;Bishop & Norbury, 2002;Botting & Conti-Ramsden, 2003).
In the phonology area of people with LDL/SLI, an incorrect phonological disorder, limitations in the phonological system, limited syllable patterns and variations in incorrect forms tend to appear (Coll-Florit, 2013;Kalábová, 2009).
In pragmatic terms, the scientific literature reports the frequent use of gestures as a strategy to make up for lexical difficulties, having little conversational initiative, or difficulties in using narrative and conversational strategies, or describing facts, and poorly interacting with adults by being limited to take turns in a question-answer format (Andreu et al., 2007). Difficulties also appear in reciprocal social interactions (Bishop, 1997;Leonard & Bortolini, 1998).
Therefore, it has apparently been demonstrated that subjects with LDL/SLI present marked heterogeneity in linguistic profiles because both their different language skills (phonetics, phonology, morphosyntax, semantics and pragmatics), and the type of language they employ (receptive and/or production), are selectively committed (Marini et al., 2008;Verhoevenet al., 2011).
Hence the use of valid protocols to measure these variables, and having a high degree of sensitivity and specificity is essential (Mendoza, 2011;Lietos & Belén, 2017). According to Acosta (2012), not only the way that subjects process language in given contexts, but also the socio-cultural factors that guide learning and language acquisition, must be considered when contemplating an evaluation. Along the same lines, Baker and Chenery (1999) and Carballo (2012) report that preparing an efficient intervention design depends on both the quality of the evaluation and the information obtained with it to a great extent (Howlin & Kendal, 1991).
Therefore, one of the most first important aspects to bear in mind when facing an LDL/SLI evaluation process is the early identification of the disorder, which centers on completely preventing the disorder or reducing any possible harmful effects Stevenson, 2003). Given the prototypical linguistic alterations of LDL/SLI, carrying out a receptive and expressive language evaluation by applying standard tools is fundamental to diagnose it (Fleckstein et al., 2018).

CELF 4 VS. CELF 5
The most frequently used standard tests include the Clinical Evaluation of Language Fundamentals (CELF) (Semel et al., 2003). CELF is able to very accurately identify subjects with LDL/SLI (Aguado, 2009). Some authors indicate that this is because the specificity and sensitivity of CELF is 89% and 100%, respectively, and its corresponding cutoff points in relation to standard deviations of -1.5 are crucial questions to correctly identify LDL/SLI (Acosta et al., 2013;Conti-Ramsden et al., 2001;Wilson et al., 1991;Wood et al., 2016).
CELF 4 (2006) is a tool that is individually administered to subjects aged 5-21 years. It allows individuals' strong and weak points to be identified, the seriousness of the disorder to be established, and enables subjects with LDL/SLI to be diagnosed and followed up. It provides recommendations for efficient interventions (Carvallo et al., 2014) which, in turn, permit the clinical decision-making process to be reflected (Paslowki, 2005;Semel et al., 2006). Its detailed description appears in Table 1. Nonetheless, the updated and more recently published CELF (CELF 5; 2018) presents many improvements compared to the previous version, of which the following stand out: a student performance evaluation process in class, or a classification guide based on the observation that analyses communication performance at school and home. Whereas the test elements in the fourth edition evaluate a given language skill, the fifth allows each subtest to be used independently of the rest (Douglass et al., 2019;Hessling & Schuele, 2020).
As for modifications in the test format, we find an independent evaluation of linguistic concepts and directional concepts, and stimuli that evaluate the most primary semantic relations have been included. Unlike former versions, CELF 5 allows the semantic component to be evaluated from the age of 5 years, and new elements to be included which boost the upper and lower limits of comprehension tests (Oetting et al., 2019;Scheidnes & Redmond, 2019).
Regarding changes to contents, some subtests have been removed given their poor use, and also because other specific protocols exist to evaluate them, such as referential naming capacity, phonological awareness or familiar sequences. New pragmatic evaluation subtests have also been included to allow social interactions to be analyzed, such as "verifying pragmatic skills" (Forbes, 2019;Poth, 2020).
It is also worth highlighting that CELF 5 is typified and validated for many special groups like LDL/SLI, learning disorders with reading-writing difficulties and Autism Spectrum Disorder (Matsuzaki et al, 2019). The CELF 5 structure is found in Table 2. Having analyzed both tests, this study aimed to compare the different language indices between them to verify if a correlation exists between the different areas in people with LDL/SLI.

Participants
In the present research work, 26 people diagnosed with DLD/SLI (22 males, 4 females) participated. They were divided into two age groups according to the distinction that the different evaluation tests make, 6-8 years (8 children) and 9-15 years (18 youths), according to the different age groups established for the test. The first group had a mean chronological age of 7.37 years (0.74), and the second group had one of 11.83 years (2.4).

Procedure
We first we contacted several LDL/SLI associations and private clinics to know people diagnosed with LDL/SLI and their interest in participating in this study. After centers had accepted and confirmed the sample subjects, an informed consent document was sent to parents to confirm their participation in the study. Informed consent had been approved by our University's Ethics Committee. After signing this document in both cases, a language evaluation was carried out during four sessions: two for CELF 4 and two for CELF 5. Tests were randomly performed with at least 1 month between one time point and the next. Both tests were printed and handed out.

Instrument
In order to make the language evaluation, the complete standard CELF 4 (Spanish Clinical Evaluation of Language Fundamentals-4) test version was used (Wiig, Secord & Semel, 2003). This test was administered to populations aged between 5 and 22 years to evaluate the language level in several areas: core language, receptive language, expressive language, language content, language structure and working memory.
The CELF 5 test (Wiig, Secord & Semel, 2018) addresses youths aged from 5 to 15 years. This test offers five types of compound scores for language level: the core language index (CLI), the receptive language index (RLI), the expressive language index (ELI) the language content index (LCI), the working memory index (WMI).
Both these tests distinguish age groups: one is for 5-8 year-olds and the other for youths aged 9 years and more.
In the present study, a decision was made to use the first four areas of both tests because they are comparable in both. This was not the case with language structure and working memory (that appear in CELF 4) and the language memory index (CELF 5). We were unable to compare the pragmatic area because it is not included in CELF 4.

Data Analysis
Sample normality was verified by the Kolmogorov-Smirnoff test, which was parametric. The Bonferroni post hoc test was then applied. Next the results were analyzed by the Student's t-test of independent samples. Pearson's correlation coefficient was employed to analyze any correlation between both instruments.

Results
The results obtained in the different evaluation test areas are found in Table 3. The data obtained with the participants in both groups were generally lower for performance in CELF 5 than in CELF 4. Medium scores can be observed in table 3. When correlating the scores obtained with both instruments, and by bearing all the participants in mind, we found differences when comparing various areas. For the CLI, significant differences appeared between both lots of data (t(25)=4.32, p<.01). Differences were also significant for the RLI (t(25)=2.13, p<.05) and the ELI (t(25)=4.95, p<.01). Nonetheless, no significant differences were observed for the LCI (t(25)=.85, p>.05).

Discussion
Our data demonstrate that both CELF 4 and CELF 5 are evaluation instruments that can be used as a starting point to diagnose LDL/SLI, as evidenced by different research works of recent years (Peña et al., 2020;Ramírez-Santana et al., 2019;Wright et al., 2018). Nevertheless, some characteristics ought to be taken into account. Some items in the original version have been modified, which has altered the tests that comprise each index. This led to lower results being obtained for CELF 5 than those obtained for CELF 4. There was no correlation between the different CELF 4 and 5 tests, except for the LCI. This is a striking finding because content subtests have been removed from the more recent version (CELF 5), which were in the previous version.
One of the reasons for this could lie in the test evaluation which, for CELF 4, was performed in Spanish-speaking children and children from the USA. However, a Spanish-speaking population was employed in the updated version. Therefore, we should also bear in mind a test limitation: evidence for poor test stability in some cases, (Coret & McCrimmon, 2015). Another explanation would be that despite CELF 4 generally having more items per test, the total number of tests to achieve the overall indices is bigger in CELF 5, which could result in the evaluated individuals feeling tired and not paying enough attention.
Nevertheless, very little research has been conducted into the overall CELF 5 analysis, despite the first studies that focused on investigating the usefulness of specific areas; e.g. understanding sentences (Lituma-Solis, 2019) or the pragmatic area having already appeared (Reid, 2018). Different studies have also taken CELF 5 as a general indicator of language level in other types of disorders like Fragile X Syndrome (Hoffmann et al., 2020), Cerebellar Ataxia Syndrome (Bonne et al., 2016) or difficulties with learning bilingual language (Shenoy, 2015). Therefore, not limiting the administered CELF test to only diagnose DLD/SLI and supplementing the evaluation with other tools to gain a more global vision of the different language areas are advisable, particularly for developing the lexico-semantic and morphosyntax areas.