Development of Reading Comprehension Assessment Tool: Applying the Rasch Model

1. PhD Scholar, Department of Education, Lahore College for Women University, Lahore, Punjab, Pakistan. 2. Associate Professor, Department of Education, Lahore College for Women University, Lahore, Punjab, Pakistan PAPER INFO ABSTRACT Received: April 16, 2020 Accepted: June 15, 2020 Online: June 30, 2020 Language assessment is not a simple task. Development of assessment requires expertise, commitment and understanding of psychometric aspects of assessment. Reading comprehension is the important component of language assessment tools. This article is based on the development process of reading comprehension assessment for grade X. Forty items were developed by keeping in view the national curriculum of English language. National Education Assessment for Progress (NEAP) framework for English language was used to construct the items. Test was validated through experts. Thirty-three items were finalized after expert’s validation. Later on, the test was administered to 500 students of grade X. Rasch Model was used to assess the psychometric properties of test. Finally, 12 items were selected as a part of reading comprehension assessment. So, it can be concluded after this study that it is most likely to develop valid and reliable reading assessment tools.


Introduction
Reading assessment helps us understand the strengths and needs of each of our students. It depends upon reading anticipated audience and purpose of assessment. In teaching learning situation anticipated audience are students, teachers, parents and school administration. For each stakeholder assessment serves different purposes. Congruently, there are different assessment tools and activities ranging from high stakes assessments to school wide paper and-pencil reading tests used in different situations. These tests are administered individually to acquire bottomless understanding of how deeply a student understands a particular curriculum-related text .
While tracing the historical perspective of construct of reading, it is reflected that concept far behind the prescribed assessment of reading abilities. Until the 1990s, this construct had not been fully explored and placed in assessment context in USA.
During the era of 1920s to 1960s, psychometric principles were powerful assessment tools (Grabe& Jiang,2014). There was contrasting view in Uk and Europe where emphasis was laid on expert validity. Reading was assessed through some interesting tasks such as summarizing, paraphrasing and text interpretation. It was sometime time difficult to ensure high reliability of assessment tasks (Weir & Milanovic, 2003).
Objective type test items (MCQs, True/false, Matching the column) formulation became prominent during 1960s to 1970 and led to changes in assessment practices. This type of objective testing put constraint on how reading comprehension could be measured reliably. So, it was seriously realized in the beginning of 1970s, that it is not possible to assess reading comprehension through objective testing. Therefore, communicative competence and communicative language teaching laid emphasis on suitability of integrative reading assessments (Grabe& Jiang,2014). Later on, during 1980s bulk of cognitive research was conducted on reading abilities and identified several subskills of reading comprehension. According to Karakoc (2019) there are ten agreed upon common subskills listening and reading, while subskills unique to reading were seven. In 1990 research focus has been changed and diverted towards the roles of subskills on the reading performance. Researchers tried to explore the relationship between reading subskills and reading for different purposes such as reading to learn, reading for general comprehension, expeditious reading, etc. Recently reading comprehension construct has been conceptualized as the driving force modern standardized assessment practices.
Literature revealed no consensus regarding number of subskills, the multidimensionality of reading has fascinated the researchers to variety of subskills. Both qualitative quantitative studies were conducted to find out the reading subskills (Goh &Aryadoust, 2015;Kim, 2011) and no. of theories has been presented by linguistic experts to explain such skills. Another obvious factor revealed by literature is described the multi-divisibility with reference to particular subskills or language characteristics. It can be concluded that these subskills are used to explain the construct of reading comprehension (Karakoc, 2019). Luke and Freebody (1999) proposed model of reading as social practice. They chart out four reading practices such as coding, text meaning, pragmatic and critical practices. Individual having good reading ability uses all these practices simultaneously without any difficulty. A person become enable not only to decode or comprehend the text but can also developed his/her own argument while taking critical stance and increased his/her knowledge domain. Consequently, if all these standpoints taken together produce equally deep and broad understanding of the concept of reading (Margaret, et al 2009).

How Student Comprehend
Reading comprehension is one of the basic skills of the English language that enables students to comprehend textual material appropriately (Ali, et al,2017). There are sequences of cognitive progression and activities based on reader's ability that play a vital role to connect the meaning of multiple sentences and enable reader to articulate the meaning of the overall text (Magliano, et al, 2011) and readers interaction with text construct a meaningful representation of the text (Gilakjani,2016). For comprehension of textual material active participation of reader is necessary. It depends upon the individual's utilization of cognitive strategies and cognitive awareness. While studying literature on reading comprehension, it is noticed that concepts of cognition and metacognition are vanguard of reading comprehension (Aksana&Kisaca, 2009).
According to McNamara & Magliano (2009b) there are two types of informational process activities, bridging and elaborative inferences which support comprehension. Reader identifies overtly mentioned ideas in the text while creating conceptual link with previous knowledge is known as bridging inferences. There are two kinds of bridging inferences; anaphoric and pronominal along with and causalbased inferences that require the application of world knowledge. Reader needs these references for coherence and are routinely generated during comprehension (Singer & Halldorson,1996) and quantity of these generated references can differentiate between skilled, semi-skilled and less skilled reader (Magliano & Millis, 2003).
Elaborative inferences depend upon the world knowledge of the reader regarding perceptions and events presented in the text. Efficient reading requires addition of missing information in the text or adding new information. So, reader need to develop inference making ability to comprehend the text. Inference making is a cognitive and constructive thinking process which facilitates reader in comprehension of textual material. Vocabulary is not enough to understand the text, general world knowledge is required to understand the text. Information presented in the text sometime not adequate enough to comprehend the content. This weakness can be overcome through making elaborative inferences. In order to understand meaning of the text, reader use this strategy of elaborative inferences to understand the meanings of the writer and to interpret the sentences. Students in Pakistan are unable to participate actively in reading strategies because of their weak knowledge of such strategies. Hence, they fail to gain maximum benefit of the reading activities carried out in the classroom and to use them meaningfully outside the classroom setting (Haq, 2016)

Purposes of Reading Comprehension Assessment
As it is already mentioned that construct of reading comprehension is complex and multidimensional, therefore it is quite challenging to assess reading comprehension of an individual (Kendeou et al).
Standardized assessment and classroom-based assessment are the primary focus of reading assessment during recent era and have greatest impact on test takers (Grabe& Jiang,2014). (2004) stated that there are four purposes of reading comprehension assessment in school settings:

Carlisle and Rice
1. Assessment of student progress 2. Identification of the children with reading problems 3. Provision of feedback to stake holders 4. Segregating students at risk

Purpose of the Study
The first thing to do to rectify this validity problem may be to construct tests on the basis of sound reading comprehension theories (Hannon & Daneman, 2001). However, it would be difficult to find the perfect theory-based assessment because the reading theory is still developing and changing (Pearson & Hamm, 2005). A second approach could be to test validity utilizing factor analysis for construct validity and correlations for predictive and concurrent validity (Allen & Yen, 2002;Bell & McCullum, 2008).
One of the main characteristics of good test is its validity (it measures what it supposed to measure). In order to rectify the validity problem, test may be constructed on the basis of sound reading comprehension theories. But the problem faced by test developer is lack of sound reading comprehension theories. Alternatively factor analysis can be used to measure construct validity and correlations for predictive and concurrent validity (Hannon & Daneman, 2001, Pearson & Hamm, 2005Allen & Yen, 2002Bell & McCullum, 2008).
Classical Test Therory (CTT) has been extensively used to ensure the validity of the tests. Exponents of CTT stated that test score of the person is not necessarily an index of his/her ability. It is rather a combination of error score (i.e., random error variance) and true score. Major emphasis of CTT is to reduce the effect of measurement error and maximize the effects of language abilities to be measured (Nodoshan, 2009).
Finally, Item Response Theory (IRT) replaced CTT because classical theory has its own limitation. CTT has been replaced by an IRT very frequently in recent years. IRT is considered as best assessment tool for construct of reading comprehension. The main difference between CTT and IRT is that CTT emphases on the total test score while IRT focuses on performance of examinee on each item. IRT statistical models can be approved or disapproved through empirical data.
Present study is designed to develop valid reading comprehension test based on IRT.
The probability of a positive response as a function of ability, Pk(θ), is the socalled item response function of item k. Two item response curves are shown in Figure   Development of Reading Comprehension Assessment Tool: Applying the Rasch Model 1 776 2. The x-axis is the latent continuum θ and the y-axis is the probability of a positive response.
Educational testing data is analyzed mostly by using Rasch model. So, the latent variable θ is called ability and the item parameters bk are called item difficulties. Present study is designed to develop valid reading comprehension test based on IRT.

Material and Methods
Unidimensionality and the shape of item characteristics curves are the two basic assumptions of IRT. IRT stipulate a single latent characteristic to account for all statistical dependencies among test items as well as all differences among test takers. Test items that are easy are shifted to the left on the scale measuring the trait, and items that are hard are shifted to the right end of the measuring scale. Discriminating items have higher slopes than lower discriminating items. With appropriate model fit, the ICC match up closely to the actual test data (Fotaris&Mastoras, 2014&Zanon et.al,2018.The data set used in the present study was obtained from a testing company. This 33-item reading comprehension test was administered.

Participants
Total five hundred boys and girls were selected. Participants (n = 500) were 10grade adolescent students of local district. Participation of male and female is approximately equal. A research conducted by Jiang, Wang and Weiss (2016) on sample size requirement for estimation of IRT parameter. According to their finding sample size of 500 is necessary to obtain accurate parameter estimates. Another study conducted by Sahin and Hacettepe (2017) suggest combination sample size and length of the are important factors for correct estimation of parameters and sample size 150, 250, 500, and 750 students can be used to estimate IRT parameters.

Instrumentations
A test was constructed by using Framework of English Reading Assessment developed by National Assessment of Educational Progress (NAEP). This framework was used keeping in view the objectives of National Curriculum of English for 10 th grade.

Framework Description
According to NAEP, reading is an active and complex cognitive process. This definition applies while constructing assessment of reading achievement. There are number of factors that affect readers comprehension. According to NAEP, these factors include;  Reading context (for study, for skimming, for leisure)  Ability to recognize words  Content of the text  Infer meaning The NAEP reading framework includes two types of content; a. literary text b. Informational text.
These texts have distinct categories due to two reasons first the structural differences that mark the texts and second the purpose for which students read different types of text. The framework specifies that assessment questions for both literary and informational texts measure one of the three cognitive targets.

Item Construction
Test items were aligned with National curriculum and proficiency framework. Keeping in view the objectives, standards, benchmarks and SLOS given in national curriculum items were constructed. Each 'X' represents 0.8 cases Figure 1.

English Reading Comprehension
Item-person map shows that students' ability in reading comprehension is about between -2 to 2. Therefore, students' English reading comprehension ability is good.

Final Items
Data was analyzed while using CONQUEST (IRT based software).12 items were finalizedhaving fulfilled all criteria of IRT.

Discussion
The focus of present discussion revolved round complex and constructive nature of reading comprehension assessment. Reading comprehension tests are used in real school setting to assess the students' level of understanding about written text presented in the test (Keenan et al., 2008). Valid reading comprehension assessment is important and can be used for high stake testing and classroom-based assessment and following are the criteria;  Vocabulary assessment  Assessment having certain psychometric properties  Valid assessment measures Above mentioned criteria are quite difficult to meet while administering teacher made tests. Development of new valid tools to assess students reading comprehension skills become needs of the day because of student's wide variety and assessment of complex construct such as reading comprehension. Assessment of reading comprehension faces different challenges. As we have already mention that reading comprehension is complex and multifaceted construct comprising many skills and sub-skills, the first challenge shoots from this very nature of reading comprehension construct. Selecting appropriate sub-skills become difficult while designing assessment. This challenge becomes more formidable because of limitation of validity, reliability, time, cost and usability that constrained use of different assessment tasks. 2 nd challenge is to find out the connection between reading during assessment and reading without assessment. When student read text as a part of assessment, he/she would be more cautious. This cautiousness creates stress or motivation depending on personal characteristics of particular students. A final challenge is the possibility of developing an idea of the reading construct that varies with increasing proficiency in reading.
A genuine problem while designing and conducting assessment of reading comprehension is that test scores shows variation. This variation indicates a validity problem in reading comprehension assessment. One of the solutions to rectify this problem is to use more sophisticated tools to ensure the validity of the test. Use of IRT instead of CTT is a step forward in this direction (Keenan &Meenan, 2014;Baldwin, 2007). This study applied IRT to a set of data from the reading comprehension assessment of 10th graders. According to test developers, test designed to measure conceptualizations have consequences for how we are going to measure reading comprehension. It is universally known fact that reading comprehension is a multidimensional construct and is not well-defined, universally applicable construct. According to Rehman and Maslevy (2017) that reading comprehension depends personal factors such as interest and acquaintance along with context and purpose. It encompasses enormous array of linguistic and semantic cognitive developments (synder&Caccamise, 2005). Primary focus of reading comprehension assessment is creating and interpreting the meaning of what is read. Readers acquired meaning of the text by building an articulate image of what they read ( Graesser, McNamara&Louwerse 2003) . Snow's (2003) presented a list of traits necessary for on Snow's summary of prerequisites for effective reading comprehension such as  Good vocabulary  Command on variety of topic  Robust social interaction  Excessive reading  Access to reading material Socio economic status of the students also effects the performance of the students in reading comprehension. Children belong to low SES perform poorly particularly in the domain of word reading fluency because of inadequate knowledge of words meaning. These students cannot compete with the students from higher SES due to lack of vocabulary. Gilakjani (2016) found that students reading comprehension is widely affected by different reading strategies. Student not only received information but also infer meaning from that information. Raisani and Taveeno (2017) also found that students use a number of readingstrategies; however, the use of these strategies is not on a regular basis or a specific purpose in mind. Most of the strategies adopted were simple in nature such as summarizing a text, reading aloud and to translating texts in their mother languages. Effective reading required several skills to comprehend the text. Huge responsibility lies on teachers to generate the interest of the students in reading. As Fareed, Jawed and Awan (2018) found foremost challenges that teachers encounter while teaching reading skills included lack of interest and concentration on the part of the students.So, there is need to devise such assessment tools that measures true abilities of the students with minimum error. Constructive feedback after administration and scoring of these tools enhanced students' interest and motivation. This study is an obvious effort to develop valid and reliable tool to assess students' comprehension. Keeping in view the importance of reading comprehension in language learning, a study was conducted to develop reliable tool while using Rasch Model.