日本テスト学会誌 Vol.5 No.1 Abstract
JART Vol.5 No.1
▶ General research | |
Detecting Overestimation of Discrimination Parameter ApplyingMutual Information | |
Makoto Sano | |
Institute of Test Engineering and Psychometrics, Prometric Japan Co., Ltd. | |
Local item independence is generally a strong assumption for applying item response theory. Chen and Thissen (1997) proposed two models of local item dependence. One of the two is surface local dependence (SLD) which typically affects overestimations of discrimination parameters. This study evaluates the performance of some local item dependence indices focusing on the SLD condition. Simulation study was performed and computed local dependence inclices with jIRTNew (Tsai & Hsu, 2005b) 0.35. The study suggests that the local item dependence indices applying mutual information are promising for detecting SLD and overestimations of discrimination parameters. | |
Keywords: Item response theory, Local item dependence, Mutual information | |
▶ General research | |
A two-level item response model which takes the opportunityto learn into account and its application | |
Yasuhito Hagiwara | |
National Institute for Educational Policy Research | |
In large-scale curriculum-based testing survey,it is a usual sampling technique that examinees (i.e., students) are selected as members of the primarily sampled clusters (i.e., schools or classrooms). There are cases that such a survey also asks the teachers item-specific auxiliary information related to opportunity to learn (OTL). Owing to the mixture of student-level responses and class-level OTL information, the data structure is hierarchical. In this study, a two-level item response model which took both this hierarchy and the OTL into account was considered, and applied to the real data. The results suggested that for no OTL classrooms, as compared with OTL ones with the same level of the latent trait, it was difficult to answer correctly in many items, rather than OTL information explained the variance of the trait. |
Keywords: multilevel analysis, opportunity to learn,instructionally sensitive items,curriculum, junior high school students | |
▶ General research | |
An analysis of test scores adjusted for the regression-to-the-mean effectbased on incomplete data | |
Yuichi Kawata1, Manabu Iwasaki2 | |
1Chug ai Pharmaceutical Co.,Ltd., 2Seikei University | |
In this study, we consider a situation where students who obtained low scores in their test results are given a remedial class and administered an after-remedial test to check the learning effects of the class. A beta-binomial distribution is assumed for the model of the test scores, namely, the number of correct answers in a test consisting of n questions. In addition, we consider the more common situation where the affter-remedial test consists of m questions. We present the expectabion value and variance of the score of the after-remedial test and the difference from the before-remedial test. Moreover, a degree of incompleteness in the before-remedial data is classified into three situations: selection, censoring, and truncation. For each situation practical esimation procedures of the beta binomial distribution parameters are provided to fit the model by using the moment estmators. in every situation we find that the statistical test adjusted for the regression-to-the-mean effect is appropriate. Thus, the result suggests that the application of proper tests is important for assessing the learning effect. | |
Keywords: Beta Binomial distribution, Pretest-Posttest design, Selecxion, Censoring, Truncation | |
▶ General research | |
The influence on the item valuefrom item form and its testing method | |
Yiping Zhang | |
Center for Research on Educational | |
Testing This research is about the influences on the measurement effect from item form and its testing method. As a result of comparison between the Short-Answer form with the Multiple-Choice form, the item difliculty parameters of Short-Answer form were found to be higher than Multiple-Choice form, but no clear differences were shown in the item discrimination parameters between these forms. Moreover, it was found that the measuring ability of the test hardly changed when replacing the item form. Furthermore, it was found that when the suitable alternatives were provided, the Multiple-Choice form was expected to show higher discrimination power than the Short-Answer form. It was also found that when using testing methods with more grades of scoring, the item information will be increased. | |
Keywords: Short-Answer form, Multiple-Choice form, Make-One-Alternative, Probability-Testing, Item parameter, Information | |
▶ General research | |
Applicability of neural test theory — Discussing methodological problems and applying a polytomous model — | |
Satoshi Usami | |
Graduate Schoo1 of Education, University of Tokyo/Japan Society for the Promotion of Science | |
Neural test theory (NlTT) is a data analysis method that has gradually become popular in educational measurement and psychometrics. In NTT, examinees are clustered to a discrete latent rank. The author noted several technical issues for future researches of NTT, such as accuracy of estmates for latent ranks and ICRP, consistency of estmates for latent ranks over item sampling, comparison among several optmizing criteria, and construction and improvement of algorithm for NTT models. In the present research, the author performed a simulation study by using polytomous NTT for ordered data, to compare consistency of estmates for latent ranks over between NTT and another melhod using total test score. Finally, a real data example for essay test data was shown by using polytomous NTT, and the author compared these results with methods based on item response theory and total test score. | |
Keywords: neural test theory, ordered data, essay test, educational measurement | |
▶ General research | |
Psychological mechanism of the applicant forming impression of inter-view examination for university admissions: the perspective on designing appropriate interview | |
Dai Nishigori | |
Admission Center, Saga University | |
The present study investigated the psychological mechanism of those who applied to the interview exarnination which was executed on university admission from the view "structural factor" and "social factor" on procedural justice indicated by Nishigori (2007) . The result of analysis using SEM (Structural Equation Modeling) showed that the influence of "social factor"upon impression of interview composed of "accepting the rule of inteview" and "sense of achievement" which they experienced were greater than "structural factor" involved the procedure of interview. In addition, the cognition to "fairness or justice on interview" or "the affirmative on inteview" which people who have interviewing experiences have is far higher compared with that of people who have no inter-viewing experiences. | |
Keywords: university admissions, interview, procedural justice, structural factor, social factor | |
▶ Case study | |
Using Proficiency Tests for Curricular Innovations in University EnglishEducation: Relationships between Proficiency Tests and the Ibaraki University Internal English Test | |
Chisato Saida1, Kunihiko Kobayashi1, and Hiroyuki Noguchi2 | |
1Ibaraki University, 2Graduate Schoo1 of Education and Developmental Sciences, Nagoya University | |
Proficiency tests such as TOEFL or TOEIC have been utilized in curricular innovations in many universities. This research attempted to categorize the functions of the use of proficiency tests in university English education.Then, the practical use of an internal test in the new English curriculum in Ibaraki University was focused on. Ibaraki University has conducted the internal English test accompanied by a textbook since 2005. This research examined the criterion-related validity of the internal test. The correlation coeffcient between the scores of the internal test and the National Center English Test was about .65, that of TOEIC IP was .65, and that of TOEFL ITP was .62. Correspondent score sheets were developed.As a result, the usefulness of the internal test increased. | |
Keywords: University English Curricular Innovations, the National Center Test, TOEIC IP, TOEIFL ITP, Ibaraki University Internal Test | |
▶ Case study | |
GUIDevelopment of IRT analysis programs for beginners: EasyEstimation series | |
Ryuichi Kumagai | |
Niigata University | |
In this paper, computer programs we have developed for IRT analyses for beginners will be discussed. In developing, we attached importance to following two points: a. they have GUI which are easy to use intuitively, b. they are Freeware and everyone can easily obtain it. To verify the validity of the numerical results, we compared our programs with the existing programs. Then, it was shown, that numerical results in our programs were appropriate. |
Keywords: Item Response Theory, Computer program, GUI | |
▶ Case study | |
What kind of ability is assessed by PBL(Problem-Based Learning:A new educational strategy in Japan). $mdash;In comparison with the abilities assessed by the Common Achievement Test including CBT (Computer-Based-Testing) and OSCE(Objective Structured Clinical Examination)$mdash; | |
Manabu Miyamoto1, Yoshiaki Mori 2, Takahiro Kubota1,2, Hiroshi Yoneda1,3 | |
1Education Center, 2Physiology, 3Neuropsychiatry, Osaka Medical College | |
The relationship of the abilities assessed by PBL, CBT and OSCE should be clarified to reform medical education. The 4th year students were assessed by PBL in terms of attendance, performance, presentation, assignment, and Block written test. They also took the National Common Achievement Test based on CBT and OSCE. CBT included 2 subtotal scores; and OSCE, total competence scores and rating scale scores. Principle Component Analysis and Structural Equation Modeling showed that all the scores could be explained well by the PBL items. PBL ability as a latent factor, correlated to OSCE ability with coefficient of 0.39 and to CBT ability with coefficient of 0.16.The standardized regression weights to each PBL item ranged from 0.45 to 0.92 from PBL ability.Presentation was the most representative.The examinee' group who could not do written tests well except other items,showed that knowledge building process was different from knowledge itself necessary for written tests.The PBL ability based on self-learning and problem solving has only slight correlation to those of CBT and OSCE. | |
Keywords: Problem-based Learning,Common Achieved Test,CBT,OCSE,Principle Component Analysis | |
▶ Case study | |
On the Convergent and Discriminant Validity of the National AdmissionTest for Law Schools (NATLaS) | |
Sugisawa Taketoshi1, Uchida Teruhisa2, and Shiina Kumiko2 | |
1Niigata University, 2The National Center for University Entance Examinations | |
We investigated the relationship between scores from the National Admission Test for law SChools (NATLaS) and several tests measuring various abilities or traits in order to clarify what the NAILaS measures. Our results show that NATLaS scores well correlate with those from a basic logical thinking and a vocabulary test while there are low correlations between NATLaS scores and those from a questionnaire about attitudes toward critical thinking. The "Reasoning and analytical abilities" part of NATLaS correlates more strongly with the skills tested in the logical thinking test, and the "Reading comprehension and expressiveness" part correlates more strongly with the test of vocabulary. These results suggest that each part of NATLaS accurately measures examinee's abilities as intended. Furthermore, a follow'up using results from another year's NATLaS shows that these results are reasenably consistent. | |
Keywords: National Admission Test for law Schools (NATLaS), logical thinking, vocabulary, attitude toward critical thinking, validation study | |
▶ Case study | |
Detection of Difference of Rating Characteristics of an Novice from the Ones of Experts on an Essay Test | |
Taichi Okumura | |
Graduate School of Education, Joetsu University of Education | |
It is a critical problem how to train a novice to be a skilled rater of essay tests. In this article, we tried to detect the biases and characteristios of the rating of a novice compared to the ones of experts using the Hierarchical linear modeling. As a result,some characteristics which the novice had never been aware of were detected. Feedback of such information to novices can be usefu1 to train them to be skilled raters of essay tests more effectively. | |
Keywords: Essay test, Rat ing, Experts, Novice, Training, Hierarchical linear modeling | |
▶ Case study | |
The effect of item use over long time in computerized adaptive testing in case of CASEC | |
Yasuko Nogami | |
The Japan Institute for Educational Measurenment, Inc. | |
The effect of item exposure rate on changes in item properties and the quality of proficiency estimation is one of the most important concerns for practitioners of computerized adaptive testing. This article investigates the effects of high item exposure rate and repeated item presentations to the same examinee on item properties and proficiency estimation. In this study, I focused on the Computerized Asspssment System for English Communication (CASEC), which is a commercially available computerized adaptive test. Simulated as well as the real data were analyzed and compared in terms of percentage of items answered correctly, item exposure frequency, and proficiency estimates of examinees. The results suggest that the effects of item exposure on item pollution and proficiency estimation were not so serious in the case of CASEC. | |
Keywords: computerized adaptive testing, item exposure rate, item pollution |