日本テスト学会誌 Vol.8 No.1 Abstract

トップ>学会誌>既刊号一覧>既刊号(8-1)>Abstract

JART Vol.8 No.1

▶ General research  
Evaluation of Admitted Applicants' Abilities of an Entrance Examination Using Composite Scores of a Few High Scores of Several Subtests
Kenichi Kikuchi
Toho University
In Japan, we often use a composite score of a few high scores of several subtests in university entrance examinations. In this case, in order to evaluate scholastic abilities of applicants or admitted applicants, we usually use a standard score called Z-score. For example, applicants' or admitted applicants' raw scores are standardized using the mean and the standard deviation based on all applicants of the National Center Test. In this paper, we discussed the relationship between the means of all applicants' raw scores and admitted applicants' Z-scores when we use a composite score of a few high scores. As a result, we found that the mean of all applicants' raw scores affects the mean of admitted applicants' Z-scores. If the mean of raw scores of a subtest is high, the mean of admitted applicants' Z-scores of it is also high. Finally, we pointed out some concern about an analysis of the above selection method.
Keywords: subtest with a high score, admitted applicant, standard score, analysis of scholastic ability, entrance examination
▶ General research  
Estimation of Equating Coefficients Using Estimated Population Distribution in Common Examinees Designs
Ryuichi Kumagai1, Hiroyuki Noguchi2
1Tohoku University, 2Nagoya University
When the equating coefficients of two IRT based scales are estimated by Mean and Sigma method in common examinees designs, we sometimes obtain inappropriate estimates due to error variances of the examinees latent trait estimates. Noguchi and Kumagai (2011) proposed a correction method for such cases and examined the validity of the corrected method using a simulation study. However, we found that this method could generate inappropriate estimates in some cases. In this paper, we have proposed a new method of estimation method using estimated population distributions, which can reduced the effect of the error variance. In simulation studies, when the effect of the error variance was very large, this method was able to estimate appropriate coefficients, more accurately than the simple Mean and Sigma and, or Noguchi and Kumagai methods. Using PISA 2006 data, we examined the effectiveness of this method and found that it was not affected by the error variances of the examinees latent trait estimates.
Keywords: common examinees design, equating coefficients, Mean & Sigma method, distribution of population
▶ General research  
Relationship between values of a test and strategies in learning for and reviewing a test: Test approach-avoidance tendency as a mediator
Masayuki Suzuki
Graduate School of Education, The University of Tokyo / Japan Society for the Promotion of Science
This study investigated the relationship between values of a test and the strategies used in learning for and reviewing a test. We employed test approach-avoidance tendency as a mediator variable for evaluating the above relationship. Data was collected from 493 high school students using a self-reported questionnaire. The results showed that strategies used in learning for a test had a relationship with those used in reviewing a test. This suggests that teaching students strategies to learn for a test might be useful in fostering effective strategies to review for a test. In addition, it was indicated that the test-approach tendency promoted effective strategies, and that students who considered a test as an effective way to improve their learning strategies and create a learning program had a higher test-approach tendency. Furthermore, values of a test exhibited a direct relationship to learning strategies that were controlled.
Keywords: strategies in learning for a test, strategies in reviewing a test, values of a test, test approach-avoidance tendency, learning English
▶ General research  
A comparison of equating methods based on item response theory for the purpose of constructing an item bank
Haruhiko Mitsunaga, Shin-ichi Mayekawa
Tokyo Institute of Technology
In this study we compared the concurrent calibration method and the separate calibration method for item parameter equating for the type of test design in which both the set of anchor items and the set of newly written items are used for scoring. It is also assumed that the number of anchor items per form is less than the newly written items and the latter will be added to the item bank after each of the test administration. A simulation study showed that, when the item discrimination parameter values are increasing with time, the separate calibration method performed better than the concurrent method in terms of the recovery of the true parameter values.
Keywords: common-item nonequivalent groups design, item bank, concurrent calibration, separate calibration, test equating, Item Response Theory.
▶ General research  
Developing Two Equivalent Spatial Ability Tests for Myanmar Middle School Students
Nu Nu Khaing1, Tsuyoshi Yamada2, Hidetoki Ishii3
1Sagaing Institute of Education, 2Okayama University, 3Nagoya University
Nowadays, interest in spatial ability has been increasing in education because it can predict education and professional success. However, in Myanmar, there was no wide awareness of the importance of spatial ability, and there were no typical spatial ability tests. Therefore, in this paper, two new equivalent multiple-task spatial ability tests were developed. These tests were designed as multiple-task spatial ability tests to measure more aspects of spatial ability than single-task tests. To develop the tests, a two-parameter logistic model (2PLM) of item response theory (IRT) was utilized. Consequently, two equivalent tests composed of 40 items were developed.
Keywords: spatial ability, multiple-task test, item response theory, two parameter logistic model
▶ General research  
Computer Adaptive Test Based on Latent Rank Theory: A Proposition of the Algorithm and its Validation
Tetsuo Kimura1,2, Keizo Nagaoka2
1Niigata Seiryo University, 2Waseda University
The purposes of this paper are 1) to propose an algorithm for a computer adaptive test (CAT) based on latent rank theory (LRT), after reviewing the characteristics of LRT, 2) to determine the number of items required for actual CAT implementation using a simulation study, and 3) to discuss further research topics by validating the proposed algorithm after administering the actual CAT. This paper uses a simulation study to examine the consistency between the true and estimated latent ranks, the degree of convergence in rank membership profile (RMP, which describes test takers’ latent ability in LRT), and the number of items needed to satisfy CAT termination conditions. As a result, a projection is given of the number of items needed to administer CATs, and flaws in the current item bank are also revealed. In addition, the proposed algorithm was reconsidered and further topics were discussed, such as manipulation of item difficulty and evaluation of RMP convergence.
Keywords: latent rank theory, computer adaptive test, simulation
▶ General research  
A comparison of item response models for a test consisting of testlets
Naoya Todo
Graduate School of Education, The University of Tokyo
A set of items that have common stimulus is called testlet. Responses to the items in a testlet often violate the assumption of local independence. In this article, we compared three types of IRT models that take local dependence into account and standard IRT models in terms of accuracy of latent trait's estimation. Simulation results showed that the values of bias, root mean square error and correlations between latent trait and its estimates for four types of models were similar, not depending on the sample size, the proportion of locally dependent items and the degree of local dependence affected the accuracy of latent trait's estimation.
Keywords: item response theory, local independence, local dependence, latent trait
▶ Case study  
Report on an experimental English listening test utilizing converted speech sounds by changing fundamental frequency
Teruhisa Uchida, Kei Ito, Takamitsu Hashimoto, Tatsuo Otsu
Research Division, the National Center for University Entrance Examinations
In high-stake testing, it is critically important to secure the content of the tests. Aiming to disguise the identity of the speaker in listening tests, this study applied digital voice-transformation technology to change types of voices. Experimental listening comprehension tests were constructed using converted voices. There were two lines of experimental conditions; that is, low-pitch speech condition that reminds persons of large build and high-pitch speech condition that reminds persons of small stature. The test scores in these two conditions were compared with the scores with the original voices. There was no significant difference among them. The results suggest that this digital voice conversion technology is applicable to be used for various purposes including securing the speakers’ identities although there are further possibilities to improve the quality of the converted voices.
Keywords: listening comprehension test, concealment of test items, types of voices, speech signal processing
▶ Case study  
What outcomes are assessed in the elective clinical trainings for the final-year medical students? -The results of competence assessment composed of 16 items concerning with knowledge, skills and attitude-
Manabu Miyamoto, Ayako Miyazaki, Koichi Suzuki, Hiroshi Yoneda
Education Center, Faculty of Medicine, Osaka Medical College
In order to resolve the shortage of doctors, we need to make good doctors with excellent knowledge, skills, and attitude for a given period. Substantiation of clinical training for undergraduates is required and it becomes important what outcomes should be educated and assessed in their clinical training. The elective clinical trainings for the final-year medical students were performed in the university hospital and external community hospitals including clinics for each 2-4 weeks’ rotation during 12 weeks from April through July. We analyzed 376 assessment results filled full out of 456 in 2010 and 334 assessment results out of 403 in 2011. The assessment form was composed of 16 items including 4 items for knowledge, 7 items for skills and 5 items for attitude as deduced by Sequential Equation Modeling. The path coefficients between three constructs and related-items were calculated. The covariance between "Knowledge" and "Skills" constructs was 0.938. Those between "Communication and Attitude" and "Knowledge" or "Skills" were 0.749, 0.778. Score values were estimated on each construct by Item Reaction Theory. The group with excellent results has high scores in all of three constructs, however the underachievers tend to have very low scores in one of the three constructs.
Keywords: assessment, outcomes, elective clinical clerkship, knowledge, skills, communication and attitude, Structural Equation Modeling
▶ Case study  
The effect of item format in a Japanese language comprehension test on the test-taker's response: An empirical investigation using a test for junior high school students
Kazuhiro Yasunaga1,2, Makoto Saitoh1, Hidetoki Ishii1
1Graduate School of Education and Human Development, Nagoya University, 2Research Fellow of the Japan Society for the Promotion of Science
The purpose of this study was to examine if the manner of item format on a Japanese language comprehension test in high school entrance examination affects the proportion of correct answers and item discrimination. A total of 493 third-year junior high school students were administered this test. Four variations of item presentation were used: 1) choosing a sentence from the text (the first 5 words or free-response), 2) reading tasks (retrieving information or developing an interpretation), 3) presentation style using blanks (the same form throughout or varied), 4) limiting length of answer column. It was discovered that 1) does not affect the proportion of correct answers and discrimination. Further, 2) leads to higher proportion of correct answers and discrimination. In addition, for 3), varying the presentation of blanks results in higher discrimination. Finally, 4) leads to the highest proportion of correct answers and discrimination.
Keywords: a Japanese language comprehension test, item format, item analysis, proportion of correct answers, discrimination