日本テスト学会誌 Vol.12 No.1 Abstract

トップ>学会誌>既刊号一覧>既刊号(12-1)>Abstract

JART Vol.12 No.1

▶ General research  
A simulation study on appropriate transformations of reliability coefficient in mixed-effects meta-analysis models
Yasuo Miyazaki1, Taketoshi Sugisawa2
1Virginia Tech., 2Niigata University
Reliability generalization is a meta-analytic technique used to synthesize the score reliability for an instrument across many studies. The concept is relatively new and therefore the methodology for this technique is not established yet, especially the appropriate form of transformation of the reliability coefficient is not well known. In this paper, a simulation study was conducted in order to examine which transformation of alpha coefficient works best by generating a population of reliability coefficients within the framework of mixed-effects meta-analysis models. The results of six forms of transformation were compared in order to find a better transformation for reliability generalization. The results implied that either log or cube root transformations performed much better than other forms of transformations. From the variance stability viewpoint, the log transformation is more recommended since it is a variance stabilizing transformation while the cube transformation is not.
Keywords: reliability coefficient, reliability generalization, meta-analysis, mixed-effects model, hierarchical
▶ Case study  
Judgment Standards for Test Item Disclosure for Official Examinations in Japan ― Cases on The Information Disclosure System ―
Masako Wakabayashi, Kazunari Sugimitsu
Foundation for Intellectual Property
There are diverse values to consider when deciding whether test items should be disclosed or not. However, test item disclosure considering diverse values has not been discussed comprehensively in previous studies. The aim of this study is to obtain judgment standards for test item disclosure. We surveyed multiple cases dealing with test items in the information disclosure system of Japan. Also, we investigated the details of each case comprehensively. As a result, the following viewpoints were obtained:(1) a necessity of ensuring transparency by test item disclosure, (2) a reuse of test items for future examinations, (3) acceptability of test preparation using former test items,(4) acceptability of burden increase for developing new test items, and (5) an information management of test items. This study suggested that the points mentioned above can be applicable to judgement standards for test item disclosure for Japanese official examinations.
Keywords: Test item, Disclosure, Standard, Official examination, Reuse
▶ Case study  
The verification of validity in TIMSS 2011 mathematics data in Japan using multidimensional item response theory
Yutaro Sakamoto
Recruit Management Solutions Co., Ltd.
While it is said that the evidence based discussion about education is needed, the quality assurance of test is also important. They say that we need to reaffirm the significance of measuring constructs correctly and the previous studies are not enough. The present study examined the verification of validity in TIMSS 2011 mathematics data in Japan using multidimensional IRT. In addition, the present study tried to investigate what the subscales "knowing" "reasoning" "applying" measure using bifactor model in terms of item information . As a result, there are 23 items which group factors have more impact on than general factor, so the present study proved characteristics which unidimensional IRT can not express. In other words, the present study can express characteristics about constructs which this test try to measure using multidimensional IRT and the application possibility was suggested.
Keywords: construct validity, multidimensional IRT, bifactor model
▶ Review  
A Review of Item Response Models for Performance Assessment
Masaki Uto , Maomi Ueno
The University of Electro-Communications
Performance assessment has been attracted much attention in various assessment fields, such as entrance exam, employee evaluation and educational assessment. Performance assessment enables to assess examinees' practical and higher order skills, which are difficult to be assessed by traditional paper tests. In typical performance assessment, examinee's performances for multiple tasks are evaluated by multiple raters. However, it has been pointed out that reliability of such performance assessment strongly depends on characteristics of raters and tasks. As a method to improve the reliability, item response models which incorporate rater and task characteristic parameters has been proposed. Earlier studies reported that the models could improve the reliability of performance assessment because they can estimate ability of examinees considering characteristics of raters and tasks. When applying them to actual performance assessments, the selection of an optimal model for the assessment situation is important. Therefore, this paper reviews previous item response models that incorporate rater and task characteristic parameters and explains those characteristics. Furthermore, the paper proposes an approach to select an optimal model for assessment situations. Moreover, the paper demonstrates the effectiveness of the models through a real data application.
Keywords: Item response theory, performance assessment, reliability, rater characteristics, multi-way data