日本テスト学会誌 Vol.3 No.1 Abstract


JART Vol.3 No.1

▶ General research  
Effects of stems and/or optaion preview on item difficculty and discrimination
Kozo Yanagawa
University of Bedfordshire, Centre for Research in English Language Learning and Assessment
The purpose of this study is to examine the effects of item stem preview and/or answer option preview of multiple-choice listening comprehension tests on item difficulty and item discrimination. Three different item formats of MCQ listening comprehension test were addressed in the present study: (1) both preview (printed item stems and answer options provided prior to listening) (2)answer option preview (printed answer options provided prior to listening: printed item stems after listening)(3)item-stem preview format (printed item stems provided prior to listening; printed answer options after listening). Item format was referred as stems and/or option preview in this study.
The result of the experiments revealed that item format of MCQ listening comprehension tests affects item difficulty. The mean score of the format (2) was found significantly lower than that of the format (1) or (3).This was primarily be attributed to the nine items on the fomat (2), which showed higher item difficulties than on the other two formats. No significant difference was found in item discrimination power across the three formats.
Keywords: listening comprehension tests, multiple-choice questions, item difficulty, item discrimination preview of stems and/or answer options
▶ General research  
An attempt for the scale selection by non-linear logit IRT models — An applicataion for the EI scale —
Ikko Kawahashi1, Hideki Toyoda1, Mayumi Sakurai2, Kenichi Sasaki2, Masato Yokoi2, Toru Watanabe2, Mieko Idaka2
1Waseda University, 2EI research Inc.
In this paper,we suggest a method to enable scale selection which met a purpose without preparing for an item pool. We applied two non-1inear logit IRT models to the EI (Emotional Intelligence)scale which aim to measure a business mannerrelated to EI. These two models were the model which has a term of θ-square in the logit and the model which has term of log(θ)in the logit. AS the results, We constructed two scales which are not same in discrimination power but has a high correlation with a external criterion (ordinal nominal response model).In particular,it was clear that the first model has high discrimination power athigh trait levels more than the ordinal nominal response model.Therefore,concering a practical use of the EI scale which aim to select high trait level persons,it is suitable to choice the model.
Keywords: Nominal response model, Multinomial logit model, non-linear logit, MCMC
▶ General research  
Adaptive Test by Minimum Entropy Criterion: A problem of test information function
Yasuhnaru Okanmoto
Japan Women's University
A case of inadequacy of using information function in adaptive test was demonstrated by simulation. In adaptive test, assumption of independent and identical distributions is not satisfied, and in the beginning of the test, amount of data is not enough to insure asymptotic representation of distribution of parameter estimate by information function. The simulation showed that these conditions can induce contradictive discrepancy between the true distribution of parameter estimate and the clistribution represented by information function. Adaptive test by minimum entropy criterion instead ofinformation function was proposed. Minimum entropy criterion was adopted for psychophysical measurement byKontsevich and Tyler (1999), and the method was named Ψ method. Behavior of adaptive test by minimum entropy criterion was checked by simulation,which demonstrated that the method by minimum entropy criterion with a uniform prior distribution works as well as the one by information function with respect to estimation of ability parameter,although it induces substantial conservative biases in case ofa normal prior distribution.It was pointed out that as one of Bayesian approaches, the proposed method is worth further investigation to inspect its merits and demerits in various conditions and to exploit its multidimensional characteristics.
Keywords: adaptive test, information function, iid, entropy, posterior distribution
▶ General research  
Automatic Skill Assessment Using Human Motion Data
Ogata Hiroyuki1, Kawai Gaku2, Yamamoto Saeko3
1Faculty of Science and Tecknology, Seikei University, 2Former Graduate Student, School of Engineering, Seikei University, 3Graduate School of Engineering, Seikei University
This paper discusses a method to assess automatically the skill of examinee through a performance test involving a continuous motion like craftwork or sports. The motaion data is obtained using a motion capture device. Generally, examinees'motions are not identical even if they are exactly in the same skill level, because they may be different in body size and build. Motions originated from the same examinee should be different, too. Instead of comparing motion data directly,we treated this as a classification problem that estimates the skill level from the motion data. As the motion is essentially a time series, first it is substituted with some postures that can be determined uniquely to be transformed to a vector data. Then the substaitute data is used to estimate the skill level. A putt swing was taken as an example to verify the availability of this method. 556 motion data from 30 examinees whose skill level was known in advance were obtained for the experiment. The vector data constaructed of 4 postaures extaractaed from the motion data were suitable to estimate the skill level using nearest neighbor method.
Keywords: performance testing, skill assessment, motion data, motion capture device, putt swing
▶ General research  
What Leads to Nonresponses for Open-Ended Questions? A Data Analysis of the Japanese Test for 6th Grade Students in the Gunma Prefecture Achievement Test for Middle Schoo1 Students
Hidetoki Isii
Graduate School of Education, Universitay of Tokyo
According to the results of PISA, administered by OECD, and other tests, it has been revealed that Japanese students possess relatively low reading literacy and that they tend not to respense for many open-ended questions. ln this article, it was studied what leads to nonresponses for open-ended questions. The data of the Gunma Prefecture Achievernent Test for Middle School Students was analyzed. As the results, it was observed that the rate of nonresponse was higher when the items required students to state their own thoughts apart from the text than the rate when the items required them to write answers along the text It was also observed the rate of nonresponses was re1atively low in those students who could enjoy learning activities of speech reading of descriptive passages, writing, and grammar. Moreover, it was also found that the rate of nonresponses was re1atively high in those students who couldn't understand all over classes well.
Keywords: open-ended question, nonresponse, achievement test, middle schoo1 students
▶ General research  
Development of Social Skills Inventory for Middle School Students (SSI-M)
Niwako Sugimura, Hidetoki Ishii, Yiping Zhang, Hiroshi Watanabe
Graduate School of Education, University of Tokyo
There are several social skills scales available in Japan. However, most of the scales for children and adolescents are intended for screening for school adjustment problems. We developed an inventory, Social Skills Inventory for Middle School Students (SSI-M), that can assess various levels of social skills of normative populations and that is culturally and developmentally appropriate for Japanese students in grades five through nine. The inventory was administered to approximately five hundred and fifty students from a junior high school in Tokyo. Results from analysis using the group principal axis method revealed eight components: 1) Assenion with Teachers, 2) Relationship'building, 3) Conversation, 4) Emotion Regulation, 5) Group Activity, 6) Assertion with Peers, 7) Relationship'maintenance, and 8) Basic Manner. The internal consistency, stability, andvalidity of the SSI-M were confirmed.
Keywords: social skills, inventory, middle school students, reliability, validity
▶ General research  
Horizontal and vertical equating of large scale English proficiency tests: Trends of English proficiencies of students preparing for university examination
Ryuichi Kumagai1, Daisuke Yamaguchi2, Mariko Kobayashi2, Masahiko Beppu2, Takafumi Wakita3, Hiroyuki Noguchi3
1Niigata University, 2Kawaijuku Educational Institution, 3Nagoya University
In this study, we conducted horizontal and venical equating about large scale English proficiency tests which were examined by Kawaijuku Educational Institution. The tests have been executed in May, August, and December from 1995 to 2005 (33 test sets). Common-subject design using anchor tests was adopted. In seven months from May to December, the proficiencies of English have increased. And, trend of proficiency were discussed. The trend of traits was same as the early researdh. However, the amount of the change was different from the previous research.
Keywords: equating, English proficiency test, item response theory
▶ General research  
Correlations between public examinations and assessment of English classesat Niigata University
Ryuichi Kumagai, George Gotoh, Naoko Nakaume, Tadashi Shibayama
Niigata University
At Niigata University, a new curriculum of English education was introduced in the fiscal year 2005. As a result, the TOEIC IP test was given to all first'year students. In this researdh, the relationship between the results ofthe TOEIC IP test and results of tihe National Center Test for University Entrance Examinations (NCUEE) and "Standard English" was examined. Results indicated that there was a .62 correlation between the total TOEIC score and marks of the NCUEE. There was also a .32 correlation between the total score of the TOEIC and marks of Standard English. Moreover, the correlation between Standard English and each skill (or total score) of TOEIC varied depending on the teacher and the class, which had been streamed according to students' achievement. A regression analysis in which the NCUEE score was the independent variable and the total TOEIC score was the dependent variable, indicated coefficients of determination of.85 for teacher and .81 for class. It is suggested that these results can be used for selecting classes in terms of the teacher's cheracteristics or students' achievement.
Keywords: Standard English, TOEIC, the National Center Test for University Entrance Examinations, English educabion
▶ General research  
Japanese Law School Admission Test: Summary on four years of administration and future directions
Tadahiko Maeda1, Hiroyuki Noguchi2, Tadashi Shibayuma3, Akma Fu㎞oto4, Masahiro Fujita5, and Yoshikazu Sato6
1The Institute of Statistical Mathematics, 2Nagoya University, 3Niigata University, 4Shizuoka University, 5National Graduate Institute for Policy Studies, 6Miyagi National College of Technology
Japanese Law Schoo1 Admission Test (JLSAT) has already been conducted four times since 2003. This test consists of three sections of multiple-choice items,which purport to measure abiliities and skills in logical reasoning (section 1),analytical reasoning (section 2), and reading comprehension (section 3), respectively, and a section of essay-form test for assessing the examinees' writing skill. In this paper, using the data obtained through the past four administrations, we examined the internal consistency of three multiple-choice sections, and correlated the section scores with external test scores in order to evaluate the construct validity of JLSAT. The results indicated that the psychometric property of three sections of JLSAT was fairly stable during the past four years, and the coefficient α of total score distributed around 0.75to 0.80 Pattem of correlations with two external tests supported the construct validity of JLSAT. While we confirmed the stability of psychometric property of JLSAT, we further discussed that the evaluation of predictive and content validity is essential to corroborate our statement about the validity of JLSAT.
Keywords: Japanese Law School Admission Test, classical test theory construct validity, internal consistency
▶ General research  
Design of the National Admission Test for Law Schools and lnvestigation of Its Stability Based on Empirical Data
Kumiko Shiina, Taketoshi Sugisawa, Katsumi Sakurai
The National Center for University Entrance Examinations
The design of the National Admission Test for Law Schools (NATLaS) and the results of an empirical study on its stability and validity are stated in this paper. The distribution of the NATLaS scores and the proportion of item difficulties are very similar between administrations. Reliability coefficients of the NATLaS are sufficiently high,and scores in the subtests show a reasonably positive correlation for all admimistrations as a result of the high control of the quantity of items and the variety of item types. We obtained validity evidence for the NATLaS in the form of moderate positive correlations between the NATLaS and legal subject tests in the selection data of one law school. The pass rate of the national bar examination for students who completed their law degree was estimated based on a hypothetical model using the mean NATLaS score of enrolled students at each law school. A similar trend between estimated values and actual ones was considered to be evidence for the predictive validity of the NATLaS.
Keywords: National Admission Test for Law Schools, stability, reliability, validity
▶ General research  
Apphcation of Hierarchnical Linear Models to Educational Researchn and Viewpoints of Utilizing the Results to Educational Policies
Yasuo Miyazaki
Virginia Polyteckmic Institute and State University
Hierarchical Linear Modeling is one of the most frequently used statistical methodology for analyzing educational and developmental data which show a nested data structure, and it has been recognized that this methodology provides useful information when it is applied for analyzing large-scale educational data. However, it is unfortunate that researchers in Japan have not been exposed to this methodology much yet and thus not many studies have been conduced using this technique. There are two objectives for this article: One is to introduce this methodology to the researchers who wish to apply the technique to their research by providing two sets of typical analyses of organizational study and growth modeling using a National Education Longitudinal Study (NELS)data set. The other is to provide checkpoints of utilizing the results to educational policies from a standpoint of research design and statistical analysis.
Keywords: Hierarchical Linear Model, Nested Data Structure, Organizational Study, Growth Model,Large-scale Educational Research Data
▶ General research  
Proposinga faimessresearch in socialpsychology over university admissions in Japan: Analysis of the"Admission Office Examination"
Dai Nishigori1, Naoki T. Kuramoto2
1Graduate School of Educational lnformatics, Education Divison, Tohoku University, 2Center for the Advancement of Higher Education, Tohoku UniverSity
University admissions have been considared as high-stakes selection processes in Japan. In consequence, fairness issues in rniversity admissions have always been targets for social concerns. The concept of fairness, however, is not simple. It should be considered and treated as a complex psychological construct. In the present article, we analyzed subjective fairness perceived by university freshmern who had been university candidates until quite receritly, from the viewpoints of fairness theories in social psychology. As a result we obtained some noteworthy findings from the viewpoints of allocation principle on distributive justice, and rules on procedural justice. We also demonstrated the usefulness of social psychological fairness research. It is extremely important to peint out the fact that there can be no 'perfectly fair system' everyone agrees.This fact implies that it is realistic to seek for reasonably fair situations that may minimize dissatisfaction in terms off fairness among people in concern.
Keywords: university admissions, fairness, social psychology, distributive justice, procedural justice
▶ General research  
Scholastic Achievement Structure of National Center Test 2006 by Self-Organizing Map
Kojiro Shojima
The National Center for University Entrance Examinations
The scholastic achievement strructure in the data of the National Center 'Iest 20-06 was exploratorily analyzed using a self-organizing map (SOM). Our target population was a group of 389,235 able-bodied high schoo1 graduates (12th graders). First, the simple statistics of an 34 subjects. We then used the ful1-information maximum likelihood method to identify the mean, covariance, and correlaeion structure of the main 19 subjects. Second, we did an exploratory analysis of the scholastic achievement structure of NCT2006 by SOM, and extracrted eight clusters (layers).We also examined the distinction of genders in scholastic achievement.
Keywords: scholastic achievement, The National Center Test, self-organizing map, full-information ML