Ideas in Testing Research Seminar Schedule, November 15, 2024

 

Coffee & Networking (9:00 — 9:50)

Welcome and Introduction (9:50 — 10:00)

AI and item generation (10:00 — 10:55)

Assessing ChatGPT's Proficiency in Generating Accurate Questions and Answers for Statistics Education — Meredith Sanders, Nancy Le, & Alison Cheng (University of Notre Dame) abstract slides

Artificial Intelligence and Testing — Kirk Becker & Paul Jones (Pearson) abstract slides

An agenda and checklist for psychometric research using generative AI — Alan Mead (Certiverse) & Chenxuan Zhou (Talent Algorithms) abstract slides

Comparison of Zero-Shot, RAG, and Agentic Methods of Generating Items Using AI — Alan Mead (Certiverse) & Chenxuan Zhou (Talent Algorithms) abstract slides

Break (10:55 — 11:05)

CAT and item formats (11:05 — 12:00)

Pros and cons of compositional forced choice measurement — Austin Thielges & Bo Zhang (University of Illinois Urbana-Champaign) abstract slides

Two-phase Content-balancing CD-CAT Online Item Calibration — Jing Huang, Yuxiao Zhang, & Hua-hua Chang, (Purdue University) abstract slides

Polytomous Item Sets in CAT: Novel Representations for Online Calibration — Zhuoran Wang & William Muntean (National Council of State Boards of Nursing) abstract slides

Lunch (12:00 — 1:00)

Group differences, discrimination, and the law (1:00 — 1:55)

Finding Words Associated with DIF: Predicting and Describing Differential Item Functioning using Large Language Models — Hotaka Maeda (Smarter Balanced) & Yikai Lu (University of Notre Dame) abstract slides

An Update on AI Discrimination Laws — Scott Morris (Illinois Institute of Technology) abstract slides

Modeling Diversity-Validity Tradeoffs: A Comparison of Pareto- and Multi-Penalty Optimization. — Hudson Pfister, Amanda Neuman, Tony Lam, & Scott Morris (Illinois Institute of Technology) abstract slides

Break (1:55 — 2:00)

Test design/test constructs (2:00 — 2:55)

The Work Disability Functional Assessment Battery (WD-FAB) — Michael Bass (Northwestern University) abstract slides

Faking Detection Using Item-level Machine Learning — Chen Tang (American University), Bo Zhang (University of Illinois Urbana-Champaign), Zheting Lin (CCCC Highway Consultants Co., Ltd.), Jeromy Anglim (Deakin University), & Jian Li (Beijing Normal University) abstract

Assessing Engagement in Academic Contexts: A Multi-Method Validation — Nancy Le & Alison Cheng (University of Notre Dame) abstract

Break (2:55 — 3:00)

Working with open ended responses (3:00 — 3:55)

Bridging Constructs Underlying Quantitative and Textual Data: A Joint Factor-Topic Model — Yuxiao Zhang, David Arthur, Yukiko Maeda, & Hua-Hua Chang (Purdue University) abstract

Head-to-Head: Comparing AI versus Human Categorizations of Open-Ended Survey Responses — Nicholas Williams, Lidia Martinez, Tara McNaughton (American Osteopathic Association) abstract slides

Assessing Topic Recovery in Open-Ended Responses: The Effects of Sample Size, Document Length, and Similarity Threshold — Xiyu Wang, Yukiko Maeda, & Yuxiao Zhang (Purdue University) abstract slides

Closing comments (3:55)

Questions about the seminar may be directed to Alan Mead (), Scott Morris (), or Kirk Becker (). We hope you will join us.

Back to the main page