Tuesday, January 30, 2024

Scoring Process for STAAR Constructed Responses

TEA Statements about Essay Scoring

A document was released in December of 2023 about how constructed responses were scored. We already knew that retesters would have their work scored in a hybrid manner. In other words, a machine (ASE - automated scoring engine) would score all of the papers and 25 percent of them would be scored by a human. It caused quite a stir. TEA staff doesn't want to call it AI. Semantics? 

To help with understanding what we know and what we don't know, I've annotated the December Document, Scoring Process for STAAR Constructed Responses. 

Background and Resources for Consideration

Automated Essay Scoring

Since we don't really know ANYTHING about the ASE other than what is in this document, a friend and I started looking for background. 

ETS has an automated scoring engine called erater. They talk about it here

This is a literature review about automated scoring engines. It will be important to read and consider these ideas as background as we wait for guidance on what AES actually is and who developed it. And here's the article from the doi. The article does use the term AES that TEA uses in their document. 

Literature Review IS a respected form of research. And this article does describe the research method, the research questions that focused the study, how the information was searched and included. 

The authors are: 

  • Dadi Ramesh: School of Computer Science and Artificial Intelligence, SR University, Warangal, TS, India Research Scholar, JNTU, Hyderabad, India and 
Suresh Kumar Sanampudi Department of Information Technology, JNTUH College of Engineering, Nachupally, Kondagattu, Jagtial, TS, India


The aims and scope of the journal are listed here. They focus on Artificial Intelligence. 

Latent Semantic Analysis

Latent Semantic Analysis is a mathematical tool used in essay scoring. This study was conducted by folks from Parallel Consulting and members of the Army Research Institute for the Behavioral and Sciences Institute. They wanted to see if essay scoring would work for a test they use called the Consequences Test that measures creativity and divergent thinking. Their research process is clearly outlined and gives us some ideas about how LSA worked for the Army in scoring the constructed responses in the exam. It was published in a journal called Educational and Psychological Measurement. 


Where to Begin Research: 

As we know, Pearson and Cambium are running STAAR.  So that's why I used the Pearson website on Automated Scoring to start looking for the research in the previous blogs and for the CREST presentation. Similar information can be found here for Cambium. As practitioners, until we have more answers from TEA about who designed the the products, we only have these places to start looking at what we're dealing with and the research behind it. For now, we have a place to begin informing ourselves. 

PS: Here is a document about human vs machine scoring. 
And a copy of TEA Constructed Response Workbook that was shared at the assessment conference. 








 


No comments:

Post a Comment