Automated scoring. I want to know more about this, so I've been reading and researching what's out there for teachers and practitioners and policy makers. What kind of research are these folks using to make decisions? WHAT IS this stuff and HOW does it work?
I'm researching - won't you join me?
Here's something from ETS about the difference between Human and Machine Scoring. I used a program called Kami to annotate, highlight, and point out features and ideas we should consider. I did find guidance on developing a litmus test for use and some terms and ideas for further investigation. I didn't find much research. I did find some terms to research and some programs/companies to research about their effectiveness - but since they are making money selling this stuff and conducting their own "research", I'm not hopeful in finding valid and reliable data.
Points of Interest in the Document Linked Above:
- Be sure to look at Table 2 on page 5.
- Consider the considerations in the bullet points on page 7 in considering your own research and to guide what we can say to policy makers and our board members and staff at TEA. Are these conditions being met? How?
In previous posts, I analyzed a series of citations given by Pearson, trying to dive deeper to find out how Texas is going to use the stuff with our kids on STAAR. (Pearson develops the stuff they use for TELPAS speaking. Cambium develops the stuff for STAAR and TELPAS writing. I'll be diving into that next.)
Scoring Process for Constructed Response: This gives background about scoring and STAAR.
Review of Pearson's Commentary and Citations: Primarily, I was looking for actual research and where to find the support and development of the programs they've developed. I didn't find research. And some of it was flat out alarming.
No comments:
Post a Comment