Monday, October 30, 2023

Is it RESEARCH? AI Scoring: A Call For Readers and Thinkers

Think and reason with me? Pretty Please? 

CAVEAT: So, I really do think we are ok for AI scoring of the type of writing that STAAR will assess. CAVEAT: And, I think this will be better than what happened with scoring in the last administration. (I've seen pretty much identical papers get very different scores. And there's lots of confusion out there because of it about what kids need to do, even bad advice like just use less words.) 

But AI scoring is going to be a reality. It's already used effectively with TSI. It's already used with TELPAS. Not sure I can say effectively because I don't have any information or experience about how that's working. 

Now it's time to dig deeper. As I've been discussing things with folks I respect, these are some thoughts we should consider: 

Concerns: 

AI can be: 

  • technically flawed
  • unresponsive to originality and nuance
  • based on predictability formulas

Essential questions: 

In addition, is there consensus in the field about AI's value, validity, and reliability? Are we looking at real research? And what does the research we are given say? What other voices are out there? 

Background: 

Pearson reports that their scoring is "backed by research and [is] unparalleled in the field." At the end of the website are several articles. I'm waiting on access to some of these. In the mean time, let's take a closer look at what's offered as research.

I've started a spreadsheet here that we can use to collaborate. 

Call for research and analysis: 

As Joyce Armstrong Carroll and Abydos state: I'm [researching.] Won't you join me? 

Step One: 

Can you join me in reviewing these texts? We need an annotated copy of the articles. If you have information for the other cells, we need help in evaluating those details. What else should I consider? 

Step Two: 

What other peer reviewed research of excellence should be considered? I've added a second tab for that analysis. 





Sunday, October 29, 2023

Hybrid/AI Scoring: Content and Implications for Instruction

AI Scoring: Content and Implications for Instruction

Something to Read for Background and Further Research:

Here's what we know about "hybrid" automatic scoring: https://www.pearsonassessments.com/large-scale-assessments/k-12-large-scale-assessments/automated-scoring.html

Be sure to look at this one too: https://www.pearsonassessments.com/large-scale-assessments/k-12-large-scale-assessments/automated-scoring/automated-scoring--5-things-to-know--and-a-history-lesson.html 

What's been going on? 

So, there's a LOT of data that has already been collected (been going on in the background since at least 1994. And the EIA: Intelligent Essay Assessor probably used data and tech that preceded that.) Student papers. Rater scores. Rubrics. Processes. Refinements. 39 million responses were tested in 2019 by Pearson. And Pearson isn't the only one working on this stuff. 

What's happening in December 2023

The test has already been designed and field tested. 

The passages and genres and prompts have already been selected. The scoring guide has already been prepared by humans. They've decided what the possible answers are. They've decided what text evidence should match it. The humans have decided what wrong answers and evidence are probable. The humans have decided what paraphrases and synonyms are likely. 

Sample/anchor papers have been scored and uploaded into the machine/system. 

The machine/system is programmed with all of this information. The machine is programmed with the rubric.

When retesters in December submit their tests, the machine gets the papers. The machine/system already knows how sample papers scored. It scans the new submissions and compares to what it already "knows." A score is generated. 

When the system/machine is challenged, the writing gets sent to a human. Sometimes the human gets the paper and the system/machine gets the paper. The human and system/machine calibrate or recalibrate. This will happen about 25% - of papers? of scoring attempts? Not sure. We just have 25%. 

It's called hybrid because humans decide what the system/machine looks for. It's called hybrid because humans are scoring continuously between and with the machine. 

Accuracy/Validity

Can a computer do this? Is it fair to kids and the ways they interpret the text? Some people will argue with me...but YES, it's fair. Much more fair than what we've had before. Why? Let's think about what's being assessed here. It's not really TEKS. And it's not really writing for ECR. 

We want to know - and it's a good thing to know - 
  • Can kids read and understand a text of any genre? 
  • Can kids follow written instructions? 
  • Can kids use data to make decisions? 
  • Do kids know how to read stuff at their grade level? 
  • Can kids make decisions about important ideas and communicate them in a variety of ways (correspondence, use information, argue, etc.)? 
When TEA released some of the information about the new test for ELAR, they gave us some really important information: they showed us a guide for raters. The guide included the text, possible answers, and possible text evidence that could support it. The guide explained what a good response would include with the rubric description. 

So...kids will read a passage (or passages.) They'll be given a prompt with a genre. The passage will have clearly supported answers and text evidence to match it. Texas teachers will have written, revised, and validated this information. The item will have been field tested and reviewed psychometrically. And all of that will be programmed into the computer. 

Here's how I think it will work: We teach kids to dig for the good stuff. 



 




Implications for Instruction: 

1. They don't want to, but kids are going to have to read the passages and the prompts in full. 
2. Deep comprehension is essential. 
3. Digital reading and digital composing skills must be transferred from physical reading and writing tasks. In other words, learners must see what this stuff looks like in practice IN the platform or some other digital platform. 
4. Readers and writers must learn beyond WHAT the tools are in the platforms and HOW they can be used to support reading and writing processes. The Cambium tutorial builds familiarity but NO information on how the tools aid the thinking processes. 
5. Text evidence is KING. And it must be connected to answering the question. 
6. One single acronym is NOT going to help kids answer all the ways this thing can be assessed. RACE isn't good enough. (I actually despise it because it causes bad results. More on that another time.)

Application: 

Here's what we tried with a group of kids who scored between 3-5 on their essays last year. It's a powerpoint of notes and our steps that we used in working through the processes of revising their previous work. We pulled up the Cambium testing platform and used what they have available on the test. We learned a TON of stuff that kids said they did and didn't do; what they knew and didn't know. I still need to think about that for a while. But in the meantime, I hope this helps you make some instructional decisions and next steps. 



Hybrid/AI Scoring: Mechanics and Implications for Instruction

 



TEA kind of announced that AI is scoring stuff for ECR and SCR when they offered the testing administrator training this month. Ok. So we found out about it in a backhanded way. Moving on. 

My little rescue dog, Joy, is trained well. For one trick only: Be cute. Then she sits. Like Joy, the AI scoring machine will do exactly what it is trained to do, regardless of what we call it. For ECR/SCR on STAAR, the scoring will do exactly what it says it will do. 

We are going to be ok. It's time. We already have the technology to hear, transcribe, and interpret English vocabulary and syntax. Phones, text messages, emails, docs, etc. It's working pretty well. (I remember when Dragon Dictation started. Didn't work too well. But it learned and got better.) 

A friend of mine saw the video I posted on facebook about this: 

Hi😊I just came across your video you posted after meeting with your favorite English teachers. I’m not savvy to all the things your referencing in the video, but I will say…if AI allows the teachers a way to grade the students essays, I’m all for it. I would love to have all those HOURS back with my mom, my favorite English teacher. It was just our normal at the time…but looking back, she spent so much precious personal time at home “grading papers” when I was growing up.
❤️


Wow. That kind of hurt. And, we know that marking papers is actually a waste of time. See my dissertation. And I can't tell you how many hours I wasted in carrying those bags of papers home. 

So the implications for instruction are: 

1. Make sure teachers and students have access to technology that can "score" and grade papers for mechanics. Even if it costs money. This should be provided by TEA if they are using it to score. Assessment must match instruction. AI scoring resources are now instructional materials. 

2. Stop editing (grading) papers for writers. Don't do something a computer can do. 

Now - our next treat is going to be what all of this means for scoring content. More on that next. 



Tuesday, October 24, 2023

Notes: “Pleasurable Poetry Analysis: Extracting Theme, Craft, Structure and Beauty without Tying the Poem to a Chair and Beating It…,” with Gretchen Bernabei (3-12)

 www.tinyurl.com/BeachBernabei2023

Who is someone you know with a quirky habit? Or one unusual behavior? Describe it. Write about that for 3 minutes:

My thoughts: 

Quicklist - Cousin Tommy? Joy? 

Joy thinks she is a guard dog. I don't need a guard dog. She's also a chow hound. I don't need that either. She knows only one trick: be cute. Actually, it means "sit" to her. So, when we are at the table, she "sits" thinking that she'll get a treat. When Dean is there, we don't ever really hear her because they share the plate and he offers small tidbits throughout the meal. But, when he's absent, things are very different. 

When we keep eating and talking, ignoring her, she makes a small, barely audible, "huff." As if, "excuse me, I'm being cute and no one is noticing." The huff's punctuate the dinner, coming more and more frequently. Until, she feels like we can't hear her at all and her sharp exclamation of alarm pierces the air: and she is scolded for being not so cute. 

Nobody starts with a blank page. 

Read aloud from Possums

Read aloud from again. Highlight 2-3 lines you really like and want to call your own. 

We shared our underlined phrases. Why did we underline it? What did the lines do to you? What were you thinking? Talk to me about that. 

(Thoughts: It always surprises me how long we talk about the poem and our reactions. It's like we do a full poetic analysis with talk before we ever do another thing. It happens like this in class, too. Just being humans and reacting to the words, our thoughts, confusions, reactions, nodding our heads and seeing more as we see what others have felt. "I don't think I would have noticed that if you hadn't said that." How does Gretchen know when to stop talking about the lines and when to move into the next steps of the lesson? Gretchen notices and names their noticing with the academic language: enjambment and poet choices: The poem is awkward representing the contrasting ideas, forcing us to stop our reading like the possum stops. We must remain motionless like the possum. The words are poking at us with a stick.)

From her book.

Showed us the chunks. Quirky thing.  We marked our copy, using different colors. 

Read the poem one more time to confirm the chunks and what they mean. We call this kernelizing. Pointed out that the text isn't the same size - that all paragraphs aren't the same size. It's just what each chunk does - tracks the movement of the mind like stepping stones(Thomas Newkirk). It's coherent! 

Can you predict what I'm going to ask you to do? Write for 10 minutes to restructure your writing or to write something new. What kind of poem will you create? 

Tricks for Gluttony

Sitting next to my dining room chair

the dog's acting cute: silent sitting, 

expecting me to deliver a reward

Because that's the only trick she knows:

"Be cute!" She sits for a piece of duck jerky.

I wonder: what do I do for treats? 

What am I waiting for people to notice? 

Until they don't and I under my breath 

Huff for a notice, a glance, a praise

Repeat the throat clearing as if unheard

Until a sharp piercing call interrupts

Irritating and Self-Focused Interference

Am I seeking what others can provide? 

A reward and noticing: superficial sustenance

when I'm already full and capable of feeding myself. 

My AHA: Wow. I didn't expect the depth that the structure caused. It shifted my prewriting to a new place! Exciting revisions. As I talked with Gretchen before the session, we talked about how we begin to see ourselves as writers and poets through this process. It made me think more about schema. Kids often approach comprehension from their own schema. Which is good for the reading until we go to the kinds of questions valued on STAAR. We don't want them to use schema of self. We want them to use schema of text. But...if they are working from a schema of authorship...if I were the poet of this piece...I'd be doing x to cause y...That's a kind of personal schema I can support in instruction about poems and author's craft. What a difference it makes when we experience ELAR as both reader and writer! 

It was hard for some in the crowd to write. She said, "Leave each other alone..." and shared what she would offer them for lessons. We laughed and talked. She reminded us of how she would monitor and grade in class. We shared with each other at our table. Then folks read aloud to the whole group. 

We covered 7 of the 8 TEKS strands (all but research.) 

Kids can write responses from their own poems and from the poems of others. The more kids experience both hats - reader/writer - the better equipped they are! 





 




Monday, October 23, 2023

Notes: Teach Rhymes with Beach: Gretchen Bernabei: What's Important (and What's Not)

 Literacy's Democratic Roots by Thomas Newkirk - drawing heavily from this because teaching is an act of patriotism. Fighting against ignorance to prepare a population that votes and perpetuates the experiment of democracy. 

Envelopes- 

tinyurl.com/BeachBernabei2023

Opening thought: from his first page, xiii: "If you enter...Cherish, hold dear." 

Think about what's important...Let's explore out own thoughts. What do we think is important in our classrooms? (Besides the humans.) 

Envelope: Two columns: Quicklist of important things: 

Life lessons you want to convey: (my thoughts)

1. Grace and acceptance

2. Doing your best is most important

3. Everything we do is preparation to do the next thing (my thoughts) 

Life skills you want them to walk away with

4. How to think and reason

5. How to consume and critique a text

6. How to write and create

Three practices you want them to learn (my thoughts) 

7. How to listen and respond

8. How to notice when something is wrong

9. How to find solutions when things aren't right

Add one more thing that's not already on the list (my thoughts) 

10. How to make friends and support others

Other Side: Not important

Things you did as a student that aren't important now (my thoughts) 

1. Diagramming sentences; outlining a chapter

2. balancing chemical equasions (can't spell math) 

3. write something and then type it

Things that you are required to do in classroom that are required by others (my thoughts)

4. write the objective on the board

5. read from a script

6. take attendance on time

Add to the list to outdated things that you have to do (my thoughts) 

7. grading

8. CBA's

9. meetings

One more thing: 

10. Follow rules from assessment regime

Put lists to the side. Asked volunteers to come to the front. This is how she teaches kids to answer questions fully. (She goes through the process she explains in QA12345: Through dialogue.) 

Question; Answer; How do you know? Huh? What does that mean? How else do you know? Huh? What does that mean? So...your answer is...what? If we put the questions on mute, you would have heard a complete response. 

First person: What's your name? 

Second person: What town are we in right now? 

Third person: Is it hot outside today? 

She knows that when a student wants to do her role, that they are on the way to success. She uses a lot of visual texts to begin the work. Told story of how her students used the structure with The Wizzard of Oz. Used sentence stems from the website to ask the questions. These structures become internal and automatic. 

Return to your list. Choose three. (She shared the ideas she added to her envelope.) Circle three of them. Or four. :) Pick one that you wouldn't mind talking about with folks in this room. 

Open the envelope to write on the flap side. 

Kernel Essay: Topic at the top. The other kernels will go under the flap. What you write will go on the envelope outside. 

Answer/Claim/Opinion; One way I know; This means; Another way I know; This means/shows; And so, answer repeated.

CBA's should be outdated and abandoned. I know this because everyone is required to do them, but nothing much changes in learning and teaching. Why would we waste time doing something that makes no difference. Another way I know is that the scores on CBA's don't match the results we end up getting on STAAR. Kids who pass a CBA don't necessarily pass the STAAR or get those kinds of questions right on the final exam. Kids who fail a CBA don't necessarily fail the STAAR or get those kinds of questions wrong. So, CBA's should be outdated practices and abandoned as ineffective pedagogical practices. 

Shared out our work. Rhetorical triangle - someone writes something for somewhere for someone else to hear. It's not real communication until it's shared. Its an exercise for a school laboratory if only the teacher reads. 

Don't do it this way: This isn't very good. It's just toff the top of my head. I didn't have time to revise.

Don't do it this way: Read from it and then ad lib to add. Just read it the way it is. 

Hear three people. 

Heard some from the crowd. 

Everything that has been written or read, has structure. There are hundreds that you can use once you learn the concept. Pick what you want to use empowers students to explain what they mean. 

Other thoughts from Thomas Newkirk

See tinyurl for thoughts and resources. 






Notes: Reading Lenses-Deep Analysis with Jenny Martin at Teach Rhymes with Beach

 C.  “Reading Lenses--Deep Analysis,” with Jenny Martin (3-12) Is authentic reading engagement truly possible? Absolutely! Using the “Reading Lenses” strategy with rich texts, teachers will learn how to boost their students’ reading interactions, and immediately facilitate rigorous analysis. Students are empowered with confidence and skills that support STAAR reading and constructed responses. This session also helps teachers better understand the “thinking” within their TEKS. (3-12) 

Vocabulary: Listened to Speech on the Death of Dr. Martin Luther King Jr. by Robert F. Kennedy. As we listened, we coded the text for vocabulary: context clues (CC), Greek and Latin/Foreign (GL), and multiple meaning words (MM). Then we met in small groups to discuss and share ideas about how these words added to the message by discussing meaning. 

Author's Purpose and Craft: APC elements + Why are these choices made by the author? What's in the text that makes you believe and understand the purpose? We spent some time thinking about R. Kennedy's purpose. Discussed in table to find a common purpose. "Unification is needed." "Martin Luther is a legacy that tells us what we should focus on for unification. Honor MLK." "To inform about passing and to persuade the audience to act in a way to honor MLK's life purpose." Add what you liked to what you have already written. Then we highlighted things in the speech that supported that idea. 

Next: Explain or analyze how the use of text structure contributes to the author's purpose. We are working up to these kinds of questions. Use Gretchen Bernabei's rope and brand/kernelizing process to determine the structure. Use the text structure to explain how it's put together to deliver the message/purpose. Example Kernel: Big announcement: big loss; Who was this man? Where do we go from here? Here are our choices. Here's a poem that is a beautiful expression of the moment. Here's the plan. Nails the landing with: so we dedicate ourselves to... Name the speech structure: Honoring a Great Person. (Kids can write from this structure as well.) 

Next: How does the photograph, caption, and poem contribute to the author's purpose? Why are they there? What is the effect? How does it help you as a reader to connect to and understand the message. 

Next: Metaphor/personification: Highlight with a different color. Discuss impact and use. 

Next: Point of view - best example comes from him having a member of his family being assassinated

Next: Mood and voice: word choice in opposition; repetition: difficult; How does this impact mood and voice. Discuss. Share out. 

Next: What makes it argumentative? Rhetorical devices/appeals and logical fallacies. Discussed logos, pathos, ethos. Exaggerations.  ID those. (Note, there are additional features including argumentative structures.) 

Next: Use the lens of text structure to reread - look for pitchforking. Highlight three examples of pitchforking. Discuss what they are doing rhetorically. Enter our writing and use them. (Note: wouldn't it be a good idea to use these for pieces of text evidence that supports ideas? Then we could explain each? Hmmm.) Shared out end of 9 and all of paragraphs 7 and 10. 9 uses parallelism and anaphora. 

The point is to re-enter the work repeatedly to deeply analyze. We can make sentence frames and have the kids imitate. These are all highly tested. We can also use this as a mentor text to model explicitly about the skills we are teaching. Kids are familiar with the text and we can focus on the skill over time. 

Next: Use question stems handout and respond in writing. Try two vocab stems. Try two or three author's purpose stems. 

6th Grade Question Stems

 


Notes: Teach Rhymes with Beach: Melanie Mayer's Keynote - Six Strategies for Success

 1. The "Devil" Cards: Take a deck of cards. Write the kids' names on them. Ask a question. Have kids turn and talk. Then pull a card and have the person share 

2. Charting the reading: Several ways to do this...

Take a piece of paper and fold it into four columns. 

Option One: Questions: What do you wonder? 

Words: A new/powerful word for me is ____. I think it means/adds clarity by ________ because _________. 

Summary: Who's the speaker? What's the subject? What's the situation? Why does it matter? 

Inference: The writer doesn't say directly, but I can tell from the evidence that...

(Change the chart to fit your focus, such as figurative language: simile, metaphor, personification)

Option Two: Text to self, text, world. Write a structured paragraph response. 

Option Three: Answer, Prove it (text evidence) explain it (use transitions - furthermore) Summarize it 

3. Self- Assess and Peer Advice

Did you start sentence with capital letters? 

Did you write x number of sentences? 

Did you use transition words? (Or your focus for the day)

Did you capitalize I? 

Give yourself a check for each item. Revise and submit. 

Peer Advice: Two things you heard. Use word wall of writer/authorial choices to help name and give examples. Name two what if's of author choices that might improve the text. 

4. 5 Minute Grammar Lessons

Show examples from your writing. Let them notice the structure. Give them check points to evaluate. Have them write and share. Next day, use the structure to answer a question about the day's learning. Add it to a quiz.

5. Utilize Tiering

Gave the example of DOL bell ringer to find the mistakes with homonyms/multiple meaning words. Build several tasks...

Tier One: Find the mistakes. 

Tier Two: Find other homonyms and how they are used. 

Tier Three: Write your own examples or re-enter your writing. 

6. Establish a culture and climate of grace. 

GPS: You moron. You missed the exit again. I told you 4.6 miles ago to get in the right lane. What were you thinking. 

She doesn't even huff - she just says, "rerouting" or "recalculating." Our kids need this too. We do this by being accessible and approachable. We do this by being interruptable. We do this by using positive words and putting on our smiles with our outfit. 

Story - The average person lives to 76. That's a certain amount of days. At 56, I've lived 20, 440 of my days. I only have 7,300 left if I live to the American average. What I do matters...and I'm running out of time. I want to spend it doing the right things for those in front of me. Psalm 90:12 and Psalm 118:24. 

Monday, October 2, 2023

Phonics IS the V in MSV

Gonna be tacky for a while. (This is actually an old issue, but continues to have serious, unintended consequences. A bit scared to post it, as people will probably attack me because they think I'm stupid and misinformed or slamming phonics. Which I'm not. Well, not stupid or ignorant about this issue.) 

 To those who are demonizing the 3 cuing systems...I wonder if they realize that phonics IS the V in MSV (Meaning, Structural, Visual). To those writing legislation to outlaw curriculum with a V in it, do they really want us to NOT look at the phonics? I can just imagine us rowing through books with blindfolds like Sandra Bullock in Bird Box. Ridiculous. 

Friends, MSV isn't the devil some claim it to be. Here's some information to consider.

Semantics: 

Visual - What are the letters? What are the sounds they make to produce the word in print? Phonics IS the visual component of language that we use to translate ink and pixels into sounds. 

"Does it look right?" isn't a suggestion to guess what the word is. It's not really even a strategy. It's a piece of the way the language functions. 

And it's cultural. 

Marie Clay was Australian. They say, "Have a go" like Texans say "try it and see." "Does it look right" is a cue for kids to monitor and self-regulate. 

Let me translate this maligned question for those that don't understand the science: 

Here's what happens with the visual, phonics cue: 

  • The kid says a word after looking at print. 
  • Right or wrong, the teacher wants them to check their accuracy. 
  • Texan:"Kid, match what you heard yourself say with the visuals/letters in the print. 
  • Linguist: Use your phonemic awareness of sounds and connect it with your understanding of phonics." 
Is a kid going to understand the Texan or the linguist? Bless your heart.

A Solution: 

Isn't it easier to ask kids if what they said looks like what's on the page instead of asking them to explain the linguistic processes and names of phoneme types? I thought we were trying to look at words to say/hear them so we can understand what the words mean. House or horse. It's a big deal. 

If a kid can tell you what a digraph is but can't tell you what their mouth does to make the word, is the linguist helping?  Furthermore, neither the Texan or the Linguist will be successful in teaching reading if the kid says the word correctly and doesn't know the difference between a house and a horse. 

There's more to teaching reading than phonics. V is just one of many cues. (And for those that don't know the sciences, there's more than three.)