Monday, October 23, 2023

Notes: Teach Rhymes with Beach: Melanie Mayer's Keynote - Six Strategies for Success

 1. The "Devil" Cards: Take a deck of cards. Write the kids' names on them. Ask a question. Have kids turn and talk. Then pull a card and have the person share 

2. Charting the reading: Several ways to do this...

Take a piece of paper and fold it into four columns. 

Option One: Questions: What do you wonder? 

Words: A new/powerful word for me is ____. I think it means/adds clarity by ________ because _________. 

Summary: Who's the speaker? What's the subject? What's the situation? Why does it matter? 

Inference: The writer doesn't say directly, but I can tell from the evidence that...

(Change the chart to fit your focus, such as figurative language: simile, metaphor, personification)

Option Two: Text to self, text, world. Write a structured paragraph response. 

Option Three: Answer, Prove it (text evidence) explain it (use transitions - furthermore) Summarize it 

3. Self- Assess and Peer Advice

Did you start sentence with capital letters? 

Did you write x number of sentences? 

Did you use transition words? (Or your focus for the day)

Did you capitalize I? 

Give yourself a check for each item. Revise and submit. 

Peer Advice: Two things you heard. Use word wall of writer/authorial choices to help name and give examples. Name two what if's of author choices that might improve the text. 

4. 5 Minute Grammar Lessons

Show examples from your writing. Let them notice the structure. Give them check points to evaluate. Have them write and share. Next day, use the structure to answer a question about the day's learning. Add it to a quiz.

5. Utilize Tiering

Gave the example of DOL bell ringer to find the mistakes with homonyms/multiple meaning words. Build several tasks...

Tier One: Find the mistakes. 

Tier Two: Find other homonyms and how they are used. 

Tier Three: Write your own examples or re-enter your writing. 

6. Establish a culture and climate of grace. 

GPS: You moron. You missed the exit again. I told you 4.6 miles ago to get in the right lane. What were you thinking. 

She doesn't even huff - she just says, "rerouting" or "recalculating." Our kids need this too. We do this by being accessible and approachable. We do this by being interruptable. We do this by using positive words and putting on our smiles with our outfit. 

Story - The average person lives to 76. That's a certain amount of days. At 56, I've lived 20, 440 of my days. I only have 7,300 left if I live to the American average. What I do matters...and I'm running out of time. I want to spend it doing the right things for those in front of me. Psalm 90:12 and Psalm 118:24. 

Monday, October 2, 2023

Phonics IS the V in MSV

Gonna be tacky for a while. (This is actually an old issue, but continues to have serious, unintended consequences. A bit scared to post it, as people will probably attack me because they think I'm stupid and misinformed or slamming phonics. Which I'm not. Well, not stupid or ignorant about this issue.) 

 To those who are demonizing the 3 cuing systems...I wonder if they realize that phonics IS the V in MSV (Meaning, Structural, Visual). To those writing legislation to outlaw curriculum with a V in it, do they really want us to NOT look at the phonics? I can just imagine us rowing through books with blindfolds like Sandra Bullock in Bird Box. Ridiculous. 

Friends, MSV isn't the devil some claim it to be. Here's some information to consider.

Semantics: 

Visual - What are the letters? What are the sounds they make to produce the word in print? Phonics IS the visual component of language that we use to translate ink and pixels into sounds. 

"Does it look right?" isn't a suggestion to guess what the word is. It's not really even a strategy. It's a piece of the way the language functions. 

And it's cultural. 

Marie Clay was Australian. They say, "Have a go" like Texans say "try it and see." "Does it look right" is a cue for kids to monitor and self-regulate. 

Let me translate this maligned question for those that don't understand the science: 

Here's what happens with the visual, phonics cue: 

  • The kid says a word after looking at print. 
  • Right or wrong, the teacher wants them to check their accuracy. 
  • Texan:"Kid, match what you heard yourself say with the visuals/letters in the print. 
  • Linguist: Use your phonemic awareness of sounds and connect it with your understanding of phonics." 
Is a kid going to understand the Texan or the linguist? Bless your heart.

A Solution: 

Isn't it easier to ask kids if what they said looks like what's on the page instead of asking them to explain the linguistic processes and names of phoneme types? I thought we were trying to look at words to say/hear them so we can understand what the words mean. House or horse. It's a big deal. 

If a kid can tell you what a digraph is but can't tell you what their mouth does to make the word, is the linguist helping?  Furthermore, neither the Texan or the Linguist will be successful in teaching reading if the kid says the word correctly and doesn't know the difference between a house and a horse. 

There's more to teaching reading than phonics. V is just one of many cues. (And for those that don't know the sciences, there's more than three.)

Wednesday, August 30, 2023

A process and resources for authentic STAAR remediation, response, instruction, and assessment: An English I Case Study and Suggestions


Premise: Before we collect more data, let's intervene on the data we have. 

Premise: Our work does not have to look like a STAAR test. 

Premise: There's a lot of thinking and work to be done BEFORE you can answer a question. Most of that isn't a TEK. 

Premise: Answering multiple choice questions is a completely different skill than any of our TEKS. Answering questions is about thinking and reasoning that you do while reading the source text and while analyzing the question and choices.

Premise: The English course is essentially the SAME course for all grades. Hate me if you want, but listening, speaking, reading, writing, viewing, and thinking are the same K-12. The only difference is text complexity. 

Premise: Thinkers must be able to examine a text and make decisions without the use of questions and in their own writing and that of others. 

Premise: Skills and content are best practiced and mastered when we mess with and apply those things in our own writing.

Explanation and Process: 

Prioritizing what we have for specific purposes

  • Use the 2023 released exam for the previous grade level for analysis and instructional material. 
  • Use the 2023 ECR and SCR samples. Use them for exemplars. Use them for sample texts to revisit/revise. Give them to students so they can grapple with their responses (or lack of response) and improve. 
  • Use the 2022 full length practice exam for crafting common based assessments or formative assessments. (I know, it's not psychometrically aligned, but it's what we have. You could also use and revise some of the previous STAAR exams.) 
  • Save the 2023 released exam for the current grade for the benchmark in December or January.

Resources for English I:

For Instruction:

Analysis and Tools for 8th Grade Released Exam

Powerpoint (in draft to show the process, more to come as time permits) 

For Assessment

The Benefits of Learning to Play an Instrument: Text and Sample Prompts for ECR, SCR


Tuesday, August 15, 2023

2023 STAAR Extended Constructed Response Analysis, Interpretations, and Recommendations

The extended constructed response seems like a hinge-point because success depends on what kids DO while they are reading and responding. Success on these items can be a gateway to success on the other items. 

So - here's some things to think about as you are interpreting results, designing interventions, and planning initial instruction. 

By the way - If you didn't know, you have access to ALL the writing students have done on the exam. It's important to print those out for the kids you have. Revision is the best way to improve writing. And it's a good place to start in helping students become aware of their thinking and reasoning. Eventually, we'll have pdfs of it all. Right now, we're having to clip, cut, and paste. Your testing coordinator will be able to give you access in the system to see student results and will have the pdfs eventually. 

Here's a hyperlinked pdf of the analysis that you can use in your plc's. 


 





Saturday, August 12, 2023

So many ECR Zeros on STAAR? Why?

Check out these scorepoint graphs: 



Why so many zeros? Well...it was new. And 3rd graders are NINE. But perhaps we need to look at the obvious...or what is obviously hidden before we start diving into TEKS.





Here's an analogy - did you know that if you break open a cottonwood twig, that there is a STAR of Texas hidden inside? (Project Learning Tree has a cool lesson about it here.) The star was there all along, but you didn't know it was there. 

In the TECH world, they study the "user experience." Kids all over the state told us that they didn't have an ECR on the assessment. Why did they think that? Remember - most kiddos are taking this test on a chromebook tiny screen with no mouse. Here's what they saw: 


Semantics: 

Students are asked to write an argumentative essay. Not an ECR. Could the problem be semantics? Kids need to know they will be asked to write a composition or an essay. (And not an S A. Before my kids saw the word typed out, they thought I was just saying two letters.) ECR is an assessment item type vocabulary and is not used for kids on the test. Same goes for passages - that's what we call them. That's not the academic language used on the assessment - STAAR says "selections." Our academic synonyms might be causing some of the confusion. 

User Experience: 


And...did they SEE a place to type the essay? Sure, there's a little arrow that says to scroll down. The scrollbar that says there is more to see doesn't show up until you hover over the right hand side of the screen on some computers. On other computers it does. 

That's a wonky user experience for something kids aren't familiar with. With which kids aren't familiar. Whatever. You get the point. 

Our solution: Kids need to be beyond familiar with the TECH elements. They need to be fluent with them. And our words need to match their experience - "You'll be asked to write and essay or a composition. You'll need to scroll down or use your arrow keys to see where you need to type." 



Font and Text Features: 

Another problem is the way the paired passages appear. 

See how these scroll on the same "page"? It looks like "Laws for Less Trash" might be a subheading for "Rewards for Recycling." 
That's a problem. Kids need to make sure they are attending to the bold material at the beginning of the passage and know that "selections" means two different passages. Another semantics issue. 
    But this is also a text feature issue. I didn't see any subheadings in the third grade test, so I had to go to 4th grade. 

See how the titles of passage are center justified and of a certain size font? Now compare to the subheading: left justified and a smaller font. These are text features that serve as cues for readers about what the author is doing as well as when a new passage/selection appears. 

It will be important to teach kids how to understand and decode the text features and navigation elements (like that little almost invisible grey arrow on the right hand side of each screen that says there's more below and you need to scroll down.) 



Using Cambium Components for Self-Regulation: 


The very top row is designed to help students understand and see how much of the test they have completed with the blue bar and the percentage complete. They know how much energy they need to use and how much time to save. But they can also see where they have and have not answered questions. The little red triangle tells kids where they have NOT answered questions. This would have been a huge cue to students that they'd missed answering the essay question. But did they know it was there? Were they fluent in using the tools to monitor their progress and to check for completion? 

Before we start digging into TEKS (and especially accountability ratings and class rosters), let's do some talking with kids about their experience and their approach when taking the exam. Our solutions can start with modeling how the platform works and using similar tools during daily classroom instruction so that students are fluent with technology experiences beyond a familiarity with a tutorial or mention of the tools. Let's make sure students understand the user experience and how to use the tools to enhance their comprehension and demonstration of grade level curriculum. Until then, they're walking in a Texas creekbed under the cottonwoods, not knowing about the hidden treasures all around them. 



Monday, May 1, 2023

Readability, TEA and Hemingway, DOK and Worms

Questions from the Field

I wanted to get your thoughts and opinion about reporting DOK and Bloom's information with items and readability formulas with passages.

DOK was not designed for how people are using it. I talked with Norman Web, himself about it. Even got a picture and an autograph. DOK was supposed to match the rigor of the curriculum to the assessment NOT the rigor of the questions themselves. And this distinction is worth noting. Questions are not written at DOK levels, the curriculum is. DOK is supposed to measure a gap between the assessment and the standards. So...writing questions at DOK levels skips important context and grounding from the standards. Does TEA use DOK to write questions? I'd like to see their charts. 

Blooms: This is the model I recommend for Blooms.

Research Ideas and Next Steps: 
  1. When we pair the KS with the SE, what is the Bloom Level? (for each breakout in the assessed curriculum) 
  2. When we look at the item types, what is the connection between Bloom and DOK? (for each item type and TEK) This will have to build over time because we will only have certain item types for certain TEKS for a while. And it will vary by grade. 
  3. Does TEA's interpretation of Bloom and DOK on the assessment match the curriculum? 
  4. Once we can see the gaps/alignment, then we can make some decisions for metrics, practice, and instructional interventions. 
    1. What do these metrics tell us about student performance/growth? 
    2. How do these metrics inform Tier One Instruction? 
    3. How do these metrics help us form progressions of learning and identification of pseudoconcepts that result in refined teaching and intervention for Tier Two Instruction? 
    4. How do these metrics help us write better items and give better feedback to students, parents, and teachers? 
We are taking our cue from TEA to use F/K readability scores and the "Hemingway" app they recommend, so I feel like the info we are collecting is TEA-ish-aligned, but is it the type of readability score you think teachers will want to see or care about?

Thanks for sharing this. I didn't know they were recommending Hemingway. I had to do some research. What do you mean by F/K readability? Flesh-Kincaid? Hemingway uses a similar algorithm as the F/K but somewhat different. It uses the Automated Readability Index. 

Commentary: I think TEA's move here is a good one; however, all readability formulas are flawed. I like that Hemingway's formula uses length of words (loosely associated with vocabulary) and length of sentences (directly associated with T-units and idea density/complexity). 

Note that Hemingway and The Automated Readability Index are not really grade level descriptions that teachers are used to. These numbers are NOT grade level markers that we see in Guided Reading, DRA, Fountas and Pinnell, Reading A-Z, Reading Recovery, or even in Lexiles. These measures do not measure the grade level of the text, but describe the amount of education a reader would need to understand the texts. TEA is using these measures to defend the passages. Teachers use readability measures to match readers to texts they can read easily on their own, texts that are useful for instruction, and texts that will be too frustrating to read alone. It would be a mistake for teachers to use Hemingway to match readers to texts because that's not what it does. 

Hemingway is more about CRAFTING text for readers so they will be successful.  The purpose of the scale is what is important here: how do you write in a way that most people can understand your message and purpose? Writing for people with 9th or 10th grade education levels is ok, but many people aren't that proficient. The Hemingway app and measures help you simplify your writing so that it lands where people with 4th to 6th grade experiences can understand what you intend to convey. Again (as we saw with DOK), we have a disconnect between purpose and how the thing is being used. 

We cannot provide Lexile scores for a few reasons (cost of license being primary), but we can provide some more content-based and not just language-based readability formulas, such as might be seen in Fountas-Pinnell readers.

Lexiles. Eye roll. So many better measures out there. Glad they aren't useful to you.
Content based measures. Hmmmm. That's problematic semantically. I wouldn't say that Fountas-Pinnell readers are content based measures as their levels are also language and text feature based. In ELAR, there really isn't any content past early phonics and some grammar. The rest is process. I know of NO way to measure content levels. 

Do you see a need/want for that among teachers, or is a simple language-based tool like F/K enough in your opinion?

What I see here is the potential for confusion. Already we have a mismatch between TEA recommendations of using Flesch-Kincaid and an app that uses something different. In addition, the semantics and purpose seem similar but have distinctions in practice that confound their use and application with matching students with texts, measuring growth, selecting curriculum materials, writing assessments, and planning instruction for both reading and writing. What a mess! There's a military term I'd like to use here...

Here's another wrench in the works --as if we didn't have enough to worry about:  When you use these formulas to measure readability of questions as units of meaning (like passages), questions are FAR FAR FAR above grade level of any measure. Questions are dense, complex little beasts that no one is talking about at any grade level in any content area. 
Grade 3 Full Length Practice Questions 1-3 analyzed by Hemingway: 
Screenshot 2023-05-01 at 1.52.31 PM.png
Screenshot 2023-05-01 at 1.56.17 PM.png
Screenshot 2023-05-01 at 2.01.17 PM.png

As you can see, using TEA's own recommendation, 3rd graders would need an experience of a fourth or fifth grader to answer the first three questions on the assessment. And that's after reading the passage. The more I look at this stuff, the more I believe we aren't measuring curriculum or student growth or any of the things we think we are measuring.  

Initial thoughts on solutions: 1) Give a language based, classical formula that teachers understand. 2) Give a second, grade level "experience" measure for comprehension and schema or reader's ease. This gives us a chance to help teachers understand what TEA is doing here. (Reminds me of Author's Purpose and Craft. TEA has a specific purpose for their readability stuff. It's about making sure they can defend (to the legislature and parents) what kinds of texts they are using to assess the curriculum. Teachers' have different reasons - ones that have to do with supporting a child and their growth instead of the validity of the assessment. 
 
Secondly, we have been tracking Bloom's and DOK as quickly-trained evaluators (I had a three-hour seminar years and years ago at xxx; xxx has had some haphazard PD as a middle school teacher over the years). As you no doubt know, for a STAAR assessment, we find a lot of DOK 2 / Bloom's "Analyzing" items, and so it seems like it might not be the most useful metric, but we are also not experts and might be missing some subtleties between TEKS and items that would give a more varied report. So my question is two-part. Do you agree that we are likely to see similar DOK/Bloom designations across many items, and if so (or not) is this information you think teachers will want or could use in classroom instruction or reteach? Is the information bang worth the research and editorial review bucks for DOK and Bloom? And perhaps DOK is appropriate and Bloom's not (I kind of lean this way personally)? Sop that's four questions, I guess. :) 

Can-o-worms. Earlier, I described problems with DOK and questions. If you are not matching the question and the curriculum to determine the DOK, then the data you get from that doesn't do what most think it would do. So...that has to be fixed first. 

Do I think we are likely to see similar DOK/Bloom designations across many items? My first response is: People tend to do what they have always done. So yes. TEA tends to do all kinds of things from one design to the next. So no. 

My second response is: How does any of that help the teacher? We see this ongoing work in training for unpacking standards. But honestly, if TEA isn't transparent about what they thing is DOK or Blooms, then we are guessing. Do we have solid instructional practices and understanding of the curriculum that LEADS to success on these descriptions? Labeling them without that seems like a waste of time to me. Teachers might want to put DOK and Blooms as compliance measures for item analysis or in lesson plans, but honestly...what does this change for the instructional impact on students? "Oh, it looks like 75% of our kids missed DOK 2 questions." Now what? 

My third response is this: We haven't even gotten results back. Districts and people are downtrodden and devastated and confused. They are all feeling a little nutty about everything. It's too early to even know or make a good decision. I'm wondering if we are making all of this so much more confusing than it ought to be. Mom always says, "Rest in a storm." That never feels good in a storm. But what good are we going to do if we try to build a DOK nest in a tornado? 

Is this information you think teachers will want or could use in classroom instruction or reteach? 

I don't know. Do we know what kind of instruction fixes DOK problems? I'm not sure we do. Is the DOK or Bloom what's actually causing the problem? For lots of reasons, I don't think so. There are too many variables and too many of them cross pollenate and create varieties of problems that didn't exist before or for everyone. There are SO many instructional implications before we ever get to a question and its level that are not being addressed. It seems counterintuitive to fixate on DOK before we know and repair the underlying issues. 

Here's an example. A local school here decided they wanted a greenhouse and a tennis complex. Funds were acquired. Community was excited. Structures were built. Programs began. Years passed. In the nearby school building, walls and whole classrooms in the school threatened to collapse. Know why? There was no problem with the structure and quality of the building. The contractors had done masterful construction that should have lasted a century or more. The greenhouse and tennis complex were built on the well planned and placed drain fields to take the water away from the sodden clay of our panhandle soil. The problem isn't the structure of the building/question/DOK. The problem is how the whole system worked together. 

Is the information bang worth the research and editorial review bucks for DOK and Bloom? And perhaps DOK is appropriate and Bloom's not (I kind of lean this way personally)? The problem is that we have to make decisions now when we don't have the land survey to tell us how things are built and how we should proceed. 

It's a crapshoot. We might spend a lot of our resources to make something that isn't useful. We might make something that looks good and attracts attention but isn't consequential for helping those we want to serve the most: our students. I'm more "meh" on Bloom's as well. I just can't bring myself to care much about either one when I consider all the things that need to be fixed before labeling a question with a level that we can't validate. I also think the question types themselves indicate their own DOK. 

Sunday, March 12, 2023

Should we take the interim? And then what? Part One

Draft

Should we take the interim? 

Yes. 

Reason One: It's a giant research study to see if it is possible to measure growth over time instead of on a one day high stakes assessment. That's what the legislation originally asked for TEA to study. Part of me wants to say it can be done. And, that's not really how the interim is being used right now. Reminds me Dr. Ian Malcom on Jurassic Park that says, "Your scientists were so preoccupied with whether they could, they didn't stop to think if they should." Right now, the interim is supposed to predict the likelihood of passing the STAAR at the end of the year.  So many variables are in place socio-emotionally, culturally, academically, and within each subject domain and test design, that I fear we are not measuring what we think we are anyway. It's a good idea to see if this works or not. 

Reason Two: Prove them that teachers are the ones that know best, not an assessment. I'd really like to see the data that says it predicts what it says it does. But from what I've seen in reports from campuses last year and their STAAR results, the interim data didn't match any of the projections from the interim for ELAR. So...let's take the thing and then bust out our best to make sure kids are learning beyond the progressions and predictions. 

Reason Three: It gives the kids an "at bat" with the new format and item types. I'm ok with that rationale...except: Have we explicitly taught digital reading skills and transfer of knowledge and strategies for the new item types, TEKS, and skills? Have we had enough massed and distributed practice on these skills before weighing the baby again? If we used the interim as an instructional tool, maybe. We could use the interim as a guided or collaborative practice. But as another source of decision making data? Not sure that's accomplishing our goals to make kids do things alone that we already know they don't have enough experience to do well. Sounds like a good way to disenfranchise struggling learners with further beliefs about how dumb they are. It's like signing up for fiber internet and paying for it before the lines get to your neighborhood.  

No. It's a giant waste of time for kids and teachers. 

Reason One: After examining the data, I have NO idea what I'm supposed to do in response to help the teachers or the kids. More on that later. 

Reason Two: It's demeaning and demoralizing. Do I really want to tell a kid in March, a few days before the real deal that they have abysmal chances of meeting expectations? Do I really want to tell teachers that x of their z kids aren't in the right quadrant to show growth when they have less than two weeks after spring break to do something about it? If they even believe that the kids took the exam seriously? They already know the kids didn't use their strategies and purposefully blew off three or more days of precious instructional time while taking the dang thing. 

Reason Three: Did we do something about the last data we collected on the interim? Do the kids know their results? Have they made a plan to improve? Do we have a specific plan? Have we fixed the problems that caused the first set of results? People are having data digs and meetings to tell teachers what to do and how these predictions are going to play out for accountability. We're having tutorial programs and hours to meet 4545. We're doing some stuff, but is it really a detailed and reasoned response to resolve the causes of the data? Have we fed the baby enough to cause weight gain before weighing it again? No. 

Reason Four: The data is correlational, not based on cause. The data on the interim tells us the correlations between one data collection (STAAR last year) and the next assessment. Results are correlated to the probability of success or failure and do not pinpoint the cause of the success or failure. When working with human subjects, it is humane to use correlational data to make instructional decisions about nuanced human intricacies for individuals in such complex settings and soul crushing accountability for personal and collective judgments? 

An additional problem with the interim is that you don't have a full trend line until you have three data points. Statistically, it doesn't make sense to take last year's STAAR results (which was a different test using different standards) and pair it with a second interim. There is no trend line until the third assessment even if the assessments were measuring the same thing. 

Yet, that's what teachers were asked to do: make some decisions about indictments on their instructional practices and resulting student performance on data that doesn't mean what they were told it meant. Furthermore, teachers are told to revisit previous CBA's and other data to determine what needs reteaching. The advice is well meaning, but in practice is too unwieldy and flawed to do anything other than make teachers want to pull their hair out and cry out in desperation and stress. 

More on that in Part Two: We took the interim. Now what?