Monday, May 1, 2023

Readability, TEA and Hemingway, DOK and Worms

Questions from the Field

I wanted to get your thoughts and opinion about reporting DOK and Bloom's information with items and readability formulas with passages.

DOK was not designed for how people are using it. I talked with Norman Web, himself about it. Even got a picture and an autograph. DOK was supposed to match the rigor of the curriculum to the assessment NOT the rigor of the questions themselves. And this distinction is worth noting. Questions are not written at DOK levels, the curriculum is. DOK is supposed to measure a gap between the assessment and the standards. So...writing questions at DOK levels skips important context and grounding from the standards. Does TEA use DOK to write questions? I'd like to see their charts.

Blooms: This is the model I recommend for Blooms.

Research Ideas and Next Steps:

When we pair the KS with the SE, what is the Bloom Level? (for each breakout in the assessed curriculum)
When we look at the item types, what is the connection between Bloom and DOK? (for each item type and TEK) This will have to build over time because we will only have certain item types for certain TEKS for a while. And it will vary by grade.
Does TEA's interpretation of Bloom and DOK on the assessment match the curriculum?
Once we can see the gaps/alignment, then we can make some decisions for metrics, practice, and instructional interventions.

What do these metrics tell us about student performance/growth?
How do these metrics inform Tier One Instruction?
How do these metrics help us form progressions of learning and identification of pseudoconcepts that result in refined teaching and intervention for Tier Two Instruction?
How do these metrics help us write better items and give better feedback to students, parents, and teachers?

We are taking our cue from TEA to use F/K readability scores and the "Hemingway" app they recommend, so I feel like the info we are collecting is TEA-ish-aligned, but is it the type of readability score you think teachers will want to see or care about?

Thanks for sharing this. I didn't know they were recommending Hemingway. I had to do some research. What do you mean by F/K readability? Flesh-Kincaid? Hemingway uses a similar algorithm as the F/K but somewhat different. It uses the Automated Readability Index.

See a simplified description here: https://en.wikipedia.org/wiki/Automated_readability_index
a comparison I made here: https://docs.google.com/presentation/d/1RsqfmOlUjMcEETidAnGXsq3z-s5U3J4yiDIIQ7Pxp5g/edit?usp=sharing
and a review of a readability hearing from TEA to SBOE here: https://docs.google.com/document/d/1l6uktpoj5TsJ1xAHmlYTnWNIvEnD5WD7l3HA0B0EasI/edit?usp=sharing

Commentary: I think TEA's move here is a good one; however, all readability formulas are flawed. I like that Hemingway's formula uses length of words (loosely associated with vocabulary) and length of sentences (directly associated with T-units and idea density/complexity).

Note that Hemingway and The Automated Readability Index are not really grade level descriptions that teachers are used to. These numbers are NOT grade level markers that we see in Guided Reading, DRA, Fountas and Pinnell, Reading A-Z, Reading Recovery, or even in Lexiles. These measures do not measure the grade level of the text, but describe the amount of education a reader would need to understand the texts. TEA is using these measures to defend the passages. Teachers use readability measures to match readers to texts they can read easily on their own, texts that are useful for instruction, and texts that will be too frustrating to read alone. It would be a mistake for teachers to use Hemingway to match readers to texts because that's not what it does.

Hemingway is more about CRAFTING text for readers so they will be successful. The purpose of the scale is what is important here: how do you write in a way that most people can understand your message and purpose? Writing for people with 9th or 10th grade education levels is ok, but many people aren't that proficient. The Hemingway app and measures help you simplify your writing so that it lands where people with 4th to 6th grade experiences can understand what you intend to convey. Again (as we saw with DOK), we have a disconnect between purpose and how the thing is being used.

We cannot provide Lexile scores for a few reasons (cost of license being primary), but we can provide some more content-based and not just language-based readability formulas, such as might be seen in Fountas-Pinnell readers.

Lexiles. Eye roll. So many better measures out there. Glad they aren't useful to you.

Content based measures. Hmmmm. That's problematic semantically. I wouldn't say that Fountas-Pinnell readers are content based measures as their levels are also language and text feature based. In ELAR, there really isn't any content past early phonics and some grammar. The rest is process. I know of NO way to measure content levels.

Do you see a need/want for that among teachers, or is a simple language-based tool like F/K enough in your opinion?

What I see here is the potential for confusion. Already we have a mismatch between TEA recommendations of using Flesch-Kincaid and an app that uses something different. In addition, the semantics and purpose seem similar but have distinctions in practice that confound their use and application with matching students with texts, measuring growth, selecting curriculum materials, writing assessments, and planning instruction for both reading and writing. What a mess! There's a military term I'd like to use here...

Here's another wrench in the works --as if we didn't have enough to worry about: When you use these formulas to measure readability of questions as units of meaning (like passages), questions are FAR FAR FAR above grade level of any measure. Questions are dense, complex little beasts that no one is talking about at any grade level in any content area.

Grade 3 Full Length Practice Questions 1-3 analyzed by Hemingway:

As you can see, using TEA's own recommendation, 3rd graders would need an experience of a fourth or fifth grader to answer the first three questions on the assessment. And that's after reading the passage. The more I look at this stuff, the more I believe we aren't measuring curriculum or student growth or any of the things we think we are measuring.

Initial thoughts on solutions: 1) Give a language based, classical formula that teachers understand. 2) Give a second, grade level "experience" measure for comprehension and schema or reader's ease. This gives us a chance to help teachers understand what TEA is doing here. (Reminds me of Author's Purpose and Craft. TEA has a specific purpose for their readability stuff. It's about making sure they can defend (to the legislature and parents) what kinds of texts they are using to assess the curriculum. Teachers' have different reasons - ones that have to do with supporting a child and their growth instead of the validity of the assessment.

Secondly, we have been tracking Bloom's and DOK as quickly-trained evaluators (I had a three-hour seminar years and years ago at xxx; xxx has had some haphazard PD as a middle school teacher over the years). As you no doubt know, for a STAAR assessment, we find a lot of DOK 2 / Bloom's "Analyzing" items, and so it seems like it might not be the most useful metric, but we are also not experts and might be missing some subtleties between TEKS and items that would give a more varied report. So my question is two-part. Do you agree that we are likely to see similar DOK/Bloom designations across many items, and if so (or not) is this information you think teachers will want or could use in classroom instruction or reteach? Is the information bang worth the research and editorial review bucks for DOK and Bloom? And perhaps DOK is appropriate and Bloom's not (I kind of lean this way personally)? Sop that's four questions, I guess. :)

Can-o-worms. Earlier, I described problems with DOK and questions. If you are not matching the question and the curriculum to determine the DOK, then the data you get from that doesn't do what most think it would do. So...that has to be fixed first.

Do I think we are likely to see similar DOK/Bloom designations across many items? My first response is: People tend to do what they have always done. So yes. TEA tends to do all kinds of things from one design to the next. So no.

My second response is: How does any of that help the teacher? We see this ongoing work in training for unpacking standards. But honestly, if TEA isn't transparent about what they thing is DOK or Blooms, then we are guessing. Do we have solid instructional practices and understanding of the curriculum that LEADS to success on these descriptions? Labeling them without that seems like a waste of time to me. Teachers might want to put DOK and Blooms as compliance measures for item analysis or in lesson plans, but honestly...what does this change for the instructional impact on students? "Oh, it looks like 75% of our kids missed DOK 2 questions." Now what?

My third response is this: We haven't even gotten results back. Districts and people are downtrodden and devastated and confused. They are all feeling a little nutty about everything. It's too early to even know or make a good decision. I'm wondering if we are making all of this so much more confusing than it ought to be. Mom always says, "Rest in a storm." That never feels good in a storm. But what good are we going to do if we try to build a DOK nest in a tornado?

Is this information you think teachers will want or could use in classroom instruction or reteach?

I don't know. Do we know what kind of instruction fixes DOK problems? I'm not sure we do. Is the DOK or Bloom what's actually causing the problem? For lots of reasons, I don't think so. There are too many variables and too many of them cross pollenate and create varieties of problems that didn't exist before or for everyone. There are SO many instructional implications before we ever get to a question and its level that are not being addressed. It seems counterintuitive to fixate on DOK before we know and repair the underlying issues.

Here's an example. A local school here decided they wanted a greenhouse and a tennis complex. Funds were acquired. Community was excited. Structures were built. Programs began. Years passed. In the nearby school building, walls and whole classrooms in the school threatened to collapse. Know why? There was no problem with the structure and quality of the building. The contractors had done masterful construction that should have lasted a century or more. The greenhouse and tennis complex were built on the well planned and placed drain fields to take the water away from the sodden clay of our panhandle soil. The problem isn't the structure of the building/question/DOK. The problem is how the whole system worked together.

Is the information bang worth the research and editorial review bucks for DOK and Bloom? And perhaps DOK is appropriate and Bloom's not (I kind of lean this way personally)? The problem is that we have to make decisions now when we don't have the land survey to tell us how things are built and how we should proceed.

It's a crapshoot. We might spend a lot of our resources to make something that isn't useful. We might make something that looks good and attracts attention but isn't consequential for helping those we want to serve the most: our students. I'm more "meh" on Bloom's as well. I just can't bring myself to care much about either one when I consider all the things that need to be fixed before labeling a question with a level that we can't validate. I also think the question types themselves indicate their own DOK.

Sunday, March 12, 2023

Should we take the interim? And then what? Part One

Draft

Should we take the interim?

Yes.

Reason One: It's a giant research study to see if it is possible to measure growth over time instead of on a one day high stakes assessment. That's what the legislation originally asked for TEA to study. Part of me wants to say it can be done. And, that's not really how the interim is being used right now. Reminds me Dr. Ian Malcom on Jurassic Park that says, "Your scientists were so preoccupied with whether they could, they didn't stop to think if they should." Right now, the interim is supposed to predict the likelihood of passing the STAAR at the end of the year. So many variables are in place socio-emotionally, culturally, academically, and within each subject domain and test design, that I fear we are not measuring what we think we are anyway. It's a good idea to see if this works or not.

Reason Two: Prove them that teachers are the ones that know best, not an assessment. I'd really like to see the data that says it predicts what it says it does. But from what I've seen in reports from campuses last year and their STAAR results, the interim data didn't match any of the projections from the interim for ELAR. So...let's take the thing and then bust out our best to make sure kids are learning beyond the progressions and predictions.

Reason Three: It gives the kids an "at bat" with the new format and item types. I'm ok with that rationale...except: Have we explicitly taught digital reading skills and transfer of knowledge and strategies for the new item types, TEKS, and skills? Have we had enough massed and distributed practice on these skills before weighing the baby again? If we used the interim as an instructional tool, maybe. We could use the interim as a guided or collaborative practice. But as another source of decision making data? Not sure that's accomplishing our goals to make kids do things alone that we already know they don't have enough experience to do well. Sounds like a good way to disenfranchise struggling learners with further beliefs about how dumb they are. It's like signing up for fiber internet and paying for it before the lines get to your neighborhood.

No. It's a giant waste of time for kids and teachers.

Reason One: After examining the data, I have NO idea what I'm supposed to do in response to help the teachers or the kids. More on that later.

Reason Two: It's demeaning and demoralizing. Do I really want to tell a kid in March, a few days before the real deal that they have abysmal chances of meeting expectations? Do I really want to tell teachers that x of their z kids aren't in the right quadrant to show growth when they have less than two weeks after spring break to do something about it? If they even believe that the kids took the exam seriously? They already know the kids didn't use their strategies and purposefully blew off three or more days of precious instructional time while taking the dang thing.

Reason Three: Did we do something about the last data we collected on the interim? Do the kids know their results? Have they made a plan to improve? Do we have a specific plan? Have we fixed the problems that caused the first set of results? People are having data digs and meetings to tell teachers what to do and how these predictions are going to play out for accountability. We're having tutorial programs and hours to meet 4545. We're doing some stuff, but is it really a detailed and reasoned response to resolve the causes of the data? Have we fed the baby enough to cause weight gain before weighing it again? No.

Reason Four: The data is correlational, not based on cause. The data on the interim tells us the correlations between one data collection (STAAR last year) and the next assessment. Results are correlated to the probability of success or failure and do not pinpoint the cause of the success or failure. When working with human subjects, it is humane to use correlational data to make instructional decisions about nuanced human intricacies for individuals in such complex settings and soul crushing accountability for personal and collective judgments?

An additional problem with the interim is that you don't have a full trend line until you have three data points. Statistically, it doesn't make sense to take last year's STAAR results (which was a different test using different standards) and pair it with a second interim. There is no trend line until the third assessment even if the assessments were measuring the same thing.

Yet, that's what teachers were asked to do: make some decisions about indictments on their instructional practices and resulting student performance on data that doesn't mean what they were told it meant. Furthermore, teachers are told to revisit previous CBA's and other data to determine what needs reteaching. The advice is well meaning, but in practice is too unwieldy and flawed to do anything other than make teachers want to pull their hair out and cry out in desperation and stress.

More on that in Part Two: We took the interim. Now what?

Sunday, January 15, 2023

Stations for Editing and a Few Questions

Mr. Manly Transcript and possible stations. Note that you will need to edit out one line after the firefighter is introduced.

Extending with the ECR: With Love and Linked Lessons

Good Morning Dr. Rose,

Today, I will begin using QA12345 with my students. However, this semester I have honors students, and I was wondering what you think I can do to increase the rigor using the QA12345. What are some ways we can have them elaborate, or expand their thinking and writing using the QA12345?

Any suggestions or feedback helps.

Thank you,

High School Teacher

How lovely to hear from you. Thank you for asking. I was working with a group of teachers in Small City, Texas this week that had similar questions. I see a few directions to go.

One: First we use the QU12345 to get the basic topic sentences. Then we use strategies such as looping to help students think of the next thing to say that is connected with the previous statement to deepen the elaboration into a paragraph. With my students, I always teach prove-it's next, then depth charge. At this point, students are ready for Starring or CAFE Squidd. If they are writing narratives, I present tampering with time lessons from After the End. By this point, students are ready to delete stuff from their writing that is repetitive. I also introduce the dead giveaways from Gretchen's site and throwaway writing activities. My friend Cheryl gives kids this activity to think about within and between paragraph structures to try as well.

Two: Sharing. Students should be sharing their writing by reading it aloud to peers and receiving feedback. Start with Pointing and then follow the first two rows of activities.

Three: Examine the craft of other writers and how they develop their ideas. Then try out these ideas in your own writing. Share the befores and afters in small groups. Here's how we worked that out in Small City, Texas this week. Kids began with lesson one in Text Structures from the Masters. We introduced it by saying that as they are maturing as 9th graders, we see a lot of growth in the second semester. They are becoming more of who they are going to be as adults. (We're trying to combat the immaturity we have seen after COVID.)

Next, we had them annotate the kernels in the Hippocratic Oath. Then we went deeper to analyze how the author pitchforked: "I swear by Apollo the physician, and Aesculapius the surgeon, likewise Hygeia and Panacea, and call all the gods and goddesses to witness, that I will observe and keep this underwritten oath, to the utmost of my power and judgment." We noticed that he was referencing authorities that influenced him. We noticed how he put these as items in a series using /likewise/ as a connector as well as the common conjunction, /and/. We noticed how the prepositional phrase at the end allowed the writer to clarify the depth of his devotion and efforts.

Teachers then re-entered their writing to show how they could try out these techniques in their own writing. After modeling the process, teachers allowed students some time to imitate these moves in their own writing from the original writing. They color coded or annotated their moves and revisions just as we did in the mentor text and tested out the ideas with their feedback groups.

For the next lesson, we chose Sojourner Truth's, Ain't I a Woman? speech. We followed the same write, share, kernelize to comprehend, and then annotate for craft process as with the other text. We dug in deeply to name colloquialisms and the irony in the speech. (Normally when people talk in rough mannerisms, they are considered dumb. But Truth's analysis here is astute and wise, full of rhetorical techniques.) We dug into the anaphora (repetition) with her rhetorical questions and the impact they had on us as readers and on delivering the message. We looked at the biblical allusions and how they were used as criticisms/attacks on the reasoning of those who didn't take her perspective.

Next, teachers re-entered their writing to try out the anaphora or one of the other techniques. Since we were in a PLC, teachers were able to take different techniques and try them out in class. These became more modeling texts that they could use in class. (It is important to note: teachers may have prepared the writing beforehand, but when it came time for class, they wrote live and explained their thinking aloud.) Next, students tried the strategies in their own texts, shared them with peers, etc.

That's a lot. Let me know if you want to talk on the phone or zoom. As teachers implement these lessons, we'll have some exemplars.

With Love and Lessons,

Dr. Rose

Monday, December 5, 2022

Best STAAR Resources...For Now

What STAAR resources should I buy? What online program is the best? What books should I read? Where are the best resources for Author's Purpose and Craft? For ECR and SCR?

All. The. Time people are asking.

Right now? The best resources you have are on the STAAR redesign website from TEA. Look at ALL the scoring guides, even if you teach a grade not listed on that material. Look at all the released new item assessments, even for the grades you don't teach. The guides, collectively, give you the best information about how TEA is designing items for all grades and all items.

Look carefully at how the passages and questions intersect. For example, when students are asked to combine sentences, look at the passage to see WHY they need to be combined. It's usually vague references with pronouns, repetition, or the connection between clauses with coordinating or subordinating conjunctions.

Look at how the passages are set up with the introductory or footnote material, especially for excerpts. Look at how the multipart items are connected. Consider the deep thematic links between excerpts and sections.

This may sound tacky, but publishers have not had adequate time or materials to prepare materials that fully match what was released. The last updates were in October of 2022. And we still haven't seen the TEKSGuide for High School. If you see stuff printed before that, you run the risk of the publisher's interpretation instead of TEA's.

Yes, students need online practice opportunities. TEA provides practice on the Cambium site, with TFAR, and Interim assessments. Let's start there.

Unpopular Opinion: The time to buy materials from publishers that matches content, rigor, and question types is not now. Perhaps next year.

Note: I have worked with content and item reviews for Sirius Education Solutions. I believe they have done a wonderful job with examining the standards, what is out there from TEA, and ways to give feedback to students in their online platform. From what I have examined in other platforms, this provides the most curated experience for students needing practice with online formats and item types. As new information is presented, the content, item types, and passages are updated and refined.

Saturday, November 26, 2022

Graphic Organizers Aha!

I'm teaching a class on writing. One of the assignments is to help a writer. Here's a portion of the sample and its source:

It's obvious that the writer needs paragraphs. But what feedback would help the writer most. They are past the prewriting and composing phase, so suggesting a graphic organizer at this point would be frustrating because the writer would feel like they have to start over. No, we need something that helps at the editing/revising phase.

I examined the lesson a bit further to find that the teacher was using Empowering Writer's graphic organizer called the Opinion Pillar. (See figure 2. ) While I think that there are better organizational structures than the flawed five paragraph essay formula, there's something I never noticed about graphic organizers. John used this formula. He should have had paragraphs. Why didn't he?

Because the paragraphs are implied on the graphic organizer.

Figure 2.

AHA! John didn't realize that the structure of the graphic organizer told him where his paragraphs needed to be.

What if we did these things after composing the paper?

John, I want you to return to the graphic organizer that we used to plan this essay. There's something hiding. (Previously, I had taken lemon juice to write some notes on the paper. I placeD the paper over a light bulb to let the heat turn the lemon juice brown.)

Next to Main Reason #1 appears: the pilcrow editing symbol. Keyboard strokes of enter and tab appear. New line and finger space appear for handwriting.

John, see what happens when you put the main reason 2 and 3 over the lightbulb. Isn't that cool. Now let's go back to your writing. Take a highlighter. Where did you write your main reasons? Now that you have identified them, you can make the paragraphs visible for your reader!

Monday, November 14, 2022

Transitions and Reading Comprehension

Transitions are tools to connect ideas. I worry that the focus sometimes is on using them for templates or graphic organizers for writing instead of thinking about how they help the reader follow the writer's path. That means that our instruction has to help students consume texts by noticing how writers connect the ideas in paragraphs. Then we have to help students revisit their writing to see what transitional pavers they need to lay down for their readers.

Let's take a walk through how that might look. This is the passage students read for the stand alone writing field test in 5th grade.

First, I process the title. The topic is going to be Steam and Sail. I mark that with a small t. I predict that the article will be comparing those topics. I mark that with a little light bulb.

Now my purpose for reading changes. I find where the introduction begins and ends. I know that it begins and ends with the first paragraph because the first heading begins after that. Now my purpose shifts to find how the writer is hooking me and what the thesis is. I'm also reading to see if this article will be comparing steam and sail. I can see that they start talking about railroads and then use a transition word - but. This signals me to know that the writer is shifting to a new topic that contrasts with this one. Then I read that steamboats and sailboats cause changes! Now I know that we aren't really comparing steam and sail. I'm reading to know how they caused change.

Now I preview the headings: Full Steam Ahead, Tea Races and Gold Rushes, and The Next Big Things. I have an idea now about the topics for each section.

At this point, I'm ready to read until the text starts talking about something else. The transitional phrase, "After Fulton's success" tells me that the writer is moving on to a new topic - what happened after Fulton invented the steamship. I stop to think about what this paragraph told me. Basically, it was about how fast the steamship was as an improvement to previous ways to travel.

I use the same process to understand the next paragraph. But something interesting happens here between paragraph 3 and four.

Before, we had a transitional word, "but," that transitioned between sentences and ideas. Then we had a transitional phrase, "After Fulton's success," that moved us to sequence and another topic between paragraphs. Now we have the word "rivers" at the end of paragraph 3 and the connection to specific rivers in paragraph four. This is a topical transition between paragraphs that moves the reader from general (rivers) to the specific (the Mississippi and Ohio.) As I stop to think about what this paragraph is doing, I realize that the writer is explaining the impact, or effect of the steamship. When I connect that to the thesis of change, I realize the change is about where people traveled and moved.

When I process the ideas in the next section, I realize that paragraphs 5 and 6 are offering another contrast as a whole section because of the transition word, "But." I also see a new type of transition: "Across the ocean in Great Britain." This is a conceptual transition that helps the reader know the setting has changed.

I learn about clipper ships and the changes they brought in paragraph 5. When I see paragraphs 6 and 7, I see that they are both talking about why people were still using clipper ships even though steamships had been invented. First, second, and third are not transition words here. They are words that help the reader keep track of the number of reasons the clippers were preferred.

When moving to paragraph 8, we see another topical transition. Notice how paragraph 7 ends with "tea races" and getting there "first"? Paragraph 9 uses the words also and speed to introduce another topic. We've moved from tea races to the Gold Rush.

The next heading provides the conclusion with a transitional phrase that indicates sequence and the passing of time, "A time went on." The thesis is revisited in the last sentence.

Are we teaching these types of transitions? Or are we just teaching transitions as a list of words?

Single words and how they function within sentences as well as paragraphs
Paragraph blocs and how they are structured (compare contrast)
Transitional phrases and how they function
Topical transition concepts from the end of one paragraph to another
Conceptual transitions and how they function
Transitions and how they help us follow the writer's meaning and organizational structure

Y'all. I just think this is so important. We need to look at how real writers move between ideas in sentences, paragraphs, in paragraph blocs, in headings before we start offering graphic organizers that don't really match what writers need to say. You don't know what transitions you are going to need until you decide how your ideas are related. And to make those decisions as a writer, the best models we have are out there as published work - what real writers are doing - rather than offering things people don't actually do like first, next, then, finally. Meaning dictates form (Vygotsky).