classroom observation case study

Skip navigation

Lugossy, Fodor: Research Methodology manual
What is this booklet about?
1 What is classroom-based research?
2 Suggestions for small-scale research with stories
3 Guidelines for writing small research papers and a sample study

4 Classroom observation: implications of a case study. Evaluation and sharing the findings to discuss future options

Observation is powerful. Its significance and role have long been the focus of interest inside and outside humanities. The Grimmean littlest son leaves home, learns a trade, and wins the kingdom because he observes the world and becomes empowered by what he learns from the experience. In the 1970s the social learning theory or observational learning in behavioral psychology gave new impetus to conceptualizing observation in education. Observation and the data it brings forth have been recontextualized since the 1980s and they have become one of the key agents of awareness raising and teacher development as well. In the most Hungarian schools model teaching behavior is the accepted norm and a crucial standard in mastering the trade. The significance of further professional development is denied on many instances. Self-observation may take place in order to enhance in-service teacher development as well as create an opportunitiy to challenge and solve frustrating and problematic teaching situations. Bailey, Curtis and Nunan define teacher development as an ongoing process of experiential, attitudinal and intellectual growth of in service teachers (2001). This approach emphasizes improvement and relies heavily on the awareness, or the culturally defined understanding of people's beliefs, of previous knowledge about teaching and learning. Gaining awareness of teaching is a prerequisite to teacher development as well as it is the primary aim of exploration. Learing more about someone's teaching does not imply that some ways of teaching should be preferred to others because they are better. The goal is to be open and to attempt to grasp a clear view of the dynamics of teacher-student as well as student-student interactions and rapport. Throughout their careers teachers establish, maintain and modify their own assumptions about the nature of teachability and learnability, nevertheless, very often these two concepts develop asymptotically. In an ideal case, when they explore teaching teachers test and reconceptualize teachability and learnability and assign in them a better articulated role to their learners, too. As Gebhard and Oprandy point out, "when we explore teaching, we simultaneously probe ourselves and the larger meaning of our endeavor" (1999, p. 4). Teacher and teaching become sources of data to be probed and understood in context. Five major techniques of data collection exist: 1) observation: self and others; 2) action research; 3) reflective journal writing; 4) supervision; and 5) conference with teachers . In this part of the manual we deal with the modes and implications of diagnostic classroom observation with special focus on the improvement opportunities provided by self-observation. We will also need your insights and creative thinking to imagine yourself into the position and situations of the described teachers to fully understand the potentials of this diagnostic tool.

First in this part of the manual you will read a short summary of a case study, which sums up what happened in an elementary school English as a foreign language class to show how without minor steps taken continuously to remain connected with the learners may lead to insurmountable discomforts in one's teaching practice. Of course in most schools and situations a teacher rarely ends up so demotivated and lacking resourcefulness as in the cited story. We will use this story to help you understand minor and major signs when intervention is necessary and unavoidable to maintain integrity. The remaining of the section will be devoted to a detailed analysis of a classroom-observation based self-improvement case study. Transcripts of the observed class will be provided to provoke further thinking and prove the diagnostic capacity of self-observation based action research.

Bailey et al. claim that self-awareness and self-observation are the cornerstones of all professional development (2001). Even though our usual aim in observation is to evaluate someone, it should rather be about offering a different way to look at teaching and learning. We can easily turn classroom observation to serve our needs to revitalize and rethink our current practices with the overall goal of solving some long-existing problem. Gehard and Oprandy define classroom observation as the "nonjudgmental description of classroom events that can be analyzed and given interpretation" (1999, p. 35). Guided, systematic, and focused observation helps conceptualize and deepen knowledge dealing with teaching and learning in general. If observation is seeing teaching through the lenses of the ‘other,' self-observation is the organized and regular recording of one's own classroom behavior for the purposes of earning a better understanding of procedures and meanings. Hearing, analyzing and interpreting the data from self-observation based action research provide the potential toward reinventing ourselves in teaching. Bailey et al. holds that "self-observation implies a professional curiosity-watching, listening, and thinking without necessarily judging (2001, p. 27). As the philosophical core of observation, Gebhard and Oprandy identify nine key assumptions (1999) of which we find the following five relevant to the purposes of the present research manual.

"Take responsibility for your own teaching" can be self-evident if you want to improve. The first step is always recognizing that you have individual responsibility in the process.
"The need for others" maxim holds that successful exploration requires perspectives other than our own. In the case of self-observation, the possible interviewing of the participating students may provide the fresh look.
"Description as opposed to prescription" indicates awareness and plays a key role in the meaningful use of the data. In self-observation projects, processing the data may also involve "self-help-explorative supervision" (Gebhard, 1990, p. 163). Accordingly, the teacher reconstructs teaching based on awareness gained from observations of teaching.
"A nonjudgmental stance" is very closely related to the previous category of favoring description over prescription. Judgments and judgmental position may be an easy obstacle to seeing what really happens in the classroom. Data collection methods, such as audio- or videorecording the lesson, transcribing recorded data, coding the interactions as well as studying the coding to recognize patterns of interactions are such systematic ways of handling the data and remaining objective.
"Attention to language and behavior" means that recording the lesson and coding the data help both observer and the observed to focus on the events and interactions. The coding system also serves as a metalanguage to talk about teacing rather than sink back to the comfortable yet highly inferential and thus also judgmental position of general statements, words and phrases.
"A beginner's mind" refers to exploring the classroom without preconceived ideas about what should be going on in there. The task seems enormous, how can I explore a classroom without preconceptions after having taught there for a certain amount of time?

Following the introduction about the power of observation consider the following story in terms of possibly using classroom observation for diagnostic purposes. A class of sixth graders, who learn English as a Foreign Language in four hours per week, has brought their middle-aged female teacher, Bernadette to a stage where she considers serious sanctions to punish her students for inappropriate classroom behavior and lack of interest and motivation.

The teacher here clearly has problems motivating her students and she thinks that students can only be task focused if they work quietly and do not question procedures, results or solutions. Her frustration will likely continue in other English lessons unless she seeks some form of intervention, perhaps an outside person visiting and observing her class and comment on what may have gone wrong. Her teaching style seems cooperative on the surface, yet she switches back to being authoritative when she thinks that she has failed in controlling the classroom. It is highly unlikely that one visit to her classroom would surface the issues as this visit is perhaps seen by her as a compulsory threat that she has to get through as quickly as possible, instead of regarding it as an integrated part of her teaching. Convincing her about the value of observation and diagnostic intervention-if possible-takes time and effort. In addition to the peer feedback she would have to be encouraged to reflect on what is happening in the classroom, precisely what upsets her to such an extent. She may also consider discussing it with her students and consider their reflections as well in the long term solution to the problem. She may gradually change her attitude to that of the reflective practitioner who discovers more about her own teaching by focusing on the locally based processes of teaching and learning (Schön, 1983).

In Anna's case, she has problems with rapport, discipline, and the motivation of her students. She does not approach language teaching as the management of interpersonal communication through using the content and form of the foreign language (Cogan, 1995). Furthermore, she fails to see how the introduction of genuine, meaningful and motivating contexts for using English in the classroom would result in the lower anxiety level of the learners and their readiness to accept rather than challenge the teacher. She may however feel anxious about being visited and observed. It is actually two ways of conceptualizing the EFL classroom which are in conflict here. To avoid the chasm some form of diagnostic intervention would be needed immediately, which may run into some obstacles rooted in orthodox Hungarian teacher beliefs and practices.

The first question requires you to think about the classroom conflict between learners and the teacher. As Anna has readily available lines of scolding her students and she seems very tired of uncontrolled noise it is very likely that she does not want to consider the real age-related characteristics of her learners. First that they appreciate classroom activities with opportunities for group work, where they feel safe and more ready to share or discuss their solutions with peers rather than answer the frontal solicitation of the teacher. Anna seems to know this; however, she fails to fully implement group work with the accompanying noise and temporary chaos. Moreover, it is very unlikely that her students genuinely feel bad about Anna. They are too restricted in most of their school activities and use every opportunity to express themselves and feel less tense as would be the ultimate goal of cooperative teaching and learning.

Gebhard (1984) claims that in a number of second language teacher education contexts educators tend to limit them to giving the same reasons for doing supervision and to have the same supervisory behaviors in spite of the availability of a wide choice of supervisory behaviors. Although for a while external monitoring of teachers can be a source of providing feedback the ultimate goal should be an independent teacher capable of self monitoring and improvement. Based on his long discussions with teachers and doctoral students in Hungary's two teacher training programs, Gebhard identifies a few culture-specific issues why certain modes of teacher development in Hungary are difficult to apply (2003). The first issue is time. Gebhard's conversational partners talked about their overburdening which made it hard to focus on improvement and their unwillingness was chalked up to the little time and even less energy left for exploration of issues and development. Gebhard labels the second issue "comfort." Accordingly a number of teachers expressed their anxieties related to observation, that they did not feel comfortable with some other professional present in the classroom. Although some teachers are comfortable with observation and reflective practice, a large number of them are not. The third point concerns "teacher isolation." Most responding teachers felt that professional and collegial collaboration is almost entirely missing from Hungarian educational institutions. Accordingly, it would be difficult to ask a colleague, or trusted teacher friend to come, visit classes and discuss about diagnostic steps and opportunities. The fourth issue as an obstacle to self-development is often "trust," meaning that most teachers carry unpleasant memories of being observed during their teaching practice and need not only simple encouragement to tackle their aversions in the first place. They also mentioned that observation often equals judgment instead an opportunity to discuss teaching as a holistic experience. A few of Gebhard's respondents also pointed out that the still palpable Prussian heritage in Hungarian education makes it hard to switch to qualitative methods and reflexivity in evaluation. We would also add one source of discomfort with an outsider in the classroom and that's the anxiety felt over their knowledge and use of English in the classroom. Most of the teachers over forty years of age have spent only little time in English speaking countries let alone working in an educational context with native speakers. At the age of the internet and extended online opportunities to develop this should not be a huge problem, however, most teacher would refer to the previously revealed constraints fof further development, such as time constraint. With all such traditional misconceptions and dislike toward observation it is hard but not impossible to change the views of the Ms. Hermina, the teacher in the case study. In such education context self-observation is the less intrusive method and if the teacher is willing and ready to accept outsider views creative supervision may also bring up issues that supplement the teacher's reflections of her own context and approach.

Regarding its instrument, self-observation, however, requires a slightly different approach from traditional observations. External observers, even colleagues, would structure their description of the lesson on some formal list of categories, such as personality features of the teacher, preparation, English in the classroom, management, rapport and relationship with students, overall impressions. These are easy to identify and follow in the lesson, and the post-lesson conference should be about a genuine discussion of the findings of the observer with the observed teacher. One of the challenges in the case of structured observation would be the readiness of the teacher to distinguish these insights from negative criticism and handle them as starting points for further growth and development. In a Hungarian educational context, most practicing teachers feel that observation standards are not adaptable to the specific context and issue, but they are set by external supervisors. Supervision has served evaluative and assessment rather than developmental or diagnostic purposes. For teachers socialized into this role of observation it is hard to see the real potential of this method of action research. They often would want the observer-supervisor to give a full-length description of what they have "done wrong" instead of accepting constructive help or simply an initiative to think together about questions that concern the success of their teaching as well as individual professional satisfaction. Self-observation does not evaluate or assess teachers' aptitude for language learning rather how the usual practices in their English lesson bring less the required results.

The particular class was about the Great Depression of the 1930s and students had a discussion of the assigned chapter entitled The Thirties from the course textbook. To understand about how sociocultural competence may be developed in the classroom where English is used as a foreign language to talk about cultural content, we will sum up some approaches to the concept.

Approaches to culture

Defining culture is a complex task, since both its content and nature have to be examined. In this respect culture may be referred to as the complex entirety of the customs, ideas, values, art etc. that are produced and/or shared by one particular group of people, which may change over time but only if the majority of the members accept the changes. From a sociological perspective culture refers to the social heritage of a people - the learned patterns for thinking, feeling and acting that characterize a population or society including the expression of these patterns in material things (Wardhaugh, 1994). We understand culture essential to our humanness and some social scientists use the term "society" interchangeably with "culture", since culture lacks a life on its own and exists only with the people to enact it. The basic assumption behind the notion of culture is that human behaviour varies from society to society over the globe, which though makes defining culture more difficult, yet offers us the freedom to challenge essentialist approaches to it.

Culture in Applied Linguistics has inspired somewhat different definitions. In his widely quoted definition, Lado (1957) describes culture generally as the ways of a people which approach reflects the variety and inclusive nature of both the word and the concept. Richards, Platt and Platt define culture as the total set of beliefs, attitudes, customs, behaviour, social habit, etc. of the members of a particular society (1996). Wardhaugh interprets society as "any group of people who are drawn together for a certain purpose or purposes and a language is what the members of a particular society speak" (1994:1). In this last definition we meet the concept of language in a context which suggests that language, society and culture are inter-dependent. Byram (1989) takes one step further when he attempts to approach culture as an omnibus term (also Kaplan and Manners 1972:3). Though defining culture is attempted in a number of disciplines dealing with society and culture, the task seems, are notoriously difficult particularly in anthropology, yet, as Byram notes it is as good a label as any for the overall phenomenon or system of meanings within which sub-systems of social structure, technology, art and so on exist and interconnect (1989).

In their more recently published approach Adaskou, Britten and Fahsi refer to ‘four separate sorts of "culture" that language teaching may involve' (1990:3-4). The first category is an aesthetic sense of culture, including cinema, music and literature, in brief ‘culture with a capital C.' The organization and nature of family, home life, interpersonal relations, customs, institutions, work and leisure, and material conditions of a society - the sociological sense - means ‘culture with a small c'. The ‘conceptual system embodied in language, ... conditioning all our perceptions and our thought system' belongs to the semantic sense. Adaskou et al. (1990:4) classify ‘the background knowledge, social skills, and paralinguistic skills that, in addition to mastery of the language code, make possible successful communication' as the pragmatic (or sociolinguistic) sense. All these different meanings of culture are defined by and therefore centred around language. Therefore, the inter-related nature of language and culture has gained evidence from yet another angle.

In his approach to culture Sarangi finds it necessary to acknowledge in alignment to many scholars ‘that any definition of culture is necessarily reductionist' (Sarangi 1995, in: Holliday 1999:242,). Therefore, he suggests that two paradigms of culture be distinguished instead of what has become the default concept of ‘culture' referring to ‘prescribed ethnic, national and international entities' (Holliday 1999:237-264). In this taxonomy a ‘ large culture paradigm' refers to culture that is by its nature vulnerable to a culturist -prone to excessive stereotyping- reduction of foreign students, teachers and their educational contexts. A ‘ small culture paradigm', on the other hand, defines culture as small social groupings or activities which display cohesive behaviour. On the basis of these two paradigms our perceptions and interpretations of culture may be characterized as static (large culture) or dynamic (small culture), claiming culture as a means of investigation rather than as an end-product.

To sum up the information these definitions provide us with concerning the nature and content of culture, it obviously displays an enormous potential to be exploited in teacher education programmes. This discipline may be labelled Applied Cultural Studies, including applications of the target language culture through content areas such as anthropology, ethnography, cultural geography, literature and sociology as well as methodological issues on how to research and interpret such cultural materials.

The relationship between culture and language

The relationship of language and culture has been perceived in rather different ways through time. Vander Zanden finds language the most important set of symbols a human being possesses, which allows him to create culture and perpetuate it from one generation to the next (1988:63). Earlier it was assumed that learning the language should always precede learning the culture. The ‘linguistic relativity hypothesis', that every language cuts the world into dissimilar pieces, thus drawing our attention to different faces of experience, (Whorf 1956, in: Vander Zanden 1988) served as a breakthrough in judging the relationship between language and culture because it assumed direct link between the two. Many scholars, however, warn of the fallacy of the theory itself (Pinker 1994:60). Doubtless though it influenced much of the way we tend to think about culture and language. By the 1990's the axiom that ‘culture is the context for language use' (Lessard-Clouston 1996:198) has become widely accepted and exploited. Byram notes, for example, that

language pre-eminently embodies the values and meanings of a culture, refers to cultural artefacts and signals people's cultural identity. Because of its symbolic and transparent nature language can stand alone and represent the rest of a culture's phenomena, ... [it] cannot be used without carrying meaning and referring beyond itself, even in the most sterile environment of the foreign language class. The meanings of a particular language point to the culture of a particular social grouping, and the analysis of those meanings - their comprehension by learners and other speakers - involves the analysis and comprehension of that culture. (Byram 1989:41)

Another noteworthy aspect of the relationship between culture and language is the interdependence of communicative and cultural competencies. According to the findings of Manes and Wolfson (Wolfson 1986), ‘a single speech act may vary greatly across speech communities,' that is language exists primarily beyond the classroom where communicative competence, language and culture are all equal parts of successful communication. Buttjes claims that communicative competence has to be regarded as much more ‘than a purely linguistic decoding facility. Since language and culture are so intimately interrelated in the experience of both native and foreign speaker, cultural competence must be involved at all stages of such an encounter' (Buttjes 1990:55 in: Lessard-Clouston 1996:198). This is because familiarity with the background culture is the clue to understanding linguistic behaviour, or else as Saville-Troike (1983:131-132) points out ‘the concept of communicative competence must ... be embedded in the notion of cultural competence' (In: Lessard-Clouston 1996:198). The content of cultural competence defined here by Manes and Wolfson, Buttjes and Lessard-Clouston is refined and extended in Byram's definition of sociocultural competence. Byram proposes

to define sociocultural competence in terms of a content of which learners should be "aware". Furthermore, some parts of the specified content might appear to be "universal", although in fact they tend to be centred on the developed North, and have a tendency to be ethnocentric. In so far as the common framework is European, this is to be expected, but it is doubtless desirable to establish a potential for links with other developments, for example in North America. (Byram 1997:9)

‘Attitudes and values' refer to the affective capacity to give up ‘ethnocentric attitudes towards and a cognitive ability to establish and maintain a relationship' between native and target cultures.
‘The ability to learn' is identical with an interpretative system or cultural code (Guerin, Labor, Morgan, Reesman and Willingham 1992:249-250) which helps gain insight into yet unencountered cultural meanings, phenomena, expressions.
‘Knowledge' is defined as ‘a system of cultural references which structures the implicit and explicit knowledge acquired in the course of linguistic and cultural learning', which also considers the special needs of the students when interacting with native speakers of the target language.
‘Knowing-how' tends to integrate all the three capacities in ‘specific situations of bicultural contact, i.e. between the culture(s) of the learner and of the target language' (Byram 1997 14-20). Via the development of these four sub-competences the sociolinguistic ability, the knowledge of culture areas and the knowledge of culture analysis are emphasized.

To sum up a learner possessing sociocultural competence will be able to interpret and bring different cultural systems into relations with one another, to interpret socially distinctive variations within a foreign cultural system, and to manage the dysfunctions and resistances peculiar to intercultural communication, which we shall henceforth refer to as "conflict". (Byram 1997:13).

It is possible that not everybody agrees to being recorded or photographed. In that case other ways of classroom observation data collection should be chosen, e.g. inviting a colleague to take notes, taking notes by ourselves during class (if students are engaged in solving tasks) or after class, based on our memories. If a few students agree, they may be asked to take notes and reflect on the lesson as well. If we cannot record the lesson it is useful to have a detailed lesson plan with enough space to take notes, jot down things that come to our mind while teaching. This is of course a plan B that a teacher always wants to have for scenarios better or worse.

Thinking of what to share with the participants about action research can help make you more conscious of the goals of the project as well as see the difficulties and the potential controversies and respond to the challenges. It does not only reveal our ability and willingness to reinvent ourselves but also display some of our vulnerabilities as teachers to our students and accept our right to be wrong. It is also a further option to share the recorded material with our post-primary learners and involve them in the teacher development project. Such an experience may add depth to our relationship with the learners as they also think critically of their own participation and learning process. Furthermore, all teacher development projects should note that exploration and development cannot be done in a vacuum. We need another person's perceptual filter to see through the looking glass. This is an opportunity for us to gain deeper awareness of our teachings and empower ourselves to know how to make our own informed decisions. There is little evidence that any one way of teaching is better than another in all settings. We need to collect and understand descriptions rather than follow prescriptions. Self-observation gets us the necessary data to engage in a meaningful dialogue with our learners, as well as to understand why we still do or do not do certain practices.

The following excerpts are coded as communicative moves and the coding explores the rules in our teaching, it helps generate alternatives and sees the extent to which rules are broken. Furthermore, it helps raise questions about preconceived notions. Coding is the formal analysis of the material to discover themes, patterns and ways of teaching. It provides insights into the subtleties and nuanced details of the classroom communication and practices. In the present study, utterances are perceived as communication acts and are evaluated along their direction, labeled as "move" and according to the content, labeled as "message." In square brackets, we provide the abbreviation we will use in the coding table for each type of source or target.

According to Fanselow (1987), a certain act of communication can be coded as a move and as a message. XXX has a source and a target, and both of these may be the "teacher" [t], the "student" [s] or the "other" [s]. The teacher could be any one person who assumes the role of a teacher. A teacher can be a student who presents or teaches something to the class. It is therefore, not a strictly qualification- or age-related position. It is the person who aims to transmit some kind of information for further discussion, understanding or simply learning to the rest of the group. A student can be any one person on an equal basis with another. The other category accounts for communications from outside sources such as labels, books, video clips, cell phone calls etc. if the communication is responded to. On one level only four things are done in a classroom and according to the purpose of the communication there can be four types of communicative move: "structuring" [str], "soliciting" [sol], "responding" [res], and "reacting" [rea]. Structuring sets the stage to subsequent action and behavior. It is often manifested in the form of announcements, what will happen in the classroom and who is responsible for the events. Soliciting refers to setting the tasks by asking questions, issuing commands, making requests and require responses. When students reply to soliciting it is coded as responding, whereas reacting implies giving comments. Regardless of the length of a move, whether one word, or one hundred words it counts as one move. Moreover, the teacher may also react with a nod, a short statement of "very good" or a long explanation of the rules of the past perfect continuous tense the communication counts as only one reacting move. The classic pattern is when the teacher solicits, students respond, and the teacher reacts to conclude the set of moves. Move types can be combined along all these categories and so-called idiosyncratic moves may occur which distract students to an extent that the power of other moves is diminished. Altogether move type combined with information regarding source and target answer the question: What is being done?

Mediums used to communicate can be distinguished according to how they transmit the information. This way observation data and its analysis through coding allow us to answer the question: "How is it done?" The communicative moves may be coded as "linguistic" [l], "nonlinguistic" [n], "paralinguistic" [p] and "silence" [s]. The linguistic medium includes both written and spoken utterances. The nonlinguistic medium refers to noises, music, pictures. Paralinguistic medium involves gestures, tone of voice and movement. Silence is the absence of any of the so-far mentioned mediums.

Key: T=teacher; S=student; C=class; sol=solicit; rea=react; res=respond; La=linguistic audio; pe=present, elicit (ask questions for which the answer is already known); ps=present, state (give factual information); st=extended discourse; ce=characterize evaluate; d=reproduce; so=study other areas than language; fg=life general; p=procedure XX

The transcript reveals that in this scene students talked more (16 lines) than the teacher (14 lines). After a few seconds hesitation they came up with their accounts for the events in 1877. The content of the utterances was strictly focused there was only one off-theme remark about the impact of the tape-recorder, which most probably aimed to alleviate the tension both students and the instructor felt about the instrument and the fact that everything was recorded. The teacher started with a grand tour question (Rubin and Rubin, 1995) about the Great Strike of 1877 because students did not read historiography but only primary sources. Thus, the purpose of the question was to obtain broad overviews as to what extent students had familiarized with the cultural premises of the period. ADD ON QUESTION TYPES As a kind of probing the teacher asked two other questions and students were supposed to paraphrase the readings in their responses. The questions were closer to display questions than to genuine probing and were meant to be a kind of warm-up task. All interaction but one short laugh was linguistic. At one point ( lines 42-50 ) the students carried on the discussion reacting to what had been said without teacher interference. This event broke the otherwise frequent teacher-student-teacher interaction and provided feedback about the communicative value of the speaker' comments. Once S1 finished her comment the teacher allowed no thinking time and the free-flowing conversation was brought to an abrupt end by a 29-line-long teacher monologue.

A quick look at the rest of the first part of the class ( APPENDIX or OTHER TABLE? ) shows that this reaction of the teacher was roughly twice as long (29 lines) as the two students' responses (altogether 16 lines). In her talk the teacher linked events and causes of the Great Strike of 1877 to those of the Great Depression of the 1930s. Drawing parallels between the two events could have offered opportunities for asking further questions or assign creative, thinking gap activities which could have connected learners' schemata or active story-based knowledge to the new information and unknown events in American history. These unfortunately went unexploited as the excerpt shows; moreover, as the teacher's uninterrupted flow of new information poured so much unknown data on students, it brought an abrupt halt to student willingness to venture with comments and insights. This is a potential danger of teacher monologues to which we will come back in a later section of the manual.

The transcript reveals ( APPENDIX lines 141-259) that the teacher-initiate-student-respond pattern continued in this section of the session as well. There was a linguistic medium, and the content included only the current field of study except for two procedural remarks. Teacher talk was not predominant, 88 lines as opposed to 27 countable lines-WORD COUNT plus roughly the same amount of inaudible recording of student talk. Yet the ration is worse if we compare the teacher and individual students. There was no student-initiated interaction in this part of the session and the teacher and individual students and teacher utterances were always longer except when the teacher only asked one question. Responses became more evaluative; participants experimented with inferences and analogies (line 150), too. Discussion of the question as well as the first part of the class was terminated by a long talk on Social Darwinism as an example of applied philosophy. Both main questions were based on primary readings and all but one student utterance reached the length of three lines- INSERT WORD COUNT .

A short introduction to group work followed, the teacher distributed a question sheet containing four questions ( APPENDIX LINES 285-291 ) and asked students to form discussion groups. Participants could use their notes, the readings, their smart phones, tablets or any outside source that they had at hand during the ten minutes of small group discussion. The rest of class time was devoted to conversing about he questions and related issues in a similar manner to the first part of the session. The basis of the discussion was an essay on the history of the period. Of all 69 coded communications only five did not focus on the subject that is the cultural history of the period 1880s-1920s in America. The content was procedural and the number of teacher monologues increased in this section. There were no references to personal life or to everyday practices in general, the link between students' schemata and the readings was entirely missing.

Possible implications of bad student-teacher talk ratio

Consequences of not relying on student schemata in bringing forth new factual information. History remains a cold, hardly accessible, archival inventory of facts with a huge number of dormant stories which could only be exploited if the teacher opens up the links between the past and the present with tasks that step beyond question-answer-question. Rosenzweig and Thelen argue that students of history participate actively and use the past intimately, regardless of cultural and national boundaries, if the pasts they use can be connected to intimate or private spheres (1998). The central issue in a fundamentally historical way of learning and understanding culture is participation v. passivity, active and firsthand engagement or mediation by others who had mysterious and distant agendas. If used in narrative embedding, artifacts, descriptions of the historic past invites learners to revisit their own experiences at other times and places, to imagine how they might have felt and acted, to reflect on how the earlier experiences or circumstances might have changed or been changed by those who had originally participated in them (Rosenzweig and Thelen, 1998). If we consider these events in American history as a sum total of life experiences that feed introspection, they will begin to live their own lives and help students understand the importance of learning to use the past on their own terms. Engaging more actively with these texts would also bring more different tastes as each individual puts her own story as a basis to understand someone else's story. The goal is to make such distant past a common resource and a way to understand our thinking better. Emphasizing the general features of specific historical situations makes their relevance trans-historical and trans-generational and allows students to become interpreters rather than just observers.

This task is meant to get you engaged more deeply into the dynamics of using history as a meaningful content and think critically about why certain activities fail to involve learners.

According to the transcript students started a discussion without teacher initiative ( lines 292-297 ). The use function of the communication that triggered student response is coded as "characterize-illustrate" and "characterize-label." Fanselow claims that soliciting responses such as "characterize and/or characterize-evaluate" require mental operations that are the basis of development in most fields and in much learning (1985). He argues that although communications coded this way require the type of binary choice we make every day outside classrooms, they occur less than 5% of the time in most classes studied (Fanselow, 1985). The statistics has proved true in the recorded class as well. There was only one instance ( lines 292-295 ) when a student utterance could be coded as characterizing. After the response the teacher did not allow thinking time but assumed control and continued with a question, which reintroduced the teacher-student-teacher interaction pattern observed during the first half of the class. Eight of thirteen communications were coded as "recalling information, stating facts," two were questions to explore and two utterances belonged to the previously described "characterize" category. One question the teacher asked ( lines 332-333 ) was a reformulation and expansion of a student response. EXPLANATION More

The student indicated as S8 in the transcript studies economy, too and he is especially interested in topics that involve discussions on economy. Later during the class he mentioned the name of John Maynard Keynes ( line 412 ), and eventually wrote his research paper on Keynes's theory about the Great Depression. At that time he joined in and helped maintain the teacher-student interaction. In line 331 he made a mistake, used the adjective "economical" instead of "economic". This misuse of the term economical to mean economic occurred frequently in other discussions as well as in the written assignments. The teacher then decided to manage the mistake by repeating, reformulating and expanding. The example illustrates the complexity of the dilemma whether to interrupt or not at such instances. The pattern of interaction was mainly recalling and stating facts about the American economy of the time and the role of the stock exchange as a potential cause for the economic recession. The content was subject-related in a solely linguistic medium. The teacher continued with a 60-line-long talk on what it means to buy stocks on margin. In line 358 there was the actual question about the meaning of the term, and students were allowed to think for five seconds. Compared to an earlier situation when 18 seconds of thinking time was allowed, this short pause was possibly not enough for students to come up with their ideas before the talk continued. It ended with soliciting for factual information on laissez-faire capitalism and only one student could become genuinely involved in the discourse.

The coding and the transcript reveal that in the remaining of the class ( lines 413-768 ) the dynamics of interaction changed, teacher talk (approximately 305 lines ) was about six times the length of student talk ( approximately 50 lines ). Long monologues followed short student responses to teacher soliciting ( see transcript ). Pauses became shorter, and their lengths varied between 5 and 10 seconds. Of the coded 57 communications 25 were stating facts and recalling information, and these responses were given to 14 display questions. Only two exploratory questions occurred and all characterize-evaluate statements came from the teacher. Six communications had procedural content, the rest dealt with the subject per se. In this part of the class, student-initiated interaction was entirely missing. The class ended with setting the reading tasks for the following week.

Interpretations

Transcription, coding, description and analysis of the data from self-observation walk us through what happened in the November 19 th class of the course Introduction to Studying American Culture . One of the two goals of the course is to develop the sociocultural competence of the students. Byram, Zarate and Neuner hold that a learner possessing sociocultural competence will be able to interpret and bring different cultural systems into relation with one another, to interpret socially distinctive variations within a foreign cultural system, and to manage the dysfunctions and resistances peculiar to intercultural communication which is referred to as conflict (1997).

Whenever the teacher included longer (10-18 seconds) wait-time students became more active. Longer pauses tend to help students think, relate to the topic and decide whether they want to contribute or not. No less important is the fact that the communication setting is unnatural as people sharing one mother tongue are forced to speak a foreign language. Coming up with the wrong idea, expressing it inaccurately, with a less native like accent are all risk factors not easy to overcome. Interpreting and evaluating also require longer thinking time than recalling rote-learned facts. The teacher can best help her students tackle these challenges with longer wait times, and meaningful props to encourage students and help them come up with meaningful utterances. Consequently, controlled longer pauses after open-ended questions allow participants to tackle potential language and content-related problems and assist students to switch to a foreign system of cultural references.

Mastering a foreign language means the willingness to relativise one's own cultural position and set of values and beliefs and this is possible through employing all different aspects of language use. The transcript shows that only listening and speaking skills were extensively used during the session. During group-work students could use written texts and jot down their ideas for later discussions, but that activity only took 10 minutes from the 90-minute-class and it was not about reading or writing entirely either. There were four students who did not participate in any activity except the group discussion. They may have decided to withdraw from participation because they were better at tasks, which required reading and writing as opposed to listening and speaking. Developing sociocultural competence however means managing the dysfunctions and resistances peculiar to intercultural communication, which is not restricted to listening and speaking. Such emphasis on oral communication can be one factor that blocks the development of sociocultural competence.

Despite the fact that the majority of students felt they had enough discussion and mentioned it as a positive feature of the course in their feedback, even a fast glimpse at the transcript shows the imbalance of student-teacher talk ratio. An especially striking feature is the frequently occurring teacher monologues. There are extended talks in which the teacher answers an earlier solicit fully or partially, reacts to student response or clarifies issues that she thinks might be important. There was only one reference to teacher monologues in the feedback when one student claimed: "I liked the most when you talked about a certain subject in class, which we couldn't read anywhere." As the student noticed such monologues can be a source of knowledge on the target culture, which according to Byram et al. (1997) also constitutes segment of sociocultural competence. Its presence in the class is problematic when it blocks student discussion. On the other hand, monologues could be an expression of teacher anxiety rooted in the fear that she loses control over the discussion. Accordingly, the role and impact of these extended talks depend on the circumstances under which they occur.

One feature of the monologues is that they reduce the amount of time available for students to think about the task and articulate their views. Looking at the nature of student responses reveals the little number of evaluative responses, which Fanselow argues play a crucial role in development (1987). The few such communication moves occurred while students and teacher were discussing primary sources. These materials require rather than offer interpretations, which develop an ability to produce and operate and "interpretative system" in which learners gain insight into hitherto unknown cultural meanings, beliefs and practices either in a new or a familiar language and culture (Byram et al. 1997). Comments about course readings also support this interpretation. Students felt the amount of reading was too much, yet they unanimously found primary sources and video clips interesting, whereas none of them found the course text, An Introduction to American Studies challenging.

Byram et al. also hold that sociocultural competence also has an affective capacity to relinquish ethnocentric attitudes towards and perceptions of otherness and a cognitive ability to establish and maintain a relationship between native and foreign cultures (1997). Life-related tasks would help bridge the gap between native and target cultures and provide opportunities to master descriptive categories conducive to bringing the original and foreign cultures into relation. The lack of activities involving students' life and life in general means in this particular class little space was allowed for comparison and realtivzation by means of life-related tasks.

Read the following on the uses of case studies:

Case Study :

(Gebhard, 1996, pp. 27-31)

Case study:

(Gebhard, 1999, pp. 2tl-215)

Bibliography:

Bailey, K. M., Curtis, A., & Nunan, D. (2001). Pursuing professional development . Boston: Heinle & Heinle.

Byram, M. (1989). Cultural Studies in Foreign Language Education . Clevedon: Multilingual Matters Ltd.

Byram, M., Zarate, G. and G. Neuner. (1997). Sociocultural competence in language learning and teaching . Strasbourg: Council of Europe.

Cogan, D. (1995). Using a counselling approach in teacher supervision. The Teacher Trainer. 9 , 3-6.

Fanselow, J. F. (1987). Breaking rules. generating and exploring alternatives in language teaching . New York: Longman.

Gebhard, J. G. (1984). Models of supervision: Choices. TESOL Quarterly, 18 , 501-14.

Gebhard, J. K. (1996). Teaching English as a foreign or second language: A teacher self-development and methodology guide . Ann Arbor: The University of Michigan Press.

Gebhard, J. G. & Oprandy, R. (1999). Language teaching awareness: A guide to exploring beliefs and practices . New York: Cambridge University Press.

Gebhard, J. C., M. Fodor, and M. Lehmann. (2003). Teacher Development Through Exploration: Principles, Processes, and Issues in Hungary. Eds. J. Andor, J. Horváth, and M. Nikolov. (2003). Studies in English Theoretical and Applied Linguistics . Pécs: Lingua Franca Csoport. 250-261.

Griffee, D.T. (2012). An introduction to second language research methods: Design and data.

Holliday, A. (1999). Small Cultures. Applied Linguistics, 20, 237-264 .

Lado, R. (1957). Linguistics Across Cultures: Applied Linguistics for Language Teachers . Ann Arbor: University of Michigan Press.

Lessard-Clouston, M. (1996) Chinese Teachers' Views of Culture in their EFL Learning and Teaching. Language, Culture and Curriculum , 9, 197-224.

Nikolov, M. (2011).

Pinker, S. (1994) The Language Instinct. New York: Penguin.

Rosenzweig, R. and D. Thelen (1998). The Presence of the Past . New York: Columbia University Press.

Rubin, H. J. and I. S. Rubin. (1995). Qualitative Interviewing. The Art of Hearing Data . Thousand Oaks: Sage.

Saville-Troike, M. (1983) An anthropological linguistic perspective on uses of ethnography in bilingual language proficiency assessment. In: C. Rivera (ed.) 1983. An Ethnographic/Sociolinguistic Approach to Language Proficiency Assessment . Clevedon, Avon: Multilingual Matters. 131-136.

Schön, D. A. (1983). The reflective practitioner: How professionals think in action . London: Temple Smith.

Vander Zanden, J. W. (1988). The Social Experience: An Introduction to Sociology. New York: Random House.

Wardhaugh, R. (1994). An Introduction to Sociolinguistics . Oxford. Blackwell.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

We're Hiring!
Help Center

Download Free PDF

THE ROLE OF CLASSROOM OBSERVATION IN TEACHING FUTURE TEACHERS: A CASE STUDY

Related papers

… and Developments in …, 2007

CLASS OBSERVATION AND FEEDBACK AS STRATEGIES FOR TEACHER TRAINING AND DEVELOPMENT (Atena Editora), 2024

The objective of this presentation is to bring an excerpt from our doctoral research, completed in 2023, which sought to analyze the possibilities of using class observation and feedback as training and pedagogical monitoring strategies for teaching professional development, in addition to understanding what difficulties the pedagogical coordinator meets (meets) to monitor teachers in the classroom and what strategies have been used by those who have managed to effectively implement this action in their routine. To produce the data, the research relied on a questionnaire that was disseminated in different territories and education departments, and answered by pedagogical coordinators from different education networks, in the five regions of Brazil; semi-structured interviews with some pedagogical coordinators selected among the respondents and with teachers who worked in the same schools; bibliographic analysis identifying related research and references from various authors who contribute to the discussion on related topics, such as Weffort (1996), Tardif (2005), Almeida and Placco (2009), Placco, Almeida and Souza (2011;2015); Ninin (2010), Fusari (2011), Marcelo and Vaillant (2012); City et al. (2014), Imbernón (2009, 2016), Reis (2011), Vasconcellos (2014) Darling-Hammond and Bransford (2019), Alarcão (2020), among others. The results obtained showed that the practice of class observation and feedback to observed teachers must not be isolated, but has greater reach and perpetuity when they are part of a public policy, involving the network as a whole. The main benefit highlighted by the participants was the training and professional development of teachers, based on the reflections brought up during the feedback, increasing the quality of the work developed and favoring greater learning for students. We also identified some difficulties encountered by the pedagogical coordinator, the main subject of the research, in the process of implementing these actions in their schools, such as, for example, the lack of support from directors and leaders of the education department, the resistance of some teachers in relation to this practice, the lack of security in providing feedback, time management and organization of the coordinator's routine, in addition to training that is not always offered sufficiently to prepare him for this pedagogical monitoring. At the end, we indicate some important aspects so that class observation can be implemented and used as a training strategy for teacher development, which we hope can contribute to other schools and other coordinators.

The first edition of this book was a bestseller, and is generally regarded as the most widely used and authoritative text on this topic. This completely revised and updated second edition takes into account the latest changes in educational practice, and includes coverage of recent developments in teacher appraisal and school inspection procedures. The author is an international expert on research into teaching and learning, but has always been someone who writes with teachers in mind. You will find a combination of case studies, photographs and illustrations used here to show how various people study lessons for different purposes and in different contexts. He explains a number of approaches in clear language and gives examples of successful methods that have been employed by teachers, student teachers, researchers and pupils. This book is essential reading for anyone serious about becoming a good teacher or researcher in education. E. C. Wragg is Professor of Education at the University of Exeter.

This paper is an attempt at illustrating the significance of observation as a teacher development device. For this purpose a set of observation tasks has been designed based on a specific teacher training context. Accordingly, the different stages of the procedure, as well as the importance of feedback, are discussed and justified based on the pertinent literature. The theoretical basis of the decisions made is also analysed with reference to certain modes of supervision and the different functions of classroom observation.

European Journal of Teacher Education, 2011

Five perspectives on teaching in adult and higher …

World Journal of English Language, 2019

The study indicates that classroom observation is potentially a useful tool for teachers’ professional development and works best when the personal capacity of a teacher, an observer, and school provide a base for the effective use and outcome for teachers.A brief summary of major findings and lessons learnt from the project, process, learning of teachers and my own learning is presented as follows;i) Teachers found the pre and post-observation sessions very useful for their professional development. These sessions also help the observer to understand the roots of the teacher's classroom problems.ii) Cyclical observations provide the courage and intellectual capacity to the teachers to turn their focus upon improved actions and they also developed their professional skills.iii) Teachers perceived my role as a helper, facilitator and a resource person who could provide suggestions and alternatives, where needed. I feel the need to further explore, how reflective conversations bet...

Journal of Perspectives in Applied Academic Practice, 2014

International Group For the Psychology of Mathematics Education Proceedings of the Joint Meeting of Pme 32 and Pme Na Xxx, 2008

Boletim Epidemiológico Especial, 2024

Jenyfer Vargas, 2018

South Asia, the British Empire, and the Rise of Classical Legal Thought: Towards a Historical Ontology of the Law, 2024

ÉVOLUTION DES POLITIQUES DE PLANIFICATION LOCALE AU SÉNÉGAL : EFFETS DE LA DÉCENTRALISATION ET ENJEUX D’INTÉGRATION DE LA DURABILITE , 2024

Undergraduate Topics in Computer Science, 2008

Ferry Renaldy , 2023

History Workshop Journal

Psicologia clinica dello sviluppo, 2001

Zeitschrift Fur Evangelische Ethik, 1973

Journal of Geophysical Research, 1997

Nature-Inspired Mobile Robotics, 2013

American Journal of Infection Control, 2017

Journal of atherosclerosis and thrombosis, 2017

International Journal of Systematic and Evolutionary Microbiology, 2010

The Journal of Wildlife Management, 2007

Acta Biochimica Polonica, 2015

III European Conference on Computational Mechanics

Biophysical Journal, 2001

Classroom observation systems in context: A case for the validation of observation systems

Published: 26 January 2019
Volume 31 , pages 61–95, ( 2019 )

Cite this article

Shuangshuang Liu ORCID: orcid.org/0000-0002-1754-0631 1 na1 ,
Courtney A. Bell 1 na1 ,
Nathan D. Jones 2 &
Daniel F. McCaffrey 1

2517 Accesses

32 Citations

1 Altmetric

Explore all metrics

Researchers and practitioners sometimes presume that using a previously “validated” instrument will produce “valid” scores; however, contemporary views of validity suggest that there are many reasons this assumption can be faulty. In order to demonstrate just some of the problems with this view, and to support comparisons of different observation protocols across contexts, we introduce and define the conceptual tool of an observation system . We then describe psychometric evidence of a popular teacher observation instrument, Charlotte Danielson’s Framework for Teaching, in three use contexts—a lower-stakes research context, a lower-stakes practice-based context, and a higher-stakes practice-based context. Despite sharing a common instrument, we find the three observation systems and their associated use contexts combine to produce different average teacher scores, variation in score distributions, and different levels of precision in scores. However, all three systems produce higher average scores in the classroom environment domain than the instructional domain and all three sets of scores support a one-factor model, whereas the Framework posits four factors. We discuss how the dependencies between aspects of observation systems and practical constraints leave researchers with significant validation challenges and opportunities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Possible biases in observation systems when applied across contexts: conceptualizing, operationalizing, and sequencing instructional quality

On Classroom Observations

Classroom observation frameworks for studying instructional quality: looking back and looking forward.

The different patterns of missingness in observation scores may be explained in several ways. First, there were records that originally were considered “missing data” because the records were incomplete. Specifically, the electronic rating system allowed for the planning and preparation domain to be rated before other domains, so administrators may have entered the ratings for the same lesson separately. To clean up multiple entries for the same observation, we combined scores from multiple records with the same teacher and rater ID that were entered within a week. These records also needed to be consistent with each other if there were overlapping ratings on certain components. Second, administrators may have conducted informal “walk-through” classroom visits in which they did not rate all of the components. This could lead to incomplete records in the system. To remove the records from informal walk-throughs, we dropped observations that only had scores from one domain.

We also ran t tests that account for the correlation due to multiple lessons per teacher and repeated ratings by each rater. We estimated the mean and standard error of the mean for each component and domain using a nested or crossed random effect model. Specifically, for UTQ, we used a crossed effect model with teacher, rater, and teacher by rater random effects. For LAUSD, we used a nested effect model with rater and teacher nested within rater. We found the results were similar to results from the simple t tests, except that all of the differences in mean scores became significant between the two practice-based contexts. Estimates from t tests that account for clustering are available upon request.

We also cannot calculate inter-rater reliabilities because teacher performances in the practice-based contexts were not double-scored.

Eigenvalues and scree plots for all contexts are available upon request.

FFT was developed at Educational Testing Service and was used (with some differences) as a part of the Praxis III assessment for beginning teachers in the U.S. states of Ohio and Arkansas.

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education [AERA/APA/NCME]. (2014). Standards for educational and psychological testing . Washington, D.C.: American Educational Research Association.

Google Scholar

Archer, J., Cantrell, S., Holtzman, S. L., Joe, J. N., Tocci, C. M., & Wood, J. (2016). Better feedback for better teaching: a practical guide to improving classroom observations . New York: John Wiley & Sons.

Bell, C. A., Gitomer, D. H., McCaffrey, D. F., Hamre, B. K., Pianta, R. C., & Qi, Y. (2012). An argument approach to observation protocol validity. Educational Assessment, 17 (2–3), 62–87. https://doi.org/10.1080/10627197.2012.715014 .

Article Google Scholar

Bell, C., Jones, N., Lewis, J., Qi, Y., Kirui, D., Stickler, L., & Liu, S. (2016). Understanding consequential assessment systems of teaching: Year 1 final report to Los Angeles Unified School District (Research Memorandum No. RM-16-12) . Princeton, NJ: Educational Testing Service.

Carey, M. D., Mannell, R. H., & Dunn, P. K. (2011). Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews? Language Testing, 28 (2), 201–219. https://doi.org/10.1177/0265532210393704 .

Casabianca, J. M., Lockwood, J. R., & McCaffrey, D. F. (2015). Trends in classroom observation scores. Educational and Psychological Measurement, 75 (2), 311–337. https://doi.org/10.1177/0013164414539163 .

Chaplin, D., Gill, B., Thompkins, A., & Miller, H. (2014). Professional practice, student surveys, and value-added: Multiple measures of teacher effectiveness in the Pittsburgh Public Schools. REL 2014-024. Regional Educational Laboratory Mid-Atlantic.

Charalambous, C. Y., & Praetorius, A. K. (2018). Studying mathematics instruction through different lenses: setting the ground for understanding instructional quality more comprehensively. ZDM , 50 (3), 355–366.

Cohen, J., & Grossman, P. (2016). Respecting complexity in measures of teaching: keeping students and schools in focus. Teaching and Teacher Education, 55 , 308–317. https://doi.org/10.1016/j.tate.2016.01.017 .

Cohen, J., Ruzek, E., & Sandilos, L. (2018). Does teaching quality cross subjects? Exploring consistency in elementary teacher practice across subjects. AERA Open, 4 (3), 2332858418794492), 233285841879449.

Dalland, C.P., Klette, K., & Svenkerud, S. (2018). Video studies and the challenge of selecting time scales. International Journal of Research & Method in Education. Manuscript submitted for publication.

Danielson, C. (1996). Enhancing professional development: A framework for teaching. Alexandria, VA: Association for Supervision and Curriculum Development.

Danielson, C. (2007). Enhancing professional practice: a framework for teaching . Alexandria, VA: Association for Supervision and Curriculum Development.

Danielson, C. (2011). Enhancing professional practice: a framework for teaching . Princeton, NJ: The Danielson Group.

Danielson, C. (2013). The Framework for Teaching evaluation instrument, 2013 Edition. Retrieved January 17, 2017 from https://www.danielsongroup.org/framework/ .

Darling-Hammond, L., & Rothman, R. (2015). Teaching in the flat world: learning from high-performing systems. Teachers College Press.

Donaldson, M. L., & Woulfin, S. (2018). From tinkering to going “rogue”: how principals use agency when enacting new teacher evaluation systems. Educational Evaluation and Policy Analysis 0162373718784205.

Engelhard, G. (1996). Evaluating rater accuracy in the performance assessments. Journal of Educational Measurement, 33 (1), 56–70.

Floman, J. L., Hagelskamp, C., Brackett, M. A., & Rivers, S. E. (2017). Emotional bias in classroom observations: within-rater positive emotion predicts favorable assessments of classroom quality. Journal of Psychoeducational Assessment, 35 (3), 291–301.

Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: a research synthesis. National Comprehensive Center for Teacher Quality . Retrieved on December 3, 2008 from: https://gtlcenter.org/sites/default/files/docs/EvaluatingTeachEffectiveness.pdf .

Hafen, C. A., Hamre, B. K., Allen, J. P., Bell, C. A., Gitomer, D. H., & Pianta, R. C. (2015). Teaching through interactions in secondary school classrooms revisiting the factor structure and practical application of the Classroom Assessment Scoring System–Secondary. The Journal of Early Adolescence, 35 (5–6), 651–680.

Harik, P., Clauser, B. E., Grabovsky, I., Nungester, R. J., Swanson, D., & Nandakumar, R. (2009). An examination of rater drift within a generalizability theory framework. Journal of Educational Measurement, 46 (1), 43–58.

Herlihy, C., Karger, E., Pollard, C., Hill, H. C., Kraft, M. A., Williams, M., & Howard, S. (2014). State and local efforts to investigate the validity and reliability of scores from teacher evaluation systems. Teachers College Record, 116 (1), 1–28.

Hess, F. M. (2015). Lofty promises but little change for America’s schools. Education Next, 15 (4), 50–56.

Hill, H. C., Charalambous, C. Y., Blazar, D., McGinn, D., Kraft, M. A., Beisiegel, M., et al. (2012a). Validating arguments for observational instruments: attending to multiple sources of variation. Educational Assessment, 17 (2–3), 88–106.

Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012b). When rater reliability is not enough: teacher observation systems and a case for the generalizability study. Educational Researcher, 41 (2), 56–64. https://doi.org/10.3102/0013189X12437203 .

Ho, A. D., & Kane, T. J. (2013). The reliability of classroom observations by school personnel. Research paper. MET Project. Bill & Melinda Gates Foundation.

Hoffman, J. V., Sailors, M., Duffy, G. R., & Beretvas, S. N. (2004). The effective elementary classroom literacy environment: examining the validity of the TEX-IN3 Observation System. Journal of Literacy Research, 36 (3), 303–334.

Joe, J. N., McClellan, C. A., & Holtzman, S. L. (2014). Scoring design decisions: reliability and the length and focus of classroom observations. In T. J. Kane, K. Kerr, & R. C. Pianta (Eds.), Designing teacher evaluation systems (pp. 415–443). New York: Jossey Bass.

Joe, J. N., Tocci, C. M., Holtzman, S. L., & Williams, J. C. (2013). Foundations of observation: considerations for developing a classroom observation system that helps districts achieve consistent and accurate scores. MET Project, Policy and Practice Brief . Retrieved on January 21, 2019 from http://k12education.gatesfoundation.org/resource/foundations-of-observations-considerations-for-developing-a-classroom-observation-system-that-helps-districts-achieve-consistent-and-accurate-scores/ .

Jølle, L. (2015). Rater strategies for reaching agreement on pupil text quality. Assessment in Education: Principles, Policy & Practice, 22 (4), 458–474.

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (pp. 17–64). New York: Praeger.

Kane, M. T. (2013a). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50 (1), 1–73. https://doi.org/10.1111/jedm.12000 .

Kane, M. T. (2013b). Validation as a pragmatic, scientific activity. Journal of Educational Measurement, 50 (1), 115–122. https://doi.org/10.1111/jedm.12007 .

Kane, T. J., & Staiger, D. O. (2012). Gathering Feedback for Teaching: Combining High-Quality Observations with Student Surveys and Achievement Gains. Retrieved on January 4, 2013 from http://metproject.org/downloads/MET_Gathering_Feedback_Research_Paper.pdf .

Kane, T. J., Taylor, E. S., Tyler, J. H., & Wooten, A. L. (2010). Identifying effective classroom practices using student achievement data, (September 2010), 51. https://doi.org/10.3386/w15803 .

Kraft, M. A., & Gilmour, A. F. (2016). Can principals promote teacher development as evaluators? A case study of principals’ views and experiences. Educational Administration Quarterly, 52 (5), 711–753.

Lazarev, V., Newman, D., Sharp, A., & (ED), R. E. L. W. (2014). Properties of the multiple measures in Arizona’s teacher evaluation model. REL 2015-050. Regional Educational Laboratory West, (October). Retrieved on July 23, 2018 from https://files.eric.ed.gov/fulltext/ED548027.pdf .

Leckie, G., & Baird, J. A. (2011). Rater effects on essay scoring: a multilevel analysis of severity drift, central tendency, and rater experience. Journal of Educational Measurement, 48 (4), 399–418.

Lockwood, J. R., Savitsky, T. D., & McCaffrey, D. F. (2015). Inferring constructs of effective teaching from classroom observations: an application of Bayesian exploratory factor analysis without restrictions. Ann. Appl. Stat., 9 (3), 1484–1509.

Martin-Raugh, M., Tannenbaum, R. J., Tocci, C. M., & Reese, C. (2016). Behaviorally anchored rating scales: An application for evaluating teaching practice. Teaching and Teacher Education, 59, 414–419. https://doi.org/10.1016/j.tate.2016.07.026

Martinez, F., Taut, S., & Schaaf, K. (2016). Classroom observation for evaluating and improving teaching: an international perspective. Studies in Educational Evaluation, 49 , 15–29.

McCaffrey, D. F., Yuan, K., Savitsky, T. D., Lockwood, J. R., & Edelen, M. O. (2015). Uncovering multivariate structure in classroom observations in the presence of rater errors. Educational Measurement: issues and Practice, 34 (2), 34–46.

McClellan, C. (2013). What it looks like: master coding videos for observer training and assessment . Seattle: Bill & Melinda Gates Foundation. Retrieved on January 14, 2014 from http://k12education.gatesfoundation.org/resource/what-it-looks-like-master-coding-videos-for-observer-training-and-assessment/ .

McClellan, C., Atkinson, M., & Danielson, C. (2012). Teacher evaluator training & certification: lessons learned from the Measures of Effective Teaching project (Practitioner Series for Teacher Evaluation). San Francisco: Teachscape. Retrieved Jan 3, 2019 from https://www.issuelab.org/resource/teacher-evaluator-training-certification-lessons-learned-from-themeasures-of-effective-teaching-project.html .

Muijs, D., Kyriakides, L., van der Werf, G., Creemers, B., Timperley, H., & Earl, L. (2014). State of the art–teacher effectiveness and professional learning. School Effectiveness and School Improvement, 25 (2), 231–256.

Myford, C. M., & Wolfe, E. W. (2009). Monitoring rater performance over time: a framework for detecting differential accuracy and differential scale category use. Journal of Educational Measurement, 46 (4), 371–389.

Netolicky, D. M. (2016). Coaching for professional growth in one Australian school: “oil in water”. International Journal of Mentoring and Coaching in Education, 5 (2), 66–86. https://doi.org/10.1108/IJMCE-09-2015-0025 .

Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom assessment scoring system (CLASS) manual, pre-K . Baltimore: Brookes.

Pons, A. (2018). What does teaching look like? A new video study [Blog post]. Retrieved from http:// oecdeducationtoday.blogspot.com/2018/01/what-does-teaching-look-like-new-video.html . Accessed 2 Dec 2018.

Praetorius, A.-K., Pauli, C., Reusser, K., Rakoczy, K., & Klieme, E. (2014). One lesson is all you need? Stability of instructional quality across lessons. Learning and Instruction, 31 , 2–12. https://doi.org/10.1016/j.learninstruc.2013.12.002 .

Praetorius, A. K., & Charalambous, C. Y. (2018). Classroom observation frameworks for studying instructional quality: looking back and looking forward. ZDM - Mathematics Education, 50 (3), 535–553. https://doi.org/10.1007/s11858-018-0946-0 .

Roegman, R., Goodwin, A. L., Reed, R., & Scott-McLaughlin, R. M. (2016). Unpacking the data: an analysis of the use of Danielson’s (2007) Framework for Professional Practice in a teaching residency program. Educational Assessment, Evaluation and Accountability, 28 (2), 111–137. https://doi.org/10.1007/s11092-015-9228-3 .

Sahlberg, P. (2011). Finnish lessons . New York: Teachers College Press.

Schoenfeld, A. H., Floden, R., El Chidiac, F., Gillingham, D., Fink, H., Hu, S., et al. (2018). On classroom observations. Journal for STEM Education Research., 1 , 34–59. https://doi.org/10.1007/s41979-018-0001-7 .

Seidel, T., Prenzel, M., & Kobarg, M. (2005). How to run a video study. Technical report of the IPN Video Study . Berlin: Waxmann

Shepard, L. A. (2016). Evaluating test validity: reprise and progress. Assessment in Education: Principles, Policy and Practice, 23 (2), 268–280. https://doi.org/10.1080/0969594X.2016.1141168 .

State of New Jersey Administrative Code, 6A:10-7.1 (2016), Subchapter 7.

Steinberg, M. P., & Garrett, R. (2016). Classroom composition and measured teacher performance: what do teacher observation scores really measure? Educational Evaluation and Policy Analysis, 38 (2), 293–317. https://doi.org/10.3102/0162373715616249 .

Stigler, J. W., Gonzales, P., Kwanaka, T., Knoll, S., & Serrano, A. (1999). The TIMSS videotape classroom study: methods and findings from an exploratory research project on eighth-grade mathematics instruction in Germany, Japan, and the United States, Washington D. C. Retrieved Oct 12, 2014 from: http://nces.ed.gov/pubs99/1999074.pdf .

Taut, S., Santelices, M. V., & Stecher, B. (2012). Validation of a national teacher assessment and improvement system. Educational Assessment, 17 (4), 163–199.

Taut, S., & Sun, Y. (2014). The development and implementation of a national, standards-based, multi-method teacher performance assessment system in Chile. Education Policy Analysis Archives, 22 (71), 1–31. https://doi.org/10.14507/epaa.v22n71.2014 .

van der Lans, R. M., van de Grift, W. J., & van Veen, K. (2017). Individual differences in teacher development: an exploration of the applicability of a stage model to assess individual teachers. Learning and Individual Differences, 58 , 46–55.

Van der Lans, R. M., van de Grift, W. J., van Veen, K., & Fokkens-Bruinsma, M. (2016). Once is not enough: establishing reliability criteria for feedback and evaluation decisions based on classroom observations. Studies in Educational Evaluation, 50 , 88–95.

White, T. (2014a). Evaluating teachers more strategically: using performance results to streamline evaluation systems . Retrieved September 6, 2018 from: https://www.carnegiefoundation.org/wp-content/uploads/2014/12/BRIEF_evaluating_teachers_strategically_Jan2014.pdf .

White, T. (2014b). Adding eyes: the rise, rewards, and risks of multi-rater teacher observation systems. Retrieved September 6, 2018 from: https://www.carnegiefoundation.org/wp-content/uploads/2014/12/BRIEF_Multi-rater_evaluation_Dec2014.pdf .

White, M. C. (2018). Rater performance standards for classroom observation instruments. Educational Researcher , 47 (8), 492–501. https://doi.org/10.3102/0013189X18785623 .

Whitehurst, G., Chingos, M., & Lindquist, K. (2014). Evaluating teachers with classroom observations: Lessons learned in four districts. Providence, RI: Brown Center on Education Policy at the Brookings Institution .

Download references

This study was supported by grants from W.T. Grant Foundation (Grant # 181068) and The Bill and Melinda Gates Foundation (Grant # OPP52048). For making the data available for this study, we thank administrators, teachers, and staff from Los Angeles Unified School District (LAUSD) and three large southern districts. The opinions expressed herein are those of the authors and not the funding agency or participants.

Author information

Shuangshuang Liu and Courtney A. Bell contributed equally to this work.

Authors and Affiliations

Educational Testing Service, 660 Rosedale Rd, Princeton, NJ, 08541, USA

Shuangshuang Liu, Courtney A. Bell & Daniel F. McCaffrey

Boston University, Boston, MA, 02215, USA

Nathan D. Jones

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Courtney A. Bell .

Additional information

Publisher’s note.

Springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Liu, S., Bell, C.A., Jones, N.D. et al. Classroom observation systems in context: A case for the validation of observation systems. Educ Asse Eval Acc 31 , 61–95 (2019). https://doi.org/10.1007/s11092-018-09291-3

Download citation

Received : 16 February 2018

Accepted : 27 December 2018

Published : 26 January 2019

Issue Date : 15 February 2019

DOI : https://doi.org/10.1007/s11092-018-09291-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Teacher evaluation
Observation systems
Factor analyses
Teaching quality
Find a journal
Publish with us
Track your research

IMAGES

(PDF) THE ROLE OF CLASSROOM OBSERVATION IN TEACHING FUTURE TEACHERS: A
Preschool Observation at CCLC Daycare Case Study
Classroom Observation Form
Classroom Observation Report
(PDF) Teacher-Tailored Classroom Observation for Professional Growth of
FREE 9+ Classroom Observation Form Samples, PDF, MS Word, Google Docs

VIDEO

Daily Classroom Setting
Classroom Observation
Lesson Observation Grade 5 English 2
classroom observation
Student Teaching Observation 4
Unit 2 Part 1 Research Methods

COMMENTS

(PDF) CLASSROOM OBSERVATION AND RESEARCH
This paper explains that there are at least for methods that can be used when a researcher wants to conduct a research in teaching and learning development. Those methods are formal experiment,...
4 Classroom observation: implications of a case study ... - PTE
4 Classroom observation: implications of a case study. Evaluation and sharing the findings to discuss future options. Observation is powerful. Its significance and role have long been the focus of interest inside and outside humanities.
Classroom observation for evaluating and improving teaching ...
The study describes sixteen classroom observation systems in six countries. •. We offer a framework based on conceptual, methodological and contextual aspects. •. The paper provides information on options and decisions for designing such systems. Abstract.
(PDF) Analysis of classroom observations - ResearchGate
Figures (1) Abstract and Figures. This chapter presents the analysis and results of the classroom observations and teachers' retrospective interviews. The process of data analysis is divided...
Full research paper TEACHER-TAILORED CLASSROOM OBSERVATION ...
as a tool in such practices, classroom observation (CO) is not considered to promote teacher professional learning since it is generally regarded as part of the appraisal process. Thus, this exploratory case study aims to explore the insights of four EFL teachers about CO tailored by
Classroom Observation as Method for Research and Improvement
This study recognises that classroom observation as a methodology will only capture a construct that then can be used as a metaphor in a learning improvement model for teacher professional learning.
Classroom Observation - A guide to the effective observation ...
Classroom Observation explores the pivotal role of lesson observation in the training, assessment and development of new and experienced teachers. Offering practical guidance and detailed...
Classroom observation for evaluating and improving teaching ...
The study describes sixteen classroom observation systems in six countries. We offer a framework based on conceptual, methodological and contextual aspects. The paper provides information on options and decisions for designing such systems.
(PDF) THE ROLE OF CLASSROOM OBSERVATION IN TEACHING FUTURE ...
You will find a combination of case studies, photographs and illustrations used here to show how various people study lessons for different purposes and in different contexts. He explains a number of approaches in clear language and gives examples of successful methods that have been employed by teachers, student teachers, researchers and pupils.
Classroom observation systems in context: A case for the ...
We begin by offering an assessment-oriented definition of an observation system, which can be used to describe the observation systems we study and to compare implementations of observation systems across varied contexts.