Undergraduate science education reform has focused attention on the teaching, learning, and assessment of core concepts, such as the disciplinary core idea of evolution (e.g., NRC 2001a, b, 2012a, b; AAAS 2011; NGSS Lead States 2013; Sinatra et al. 2008). A large body of research in evolution education has resulted from these initiatives. Much of this work has been directed at student understanding of evolution and non-normative ideas about evolution, sometimes with the intention of developing pedagogies to initiate conceptual change (e.g., Bishop and Anderson 1990; Demastes et al. 1995a, b, 1996; Nehm and Schonfeld 2007; Scharmann 1994; Nehm and Reilly 2007). These studies form a substantial literature regarding the magnitudes of evolutionary knowledge, non-normative ideas, and acceptance of biology students and teachers. Yet remarkably little is known about evolutionary knowledge and reasoning in another undergraduate population taught in a very different context: biological anthropology (e.g., Cunningham and Wescott 2009). Indeed, while biological anthropology and biology share a common ‘language’ of evolution (Wilson 2005), they offer distinct experiences when learning evolutionary theory. Anthropology offers a unique learning environment focusing on a single lineage and associated case studies of evolution occurring in that lineage. Do these different educational experiences produce significant differences in knowledge, misconceptions, and reasoning patterns? The overarching goal of our work was to begin to explore evolutionary knowledge and reasoning patterns in this population and compare them to undergraduate biology students.
The courses from which our populations of students were sampled appeared to be comparable on paper. Both courses represent one of the two (biology) or three (anthropology) introductory level offerings for each program, the order of which are unimportant. Both require a laboratory component in addition to the lecture component. Despite these similarities and the fact that both anthropology courses and biology courses use evolutionary theory as their foundation, our findings show that the students who come from these backgrounds displayed demographic and knowledge differences. In fact, there were significant differences for all demographic and background variables tested. For example, the anthropology students in our sample were actually less experienced in terms of how many evolution-related courses they had already taken and therefore, had not progressed as far in their overall college coursework. Given this information, it is perhaps not surprising that the two populations displayed differences in their understanding of evolutionary concepts. Across all measures of knowledge and reasoning, anthropology students had worse scores than the biology students, despite their open-response answers being comparable in terms of verbosity (c.f., Federer et al. 2015). These differences in knowledge and misconceptions were largely (i.e., ACORNS KC) or partially explained (i.e., CINS, ACORNS NI, ACORNS MT) by controlling for demographic and background variables, but significant differences, with small effect sizes, remained. Specifically, when controlling for background and knowledge variables, anthropology and biology students no longer differed in the number of accurate ideas that they used in their evolutionary explanations. Nevertheless, as compared to biology students, anthropology students displayed lower CINS scores, were more likely to bring non-normative ideas into their evolutionary explanations, and remained further from expert-like reasoning.
Many different variables can be used to place a learner along a novice-expert continuum (e.g., Beggrow and Nehm 2012). In this study, we focused on three variables: amount of knowledge, amount of misconceptions, and sensitivity to surface features in evolutionary reasoning. Experts are expected to have high knowledge, few misconceptions, and low sensitivity to surface features (Nehm and Ridgway 2011). It is possible for respondents to demonstrate novice-like behavior for some of these variables and expert-like behavior for others. Biology and anthropology students demonstrated novice-like levels of evolutionary knowledge. Specifically, both populations performed poorly on the CINS, a non-majors test of evolutionary knowledge (Anderson et al. 2002), with mean scores of 13.6 and 10.68, respectively. Furthermore, while both biology and anthropology students demonstrated few misconceptions in their explanations of evolutionary change (i.e., few NIs, 0.18 and 0.37, respectively), they also demonstrated low levels of knowledge (i.e., few KCs, 1.07 and 0.78, respectively) and inconsistent evolutionary models (i.e., low rates of purely scientific models, 61% and 38%, respectively).
Although both populations demonstrated novice-like knowledge and reasoning patterns, biology students performed significantly better for all of these variables than anthropology students. The difference was the most striking for evolutionary reasoning, where biology students had nearly twice the rate of normative evolutionary models as anthropology students. Therefore, for the purposes of this paper, we will classify biology students as novices and anthropology students as extreme novices. For anthropology students then, doing worse on these three measures (CINS, ACORNS NI, and ACORNS MT) compared to the biology students, could be reflective of their relatively early stage of learning about evolution. As extreme novices are learning, non-normative ideas can often persist while new and normative scientific ideas are integrated into their knowledge frameworks (e.g., Vosniadou et al. 2008; Kelemen and Rosset 2009; Nehm 2010), resulting in a synthetic model of both normative and non-normative ideas (e.g.,. Beggrow and Nehm 2012; Nehm and Ha 2011; Vosniadou et al. 2008). Accordingly, when a task cues that synthetic model, all the knowledge (normative and non-normative) will be elicited together. This could explain why anthropology students had KCs similar to the biology students but, because they are still in the early stages of building their evolution knowledge frameworks, their misconceptions were elicited as well, thereby resulting in a majority of explanations that exhibited non-scientific reasoning models. Similarly, on the CINS multiple-choice test, it is likely that for anthropology students, enough misconceptions are being cued, such that the incorrect choices (designed to highlight typical non-normative ideas; Anderson et al. 2002) appear as viable options. Meanwhile, biology students, while they performed as novices overall, did have a slight majority of explanations scored as pure scientific models. On the novice-expert continuum, some of these explanations fit the “emerging expert” category (adaptive reasoning using key concepts only), which is is not completely unexpected given prior research findings with similar populations (Beggrow and Nehm 2012; Nehm and Ha 2011; Nehm and Schonfeld 2008).
Sensitivity to item surface features can also be used to place learners along a novice-expert continuum. The fact that item surface features affect student learning and problem solving has been well-documented (e.g., Caleon and Subramaniam 2010; Chi et al. 1981; diSessa et al. 2004; Evans et al. 2010; Gentner and Toupin 1986; Nehm and Ha 2011; Sabella and Redish 2007; Sawyer and Greeno 2009; Schmiemann et al. 2017). In evolutionary biology, changing various types of item surface features (e.g., animal vs. plant taxon; loss vs. gain of trait; familiar vs. unfamiliar taxon/trait) has been found to influence reasoning patterns of novices (Federer et al. 2015; Ha et al. 2006; Nehm et al. 2012; Nehm and Ha 2011; Nehm and Reilly 2007; Nehm and Ridgway 2011; Opfer et al. 2012), yet experts tend to see beneath these surface feature effects (e.g., Chi et al. 1981; Nehm and Ridgway 2011; Opfer et al. 2012). We used two types of surface features in this study—trait familiarity and taxon—and will discuss the results for each in turn.
Surface feature 1
Trait familiarity
Our study used items in which all taxa were standardized as familiar, but traits were presented that were both familiar or unfamiliar. Levels of familiarity were hypothesized a priori using Google™ PageRank (see Additional file 2: Appendix 2), but confirmed a posteriori using student familiarly ratings. To our knowledge, this is the first study to explore the effects of surface feature familiarity on evolutionary reasoning while keeping constant the familiarity of the taxon. This approach is essential to tease apart the role of familiarity with “who” evolves vs. “what” evolves. Therefore, this study is the only one we know of that allows the robust investigation of trait familiarity in evolutionary knowledge and reasoning patterns. We found that when we varied the familiarity of traits (i.e., what is evolving) in our items, but kept the taxon (i.e., who is evolving) familiar, biology and anthropology students demonstrated different reasoning patterns. Specifically, biology students’ explanations were not sensitive to trait familiarity for all knowledge and reasoning outcome variables. The anthropology student explanations were similarly resistant to this surface feature in terms of their misconceptions and evolutionary reasoning, but did not exhibit comparable resistance in terms of the number of KCs used. Previous research investigating the impact of the familiarity of item surface features on student evolutionary reasoning using the ACORNS instrument has shown more pronounced effects. However, these studies differ from ours in that familiarity was standardized across both the taxon (i.e., who is evolving) and the trait (i.e., what is evolving) (e.g., Nehm and Ha 2011; Opfer et al. 2012). Therefore, it is possible that the specific surface feature (e.g., trait vs. taxon) and the number of surface features (e.g., trait/taxon only vs. taxon and trait) designated as unfamiliar may impact research findings. For example, Federer et al. (2015) found that students used more KCs and NIs in their explanations for items of familiar taxa/familiar traits compared to items of unfamiliar taxa/unfamiliar traits. We did not find this to be the case with either biology or anthropology students, instead we saw anthropology students using more KCs but no difference in their NIs. Another study also found students to use more KCs in their explanations for items of familiar taxa/familiar traits compared to items of unfamiliar taxa/unfamiliar traits, but no difference for cognitive biases (e.g., teleological misconceptions; Opfer et al. 2012). These results demonstrate a similar pattern to ours, but use slightly different measures of non-normative ideas. Again, it is important to note that both of these studies differed from ours in that the authors designed their items such that both traits and the taxa were familiar or unfamiliar. Therefore, even though we did find some effects of familiarity on student knowledge and reasoning patterns, our results did not completely align with those from previous ACORNS research. It raises the question as to whether keeping one item feature familiar is sufficient to mitigate some potential effects unfamiliarity has on student reasoning. Indeed, outside of evolution, in an investigation of familiarity effects on genetics understanding, Schmiemann et al. (2017) compared measures across items that featured familiar or unfamiliar plants and animals with familiar traits and found no effects of their surface features on students’ genetic reasoning. Similar to our study, only the familiarity of one surface feature was altered while the other remained familiar across items. However, while our study varied trait familiarity, their study varied taxon familiarity. Taking their findings into consideration with ours, the question of why it might matter who evolves, or what evolves, remains open. Additionally, while many studies have shown surface features are not expected to impact experts (e.g., Chi et al. 1981; Chi 2006; Nehm and Ha 2011; Nehm and Ridgeway 2011; Opfer et al. 2012), it is not known how the familiarity of surface features would affect experts. Because other surface features do not significantly impact experts, it is likely that experts would not be affected by the familiarity of the surface features we used here. Therefore, referring back to a novice-expert continuum, biology students demonstrate more expert-like reasoning (relative to anthropology students) in their low sensitivity to the varying familiarity of our item surface features used here, although to confirm this characterization, studies with experts are needed.
Surface feature 2
Taxon
While research into the effects of surface feature familiarity is minimal, there is even less work regarding whether the construct of human impacts students’ evolutionary reasoning patterns. Using human examples in evolution education has been suggested to help to: motivate interest in the topic, form a bridge to less familiar contexts (i.e., non-human), and help students overcome misconceptions (e.g., Hillis 2007; Medin and Atran 2004; Nettle 2010; Paz-y-Miño and Espinosa 2009; Pobiner et al. 2018; Seoh et al. 2016; Wilson 2005). However, anthropology students learn evolutionary theory within a single context (primate lineages) and their knowledge might be more tightly bound to this context compared to that of biology students (diverse array of taxa) (Bjork and Richardson-Klavehn 1989). Thus, any differences we would expect to see in anthropology students’ reasoning would be between human and non-human item measures; specifically, we would have expected the human context to elicit more key concepts (even if more naive ideas were also provided). Indeed, our study did find feature effects of taxon category on anthropology students’ knowledge measures and reasoning patterns, but not for the biology students. However, contrary to what was expected for anthropology students, non-human items had higher key concept scores and were significantly more likely to elicit a pure scientific MT, though the effect size was small. These results raise the question of why their knowledge patterns were not as they were predicted. The only other study, to our knowledge, that has looked at differences in evolutionary reasoning across human and non-human items did find similar results (Ha et al. 2006). Ha and colleagues used items asking about evolution in humans and non-humans to examine students’ explanations across various ages for accurate scientific ideas and misconceptions. They found that when asked about human evolution, students’ were less likely to use an accurate scientific explanation of evolution by natural selection. Furthermore, both human and animal items were more likely to elicit naive ideas regarding the use/disuse of traits as well as intentionality (Ha et al. 2006). While Ha et al. looked at these patterns in elementary through high school level students (who are not learning evolutionary theory situated within a human context), the similarity in their results align with our placement of anthropology students (who have received very little evolutionary instruction overall) on the extreme novice end of the continuum for evolutionary reasoning in regards to their sensitivity to taxon category. Our results generated little evidence in support of the claim that learning evolution within a human evolution context (i.e., primate lineage) is advantageous. Incorporating human examples may still be beneficial, but only when interspersed with examples of other taxonomic contexts. Our results raise numerous questions about what might be effective ways of integrating human examples into evolutionary instruction.
A number of studies suggest that the inclusion of human evolution into evolution instruction has the potential to improve learning; only two studies to our knowledge have directly investigated these effects. Evidence for positive impacts resulting from the inclusion of human evolution has been found for both human evolution instruction followed and human evolution assessment items (e.g., Nettle 2010; Pobiner et al. 2018). In a study with college-level psychology students, Nettle found that participants who were taught evolution in the context of humans performed better on questionnaires that invoked human evolution rather than evolution in non-human taxa, particularly regarding misunderstandings stemming from the lack of attention to intraspecific variation (other non-normative ideas also persisted). Weaknesses of Nettle’s (2010) study worth noting include a limited focus on assessing students on human vs. non-human evolution (as opposed to investigating impacts of human context on learning evolution) and he neglected to establish evidence for validity and reliability for the instrument. In contrast, Pobiner et al. (2018) developed human evolution curriculum mini-units for high school biology students and measured evolutionary knowledge both pre- and post-instruction using instruments for which validity and reliability evidence has been gathered (e.g., ACORNS). They found that students displayed a gain in knowledge measures post-instruction, though their analysis was limited to three key concepts (Pobiner et al. 2018). Even though this finding aligns with our results (anthropology students did not differ from biology students in their ACORNS key concept scores), their analyses did not include naive ideas nor did it compare their human evolution curriculum with non-human evolution curriculum (Pobiner et al. 2018). Thus, their findings are limited and, beyond student interest or motivation, do not provide strong evidence for an advantage of human evolution instruction (Pobiner et al. 2018). Given the paucity of empirical research on human evolution instruction, it is entirely possible that the human context itself provides no such advantages described above for learning and applying evolutionary concepts and the advantages seen are rather from increasing the diversity of contexts of evolutionary content, in general.
The NRC (2001a, b) emphasizes that an integrative mental framework utilized across a range of contexts is essential for achieving competency in science. If biology students are better at applying the evolutionary ideas that they have learned across situational features (i.e., non-human and human evolutionary change), it raises the question as to what it is about biology, which anthropology lacks, that fosters this more flexible conceptual framework. Theory suggests that this lack of flexibility could be a by-product of the focused nature of evolutionary theory learners experience in anthropology (e.g., Jacobson and Spiro 1995; Spiro et al. 1989). By only representing evolutionary theory using a single theme (e.g., evolution in the primate lineage), the construct of evolution becomes oversimplified, the likelihood of embedded misconceptions increases, and the likelihood of achieving flexible, transferable knowledge frameworks decreases (Jacobson and Spiro 1995). Incorporating a variety of examples across a diversity of contexts has been suggested as a more optimal method for teaching (Anderson et al. 1996; Jacobson and Spiro 1995; Nehm, 2018; Opfer et al. 2012; Spiro et al. 1989). Accordingly, the biology students demonstrate some ability to consistently apply their evolutionary knowledge across such a range - a skill the anthropology students do not seem to have mastered yet.
Ultimately, biology students’ explanatory frameworks appear to be relatively more developed and coherent than those of the anthropology students as they exhibit consistency in application across taxon categories and across trait familiarity (Kampourakis and Zogza 2009; Nehm 2018). Considering that experts are better at seeing beneath surface features (e.g., Chi 2006), and that transfer is a factor of representation and degree of practice (Anderson et al. 1996), it seems an advantage for learning evolutionary concepts and fostering more advanced conceptual frameworks lies in teaching a construct, like evolution, across a diversity of contexts.
While we did control for many demographic and background variables, an alternative explanation could be that some other differences in biology and anthropology students that we did not control for accounted for the sensitivity to taxon that the anthropology students displayed. Their sensitivity to the human taxon could be a result of their limited exposure to anthropology (the majority of the students’ only anthropology course was the one they were currently enrolled in). Future studies including anthropology students with more experience in terms of coursework could help resolve this issue.
Implications for instruction
The finding that naïve ideas were more common in anthropology students compared to biology students (when demographic and background features were held constant) suggests that targeting naive ideas should be an instructional goal for anthropology education. Additionally, considering the positive effects associated with incorporating human examples into biology instruction found by other authors (e.g., deSilva 2004; Flammer 2006; Nettle 2010; Price 2012; Pobiner et al. 2018; Seoh et al. 2016), another potential instructional goal could be incorporating non-human comparative examples into anthropology instruction. Providing a greater diversity of contexts for anthropology students could help build a greater flexibility into their conceptual frameworks and foster more expert-like reasoning. Clearly, more studies including anthropology students, instructors, and experts are called for, as they will continue to help clarify how contextual factors impact the learning of evolution.
Limitations
One major limitation is that biology and anthropology students may be different populations as evidenced by their significantly different patterns of demographic and background variables. One of the most striking differences is that the vast majority of anthropology students have taken only one anthropology class (i.e., the one they were in while completing the survey). In contrast, most biology students had already taken biology classes in addition to the one they were in during the survey. Therefore, although both populations were sampled at a similar time in their academic careers, these findings demonstrate that care must be taken to ensure that comparisons between anthropology and biology students are appropriate. However, even when controlling for the number of prior courses, significant differences between the two populations were still found using regression analyses. A potentially more appropriate method for comparing these two populations could be propensity score matching using a larger data set. Additionally, recruiting students from higher level courses could potentially help mitigate these concerns.
As described above, anthropology and biology students may differ in evolutionary knowledge and reasoning patterns due to their respective training. However, it is also possible that the populations enrolling in each of these courses are different in the first place, and thus, the outcomes may not be indicative of the impact of their respective types of evolutionary training. We controlled for many of the differences among students in the analyses, but we were not able to control for every student variable. For example, is possible that motivation and interest may differ among the biology and anthropology students in the sampled population. Specifically, the introductory biology course in which this study took place was designed for biology majors and most of the students in the class were biology majors. There are alternative introductory level biology courses at the university for non-major students. In contrast, the introductory anthropology class used in this study is taken by both majors and non-majors, and there are no other introductory course offerings for non-majors. The different introductory course structures for these two disciplines may have contributed to the discrepancy in previous coursework observed between our two populations, and may differentially impact student motivation and/or interest. In terms of the former limitation, sampling from upper level courses for comparison or, alternatively, sampling introductory anthropology along with a non-major introductory biology course could lead to more comparable populations. In addition, gathering pre-test data on the populations could also help with this limitation. In terms of the latter limitation, the interaction between context and motivation/interest was beyond the scope of this study, but raises important questions that could be addressed in future work.
Although we were able to determine that there are differences between populations of biology and anthropology students, we are unable to tease apart the program these students are situated within and the instructional variation the students are experiencing. In other words, is it the nature of the content (evolution via biology vs. evolution via anthropology) or characteristics of the instructors in these programs? Accordingly, an alternative explanation for the differences in measures of knowledge and reasoning seen between the populations is the anthropology students’ lack of familiarity with the assessment format. The biology program involved in this research is strongly rooted in biology education research, conducts its own research studies and incorporates evidence-based teaching practices. Thus, the ACORNS item format used in this study, while novel to the anthropology students, is not novel to the biology students. While it is possible that this discrepancy in assessment format familiarity could have impacted the anthropology students performance (Norman et al. 1996; Opfer et al. 2012; Schmiemann et al. 2017), it seems unlikely considering there was no difference in KC measures between populations. However, the instruction itself could be impacting the results if research on novices’ non-normative ideas is being addressed through targeted instruction. These ambiguities could be addressed with future research including larger samples of students across programs with diverse involvement in biology education research.