Design-based learning for a sustainable future: student outcomes resulting from a biomimicry curriculum in an evolution course

Background: National institutes and education initiatives emphasize the need to prepare future biologists to apply discoveries in science towards solving problems that are both social and scientific in nature. Research from socioscientific, design-based, and problem-based learning demonstrates that contextualized, real-world tasks can improve students’ ability to apply their scientific knowledge in practical ways to navigate social problems. Biomimicry Design is an interdisciplinary field requiring biology and design skills; it informs the creation of sustainable designs through emulation of biological structures and functions that arise as a result of natural selection. Notably, engaging in biomimicry design targets an important biology and engineering learning outcome: understanding of how structure influences function. This study leveraged the practices of biomimicry along with those of design-based learning (DBL) to improve student outcomes in an evolutionary biology undergraduate course. Through DBL, the authors aimed to (1) ignite deeper understanding of how structure determines function in nature (a cross-disciplinary concept) and (2) help students to consider new ways this concept can benefit society (a science process skill). Results: We randomly assigned two sections of an upper-division evolutionary biology course to either a biomimicry DBL (DBL group) or species comparison (comparison group) curricular design. Students in the course were exposed to a 1-day lesson, then 1-weeklong case study, and then a final project focused on either biomimicry species-tohuman design comparisons (DBL condition) or species-to-species comparisons (comparison condition). To assess the targeted outcomes, we analyzed students’ responses from a pre-post assessment. Students in the biomimicry section were more likely to apply their biological structure–function knowledge to societal benefits when leaving the course. Students in both sections showed comparable gains in structure–function understanding, but there was no change in the number of students who used misconception language in their post-course compared to pre-course


Background
The need for applied biology education As undergraduate educators, we are called upon to prepare future scientists to apply discipline-specific knowledge to current interdisciplinary problems (AACU 2016;Brewer and Smith 2011;NSF 2017). More specifically, biology students in the 21st-century must go beyond knowledge attainment and cultivate the ability to apply biological discoveries toward the world's top crises of climate regulation, food and water security, energy and natural resource use, biological diversity, and health; socio-scientific issues that require both scientific and social knowledge (AACU 2016;Brewer and Smith 2011;Eastwood et al. 2013;NRC 2009;NSF 2017). Justifications for such requests are informed by the collective understanding that environmental problems could be significantly mitigated if technological and scientific discoveries were applied toward environmental challenges (Team CW 2007;USDE 2006;WRI 2011). However, curricula that address socio-scientific problems are predominantly content-centric and often focus exclusively on what and how environmental problems occur (Aikenhead 2006;DeHaan 2005;Hofstein et al. 2015). Such approaches are devoid of student-driven solutions and decisions (Aikenhead 2006;DeHaan 2005;Hofstein et al. 2015). Content knowledge alone is not sufficient to maintain interest and motivation to learn, nor to bridge the current gap that exists between pro-environmental intentions and behaviors. Instead acquisition of application-related knowledge and skills (knowing what mitigating actions to take and how to take them) and their link with beliefs that such actions will be effective are strong predictors of individuals' motivation to act upon, learn about and work to solve environmental issues (Frick et al. 2004;Hewitt et al. 2019;Moser and Dilling 2011;Ojala 2015;Truelove and Parks 2012).
It follows that we, as undergraduate biology educators, should strive to support not only students' knowledge of socio-scientific issues, but also their ability to address these issues through application of their scientific knowledge and skills. This has been achieved in several instances (Eastwood et al. 2013;Sadler and Dawson 2012;Udovic et al. 2002). For example, a three-year longitudinal study comparing application-and science-in-societyfocused classes relative to content-focused classes found that students in the former had improved decision-making skills (e.g., they were able to apply fundamental concepts to new problems) and higher average conceptual knowledge than students in content-focused courses (Udovic et al. 2002). Similarly, in several meta-analyses of problem-based learning (PBL) in higher education, where learning is structured around complex problems situated beyond the school context, researchers found PBL to be better than traditional instruction for preparing skilled practitioners to apply concepts in a wide array of fields. This included more advanced practitioner skills in clinical, economic and public health settings (Dochy et al. 2003;Gijbels et al. 2005;Strobel and Van Barneveld 2009;Woods 1985). Advocates of PBL assert that higher education's purpose is to guide students away from knowing facts and concepts devoid of context toward developing more expert-like frameworks that help students to apply knowledge in flexible ways toward solving novel problems (Atman et al. 2007;Sadler et al. 2007).

Applied learning's influence on content knowledge
Despite the positive effects or PBL and other applied learning techniques on skill-application, there is mixed evidence as to whether applied learning and incorporation of socio-scientific issues through discussion or problem solving actually improves students' content knowledge. In some instances, there is strong evidence that situating important science content within a broader socio-scientific contextual framework is what gives that content meaning and creates learning pathways (Sadler and Dawson 2012). Students achieve higher content gains than students not learning through a socio-scientific lens in these cases (e.g. Benware and Deci 1984;Hewitt et al. 2019;Venville and Dawson 2010). However, other research indicates that students in classes incorporating applied learning and/or socio-scientific issues have either equal or negative content gains relative to students who learn without incorporation of these issues (Dochy et al 2003;Eastwood et al. 2013;Sadler and Dawson 2012). Although results are mixed, there is strong evidence from cognitive psychology and learning theories that content knowledge that is of primary importance to the discipline is strongly associated with ability to form higher order frameworks necessary for application (Bransford et al. 1989;Segers et al. 1999;Spiro 1988). Laverty et al. 2016). This encompasses the idea that the basic units of structure, operating at the molecular level through the landscape level, dictate the function of biological molecules, cells, tissues, organisms, and ecosystems (Brewer and Smith 2011;Laverty et al. 2016). This S-F concept is particularly integral in the field of evolution since genetic and phenotype structures are propagated in a population through time based on optimized functions with the greatest reproductive success (Brewer and Smith 2011). Beyond its importance inside the classroom, understanding the S-F concept is also pivotal for many biology careers. For example, the S-F concept is applied to disciplines at the interface of biology and engineering, such as genomics, bioinformatics, and conservation. In addition, knowledge of biological S-F relationships has informed engineering design approaches and quantitative modeling (Brewer and Smith 2011). This can be seen in the analysis of the functional forces of the keratin hairs on a gecko's toe pad that has informed the creation of self-cleaning, re-attachable adhesive microfabrics (Geim et al. 2003). The integrated fields of life sciences and engineering are paving the way for a wide array of practical applications including innovations in medicine, health, alternative energy, and the behavioral and social sciences (Brewer and Smith 2011).

Challenges associated with learning natural selection and S-F
While the S-F concept is widely applicable for diverse careers, undergraduate and high school students typically have a hard time understanding complex evolutionary concepts that hinge on the idea that structure determines function in nature and not vice versa. Of particular difficulty is understanding that evolutionary mechanisms result in adaptive structures with specific functions and that these processes are not purposeful (Coley and Tanner 2015;Moore et al. 2002;Nehm and Reilly 2007). Moore et al. (2002) found that when students describe how structures in a population change through time, those with higher conceptual thinking described how structures in a population or species are acted upon by evolutionary processes, whereas those with lower understanding saw species as acting upon their own structures to survive or adapt. This latter thinking and language is considered teleological, where one uses purposeful thinking to explain an outcome of natural selection or a structure resulting from natural selection rather than using random and non-purposeful explanations (Coley and Tanner 2015;Keil 2006;Talanquer 2007). Teleological thinking can include any attribution of purpose to the outcome whether or not the organism is perceived as "intending" the adaptation or not. Importantly, this type of teleological thinking is "design-based teleology" that implies that something exists because it was designed to fill a role instead of "selection-based teleology" which states that a trait exists for a role because it is being selected to fill it (Kampourakis, 2015). Design-based teleology is inconsistent with evolutionary theory whereas selectionbased teleology is not. From this point forward, when we refer to teleology in this paper we are referring to design-based teleology.
Teleological statements are generally considered evidence of an evolutionary "misconception" or a view that is deemed incorrect as it does not align with canonical thinking or broadly accepted understanding of a phenomena. The word "misconception" and its definition are somewhat problematic in this context since there are different theoretical stances used to understand student cognition. For example, using the word "misconceptions" when referencing students' representations of evolutionary phenomena implies that students hold relatively stable knowledge structures about evolution across contexts and that expressions of inaccurate or incorrect ideas reflect deeply held misunderstandings of those ideas (Gouvea and Simon 2018). However, this may not always be the case (e.g., Gouvea and Simon 2018;Ojalehto et al. 2013). In this work, we use the word "misconception" as shorthand to identify language students use that indicates a possible inaccurate or incomplete understanding of a phenomena. We therefore distinguish this "misconception" language from more deeply held misunderstandings.
Teleological language can often persist even after a student has participated in evolution courses, whether in active or traditional classroom environments (Abraham et al. 2009;Bishop and Anderson 1990;Nehm and Reilly 2007). For instance, 70% of biology majors in an active learning environment with small group and paired discussions overall expressed from 1 to 6 different natural selection misconceptions on post-course responses to open-ended questions. This included the misconception that organismal "needs" cause evolutionary changes to take place (Nehm and Reilly 2007). In another study, teleological reasoning was found to predict undergraduate students' ability to learn natural selection above and beyond acceptance of the concept itself (Barnes et al. 2017). This body of evidence suggests that traditional or even active group discussion teaching practices and students' acceptance of natural selection are insufficient alone to change misconception language use around evolution, including teleological expressions, and new pedagogical practices should be explored. Fried et al. Evo Edu Outreach (2020) 13:22 Design-based learning (DBL) practices New pedagogical practices should help students develop strong S-F understanding and teach students to apply this knowledge to real-world socio-scientific issues. One potential solution is to incorporate DBL practices into undergraduate biology classes. Informed by Dym et al. (2005) and Fortus et al. (2004), DBL is a more specific type of PBL that uses a beneficial process to engage students in designing real-world solutions to achieve a client's desired functions or goals. The DBL process includes scoping-defining the purpose and constraints of the project, generating-designing initial prototypes from past information and data, and evaluating ideastesting initial prototypes (Dym et al. 2005;Fortus et al. 2004). The DBL process benefits students by asking them to apply skills that are parallel to science process skills toward real-world tasks (see Fig. 1) (Fortus et al. 2004;Laverty et al. 2016). The additional focus of DBL on design allows students to develop skills in divergent and convergent thinking (aspects of creativity) and systems thinking (dynamic views of interacting parts), making them stronger candidates for applied biology careers Fig. 1 Comparison of Key Elements of design (left) and scientific (right) processes (Fortus et al. 2004;Laverty et al. 2016). The design process parallels aspects of the scientific process Fried et al. Evo Edu Outreach (2020) 13:22 (Dym et al. 2005;Shah et al. 2012). These ideas, while more novel in undergraduate contexts, have long been present in K-12 curricular design. In 2012, the National Research Council's Next Generation Science Standards worked to embrace designing solutions as part of a more inclusive 'Science and Engineering' set of practices (NGSS 2012). The K-12 educators and administrators who worked to draft the NGSS did this by broadening the focus of K-12 science education to include application of scientific content in conjunction with the previous, singular emphasis on deeper conceptual understanding (NGSS 2013). Segments of their common core state standards align closely with the DBL process, and have influenced more recent curriculum design and assessment frameworks at the undergraduate level, such as the 3-D Learning Assessment Protocol (LAP, Laverty et al. 2016). Despite incorporation of DBL philosophies in several suggested undergraduate learning frameworks, design skills are often not explored in traditional undergraduate biology lectures and labs. Yet, incorporating these skills may greatly benefit students, especially now when there is increased demand for STEM professions but a shrinking U.S. STEM professional population (Olson and Riordan 2012). Biomimicry design, which follows the above-described DBL process, emphasizes human emulation of organismal structures and functions to inform sustainable design solutions (Benyus 1997). When biologists and biomimicists collaborate at the design table, engineered structures are re-imagined using biological structures as inspiration, instead of traditional design methods informed from previous human-created designs. Because biological structures are subject to natural selection, which often results in improved function while minimizing resource cost, biomimetic designs tend to be more efficient and sustainable than design alternatives. Thus, the practices of biomimicry have the power to improve the sustainability of our built environment, from the energy efficiency of wind turbines inspired from a humpback whales' fin, to carbon capture processes to create concrete material informed by coral reefs (e.g. Fish et al. 2011). Biomimicry seeks to achieve designs that undergo structural reform to not only lower our use of natural resources and mitigate pollutants and waste, but to integrate our built environment with that of natural systems. In this way, it serves our society by addressing real socio-scientific issues. Biomimicry is inherently projectand design-based and is grounded in ideas of biological structure and function (see Fig. 2). Therefore, it can be leveraged as a learning tool to support students' deeper understanding of abstract evolution concepts while allowing students to apply this knowledge toward real societal benefits. Together, these components address students' knowledge, skills, and motivation to engage. While a number of academic articles provide classroom biomimicry activities with anecdotal reports of outcomes (e.g., Gardner 2012;MacDonald 2013;Soja 2014;Schroeter 2010;Topaz 2016), we are unaware of any existing studies that systematically investigate differences in biology student outcomes utilizing a biomimicry DBL framework, as we do here.

Purpose of the study
We tested Biomimicry DBL as a potential mechanism to support both students S-F content learning and their ability to apply biological S-F knowledge to socio-scientific issues. To assess this, we assigned one section of an evolution course to a DBL condition-where students engaged in design-based biomimicry projects using evolved structures to inform human sustainable design solutions-and another section to a comparison condition-where students engaged in a more traditional comparison of homologous and evolved structures of different organisms. In both sets of curricula, we emphasized learning goals that aligned with evolutionary concepts and science process skills to help students understand that structure determines function in nature. Our approach was informed by the three-dimensional learning assessment protocol (3-D LAP) (Laverty et al. 2016). 3-D LAP is a framework that assists instructors in forming alignment across learning goals, activities, and assessments along three-dimensions of learning defined by the Next Generation Science Standards (NGSS disciplinary core ideas, science process skills [synonymous with practices], and crosscutting concepts; Laverty et al. 2016, NGSS 2013. This framework is strongly informed by efforts in K-12 to understand how people learn and how they put their knowledge to use, most specifically the philosophies and practices used to construct the Next Generation Science Standards (NGSS 2013). The 3-D LAP extends the NGSS by detailing how to build robust assessments across the three dimensions of learning and applying the framework to undergraduate education. Central to our study and the curricula was the cross-cutting concept that structure determines function at all levels of life (Laverty et al. 2016). This concept was emphasized in every DBL and comparison lesson and case study we designed. However, we emphasized applying discoveries in science to society as a central science process skill in the DBL condition to support the applied nature of students' biomimicry work; this was not emphasized in the comparison condition. While applying discoveries in science to society is not a stand-alone science process skill as defined by Laverty et al. (2016), helping students to develop the ability to ask scientific questions as they relate to real world events is defined as a science process skill. We emphasized this specific science process skill in our DBL curricula and final project and went a step beyond posing questions by asking students to apply their conceptual understanding of natural selection and structure-function to real-world design solutions. While the current definition of science process skill does not include this application criteria, we view application as a vital part of connecting science to society. These standards also align with NGSS's science and engineering practices (i.e. designing solutions) (NGSS 2013), and Brewer and Smith's (2011) core science competencies.
Our research questions (RQs) were: RQ1: To what extent is biomimicry DBL related to students' likelihood to apply S-F concepts to benefit society?
RQ2: To what extent is biomimicry DBL, and application of S-F knowledge to benefit society related to students' S-F understanding?  (Dym et al. 2005;Fortus et al. 2004) and biomimicry design (Benyus 1997) RQ3: To what extent is biomimicry DBL related to a student's use of misconception language?
Specifically, we hypothesized that students in the biomimicry DBL condition, having applied their understanding of natural selection and S-F toward relevant human-design challenges in class, would have a higher quality description of the relationship and evolutionary mechanisms' influencing biological structures and functions. We predicted that they (H1) would be more likely to apply their S-F understanding to navigate socioscientific issues (Science Process Skill) and (H2) would provide a more detailed understanding of the relationship and evolutionary mechanisms influencing biological structures and functions (Evolution Big Idea and Cross Cutting Concept), but potentially (H3) would have more misconceptions overall based on the integration of human and natural design-based processes that could activate teleological thinking. Ultimately, we were curious if this specific type of DBL teaching could simultaneously contextualize biology using a real-world lens for sustainable solutions while helping students develop S-F understanding.

Research framework
In this curricular study, we tested a specific type of DBL as a potential mechanism to support both students' S-F content learning and their ability to apply biological S-F knowledge to socio-scientific issues. To best meet those purposes, we randomly assigned one section of an evolution course to a DBL condition, where students engaged in a 1-day introductory biomimicry lesson, a 1-weeklong biomimicry case study later in the semester, and a design-based final project in which they used evolved structures to inform human sustainable design solutions (57 students taught MWF 2:00-3:00 pm). These curriculum components were dispersed throughout the semester (Fig. 3). In the comparison condition, students engaged in a more traditional 1-day lesson on homologous comparisons, a 1-week long evolutionary biology case study, and final project involving comparison of homologous and evolved structures of different organisms (67 students taught MWF 1:00-2:00 pm). These comparison condition components corresponded to the timing of the DBL components during the semester (Fig. 3).
In both sets of curricula and final projects, we emphasized learning goals and objectives that aligned with evolutionary concepts and science process skills to help students understand that structure determines function in nature (Table 1). While both sections were built upon similar learning goals and 3-D LAP foundations, activities and content converged and diverged in the two sections in accordance with the study interventions (Fig. 3). Lectures and homework assignments during the 1-day lessons, weeklong case studies and the final project were informed by slightly different knowledge and skill learning objectives corresponding to DBL or comparison condition foci (Table 1).

Course context
Curricular interventions were implemented in a 16-weeklong active learning Evolutionary Biology course. This lecture and lab course is offered in the fall and spring terms at a large, public, R1 research university. Students are typically in their second or third year, and have previously taken two semesters of introductory biology. This class emphasizes evolutionary concepts that explain the diversification of life on Earth, including evolution, adaptation by natural selection, speciation, and macroevolutionary patterns and processes. The two fall 2018 sections in our experimental investigation met three times a week for 50 min in the lecture portion and 3 h once per week for the lab. This course is active and highly structured (e.g., Eddy and Hogan 2014). Students engage with concepts in homework assignments to be addressed in the next class and a majority of in-class time is spent on individual student reflections, clicker questions, and/or small group work (3 or fewer people per group). Both sections of the course culminate in a final group project that is assessed for summative gains in outcomes. These active assignments support inclusive pedagogy designed to promote peer to peer teaching and strengthen students' ability to modify ideas based on feedback (Tanner 2013).

Curricula
The 1-day lessons and 1-weeklong case studies in both DBL and comparison curricula courses addressed the cross-cutting concept that structure determines function in nature (Additional file 1: Appendix A). In week four of both 2018 sections, we initiated the 1-day lessons. Both sections explored the tubercles (bumps) along a humpback whales' fins to emphasize the evolutionary idea that adaptation by natural selection accounts for the appearance of non-deterministic "design" in nature and is subject to constraints (Table 1, Fig. 3). In groups, students engaged in worksheets in-class to apply their learning to such tasks like discovering the properties of the humpback whales' fin and how it interacts with fluid dynamics. During this lecture the DBL condition students also generated at least 3 human created designs where the whale fin could be applied to improve the design and learned about a real-world design (the wind turbine) that used the whale fin for inspiration to improve efficiency. The comparison condition students instead engaged in mathematical thinking to compare and contrast the tradeoffs of three whale species with different shaped fins. In week nine of the course, we employed the weeklong case studies. Both sections revisited S-F relationships by investigating the properties of honeycombs throughout the week (Fig. 3). We emphasized the evolutionary ideas that (1) there are multiple dimensions of S-F properties contributing to fitness (e.g. space optimization, resource use), and (2) phenotypic structural variation is influenced by genes and the environment and is necessary for evolution (Table 1). Students worked individually and collaborated in groups on a variety of in-class and out-of-class activities, such as interpreting data of genetically diverse honeybee populations to make an argument about how this might influence honeybee fitness. During this case-study the DBL condition students researched and compared current human designs with the same functions of the honeycomb, and redesigned a tiny home by researching and applying different organism S-F properties to various sustainable functions of the tiny home. The comparison condition students instead continued to discover various organism structures that have hexagonal shapes, and then compared and contrasted the variety of structures and densities of toe pads of various reptiles and arthropods. In the final two weeks of the semester, students created an open-access webpage that described how the three dimensions (3-D) of learning integrate into evolution (Laverty et al. 2016). Comparison condition students conducted an evolutionary analysis on a topic Table 1 Knowledge and skill objectives emphasized throughout the tested curricula Knowledge Objectives Students will be able to…

Both Sections
Both Sections Explain that the functions and properties of organisms are determined by their structures Explain that adaptation through natural selection accounts for the appearance of "design" in nature and is subject to constraints Describe how multiple aspects of a structure relate to organismal survival through various functions Explain how phenotypic variation is influenced by genes and environment and that this is necessary for evolution Develop multiple testable hypotheses and predictions based on evidence Use quantitative thinking to adopt reasonable objective criteria for choosing among rival claims Construct explanations and arguments based on interpretations of data Read and investigate the primary literature for evidence-based information

DBL section DBL section
Compare species' S-F properties and evolutionary mechanisms that shape natural S-F relationships to human designs and the human design process Apply species' S-F relationships toward solving problems in human sustainable design

Comparison section Comparison section
Compare homologous S-F properties of species to inform the dependence of function on structure Apply comparative analysis techniques to infer functional properties of species' structures of their choosing. DBL condition students engaged in a real, local biomimicry redesign challenge (e.g., designing a walking path for water repulsion) and educated local designers on the principles of evolution. Across both conditions, we aimed to keep aspects of the course and lessons that were not directly related to the treatment as similar as possible. Therefore, during the lesson and case study, students in the two conditions would often engage in the same content (e.g., an exploration of whale fins) prior to diverging to engage in either design or comparison activities. For a more in-depth comparison of these curricular tasks beyond our DBL condition alterations, we have included a table which describes the learning outcomes and central activities for each lesson and highlights similarities and differences across conditions (Additional file 1: Appendix A).

Study participants
All students enrolled in the evolution course were invited to participate in this research. Those who chose to participate agreed to release their responses to pre-and post-assessment responses, S-F class assignments, exams and final course grades, and academic and demographic information. 107 students out of the 124 in both sections of the course agreed to participate (86.3% participation); four students were omitted from the 107 because they dropped the course, had conflicts of interest, or received below a failing final course grade (< 60%). This resulted in a total of 103 participants (N Comparison = 53 students, N DBL = 50 students). In addition, only data for participants who filled out both the pre-and post-assessments (91 students) were included in pre-post comparison analyses (see Table 2).

Data collection
Evaluation of targeted learning outcomes from these particular lessons, case studies, and final projects was based on two open-ended questions from the pre-and postcourse assessment to distinguish pre-vs. post and DBL vs. comparison differences (Table 3). Responses to these questions serve as the main response variables of interest in this study. These assessments were given to students in the second (pre-assessment) and last weeks (post-assessment) of the semester via Qualtrics. Students received either 1 extra-credit point (pre-assessment) or 1 participation point for thoughtful completion (post-assessment). The assessment questions measured a student's ability to describe the ways in which S-F information can benefit society (Assessment Q1, Science Process Skills), and describe the relationship between structure and function for a particular biological phenotype (Assessment Q2, Evolution Big Idea). These questions aligned with larger class learning goals and our specific RQs.
Assessment questions were intentionally open-ended and nonspecific to support the diverging ways students could choose to answer. Based on pilot testing of the questions with graduate students prior to use, we were confident that student responses would address the specific learning objectives and skills emphasized during the 1-day lessons and case studies (see Table 3). More information about pilot testing and the process of aligning our assessment questions and the specific learning objectives and skills can be found in the Additional file 1: Appendix B. Assessment Q1 asked students to report ideas relating to how S-F knowledge could benefit society, with students  (9) 28.3% (15) 1-3 63 58% (29) 64.2% (34) 4 or more 16 24% (12) 7.5% (4) generating ideas that were applied/behavioral, cognitive, and affective. Thus, students' responses to Assessment Q1 allowed us to measure the likelihood that a student would specifically apply S-F concepts to benefit society by describing how their knowledge could be used to address specific societal problems.

Qualitative methods
For this study, we developed a qualitative codebook and corresponding rubrics which we used to convert students' qualitative responses to Assessment Q1 and Q2 into quantitative values useful for statistical analyses and student response scoring. The first author (coder 1-EF) and two undergraduate student coders (coders 2 and 3-AE, AT) used open coding to break down qualitative text into discrete parts (units of meaning) in order to capture discrete thoughts of a participant and assign each thought a code describing its meaning (Saldaña 2015). Two coding groups were created, both consisting of the first author and one undergraduate student coder. In these two groups, coding members read through all student responses to establish units of meaning for each response to either Assessment Q1 or Q2, and to create preliminary codebooks for the team's corresponding assessment question. Coding was then done as a whole team on a subset of the same 15 student responses to both Assessment Q1 and Q2 to further refine these preliminary codebooks. Then in the same 2 original groups, members collaborated to iteratively refine and improve team codebooks for either Assessment Q1 or Q2 by reading and coding all data for each question. Teams coalesced on the final codebooks for each question. Using the final codebooks, each coder individually coded 80-100% of student responses to their corresponding assessment question. We compared responses for agreement across coder teams and calculated Cohen's Kappa to establish our IRR score for each codebook. After the final codebooks were established, team members came to consensus on all remaining discrepancies to achieve 100% agreement for all student responses to their corresponding assessment question. For Assessment Q1, we used an inductive exploratory coding process to generate detailed codes describing student responses. We then aggregated the more detailed initial codes into larger, more clear main codes that addressed our hypotheses (Saldaña 2015). In this process, codes are generated and defined during data analysis based on patterns observed in the data; thus, they are posteriori instead of a priori. For Assessment Q2, we identified codes we could organize into a hierarchical rubric to rank student responses (i.e., to grade) based on the depth and detail of the evolutionary mechanisms students included (quality), completeness of response that included students addressing all elements of the question (completeness), and how accurately they understood these concepts and if they included any misconception language (absence of misconceptions). An inductive exploratory approach was used to generate quality and completeness codes, and a-priori codes were used to determine absence of misconceptions within responses based on past research on evolutionary misconceptions (Coley and Tanner 2015;Short and Hawley 2012). If it is of interest to instructors, we developed two grading rubrics based on our qualitative coding of the student data and informed by rubric development discussed in Allen and Tanner (2006). Both our codebooks and the rubrics to both assessment questions can be found in the Additional file 1: Appendices C and D. The rubrics can be used to evaluate responses to these questions for future classes.

Quantitative methods
To test our three main RQs, we used R (R Core Team 2017), lme4 (Bates et al. 2015) and ordinal (Christensen 2019) packages to perform binary and ordinal logistic regressions to explore the relationship between the main Q2. Explain how a biologist would describe the dependence of organismal function on its structure. Use an illustrative example in your answer A biologist in this context is someone who seeks to understand living systems Explain that the functions and properties of organisms are determined by their structures Explain that adaptation through natural selection accounts for the appearance of "design" in nature and is subject to constraints Describe how multiple aspects of a structure relate to organismal survival through various functions Explain how phenotypic variation is influenced by genes and environment and that this is necessary for evolution predictor variables and either binary or ordered integer response variables. We used ggplot (Wickham 2016) and dplyr (Wickham et al. 2015) to perform complex plot functions. Specifically, we ran a binary logistic regression to investigate whether time (pre-assessment to postassessment) and class condition (DBL or comparison) influenced the likelihood that students would describe an application of S-F knowledge to a societal benefit in Assessment Q1 (addressing RQ1). We also used binary logistic regression to investigate whether time and class condition influenced the presence or absence of evolution misconception language in students' responses to their post-assessment questions (addressing RQ3). All binary logistic regression assumptions were verified and met by plotting any predictor variable value against the response variable's log-odd values to test for linearity, and testing for overdispersion of residuals by running chisquare tests on our residuals against an over-dispersed model (Introduction to SAS 2016). Translation of binary logistic regression results into percent chance of reporting a result was accomplished using the Effects package in R as described in Theobald et al. (2019) work. We used an ordinal logistic regression to examine the differences in S-F understanding students had across class conditions and time (addressing RQ2). All ordinal logistic regression assumptions were verified and met by checking for multicollinearity between predictor variables and checking the proportional odds of independent variables (Brant 1990; Introduction to SAS 2016; Laerd Statistics 2020).
We also conducted multiple chi-square tests and an extension of the Fisher exact test (Fisher-Freeman-Halton test) to supplement analysis of our predictions. We used these chi-square tests to test for changes in the frequency of different misconception language used across class condition and time (addressing RQ3). To meet the assumption of all chi-square tests, we combined appropriate categories together when the expected frequencies of those categories fell below assumed values (no expected category frequency can be less than 1 and no more than 20% of expected categories can have a frequency less than 5) (Whitlock and Schluter 2014). Specific mathematical models used for analysis of each question are described in the results section.
Notably across all of these statistical analyses, student level data failed to meet the assumption of replicate independence since students were not randomly assigned to experimental conditions nor were experimental conditions assigned to multiple classes (i.e., the experimental design was quasi-experimental, Gribbons and Herman 1996). We took several measures to account for the failure of these assumptions. Specifically, we accounted for nonindependence of students within a section by including the students' most frequent self-selected in-class groupings (e.g., the groups they worked with to complete worksheets) as a variable of influence on student responses to our assessment questions (Hedges 2007). In all analyses, this variable consistently accounted for very little (near 0) variance in student responses. This provides some evidence that groupings of students, either within or across classes, did not strongly influence our results. These and other measures are described in our additional file under "Accounting for Failed Assumptions" (Additional file 1: Appendix E). Importantly, these measures allowed us to check our data for potential biases due to the failed assumptions. We determined that these biases were minimal and conclude that the analyses presented below are appropriate and rigorous given the stated limitations.

Student responses to assessment Q1
Our coding team identified four main codes and three sub-codes of benefits students mentioned when answering Assessment Q1 (94% agreement; Kappa Coefficient = 0.85). Of the four main codes, a small handful of students mentioned No Benefit (6.7% of all pre-post responses) through incoherent or incomplete statements or explanations of the S-F relationship but no explicit mention of a benefit gained by society. Cognitive benefits (30.6% of all pre-post responses) included statements or examples about changes to the way humans think, either about nature and natural processes or changes to how we predict future trends, problem solve, or become more metacognitive.
" [W]ithout the understanding of form and function in nature we would not be able to understand evolutionary responses that we, as well as other organisms, have to the environment. We would not be able to understand the why and how nature is the way it is today. We use form and function to understand how viruses affect our cells by latching onto the outside and inserting their DNA into our own cells (Cognitive). " Affective benefits (7.3%) were those statements or examples that included changes to the way humans feel or gain appreciation for nature informed by a greater S-F understanding. These statements were often paired with other codes, as seen in the example below.
"Society can benefit from understanding form and function in nature because it gives us a greater understanding of how our environment works (Cognitive), and how interconnected we are to the environment…This will hopefully lead to a greater Fried et al. Evo Edu Outreach (2020) 13:22 appreciation of the other species we inhabit our Earth with, and then to a mindset that is more focused on the conservation of said species (Affective). " Lastly, application/behavioral benefits (79.8% of all pre-post responses) were statements or examples that included changes to the way humans act, create, or behave as a result of this knowledge. Statements of this nature were most frequently about positively altering the world around us and often fell into (1) altering or creating human built designs that mimic nature for more sustainable designs (biomimicry design), (2) altering or creating human built designs that use natural forms (bio-utilization) or are informed by our better understanding of S-F more broadly (non-biomimicry design), (3) altering the way humans conserve resources, protect nature, or better the human-nature relationship, and (4) altering the way we practice health and medicine.
(1) "If designers found ways to mimic for example how efficient certain slime molds and termite colonies are at resource management and efficiency, humans could greatly reduce costs and benefit from more efficient roadways and public transit systems" (Biomimicry Design).
(2) "Natural forms can be augmented to fit human purposes that require similar designs, and natural forms have already been tested to serve this function by evolutionary pressures. " (Bio-utilization).
(3) "Society can benefit from understanding form and function because if, for instance, we learn more about the forms that make up an endangered species, it will give us… a better path to conservation. " (Nature Conservation). (4) "Society can benefit from understanding form and function in nature because it can lead to great medical discoveries and possible cures to diseases. " (Health and Medicine).
In both classes, creation of efficient technologies (biomimicry or not), conservation/efficient use of natural resources, and mitigating humans' impact on nature were three very common sub-code responses. In comparison, health and medical benefits were the lowest application/behavioral sub code mentioned, used by only 5 students (4.85% of all students) in either pre-post or DBL-comparison responses.

Student responses to assessment Q2
Our coding team identified eight main codes and fourteen sub-codes used by students in responses to Assessment Q2 (88.12% agreement; Kappa Coefficient = 0.87).
The eight main codes were: 'S-F Claim' , 'Factors Influencing S-F' , 'Incomplete or Non-Coherence' , 'Structures or Functions Influencing Something' , 'Constraints and Limitations' , 'How to Study S-F' , 'Example' , and 'Misconception Statement' . Subcodes fell within each of these categories. Full description and examples of the mainand sub-codes can be found in the Additional file 1: Appendix C.
These codes were used to assign scores from 0-5 to students' responses to Assessment Q2. While the codes themselves did not imply a ranking system, each code and its combination with other codes determined the quality, completeness, and absence of misconceptions that correspond to the 0-5 scores. The quality of each response concerned the depth and detail of students' responses. The completeness of a response was determined based on the inclusion of an explanation of the S-F relationship and an example. The absence of misconceptions was determined based on the presence of any misconceptions. All codes used on the assessment fell into one of these three categories, allowing the codes to inform the 0-5 scoring system. Rubric 2 (Additional file 1: Appendix D) provides direction on which codes corresponded to the categories above and how the codes were used to generate student scores. Below we provide examples of student responses for each 0-5 score.
Students' responses that received a score of 4 or 5 had complete answers with no misconceptions. Those who received the score of 4 had responses that included a correct explanation of S-F relationships and an example, but did not describe in more detail other processes that influence or are influenced by the S-F relationship. Those who received 5′s included a correct explanation of S-F relationships and an example along with one or more of the following codes: a factor influencing S-F relationships, constraints or limitations of structure or function, or something else influenced by the S-F relationship.

Level 5 Response: "A biologist would explain that any organism's functions are limited by the structures that it has (S-F Relationship). An example of this would be the structure of the human small intestine. The small intestine has particular structures that maximize its surface area and allow for the function of this organ: the absorption of nutrients (Example). When this structure is compromised, like in celiac disease, it is less able to perform its necessary functions (Constraints or Limitations). "
In the above example, the student describes the correct relationship between structure and function (that structure determines function), they provide an example that illustrates this, and they further elevate their answer by describing how the environment or other factors (e.g., Celiac Disease) can alter structure which then constrains function.

Level 4 Response: "A biologist would describe function as a product of structure. In other words, the structure of an organism is what influences its function (S-F Relationship). For instance, a dolphin's fins are structured in such a way that they function as paddles to propel the dolphin in water (Example). "
The example above is a correct response to the question because it includes the correct relationship between structure and function and an accurate example. However, the student does not elaborate on things that influence, constrain, or are influenced by the S-F relationship, and this is therefore not considered the highest quality level 5 response.
Those that received a score of 3 either had a partial response or had a complete response with misconceptions. If a student had a partial response it either included an explanation of the S-F relationship or an example. Those that had a complete response and a misconception used teleological, anthropocentric, or perfect design language, or used a combination of these. A description of each type of misconception language is detailed in the following section.

Level 3 Response: "A biologist may try to run experiments to figure out how changing the structure of an organism affects its function. This would then allow them to narrow down what a specific structure allows an organism to do (How to Study S-F). An example can be changing certain proteins to mess with the shape of a fly's eye. Then they could look at what differences there are in how the fly behaves and interacts using its eye (Example). "
The above response includes an adequate example, and yet, while the first part of the question loosely connects structure to function, it fails to clearly articulate the relationship between structure and function that occurs naturally in organisms (i.e., that structure determines function in organisms not vice versa). Below is an example of a level 3 response with a misconception.

Level 3 Response: "Firstly, a biologist would say that usually, form follows function. So the way that something is structured, almost always has to do with its function (Incorrect S-F Relationship). A good example of this is a starfish, its mouth is on the underside of its body, allowing for it to literally be a "bottom feeder" (Example). In other words, since starfish eat things off of the ocean floor, it would only make sense that its mouth would be on the underside of its body, or the part of its body that "lines" the floor (Misconception Language).
Although the above response is more detailed than the first level three response presented, it was assigned a "3" because the student uses language that espouses the misconception that necessary functions drive the presence of certain structures. This is evidenced by their use of "form follows function" in the first sentence and their assertion that because the starfish eats off the floor, its mouth must be on the underside of its body.
Those that received a score of 2 had a partial response with a misconception. This meant that student responses had either an explanation of the S-F relationship or an example along with one or more misconceptions.

Level 2 Response: "there is a reason that bats have wings and we have opposable thumbs (Example). If we didn't need certain parts of our body natural selection would have gotten rid of these traits long ago (Misconception Language). Evolutionary responses to our environment shape the way we, as well as all other organisms on earth, are built. "
Although this student does provide an example and loosely implies that it relates to functions by comparing two organisms (wings for flying vs. thumbs for tool use), they immediately follow it with the misconception that function (i.e., need) drives structure (i.e., influences the course of natural selection).
Those that received a score of 1 incorrectly described the S-F relationship (as function determining structure). The response may have also included misconception language.

Level 1 Response: "Function determines structure (Incorrect S-F Relationship). Organisms only have necessary structures they need in order to survive (Misconception Language). "
Those who had received a score 0 had a non-coherent response, meaning the response was not at all relevant or did not inform the coder about the student's understanding of S-F relationships in any way.

Identification of misconception language
Our coding team identified three major types of misconception language used by students in responses to both Assessment Q1 and Q2. Compared to our other codebooks, agreement for this codebook was relatively low (68.75% agreement; Kappa Coefficient = 0.46 of all prepost responses coded by at least one coder with a misconception). We believe that this is in part because Coder 2, whose codes were used in conjunction with Coder 1′s to calculate agreement, had not yet taken an evolution course and was therefore not as familiar with the nuances in misconception statements. However, during the team meeting process to address discrepancies, complete consensus was achieved very quickly after initial discussion. These misconceptions have been described in past research on common evolutionary misconceptions used by high school and undergraduate students (Coley and Tanner 2015;Short and Hawley 2012). Teleological language (9.8% of all student pre-post responses) is causal thinking that uses intuitive understanding of the outcome of an object/event to explain the purpose or function of that object/event (Coley and Tanner 2015;Kelemen 1999). Examples of this misconception language from our data included the idea that phenotypes or structures have a purpose to allow an organism to survive, or become adapted in order to serve a specific function (1 below) or that the organisms have evolved for the purpose of being fit for their environments (2). In either instance, students' language implies that there is purposefulness to evolutionary mechanisms.
(1) "A biologist would describe the dependence of organismal function on its structure by demonstrating how structure ultimately shapes its function to ensure successful functioning. " (2) "A biologist would describe the dependence of organismal function on its structure is how certain structures have modified over time to adapt to a certain function that they perform. " A second common type of misconception language used by our students was anthropocentric language (6.5% of all pre-post student responses). Taken from its larger meaning, we narrowly defined this as language that assigns human characteristics to organisms (Coley and Tanner 2015). This often included language that described organisms undertaking some behavior in order to achieve a desired outcome within their own lifetime or actions. For example, students described organisms as having needs or desires, or relying on their structures with intention.

"If an animal is slower, such as a turtle, then it needs to find other ways to defend itself since it cannot run away (fight or flight). Hence, turtles rely on their shells for protection as it is the most convenient and effective way for them to fight off predators. "
In the example above, the student emphasized that the turtle itself has tasks or needs similar to humans and they ascribe human value judgements to the turtle stating that that it "relies" on its shell because it is "convenient. " This thinking is in conflict with the correct idea that the presence and form of the shell resulted in more survival and reproduction for that species of turtle and therefore was subject to natural selection. In some instances, teleological and anthropomorphic language co-occurred in the same thought and both codes were assigned to a single unit of meaning.
Lastly, students used language that indicated there were evolutionary mechanisms that result in perfection (3.8% of all pre-post student responses)-where species are perfectly adapted to their environments or that evolution is a process that progresses toward perfection (Short and Hawley 2012). This misconception language often appeared in response to how humans can learn from naturally selected structures, as seen in the example below.
(Ex.5) "There are many structures and functions that have been around for way longer than we have and if nature has perfected it, then we should be able to learn from that. One example is how bees make their honeycomb. They found the perfect balance of strength and available space in the hexagon shape".
In this example, the student claims that structures that exist today have been perfected by nature, implying a set destination. This fails to capture the nuance that honeycomb structures, which reduce energy and resource use and increase bee fecundity, are more likely to be inherited in the next generation. It also does not acknowledge that the underlying processes that result in the honeycomb structures are random (e.g., mutation) and not deterministic.

RQ1: To what extent is biomimicry DBL related to students' likelihood to apply S-F concepts to benefit society?
For this study, we posit that students who list application/behavioral benefits on Assessment Q1 are more likely to apply their S-F knowledge to benefit society than those who do not list an application/behavioral benefit and the likelihood of applying S-F knowledge to benefit society will be increased for students in the DBL condition. Using this logic, we aggregated student statements into two groups: those that mentioned application(s) of their S-F knowledge to societal benefits (i.e., coded as application/behavioral on Assessment Q1 and assigned a value of 1) and those that did not mention an application (i.e., any other code used on Assessment Q1 and assigned a value of 0). Using a binary logistic regression, we investigated whether students' inclusion of an application/behavior on their post responses to Assessment Q1 (S-S post action presence ) could be predicted by the presence/absence of an application/behavioral response on their pre-response to the same question (S-S pre action presence ) and their participation in either the DBL or comparison condition (Section Condition) (Eq. 1). The abbreviation "S-S" stands for science-society to indicate the topic of Assessment Q1.
Based on our coding scheme and results from this first binary logistic regression (Eq. 1), we found no significant impact of students' pre-course use of application/ behaviors on their post-course responses to Assessment Q1. However, there was a significant influence of course condition on post-course responses to Assessment Q1 (Table 4, Fig. 4). Students in the DBL condition had a 97.65% chance of reporting an application on their post response while students in the comparison condition had only a 75.98% chance of reporting an application on their post response (13.89:1 odds of DBL to comparison condition including an application).
In general, there were no significant differences in the various combinations of benefit types (e.g. Cognitive and Affect, Application/Behavioral and Cognitive etc.) that students reported across conditions and pre-post responses when students mentioned more than one benefit to society (Pre X 2 (2) = 4.992, p-value = 0.08212; Post X 2 (2) = 2.5647, p-value = 0.2774).

RQ2: To what extent is biomimicry DBL and application of S-F knowledge to benefit society related to students' S-F understanding?
Using ordinal logistic regression, we examined whether students' post-course S-F understanding as measured by scoring responses to Assessment Q2 (S-F post, see rubric 2, Additional file 1: Appendix D) could be predicted by their pre-course S-F understanding (S-F pre ), their participation in either the DBL or comparison condition (Section Condition), and their inclusion of an application/behavioral benefit on Assessment Q1 in the post test (S-S post action presence ). We controlled for students' final course grade (Final Grade) since this varied among sections. Equation 2 expresses the model we tested.
(1) S − S post action presence = ß0 + ß1 S − S pre action presence + ß2(Section Condition) Analysis from this ordinal logistic regression indicated no differences between students' post-course S-F understanding for students in the comparison and DBL conditions (Table 5 However, both students' pre-course S-F understanding and if a student included an application in their post-course Assessment Q1 response were related to students' post-course S-F understanding (p < 0.06) regardless of course condition. Transformation of the log-odds values using base e for all estimates indicate that for each one-point increase in pre-course S-F scores (e.g. moving from a 1 to 2 on a 5 point scale), a student had a 58.23% chance of increasing their post S-F scores by one point above their pre-scores, given that all the other variables in the model are held  . 4 The percentage of students within each section condition that reported an application/behavior in their pre and post science applied to society responses (Assessment Q1) constant (a 1.39:1 odds of improving to not improving) (see Fig. 6). Likewise, when students included an application in their post-course Assessment Q1 responses, they had 76.12% chance of increasing their post-course S-F score by 1 point above their pre-course S-F score (Assessment Q2) (a 3.19:1 odds) (see Fig. 6). Put more directly, when students reported only cognitive, affective, or no benefits without mention of applications on their post-course Assessment Q1 response, they were also more likely to have lower post S-F scores (Assessment Q2). In comparison, when students articulated some variation of an application or behavioral benefit (i.e. changes to the way we behave, create or do things) on Assessment Q1, they were more likely to receive higher post S-F scores with full responses and accurate descriptions of evolutionary mechanisms influencing S-F relationships. Notably, only one student of the 22 students who received a level 5 S-F post score (the highest score achievable) reported no application or behavioral benefits on their post-assessment.

RQ3: To what extent is biomimicry DBL related to a student's use of misconception language?
From our qualitative coding scheme and two Fisher's exact tests, we discovered that the types of misconception language used on pre-course responses to either question was not different across section conditions (see Fig. 7a) (p-value = 0.5499, Fisher-Freeman-Halton test). However, there were slight differences in types of misconceptions language used across section conditions for students' post-course responses (Fig. 7b) (p-value = 0.06186, Fisher-Freeman-Halton test). The comparison condition had a somewhat higher use of the 'Teleological' category on their post-course responses, whereas the DBL condition had higher use of 'No Misconception' . None of the students in the DBL condition used 'Other (non-teleological)' in their post-course responses, but 11.11% of comparison students used this category.
We also aimed to understand if students' overall use of any misconception language in their post-course responses (Post presenceMiscon ) was influenced by the inclusion of misconception language on their pre-course responses (Pre presenceMiscon ) and their section condition (Section Condition) (Eq. 3).
Results indicate neither the section condition nor a student's use of misconception language on their pre-course responses strongly predicted a student's use of misconception language in their post-course responses (Table 6) (Pre presenceMiscon : Z-value = 1.602, p-value > 0.05; DBL: Z-value = -1.637, p-value > 0.05). While the comparison section had a slightly higher percentage of students using misconceptions in their post-course responses compared to both their own pre-course-responses and the DBL's pre and post responses, this frequency is not significant as was demonstrated by a non-significant interaction term when we originally ran in this regression across class condition and time and including the interaction. Because this was not significant, we only include results from the model without the interaction below.

Discussion
Articles on the cognitive and affective influences of biomimicry education in high school and undergraduate biology classes have largely been descriptive and (3) Post presenceMiscon = ß0 + ß1 Pre presenceMiscon + ß2(Section Condition) anecdotal. Studies that seek to understand the unique outcomes of biomimicry education, and especially how the design-based aspects of biomimicry education influence these outcomes, are absent from the literature. This makes it difficult to ascertain whether students experience deeper understanding or skill development as a result of Biomimicry DBL. What we do understand from DBL and PBL is that tasks that contextualize skills and knowledge beyond the classroom can improve students' ability to apply their content knowledge and skills in practical ways after the class is over (Dochy et al. 2003;Gijbels et al. 2005;Strobel and Van Barneveld 2009), an important goal of several national initiatives (AACU 2016;Brewer and Smith 2011;NSF 2017). However, there are mixed results as to whether such applied learning can improve, or even maintain, content knowledge learning in comparison to more traditional learning formats. Generally, slight positive or neutral gains in content knowledge are seen when PBL studies control for the moderating influence of assessment and concept alignment (Dochy et al. 2003;Gijbels et al. 2005). Extending from the results of prior work, we aimed to investigate if students learning evolution concepts within a Biomimicry DBL framework would be more likely to apply their S-F knowledge to benefit society compared to students in a the comparison condition (RQ1), and if Biomimicry DBL and the ability to apply concepts to benefit society are related to increases in students' S-F understanding (RQ2). Lastly, we were curious if the Biomimicry DBL condition would decrease (or increase) students' use of misconception language on their post-assessment questions (RQ3).
DBL students were more likely to report ways in which they could directly apply their understanding of S-F relationships to benefit society (Discussion of RQ1) Prior DBL and PBL research offer evidence that providing a socio-scientific rationale for course tasks supports gains in students' conceptual understanding and knowledge building (Hewitt et al. 2019;Venville and Dawson 2012;Udovic et al. 2002). Further, engaging in a project supports students' ability to apply their knowledge (Bransford et al. 1989;Dochy et al. 2003;Gijbels et al. 2005;Strobel and Van Barneveld 2009). In line with this research, we found that students in the DBL condition were far more likely to list ways to apply their S-F knowledge to benefit society on the post-course assessment compared to students in the comparison condition. The ability to envision and describe an application is the pivotal first step in knowledge application. Thus, we conclude that our experimental Biomimicry DBL curriculum targets an important outcome for students: their ability to connect science concepts to society and apply their knowledge to navigate societal problems (Brewer and Smith 2011). This positive outcome is likely a direct effect of learning about concrete applications of S-F knowledge a b Fig. 7 a Type of misconception language used on the pre-response, and b type of misconception language used on the post-response across section condition (Comparison/DBL) in the DBL condition (Frick et al. 2004;Trulove and Parks 2012). Students' application of their knowledge to benefit society is a notable outcome since studies demonstrate that the public's content knowledge and awareness of environmental problems is only weakly linked to behavioral responses and concern (Bak 2001;Frick et al. 2004;Moser and Dilling 2011). Simply having more information or awareness does not sufficiently motivate behavioral change. Instead, knowledge of applicable strategies, skill, self-efficacy, values and beliefs, and outcome expectations have been strongly linked to pro-environmental behaviors (de Groot and Steg 2008;Kollmus and Agyeman 2002;Moser and Dilling 2011;Truelove and Parks 2012). For example, a study of Swedish high school students found that when they discussed and worked on concrete pathways to possible futures of climate change, building knowledge, skill, and self-efficacy, students self-reported more engagement in pro-environmental behaviors in their personal lives (Ojala 2015). The curriculum in this study is similar to our own, in which students worked on concrete design challenges oriented toward sustainable solutions to societal problems. Thus, our application-oriented biomimicry curriculum may have potential not only to help students come up with ways to apply their knowledge, as seen here, but also to influence their behavior with regard to application of science knowledge to societal benefits. While we did not directly investigate behavioral change as a result of this biomimicry curricula, this would be an interesting topic for future studies. Given the interdisciplinary nature of biomimicry practices, it might be especially interesting to investigate if students are more likely to integrate knowledge from different fields in socio-scientific problem solving.

Students develop S-F understanding as a result of both the Biomimicry DBL curriculum and the comparison curriculum.
We originally posited that the students in the DBL condition, as compared to students in the comparison condition, would make greater gains in S-F understanding. Based on scoring of Assessment Q2 and subsequent analyses, we did not see a difference in students' S-F understanding gains between conditions. This was not completely unexpected given the ambiguous impact of action-based curriculum on concept knowledge from past research (Dochy et al 2003;Eastwood et al. 2013;Sadler and Dawson 2012). Additionally, though the comparison section was not working on applications to socio-scientific issues, this section, and the evolution course in general, was taught in an active-learning format throughout the semester. Past research supports that active learning, above and beyond lecture-based curricula, results in content knowledge gains (Freeman et al. 2014;Prince 2004;Tessier 2007). Thus, both the comparison and DBL curricula, designed using active, 3D-LAP formats, are likely to have supported gains in students' evolutionary concept knowledge, including knowledge about how structure influences function.
We also hypothesized that students in either section who described how to apply S-F knowledge to benefit society would have higher S-F post-course scores. This hypothesis was supported by our data. When a student listed ways to apply their S-F knowledge to benefit society on their post-course assessment, they were also more likely to have higher quality and more correct S-F responses. While, we cannot fully exclude the possibility that this effect was due solely to unaccounted-for differences in students reporting an application/behavior (N = 74) compared to those who did not (N = 14), these results are more likely to have arisen from a mechanistic relationship between concept knowledge and action application. There is strong evidence from cognitive psychology that processes like problem solving and applying knowledge to novel situations require availability of introductory content knowledge as a prerequisite (Bransford et al. 1989;Segers et al. 1999). In these instances, domain specific knowledge, if organized in a way that assists fast access of information, can appropriately be applied to solve structured problems (Bransford et al. 1989). Thus, it could be that once the students had a firmer grasp of S-F, they might have had a more structured framework from which to consider and report on concrete societal benefits. In this scenario, S-F understanding may have supported students' application of this knowledge to societal problems (S-F understanding-> increased ability to apply knowledge). Alternatively, prior work has found that advanced learning of more complex ill-structured knowledge domains-like how natural selection influences S-F relationships-requires more flexible applications of such complex content (Spiro 1988). In this case, application of students' introductory knowledge to societal problems may have supported development of more advanced and flexible schema, supporting students' S-F understanding (increased ability to apply knowledge-> increased understanding of S-F). Though this experimental design cannot demonstrate cause or provide evidence of the direction of this relationship, this work suggests a link between S-F understanding and application of this knowledge to societal problems. More work is needed to fully understand these dynamics.

The Biomimicry DBL curricula did not result in increased use of teleological language or presentation of misconceptions.
We hypothesized that teaching evolution in a design context could inadvertently activate use of teleological language resulting in increased use of this language in the DBL condition's post responses to both assessment questions. We reasoned that this trend might occur since our DBL curriculum compared non-purposeful evolutionary processes with purposeful human design processes which can both result in structures with specific functions (Coley and Tanner 2015;Nehm and Reilly 2007). To mitigate these misconceptions, we intentionally discussed that evolutionary processes are not purposeful in both the DBL and comparison curricula and confronted the misconception that adaptation occurs in order to result in a specific structure or function. These mitigation efforts may have helped since, despite our concerns, the frequency of students using any misconception (teleological or otherwise) was not different between sections. This outcome is a positive one, as it indicates that use of Biomimicry DBL does not necessarily promote teleological misconceptions, so long as these misconceptions are addressed by the instructor. However, the frequency of students in both sections using misconception language did not change from pre-response to post-response, which indicates that both curricula could be improved to address students' use of misconception language. On average ~ 44% of all participants used at least one commonly found misconception in their language at some point in their pre-responses or post-responses. While this is a lower frequency than what has previously been reported in undergraduate evolutionary misconception research (Coley and Tanner 2015;Nehm and Reilly 2007), it is still a large percentage of all students.
In general, such articulated misconception language is seen as a flaw within students' thinking since it provides an objective way to measure if a student relied on intuition, instead of evidence and knowledge, to make sense of information (Coley and Tanner 2015;Gouvea and Simon 2018). However, if we consider a more flexible way to view students' thinking, then misconception language use is not necessarily a condition of a stable incorrect understanding but rather a context-dependent use of inaccurate language (Gouvea and Simon 2018). These context dependent factors could include students' use of ambiguous terminology meant to imply 'in order to'; these placeholders can be used to describe the driving mechanism of an outcome (e.g. chairs are made so that we can sit) or can be used to emphasize the relationship, whether causal or not, between two events (e.g. species adapt so that they survive). In this instance, the student may be poorly articulating or using shortcuts to explain that adaptation is a requisite condition to survive, but it is not the reason why species adapt or that the species even drive this mechanism. Gouvea and Simon (2018) tested this hypothesis and demonstrated that a cued condition-where students saw the phrase "the need to survive causes adaptation"-had significantly less agreement with the misconception statement than the uncued condition, that "species adapt to the environment in order to survive". This idiosyncratic use of evolution language and the flexible way students think about these concepts makes our jobs as educators difficult as we find the balance between viewing our students' thinking as flexible or constant when it comes to misconception language. Gouvea and Simon (2018) recommend that we, as instructors, provide guidance to students about the detail and type of explanations that they are expected to provide when teaching about concepts such as evolution. This would enable students to better demonstrate their understanding of evolutionary mechanisms and allow us, as instructors, to more accurately assess and understand their thought processes based on the language they use.

Disclosures and limitations
As with any quasi-experimental study or analysis of qualitative data, there is potential for biases to arise. In qualitative research, it is common to include a positionality statement to elucidate the biases that may be present and discuss how they might affect the reported results based on best practices. As such, the lead researcher identifies as a white, middle-class American female who was born and raised in suburban-urban environment. As is clear from the purpose of this investigation, she strongly believes that humans are degrading our environment and climate at rates faster than can currently be recovered and she values any steps, from the personal to global, that can be taken to mitigate and give back to our natural world and the services provided. This may have introduced bias into the research efforts as an instructor during the 1 day and 1-weeklong curricula and bias into the research investigations. In either case, this investigation should be considered with this identity in mind.
Due to the limited scope of this study, we cannot make general claims about whether these patterns are likely to appear in all evolution classes. Our results only address one course context over one semester and should be viewed with this lens. As the study context only included one section per condition, we violated the assumption of non-independence and can only report on this one course-context. For instance, one possible explanation for the differences we saw in student's mention of actionable benefits between the sections on their post responses could be attributed to natural differences in the two section's populations that were not completely accounted for statistically. Because of this limitation, we took efforts to account for the non-independences within course sections (for ways we attempted to mitigate this issue, see Additional file 1: Appendix E). These additional efforts and analyses suggest that differences due to nonindependence were minimal, and thus, we are more confident in the results reported above. Nonetheless, results should be viewed as a single-course case study that reports on what we did with regard to development of a novel instructional approach and implementation, why we did it, and how we measured its success with regard to the designated learning objectives and hypotheses.
Lastly, our identification of a "misconception" based primarily on a student's use of language to written responses, ambiguous or intentional, has certain limitations when evaluating student's understanding of evolutionary concepts. As indicated by Gouvea and Simon (2018) and Rector et al. (2013), there are many ways that a student's use of language disconnects from their ideas. For instance, the lexical ambiguity of common evolutionary words used in everyday language (i.e. adapt, fitness) can interact with student's background and experiences to shape their meaning of scientific terms (Rector et al. 2013). Additionally, students can also hold multiple lexical and scientific meanings for words, which gets even more convoluted when the actual science terms themselves hold multiple meanings (i.e. purpose). These levels of complexity can make it very challenging for students to construct knowledge and to know when to appropriately use language. Therefore, our evaluation of student's use of "misconception language" in both conditions should be viewed from this perspective and be understood as a first attempt to grasp significant changes the DBL curricula had on students thinking overall.

Conclusions
In this curricular investigation, we found a strong positive link between students learning within a DBL context and how likely they were to report ways in which S-F knowledge could be applied to benefit society. Since envisioning applications of scientific knowledge to human problems is a first step in developing a student's ability to apply knowledge and enact change, it is likely that DBL curricula, such as the one presented, could strengthen students' abilities to apply complex evolutionary concepts to navigate human problems. The use of biomimicry design challenges may also be an especially salient way to employ DBL for ecology and evolutionary biology students, given that biomimicry targets socio-scientific problems associated with sustainability.
We also found that implementing a DBL curriculum did not diminish students' conceptual learning. Students in both sections showed comparable gains in S-F understanding and students who reported on how to apply S-F understanding to navigate societal problems had higher quality and more correct understanding of the S-F concept, regardless of course condition. These findings reduce concerns regarding DBL curricula potentially resulting in lower content knowledge gains than other approaches (Dochy et al 2003;Eastwood et al. 2013;Sadler and Dawson 2012).
In conclusion, we present results supporting the incorporation of both DBL and Biomimicry into biology curricula. Theory and past research suggests that experiences such as the one presented could also increase students' interest in science, values of science, beliefs that they can make a difference in society, and pro-environmental behaviors (de Groot and Steg 2008;Kollmus and Agyeman 2002;Moser and Dilling 2011;Truelove and Parks 2012). We hope that this initial study and presentation of novel curricula will lead to more studies of the efficacy of both Biomimicry and DBL curricula.

Additional file 1: Supplemental tables and analyses.
Abbreviations 3-D LAP: Three dimensional learning assessment protocol; AACU : American Association of Colleges and Universities; DBL: Design-Based Learning; H1, H2, H3: Hypothesis 1, Hypothesis 2, Hypothesis 3; N: Sample Size; NGSS: Next Generation Science Standards; NSF: National Science Foundation; NRC: National Research Council; PBL: Place-Based Learning; Post presenceMiscon : Variable indicating the presence of a misconception in students' post-course assessment responses (a misconception could be present in either assessment question); Pre presenceMiscon : Variable indicating the presence of a misconception in students' pre-course assessment responses (a misconception could be present in either assessment question); Q1, Q2, Q3: Question 1, Question 2, Question 3; RQ: Research question; S-F: Structure-Function; S-F post : Variable indicating students' post-course understanding of the S-F relationship as measured by students' post-course score on Assessment Q2; S-F pre : Variable indicating students' pre-course understanding of the S-F relationship as measured by students' pre-course score on Assessment Q2; S-S: Science-Society; S-S post action presence : Variable indicating inclusion of an application/behavior benefit on Assessment Q1 in the post-course assessment; S-S pre action presence : Variable indicating inclusion of an application/behavior benefit on Assessment Q1 in the pre-course assessment; USDE: United States Department of Energy. The lead author, EF, led the research team in study design, data collection, and writing for this manuscript. She performed the majority of qualitative analyses assisted by a team of undergraduate researchers. She performed all quantitative statistical analyses for this paper. She also taught the Biomimicry curriculum described in the manuscript. The corresponding author, LC, assisted with study design, data collection and writing of the manuscript. She advised on qualitative and quantitative analyses. AM contributed to the study design and data collection for this paper. He advised on quantitative analyses and taught the comparison curriculum described in the manuscript. AE and AT contributed to the qualitative analysis described in the paper by helping with codebook development and open coding of student responses. All authors read and approved the final manuscript.

Funding
Funding for this project was provided by the Chancellor's Award for Excellence in STEM Education at the University of Colorado Boulder.

Availability of data and materials
The raw data for this work will not be made available due to confidentiality and privacy protocols written into our approved IRB protocol. Our data consists primarily of students' responses to open-ended questions. Occasionally, these responses present information that may be used to indirectly identify study participants. Our agreement states that a limited set of select representative quotes from these responses will be reported with attention to use of quotes that do not have potential to directly or indirectly identify study subjects. Therefore, for reasons of confidentiality, we have chosen not to make our data set public.

Ethics approval and consent to participate
This investigation received approval from the Internal Review Board for Human Subjects at the University of Colorado, Boulder (#18-0467).

Consent for publication
Because details relating to individuals in this manuscript (i.e., quotes) are entirely unidentifiable and no details on individuals are reported within the manuscript, obtaining consent for publication does not apply to this work. The CU Boulder IRB-approved consent form for participation in the study is available upon request. This form informs study participants that reports about the data will be made but that they will not involve any identifying information.

Competing interests
Two of the authors of this manuscript were also instructors in the course that was the subject of study. These authors took precautions to reduce their biases associated with student responses and participation in the study, including taking measures to avoid coercing students to participate and refraining from viewing who was participating in the study and their responses to study-specific questions until after course grades were assigned. They strived to keep their role as instructors separate from their role as researchers when analyzing the data.