Evolutionary studies of stone tools are on the rise. Most involve Old World tools (e.g., Archer and Braun 2010; Castiňeira et al. 2007; Clarkson et al. 2006; Grosman et al. 2008; Lycett 2009; Sumner and Riddle 2008) that probably did not function as points (but see Apel and Darmark 2009). Collectively, these studies attest to the promise of lithic evolutionary analysis. Here, I confine treatment to current and new approaches in New World Pleistocene archeology.
In developing archeological theory of points, from a New World perspective, it is fitting to begin at the beginning. The Paleoindian point sequence starts with Clovis and diversifies into Goshen and Folsom on the Plains, to Gainey and Barnes/Cumberland in the Midwest, perhaps directly to Dalton in the mid-South, to Suwannee, Simpson and other unfluted lanceolates in the deep South, and to Colas de Pescado in Latin America. Sequences are well described and, as descriptive labels, types are convenient markers for complex patterns of variation. But we must attend to the properties of these sequences as streams, not ice blocks. We do not know if Clovis morphed by continuous degree into later types (phyletic gradualism), if types branched off fully formed while Clovis persisted (cladogenesis), or if later types are unrelated replacements of Clovis. In this state of ignorance, archeologists have begun to test methods to distinguish among the possibilities.
Functional and historical dimensions of variation must be considered in any point study. In the Paleoindian case, functional dimensions include not only flight characteristics, penetration, and durability but also launching device. Collins (2007:76–79) summarized data on use–wear type, placement and pattern, and equivocal direct evidence for atlatls to conclude that Clovis points probably were hand-held lances, although he did allow the possibility that they were darts launched by atlatls. Their similarity in size and form recommends Collins’s judgment to Gainey/Bull Brook points of eastern North America. (The status of Suwannees, Barnes/Cumberlands and Colas de Pescado is less certain.) In contrast, most archeologists believe that Folsom and probably Dalton points were launched by atlatls. If Collins is correct, then later fluted points like Folsoms and later types may be skeuomorphs adapted from a preexisting design to accommodate a new launch device. If so, then future studies, either cladistic as summarized below or geometric morphometric as advocated later, must take into account variation that corresponds to launching device.
O’Brien and Lyman
O’Brien and Lyman (2003) was the first major study of evolutionary paths in points, involving a large sample of early Holocene types from the American southeast. In context, the study was stimulating but inspired certain reservations. O’Brien and Lyman modeled three-dimensional points as two-dimensional objects, using eight mostly nominal and ordinal variables (2003:152). Despite indifference to “whether characters are quantitative or qualitative” (2003:144), their commitment to paradigmatic classification prompted O’Brien and Lyman to reduce naturally continuous data to interval and lower scales. Unfortunately, at least three of their variables conflate original design with the negative allometry of resharpening, i.e., code specimens for degree and pattern of resharpening experienced (Shott 2008a:148–149).
Types (“taxa”), not individual points or empirically assemblages of them, were O’Brien and Lyman’s unit of study. They were right to question (but not necessarily reject) the empirical basis of type definitions. Problems include lack of precision and consistency in description, disagreement about what constitutes essential characteristics, and change in type definition as more empirical samples are found. Surely classification can be improved. But their paradigmatic alternative—the a priori definition of classes into which empirical specimens are placed according to their sets of characteristics—itself amounts to a reproduction of empirical types. O’Brien and Lyman argue that character lists and paradigmatic classes defined from them “are not empirical…they can only be created” (2003:139). But characteristics are selected from infinite possibilities and types defined that resemble empirical specimens only after examination of empirical samples. The paradigmatic approach does not avoid the problem of empirical type definition. In their study, moreover, although variables were chosen (“based on expectations as to which parts of a projectile point would change the most over time as a result of transmission” [2003:150]), the variables seemed selected to describe general qualities of plan form, haft element, and fluting; the classification was not obviously informed by the detailed examination of point performance or historical criteria noted above. That is, it was extensional as much as intensional.
Whatever the virtues of paradigmatic classification, in the 621 points of O’Brien and Lyman’s study, it produced 491 types qua taxa, an average of 1.26 points per type (2003:157). Such types are nearly as unique as snowflakes. Then they confined analysis to the 83 points that formed 17 larger types. In the process, a group of more than 600 specimens was reduced to an analytical set of fewer than 100, which seems inefficient in the use of empirical data if not unrepresentative. Whatever the imperfections of traditional typology, O’Brien and Lyman’s “intentional” types freely combined form, technology, fluting, size, and degree of resharpening and freely cross-cut traditional types (2003:156). This failure to replicate “extensional” types does not automatically discredit either approach, but does suggest that we should be more careful in how types are defined. Among other things, resharpening effects can be minimized if not entirely removed by the simple expedient of confining characterization and analysis to haft elements, which are far less susceptible to resharpening than are blades and whole-object size and form.
Finally, cladistics found phylogenies for the unordered characters of the reduced point-type sample. Cladistics assumes branching divergence and usually is constrained to change only one state per step in ordinal data. These assumptions may or may not be faithful to the nature of transmission and selection in points. Assuming validity of the phylogeny (2003:169), O’Brien and Lyman make it the conclusion of their analysis. As description, this is fair enough, but any historical inference reached or evolutionary trend suggested, certainly one so detailed and complex as theirs, begs for broader contextualization and explanation.
Other Cladistic Approaches
Buchanan et al. used two-dimensional images of a continental sample of fluted points. They defined landmarks and used inter-landmark distances as dimensional measures. They used two different methods to attempt to control for resharpening allometry (Buchanan and Collard 2007:372–373; Buchanan and Hamilton 2009:284), both critiqued elsewhere along with interpretations of results (Shott 2009). Assemblage was the unit of analysis, and mean “size-free” dimensions by assemblage were reduced to ordinal or interval-scale variables, treated as ordered states for cladistic analysis. The dataset included caches, and what most archeologists would call “kill” (e.g., Naco, Lehner) and “habitation” assemblages that varied considerably in size and probably accumulation span, which are important sources of assemblage variation (Shott 2010b).
Here, the most salient points are that assemblages were units of analysis and that Buchanan et al. sought historical information in point data. Assemblages can be legitimate units of analysis, but free combination of caches, kills, and habitations is questionable. Caches were deposited in virtual instants of time and might comprise products of one knapper. Kill assemblages probably also accumulated in very short intervals, even if points were contributed by several users, unless they occur at places suitable to repeated ambushes or drives. But what archeologists typically call “habitation” sites, assemblages accumulated over much longer and more variable intervals. Combining caches with habitations freely mixes instantaneous deliberate deposits with accumulations over intervals orders of magnitude longer. Assemblages can serve as units of analysis, but not all assemblages are equal in size or accumulation span. Considering the great variation implicated, assemblages should comprise units of analysis only when at least roughly comparable in context, span, and size.
Among other limitations, cladistic practice in archeology emphasizes discrete traits to the virtual exclusion of continuous ones, which alters the character of data and might influence results. Our discrete-trait focus prevails across the range of analytical scales from characteristics of individual artifacts (e.g., Buchanan and Collard 2007; O’Brien and Lyman 2003) to entire cultures (e.g., Bettinger 2009; Chatters and Prentiss 2005). Yet some paleobiologists argue that continuous traits are perfectly amenable to cladistic treatment (e.g., MacLeod 2002). Of course we should treat as discrete those traits that legitimately are, but we should not reduce the intrinsic properties of our often continuous data merely to fit convenient analytical practices developed in other fields; instead, we should consider inherently continuous traits equally with discrete ones, and then use existing or devise new methods for their joint study.
Geometric Morphometrics
Whatever the advantages of cladistic analysis (cf. Shott 2008a; 2009), geometric morphometrics (GM) also is on the rise, mostly in resharpening studies (e.g., Shott and Trail 2010). GM is a set of methods that produce and analyze data in ways historical among others. Like cladistics, GM reduces the complex whole objects that are points to smaller sets of dimensions. Whether in two- or three-dimensional models, however, GM does not reduce naturally continuous data to lower measurement scales. Unlike traditional manual measurement of orthogonal dimensions, GM data are free of geometric constraints. When manipulated in CAD or similar software, resulting models allow easy and accurate measurement of variables otherwise difficult to measure (e.g., volume, surface area, centroid size, longitudinal and transverse section area and perimeter [and variation in both at regular intervals along the axis], edge perimeter). Landmarks and other GM data can be placed to capture the functional traits that theory identifies (e.g., Hughes 1997; Wilhelmsen 2001) and the and technological traits that relate particularly to fluted points (Shott and Trail 2010; Fig. 1). Unlike cladistics, it is not an inherently historical approach that assumes mode, rate, or direction of evolutionary change.
MacLeod (2002:129–134) demonstrated the potential of GM in paleobiology. He defined what amount to modules—sets of landmark points that covary closely as quasi-discrete components (e.g., Klingenberg 2008) of larger wholes—in hypothetical phylogenies. Landmark x-y-z coordinates and inter-landmark distances are continuous values easily analyzed in GM routines like the program Morphologika (O’Higgins et al. 2009). MacLeod defined clusters of landmark configurations that both captured discrete characters in a time-ordered set of taxa and that reconstructed its evolutionary history (itself known in these hypothetical data, which thus serve as a control sample). Finally, he reduced continuous data to discrete characters and conducted conventional cladistic analysis that agreed closely with phylogenetic inferences reached in continuous data. Thus, MacLeod suggested that both continuous data and morphometric methods can reveal historical relationships in fossil lineages.
GM study of Paleoindian points might use types and assemblages differently as units of analysis. For many purposes, types should be units of analysis (Apel and Darmark 2009:16). That does not mean that types as defined subjectively or points casually assigned to types should be taken at face value. On the contrary, GM data can be used to define valid types, especially in haft modules rather than whole points to control for resharpening allometry. This can be done using standard cluster analysis, relative-warp cross plots (MacLeod 2002:131), canonical variate analysis, or other methods.
O’Brien and Lyman (2003:135) argued that empirical-type definition obscured historical relationships between Clovis and Dalton types, and that intensional paradigmatic classification and cladistic analysis alone could reveal it. Yet GM characterization and analysis can reveal such relationships and is more faithful to the total form and continuous dimensions of points (i.e., does not reduce them to a small set of abstract formal properties and measures continuous variation in continuous terms). O’Brien and Lyman also argued that paradigmatic classification can define morphospace, “the multidimensional space encompassing the range of morphological variation of…taxa” (2003:140). So too can GM and landmark data; indeed, Gould (1991:420) considered cladistics and the paradigmatic classification consistent with it poorly suited to defining morphospace. Leaving aside the questions about O’Brien and Lyman’s analysis noted above, GM, besides its unique virtues, can accomplish all that their approach claims.
For some purposes, however, cache assemblages may be valid analytical units for the properties noted above, either by defining points in a cache as a type or using context to test the validity of types defined otherwise. Among their virtues is that cache specimens rarely are resharpened, so resharpening allometry is mooted and entire points, not just haft modules, can be studied. Also, specimens in a cache often are of the same material, controlling this source of variation. Although it always remains an inference, cache points also may have been made by one person at essentially one time, controlling other sources of variation. Paleoindian point caches are difficult to find and therefore rare, yet a surprising number exist (Kilby 2008). It is worth at least exploring GM analysis in cache points.
Whatever the unit of analysis, and assuming some a priori information about the chronological distribution of types, historical or evolutionary analysis can take several routes. For instance, thin-plate splines and relative-warp plots are graphic depictions of pattern and scale of difference in landmark points between putative ancestral and descendant types (e.g., Shott and Trail 2010). Granted, they are graphics, not analysis, but scaled images of the deformation required to transform type A into type B can suggest hypotheses for rate, mode, and cause of change (Fig. 2). Some pictures are worth at least a few words. Software like MorphoJ (Klingenberg 2011) and Mesquite (Maddison et al. 2009) can test different transmission modes or other evolutionary processes in GM data, although they require a priori phylogenies, and there are ways to fit GM data to phylogenies that treat shape as a multivariate character and tests for historical signals in those data (Klingenberg and Gidaszewski 2010). It also should be possible to fit empirical GM trends to different models of adaptive landscapes (Bettinger 2009; Polly 2004). Moreover, GM data and perhaps inter-landmark distances allow points to be subdivided into modules or even smaller sets of data and dimensions. Resolving point types this way, we may find that evolution occurs much differently or more rapidly in some segments than others, extending to multivariate GM data the approach that Morrow and Morrow (1999) demonstrated in the transition from Clovis to Colas de Pescado.
Whether microscale transmission processes can be identified in often-coarsely resolved archeological data, GM point data also can be used to explore macro-scale transmission. Again, the best examples are in paleobiology. Polly (2004) used computer simulation to model the evolution of complex morphology over long time spans in morphometric data on shrew tooth crowns. To constrain the pattern and magnitude of morphological change to the biologically realistic (i.e., to prevent morphometric landmark configurations to evolve or drift via stochastic simulation into unrealistic form [Polly 2004:3]), he calibrated covariance matrices to empirical data. (Channeling Gould, Bettinger [2009:279] called “Baupläne” similar functional constraints at presumably higher levels than points, thus acknowledging constrained variation in cultural phenomena.) Polly then simulated evolution in simple (i.e., unimodal, one peak) and complex (i.e., multimodal randomly distributed peaks that varied in slope and height) adaptive landscapes under distinct selection modes: random fluctuation, directed or sustained trend, stabilizing, and drift. Morphologies evolved over long time patterned differently depending upon selection mode. Under random fluctuation, for instance, evolved morphologies were narrowly and randomly distributed near the origin (representing the starting morphology) of the plot of the first two axes of principal component space and divergence from starting morphology rose modestly over time while the range of morphologies rose monotonically (Polly 2004:Fig. 5). In contrast, directional selection yielded tightly clustered evolved morphologies distant from the origin of principal component plots with limited divergence from the selection trend and monotonic rise in divergence from starting morphology over time (Polly 2004:Fig. 7). Thus, Polly could distinguish the effects of different selection regimes and adaptive landscapes. He concluded (2004:22) that evolutionary trends and historical relationships between taxa can be identified in morphometric data under any mode except strong stabilizing selection.
It may seem a long stride from shrew teeth to points, but both are rigid solids amenable to morphometric characterization and whose landmark configurations are constrained by function, size, allometry, and other factors. Just as shrew teeth cannot grow too large or sprout needle-sharp peaks, points and their modular components cannot grow too large or small, cannot sprout multiple points on their edges or faces, cannot grow too thick or thin, and otherwise are functionally constrained (e.g., Hughes 1997; Wilhelmsen 2001.)