6. Microtiming Studies

I have experienced one of the most interesting musical revelations of my life, gradually over the last several years, in studying West African dance-drumming and in playing jazz, hip-hop and funk. The revelation was that the simplest repetitive musical patterns could be imbued with a universe of expression. I have often witnessed the Ghanaian percussionist and teacher C. K. Ladzekpo stopping the music to chide his students for playing their parts with no emotion. One might wonder how much emotion one can convey on a single drum whose pitch range, timbral range, and discrete rhythmic delineations are so narrow, when the only two elements at one's disposal are intensity and timing. Yet I have become convinced that a great deal can be conveyed with just those two elements. Some investigations into how this can happen are set forth in this chapter.

Rhythmic expression in African and African-American musics. Some of the arguments in this chapter draw upon cultural aspects of music listening. Working from the documented historical lineage between West African and African-American cultures, Wilson (1974) has identified a constellation of conceptual tendencies that exist in the musics of that vast diversity of cultures. Among the musical preferences and principles he enumerated were the following:

These and other concepts can serve as the beginnings of a pan-African musical aesthetic, since so many of these notions appear so often in so many different kinds of West African and African-American music. A great majority of this music falls in the category of groove-based music that I have mentioned, meaning that it features a steady, virtually isochronous pulse that is established collectively by an interlocking composite of rhythmic entities and is intended for or derived from dance. This somewhat inadequate description should not be viewed as a definition of the concept of groove; indeed, to some degree, that definition is what we are searching for with this work. One could say that, among other functions, groove gives rise to the perception of a human, steady pulse in a musical performance.

In groove-based music, this steady pulse is the chief structural element, and it may be articulated in a complex, indirect fashion. In groove contexts, musicians display a heightened, seemingly microscopic sensitivity to musical timing (on the order of a few milliseconds). They are able to evoke different kinds of rhythmic qualities, such as apparent accents or emotional mood, by playing notes slightly late or early relative to their theoretical metric location. While numerous studies have dissected the nuances of expressive ritardandi and other tempo-modulating rhythmic phenomena (Repp 1990, Todd 1989, Desain & Honing 1996), to our knowledge there have been few careful quantitative studies that focus on expressive timing with respect to an isochronous pulse. In groove-based contexts, even as the tempo remains constant, fine-scale rhythmic delivery becomes just as important a parameter as, say, tone, pitch, or loudness. All these musical quantities combine dynamically and holistically to form what some would call a musician's "feel." Individual players have their own feel, i.e. their own ways of relating to an isochronous pulse. Musical messages can be passed at this level. A musician can pop out of a polyphonic texture by a "deviation" from strict metricality, or a set of such deviations. As I shall attempt to demonstrate below, these kinds of performance variations create an attentional give-and-take to emphasize different moments interactively. This and similar techniques are manipulated with great skill by experienced musicians playing together, as a kind of communication at the "feel" level. We claim that this variety of expressive timing against an isochronous pulse contains important information about the inner structure of groove.

Often when the topic of musical communication is breached, one is tempted to wonder what is being said amidst all this communication. This raises the question of what actually constitutes a musical message, or, for that matter, musical meaning in general. Here, I feel, one should draw upon the processual notion of communication, as a collective activity that harmonizes individuals, rather than on the telegraphic model of communication as mere conveyance of literal, verbal meanings. For example, the musical notion of antiphony, or call and response, can function as a kind of communication, and nothing need be "said" at the literal level to make it so (although we need not rule out the possibility of musically encoded symbolic meaning). What definitely is happening is that the interactive format, process, and feeling of conversational engagement are enacted by the musicians. In a context like jazz, the presence of this kind of dialogic process can be constant throughout a performance, as sustained antiphony. I am arguing that a significant component of such a process occurs along a musical dimension that is non-notatable in Western terms -- namely, what I have been calling microtiming.

Previous microtiming studies. Miniscule timing deviations from metronomicity are frequently miscast as "discrepancies" (Keil 1995), "motor noise," or "inaccuracies" (Rasch 1988). But there has been a small thread of research dedicated to the uncovering of structure in these so-called inaccuracies. It turns out that these deviations both convey information about musical structure and provide a window onto internal cognitive representations of music. One of the most compelling examples of this direction in research is provided by Drake & Palmer (1993). They proposed three types of accent structure:


In studies of timing of numerous skilled classical pianists, they found systematic deviations from strict regularity that correlated with these accent structures. Their qualitative findings are summarized below (table adapted from Drake & Palmer 1993).

Type of accent structure


Intensity variation


Inter-onset interval



Rhythmic grouping accent


Last event is louder


Notes crescendo throughout group


Penultimate event is elongated

(i.e., last event is delayed)

Penultimate event is more staccato


Articulations are proportional to note durations
Melodic accent


Event on a turn is louder


Event before or on a melodic turn or leap is elongated

(i.e., melodically accented event is delayed)


Event before or on melodic leap is more staccato
Metric accent


Stronger beats are louder


Last beat of metric cycle is elongated

(i.e., downbeat is delayed)

Stronger beats are less staccato

In these results, it is clear that small performance variations in timing, intensity, and duration enhance aspects of musical structure. Drake and Palmer concluded that these performance variations facilitate listeners' segmentation of musical sequences, since the accent structures serve to break up a musical sequence into smaller, more tractable chunks. Furthermore, Drake and Palmer found that the expressive effects that stemmed from rhythmic grouping tend to dominate melodic-accent or metric-accent effects; the former tended to override the latter two when the music yielded conflicting accent interpretations. That the metric accents would tend to be overridden is unsurprising, since one would expect expressive timing to break up the regularity of repetitive metric accents. But the fact that rhythmic grouping effects dominate melodic-accent effects suggests a more general primacy of rhythm over melody, in both production and perception. Also worth noting is that these expressive variations allow graduated change -- flexible, continuous variability. A moderate amount of expression is the norm, and performances that are low or high in expressivity will stand out as extreme.

In his studies of microtiming variation in small chamber groups, Rasch (1988) conducted statistical analyses of inter-musician differences in note onsets from recorded ensemble performances. Although he averaged out all musical structure, including metric accents and tempo variation, he found that generally, in a string trio, the violin's lead voice tends to lead by 5 to 10 milliseconds, the cello tends to follow, and the viola's middle voice tends to lag by another 5 to 10 milliseconds. It is unclear how accurate these findings are, however, because the standard deviation for each of the instruments was around 35 milliseconds. But at any rate, Rasch's hypothesis that there might be systematic variation in ensemble performance is a valuable one.

The above studies focused on European classical-music performance, which would not fall into the realm of groove-based music because of its reliance on tempo variation for expressive purposes. Indeed, the above results indicate that beats are frequently lengthened or shortened by the performer. Also, as discussed in the previous chapter, the treatment of metric organization as implying a series of weak and strong beats does not apply particularly well to West African or African-American musics; in these contexts there is no such thing as a metric accent, in terms of performance variation. Although the above studies are valuable, their stylistic scope does not coincide with ours.

It would be instructive to conduct a similar microtiming analysis for a percussion ensemble, particularly in instances of groove-based music which is much less forgiving in the realm of tempo variation and rubati than a string trio might be. Bilmes (1993) conducted a timing analysis of a recorded performance of Los Muñequitos de Matanzas, an Afro-Cuban rumba group [CD-37]. In a performance averaging 110 beats per minute (such that what would be a notated sixteenth note lasts around 135 milliseconds), both the quinto and the segundo (lead and middle conga drum, respectively) tend to play about 30 milliseconds ahead, or "on top." On the other hand, the tumbao (low conga drum) had a much broader distribution, nearly as often late as early. It should be noted that here the precise moment of the beat was not determined by the norm set by these three instruments themselves, as it was in the case of the string trio. Rather, the beat was established by a reference instrument, in this case a clave or a guagua. Hence it was possible for all three instruments to be ahead of the nominal beat, which was not the case for the string trio. In Bilmes's work, the average inter-drum asynchrony was not calculated; indeed, such a measure would ignore any relationship between timing and musical structure. But a frequency analysis of the microtiming variations revealed systematic structure. For example, the repetitive segundo part displayed a strong peak corresponding to its frequency of repetition, showing that the microtiming variations were not at all random.

Given this apparent systematicity of fine-scale rhythmic expression in groove contexts, we can take cues from the results of Drake & Palmer (1993) and Rasch (1988) discussed above, as well as from our expanded view of cognition involving the theory of embodiment, to make guesses about the function of such rhythmic expression. Thus we hypothesize that microtiming variations in groove music play any of the following roles:

I shall now address all of these possibilities via a few examples.



Asynchrony. The asynchronous unison attacks described above support a scientifically meaningful explanation. Rasch (1988) reports an earlier study of asynchrony: a 1977 experiment in which he investigated the effect of onset difference times on the perception of quasi-simultaneous tones. The threshold for perceiving the upper of two quasi-synchronous tones

So a function of these quasi-synchronous attacks, or flams, could be to aid in the perception of the timbral constituents of the unison attack.

The accompanying audio example demonstrates this tactic [CD-38, 39].


In the first version, the double stroke on the first beat is essentially synchronous; in the second version, a delay of 30 milliseconds is introduced between the two notes, resulting in a small flam. This serves to enhance the perception of the two timbral constituents.

In some musical situations in which blending is preferred, this kind of multitimbral asynchrony may be undesirable, but it is often a valued musical trait in groove-based music. On Wilson's list of African and African-American aesthetic concepts is the notion of a heterogeneous sound ideal, a tendency to value the presence of a variety of contrasting timbres. Another important cultural characteristic (not mentioned earlier) is a collectivist ideal, in which music is construed as a communal activity among groups of people. The rhythmic asynchronies described above aid in the perception of a multiplicity of timbres, as well as, in the ecological view of music perception (Gibson 1979, Shove & Repp 1995), the multiplicity of human bodies behind those timbres. That is, rhythmic asynchrony contributes both to the heterogeneous sound ideal and to the sense of collective participation. To be sure, exact synchrony is impossible with groups of people anyway; but this principle applies even when different sounds are played by one individual, as on the modern drumset. So here is an instance in which a kind of subtle rhythmic expression fulfills both a perceptual and an aesthetic function.

Streaming. It is well established that auditory stream segregation is a function of both pitch and timbre (Bregman 1990). From our work, it appears that microtiming can also contribute to streaming. This claim builds on the role of asynchrony in facilitating the perception of multiple tones. The audio examples [CD-40, 41] consist of a steady stream of triplets on tom-toms, together with a series of sparse tom-tom strokes at a lower volume. The musical material is shown below.

In the first audio example, the unison strokes are as simultaneous as is allowed by the MIDI protocol (i.e., within a couple of milliseconds of each other). In the second example, the second stream is delayed by 30 milliseconds with respect to the first, as indicated by the arrows in the figure, but kept at its same low volume. In the former case, the different timbres fuse into one stream, whereas in the latter case, the second stream is clearly audible as a separate entity. This example shows clearly how such miniscule timing variations can contribute to streaming. This technique is especially important in a context where the aesthetic tendency is to "fill up the musical space" (Wilson 1974). Timing variations can allow an instrument that is sonically buried to draw attention to itself in the auditory scene. Thus the presence of multiple instruments of similar timbres, as in a West African drum ensemble or a large jazz ensemble, need not be viewed as enforcing the subordination of individual identity. Individual musicians can improvise at this microrhythmic level to create an attentional give-and-take. This streaming effect also serves an aesthetic function, in that it enhances the perception of different rhythmic groups as separate animate entities with distinct "personalities" as Ladzekpo stresses (1995).

Spreading. It was not until the advent of automated machinery that human ears were ever treated to inhuman rhythmic precision. The fact is that sonic trace of temporal constraints imposed by the body are often perceived as aesthetically pleasing, while inhuman rhythmic regularity often is not. These audio examples [CD-42, 43] consist of two versions of the "same" rhythm, shown below.

The first rendition is executed as close to the theoretical idealas the computer allows - that is, rigidly, mathematically accurate. The second features timing inflections designed to imitate an aspect of human performance. The difference is not simply the injection of random temporal slop. Rather, it involves the spreading apart of consecutive attacks played by the same hypothetical limb or digit. An individual effector such as a limb, hand, or digit has a time constant associated with its motion; the nerves and muscles have a brief

The rhythmic expression added in the above example is systematic; the first of each group of three taps is about 30 milliseconds early, and the last is about 30 milliseconds late, as indicated by the arrows in the figure. In addition to enhancing perceived separation, this example depicts the encoding of bodily movement in musical material. Nearly all listeners are familiar with the kind of motion suggested by these synthetic tapped rhythms, but that motion is strongly implied only by the second, "imperfect" version. Again, this description recalls the embodied, ecological view of musical perception, in which the listener perceives the source of the sounds, rather than the sounds themselves. In a music that embraces physical body motion (Wilson 1974) and that is contiguous with everyday experience, this sonic trace of the body is a valued aesthetic.

Coding for invariance. The above three examples demonstrate the notion of invariance. At the most basic level, expressive microtiming represents a departure from regularity, so it is likely to be noticed in relief against the more regular background. Gibson (1975) claimed that our perceptual systems are attuned to variants and invariants in the environment; they code for change. As an example, consider the way that vibrato or a trill can facilitate auditory scene analysis by drawing our attention to a particular instrument in an otherwise blended orchestral texture [CD-44]. The microvariation of a single pitch is enough to make that voice pop out in the auditory scene.

We can make a similar generalization with rhythm. That which is regular, or invariant, in an isochronous-pulse context is the norm set by the regularity of pulsation, along with its salient multiples and subdivisions; that which is irregular comprises the variable rhythmic material along with its continuous expressive variation. Microrhythmic expression signals a departure from the implied norm, hence marking a particular sound or group of sounds as worthy of attention or analysis by our perceptual systems. This argument contributes to an ecological view of rhythm perception, in which we are attuned to variations in an otherwise regular environment.

Swing. A kind of rhythmic expression that seems to be indigenous to African-American culture is that found in jazz of the first half of this century. Known as swing, this kind of structure can be thought of as modified duple subdivisions of the main pulse, or as modified triplet subdivisions, or both concurrently. As duple subdivisions, they divide the interval of a pulse into two unequal portions, of which the first is slightly longer. They are occasionally rendered in triplet notation as a quarter note followed by an eighth note, but this exaggerates the typical swing ratio, which is usually in the gray area between duple and triple and is strongly tempo-dependent (typically lower for fast tempi and higher for slow ones). An individual musician has a particular range of preferred ratios and particular ways of manipulating them, which together form crucial dimensions of that individual's sound, rhythmic feel, and musical personality.

In a related experiment on rhythm, Fraisse (1982) has studied the ability of musically trained and untrained subjects to reproduce rhythmic patterns of varying degrees of complexity. "Arrhythmic" sequences with arbitrary relationships between time intervals caused the most difficulty. In more regular rhythmic cases, subjects tended to simplify the ratios between intervals, almost always settling on exactly two classes of time interval: long (400-800 milliseconds) and short (200-400 ms). People tend to understand rhythms to feature two and only two interval lengths, roughly in the ratio of 2:1. This drive towards rhythmic simplicity recalls some of the classical perceptual laws, namely the principle of economy in organization (Fraisse 1982). Usually as performed or as "preferred," the ratio is lower -- in fact closer to swing, about 1.75:1, about 57%

However, it is not apparent why the interval would be divided unequally in the first place. It would seem even simpler and more economical if there were no such difference in duration between the first and second of two consecutive swung notes. But the point is that this difference facilitates the perception of higher-level rhythmic structure. An immediate consequence of the swing feel is that it suggests the next level of hierarchical organization. In conventional terms, the swung eighth-note pairs are perceptually grouped into the larger regular interval, that is, the quarter note. If all subdivisions were performed with exactly the same duration, it would be more difficult to perceive the main beat. The lengthening of the first of two swung notes in a pair amounts to a durational accentuation of the beat. (Often in practice, the second note of the swung pair is given a slight accent in intensity, as if to compensate for its shorter duration.) Hence swing enhances the perception of the main pulse, as the examples [CD-45, 46] demonstrate:

The first version plays all eighth notes exactly equivalently and is therefore metrically indistinct, whereas the second version introduces a slight swing, which immediately marks the pulse.

In the pocket: backbeat delay. The notion of a backbeat is indigenous to the modern drum kit, an instrument pioneered by African-Americans in this century. It consists of a strongly accented snare drum stroke or handclap on beats two and four of a four-beat metric cycle, where the beat is typically a moderate tactus rate [CD-47].

The backbeat appears to have arisen in the middle of this century, as the popular swing rhythm yielded to the even more popular, more bombastic rock and roll rhythms of artists such as Little Richard and Chuck Berry.

In his musical interpretation of Stuckey's (1987) study of the culture of enslaved Africans and its influence on modern African-American culture, Floyd (1995) discusses the the important African diasporic cultural ritual known as the ring shout as a distinctive space in which, among other things, music and dance were fused. This activity "helped preserve ... what we have come to know as the characterizing and foundational elements of African-American music," including "constant repetition of rhythmic and melodic figures and phrases," "hand clapping, foot patting, and approximations thereof," and "the metronomic pulse that underlies all music." (Floyd 1995: 6) As a cultural model, the ring shout serves for Stuckey as a hermeneutical point of departure in the study of African-American art forms. It provides an alternative lens through which to view these later practices, a lens that is grounded on African, rather than European, concepts and aesthetics. (See Rosenbaum 1998 for more documentation of the ring shout.)

The backbeat that is so prevalent in postwar African-American popular music seems to reference the role of the body in the ring shout -- the bass drum (struck with a foot pedal in the modern drumset) and snare drum (struck manually with a stick) replacing the stomp and clap, respectively. In fact, a real or synthetic handclap sound is often superimposed on the backbeat's snare drum sound in popular urban dance music. The hard-edged repetitiveness of the backbeat embodies the cyclic, earthy atmosphere of the ring-shout ritual. Though sometimes dismissed as dull and monotonous, the backbeat taps into the hypnotic, functional role of repetition in such rituals, in which steady, moderate tempo, rhythmic ostinati, and physical body motion (stomping and clapping) were combined in a collective setting to create a shared multisensory experience. It seems plausible that the earliest musical activities of humankind possessed many of these qualities. The backbeat is best understood as a contemporary, popular remnant of what is probably some very ancient human musical behavior, filtered through a sophisticated, stylized African ritual and through centuries of African-American musical development.

The curious point about the backbeat in practice is that when performed, it displays a microscopic lopsidedness. If we consider the downbeat to be exactly when the bass drum is struck, then the snare drum is very often played ever so slightly later than the midpoint between two consecutive pulses [CD-48]. Often musicians are aware of this to some degree, and they have a term for it: the drummer is said to play "in the pocket." While perhaps unaware of the exact temporal details of this effect, a skilled musician or listener in this genre hears this kind of expressive microdelay as "relaxed" or "laid back" as opposed to "stiff" or "on top." This effect is much subtler than the salient rhythmic categorization of the long and short durations of swing. It is a miniscule adjustment at the level of the tactus, rather than the substantial fractional shift of rhythmic subdivisions in swing.

What function does this delay structure have? Perhaps the delay functions as a kind of accent, since it involves the postponement of an expected consequent (Meyer 1956). It seems plausible that the optimum snare-drum offset that we call the "pocket" is that precise rhythmic position that maximizes the accentual effect of a delay without upsetting the ongoing sense of pulse. This involves the balance of two opposing forces: the force of regularity that resists delay, and the backbeat accentuation that demands delay.

Note that the concept of a backbeat, and the slight delay associated with it, does not pertain if a single voice is used for both the downbeat and the backbeat. (As an example, the urban dance-music genre known as "house" features an isochronous bass drum on all four beats, with the snare-drum backbeat occasionally dropping out.) The effect seems tied to the difference between the two sounds, and perhaps also to the actual sounds themselves and the imagined bodily activity that gives rise to it. In a related study, Fraisse (1982) reports,

This delay architecture amounts to the subject's hand coming after the foot for perceived synchronization, since the anticipatory "error" is greater for the foot. This seems to predict that a regularly alternating stomp-clap pattern would contain a microscopic asymmetry similar to that found in the modern backbeat. Given that the bass drum both references and is played by the foot, and similarly the snare drum both points to and involves the hand, it is possible that this resultant delay structure was transferred to the drumset. Though these arguments are quite speculative, it is plausible that there is an important relationship between the backbeat and the body, informed by the African-American cultural model of the ring shout.



Thelonious Monk plays "I'm Confessin'." [CD-49] One of the most fascinating skills displayed by Monk and many other pianists of the genre is a high degree of independence between the two hands, to the degree that one hand can appear to perform rhythms that are ambiguously if at all related to those performed by the other. Often, as in stride piano, this takes the form of a steady pulse or repetitive bass rhythm in the left hand (the "ground"), and upper-register, rhythmically free melodies in the right hand (the "figure"). A classic example is Monk's 1963 solo recording of "I'm Confessin' (That I Love You)" (Monk 1998). In this piece, after carrying on in this expressive stride fashion for some time, the last two bars of the first chorus give rise to an improvised melodic fragment that rhythmically seems to stretch and tumble into the next bar [CD-50].

In this excerpt, the melodic structure in the right hand temporarily overrides and upsets the underlying rhythmic structure, only to be righted again. We can interpret Monk's unquestionably gripping display here as the rhythmic equivalent of a struggle, one that threatens the norm of established pulse regularity set by what has come before. It seems to offer an example of a case in which such regularity is sacrificed briefly to allow for a case of extreme rhythmic expression. But note that the sense of pulse is never lost; Monk leaves out a couple of quarter-note chords in the left hand, but otherwise provides strong and accurate pulse reinforcement in the stride style. The rhythmic underpinning of the left hand compensates for the apparent deviation from regularity.

When I played the recording of this piece for a roomful of cognitive science undergraduates, most of whom presumably had no familiarity with jazz, this excerpt elicited a burst of spontaneous laughter. Something about Monk's delivery is communicative enough to transcend what one might expect of the traditional confines of genre. Nearly upsetting the regular pulse, Monk takes a chance and chooses to follow through on a melodic idea that momentarily takes him rhythmically far afield.

The question of whether Monk "intended" to play this in exactly this way is a pejorative one, akin to reifying the role of "mistakes" in jazz (as in Walser 1995). From the perspective of an improvisor, the notion of a mistake is supplanted by the concept of displaying one's interaction with the structure suggested by the sonic environment. It is never clear what is "supposed" to happen in improvised music, so it makes no sense to talk about mistakes. This improvisation-friendly framework allows for the possibility of musical exploration and experimentation, including impromptu rhythmic variation of the sort described here, without invoking a notion of mistakes.

Ahmad Jamal plays "But Not for Me." [CD-51] A wonderfully extemporaneous, playful spirit is captured masterfully in pianist Ahmad Jamal's 1952 trio version of the standard tune "But Not for Me." In this piece, Jamal manipulates his relationship to the pulse actively and voluntarily through the skillful use of microtiming variation. Nearly every single phrase in Jamal's rendition contains some interesting microrhythmic manipulations, but here I will focus on one fragment, namely the end of the first chorus into the beginning of the second chorus. In measure 31, Jamal initiates a repeating three-beat figure in the four-beat metric context. This additive rhythmic technique is a common one in African-American music, and Jamal carries it out to a humorous extreme, letting the blues-inflected figure cycle twelve full times (nine measures). The first four measures of this passage are displayed below. I have adhered to the convention of representing swung rhythms with regular eighth notes, but it should be understood that there is much more to this passage than meets the eye. In particular, Jamal plays this figure extremely behind the beat, so much so as to enhance the humorous effect of the repeating melodic figure by casting it in starker relief against the more ordinary rhythmic background [CD-52].

In these four measures, the quarter note averages 469 milliseconds (128 beats per minute). The note events in the piano that are displayed as occurring on the beat tend to begin actually around 40% of a beat later than the drummer's rimshots, which are indicated with x's above. This places him more than a triplet behind the beat. Furthermore, Jamal's second eighth note in each swung pair tends to occur about 85% of the way through the beat. This means that the swing ratio here is effectively inverted; the first eighth note in a delayed pair lasts about 45% of a beat (less than half), and the second lasts about 55% (more than half). It would appear that the perception of swing arises due to complex variations in timing, intensity, or articulation; in this case, it is not merely a matter of achieving the "correct" microrhythmic ratio.

How does Jamal pull off this apparent rhythmic violation of an inverted swing? The answer seems to lie in his 40% phase shift relative to the beat established by the accompanying instruments. If, while maintaining this phase relationship, he were to adhere to the usual swing ratio of around 57%, then the second note in a swung pair would be close enough to the onset of the next beat (only a few percent early) that it would be heard as on-the-beat. By employing a relative anticipation of the second eighth note in each pair, Jamal avoids this problem, instead sounding squarely "between" the beats. The 40% delay also affords him enough rhythmic ambiguity so that the inverted swing does not sound jarring. Also, Jamal enhances the sense of swing by accenting the second of each pair (a common technique, as mentioned earlier). So here is a case in which one kind of rhythmic expression interacts with another; the usual long-short relationship of swing is altered in order to accommodate the "laid-back" quality of the melodic figure.

What is accomplished by playing in this laid-back, behind-the-beat fashion? One might expect the same simple perceptual effects (such as enhancing stream segregation) if he instead played ahead of the beat, for example. Playing behind the beat is definitely a cultural aesthetic in African-American music, especially jazz. In an unpublished study, Bilmes (1996) found that a West African drummer played equally as often ahead of as behind the beat, whereas one might observe casually that skilled jazz improvisors tend to play much more often behind than ahead. From the ecological point of view, playing behind the beat might be normally associated with a physical or mental state of relaxation, or might suggest a causal relationship in which the musical material is a reaction to the pulse. Such hypotheses would demand further investigation.

In this chapter I have discussed some aspects of rhythmic expression that are quite distinct from the common body of European classical musical performance techniques typically discussed. Instead of (or in addition to) expressive concepts like rubato, ritardando, and accelerando, we have seen deliberately asynchronous unisons, subtle separation of rapid consecutive notes, asymmetric subdivisions of a pulse, and microscopic delays. As further illustration, we have seen extremely deft manipulation of fine-scale rhythmic material in examples from the jazz idiom. I have chosen to focus on African and African-American musics because they often feature these concepts in isolation from the possible interference of tempo variation, and because they tend to involve percussive timbres which facilitate precise microrhythmic analysis. I have argued that African and African-American musics incorporate aesthetics that value these kinds of microrhythmic expression. However, I believe that these techniques are found to varying degrees in all music, including the European classical genres. In the next chapter, I present a representation for rhythmic structure that allows for the explicit manipulation of expressive microtiming of the variety discussed above.


Table of Contents

List of Audio Examples



Previous Chapter

Next Chapter