Mapping sonification for perception and action in motor skill learning

Introduction, Goal, and Scope

Real-time sonification of human movement (conversion of motion signals into sound) can be used as augmented feedback for motor skill learning. With sonification, motor skills can, in some instances, be learned more quickly and successfully ( Sigrist et al., 2013a ). The goal of such sonification systems is a permanent (or at least, lasting) improvement in performance at a physical task or skill, which persists in the absence of augmented feedback. Many experimental investigations of feedback, however, show that when performance is tested without feedback, a decline occurs ( Park et al., 2000 ; Maslovat et al., 2009 ). This finding has become known as the “ guidance effect” ( Buchanan and Wang, 2012 ). It has been suggested that this effect is a consequence of learner overreliance on the “ guidance” provided by augmented feedback, at the expense of task-intrinsic sensory feedback. For effective learning, this is clearly not desirable.

In this paper, we advocate a perception-action approach to sonification when used as feedback for skill learning, which may lead researchers and trainers to design more effective prototypes. We highlight three main issues: 1. The learner’s task should be conceived as perception-action based and sonification designed accordingly, 2. Sonification should provide Ecological information for perception rather than propositional knowledge-of-performance, and 3. Ecologically meaningful sound morphologies should be harnessed effectively.


Successful coordination requires the pickup and use of event-structured information through perception and action ( Gibson, 1972 ). Perceptual information available to a moving agent can be said to specify the state of the agent-environment system (i. e., the task); this enables the skilled agent to control movement and perceive its outcome “ directly” ( Warren, 2006 ; for a formal explanation of the relation between Ecological information and task dynamics, see Turvey et al., 1981 ). Through repeated interactions with a task, novices can become selectively sensitive to informational variables which best serve task goals, and become more adept at bringing these variables into use for coordination—a process described by Eleanor Gibson as “ education of attention” ( Gibson, 1969 ; Jacobs and Michaels, 2007 ). Motor skill learning is therefore characterized by the “ tuning in” of perception and action—a progression toward the active pickup of better-specifying and more useful informational variables ( Huys et al., 2009 ; Gray, 2010 ; Wilson et al., 2010a ). Stoffregen and Bardy (2001) propose that action is controlled via the pickup of multisensory informational variables, which better specify the state of the perception-action system than can unimodal variables. Sonification can therefore provide higher-order information—available via interaction and specific to that interaction—which enables better perception and control of movement. With sonification, novices can practice with an enhanced, more responsive perceptual-motor workspace (defined as the emergent resources and constraints of organism and environment in the context of a task, which are perceptually available through dynamic interaction: see Newell et al., 1991 ), which employs sound as a helpful constraint on action.

A model of the informational variables available and useful for the learner in a task can be guided by existing literature, or refined by pilot testing. However, the most useful informational variable(s) for the perceiving learner may not necessarily correspond to the motor variable being tracked by the researcher as a measure of performance. In other words, measurement and experience are not isomorphic. As an example, consider research on sonified reaching and target tracking ( Oscari et al., 2012 ; Schmitz and Bock, 2014 ; Boyer et al., 2016 ). The task as instantiated here is to track or reach for a target, while using whatever information is provided by the system to guide one’s effector/pointer. The variable of interest for measurement in this task (and others) is the absolute positional difference between hand/pointer position and target position (error). This variable is frequently sonified, by mapping error to a sonic variable such as pitch, amplitude or inter-aural panning, with mixed results ( Konttinen et al., 2004 ; Rosati et al., 2012 ; Sigrist et al., 2013b ). It is not certain that instantaneous positional error is a relevant variable for a moving individual in an everyday context. Everyday pointing for example, is primarily a visuomotor task with a criterion for success often defined in social terms ( Kennedy, 1985 ). It makes sense from the detached perspective of an experimenter to measure positional error as an objective performance index, but perhaps another, possibly higher-order variable might be more useful for the learner as a perceiver ( Runeson, 1977 ). The choice of what to sonify may have additional implications for the guidance effect, as the next section will detail.

Learning and the Guidance Effect

An analysis of the task can enable identification of the perception-action resources used by a skilled performer in an everyday context, including important informational variables and control parameters ( Wilson and Golonka, 2013 ; Bruineberg and Rietveld, 2014 ). Sonification systems can then be designed to highlight these same useful variables/parameters, rather than to create new parameters—control of which might be independent of how the task would be performed without feedback. The value of highlighting task-intrinsic information lies in the possibility to avoid the guidance effect of augmented feedback. As an example, Ronsse et al. (2011) provided direct sonification of changes in hand-movement direction with a set of two tones. Participants were required to learn a 90° out-of-phase bimanual wrist coordination task in which a two-tone isochronous galloping rhythm was produced by perfect performance. When sonification was withdrawn, participant performance remained stable. In contrast, a second group of participants who had practiced with movement-coupled graphical feedback showed a decline in performance following withdrawal. In this task, sonification preserved the spatio-temporal structure of relevant task-intrinsic events in the perceptual-motor workspace, therefore the information required to control movement was perceivable with or without feedback. Sonification had acted as a guide for its pickup. Conversely, graphical feedback provided information for the direct control of bimanual phase-relationship; its removal meant that coordination was no longer possible as the required information was absent (see Wilson et al., 2010b ). If learning is seen as education of attention ( Jacobs and Michaels, 2007 ), the need to sonify task-intrinsic events to avoid the guidance effect is clear (for similar findings, see Dyer et al., 2017 ).

Knowledge and Information

Current understanding of augmented feedback and its role in performance enhancement has its foundations in classic studies on knowledge-of-results and knowledge-of-performance (KR/KP) feedback ( Adams, 1971 ). Today, sonification in Psychology is still widely discussed using these terms ( Konttinen et al., 2004 ; Dyer et al., 2015 ; Sors et al., 2015 ; Fujii et al., 2016 ). However, this continuity belies a subtle shift in what these terms have come to mean over time as technology has improved to the point where real-time sonification as KP is possible. In the late twentieth Century, both KP and KR meant explicit, propositional knowledge—typically verbal (or verbalisable) knowledge about movement outcome (KR) or performance (KP). Older reviews of KP/KR research ( Adams, 1971 ; Salmoni et al., 1984 ) show that motor skill learning was explicitly conceptualized as a knowledge and memory-based, problem-solving task, soluble by the application of explicit knowledge and rules (typically, coach-provided guidance, or scores/graphs of performance and error). The goal was to improve performance by delivering instructions, which could be applied to programming of motor output “ intellectually,” i. e., independently of perception and action. Thomas and Thomas (1994) have argued that the traditional knowledge-based approach to motor skill learning underplays the role of selective sensitivity to perceptual information in skilled performance, catering mostly to the earliest “ cognitive stage” of learning ( Fitts and Posner, 1967 ).

Today, augmented feedback (including sonification) is often delivered concurrently with movement ( Sigrist et al., 2013a ), and could be considered something more like augmented information for online, perceptual control of action. However, the older style of thinking about feedback as explicit knowledge is evident in many modern implementations. This thinking manifests in the design of mappings intended to transmit a signal to the learner, which can be said to contain knowledge—in a description of current performance relative to an ideal (often a sonified error score). This signal must be parsed by the learner with the application of a remembered mapping rule, and the decoded knowledge applied to update ongoing movement. This bears directly on sonified feedback given the requirement for learner interpretation associated with such rule-based mappings. Time required to interpret the knowledge contained in an auditory error signal puts an intellectual barrier between perception and action, which may not be conducive to fluid, skilful performance. The perception-action approach contends that “ knowledge” related to skill is primarily enacted via perception-action engagement with the dynamics of a task—rather than through the rote application of schemas and rules (for this argument, see Ingold, 2000 , 2001 ; van Dijk et al., 2015 ). Effenberg and colleagues ( Vinken et al., 2013 ; Effenberg et al., 2016 ) argue similarly that a “ direct” approach to mapping in which sound quality perceptually correlates with the dynamics of ongoing movement is most appropriate for motor skill learning. Learners can learn to use sonic information to perceive movement directly, with no need for cognitive elaboration, as the control of movement is directly related to the sensory consequences of movement (for a related example, see Stienstra et al., 2011 ). This approach preserves the immediacy of Ecological perception, and “ knowledge” emerges from two-way interaction rather than being translated from an incoming coded signal.

Ecologically-Meaningful Sound

A theme in this article has been that sonification researchers interested in motor skill learning should understand that learners are primarily perceivers, with pre-existing skills. Perception of something meaningful, or action-relevant in a sonic experience can be conceptualized as an active listening skill which is related to the sociocultural context of its development ( Steenson and Rodger, 2015 ). It follows then that listeners might already know how to listen and interact with certain sound morphologies, in certain contexts/tasks. The use of sounds which cater to existing bodily skills, such as physical modeling of metallic scraping in a writing task ( Danna et al., 2015 ) constrain the learner’s relation to the task and guide perception and action more effectively than might be possible with more basic pitch mapping (see also Roddy and Furlong, 2014 ). However, Dubus and Bresin (2013) show that pitch mapping (of a pure tone or the center frequency of filtered noise) remains a common strategy in sonification generally, but also in sonification of motor tasks. Most individuals have little experience using a pure tone for movement coordination, therefore such a mapping may be challenging and require extensive training before it can be used. The use of already-familiar sound morphologies (e. g., melodies, rhythms, sounds of real-life noisy interactions) may produce more “ intuitive” feedback systems. What we advocate here is not a distinction which is sometimes made, between “ ecological” sounds of the natural world on the one hand and “ artificial,” synthetic sounds on the other. “ Meaningfulness,” in an Ecological sense, is defined relative to a perceiver’s experience using the information which a sound source provides.


In this paper, we have argued for a perception-action approach to motor skill learning as the basis for understanding the utility of sonification as augmented feedback. If information supports performance, then sonification should highlight task-intrinsic information to counter the guidance effect. A clearer definition of what “ information” is can help guide the design of sonified feedback whereby knowledge is a product of interaction rather than transmission (e. g., see Wilson and Golonka, 2013 ). Lastly, learners have abundant socioculturally-situated listening experience already; it is therefore undershooting the potential of sonification as feedback to rely only on unfamiliar or esoteric sound morphologies like pure tones. To speculate, the common root of these three issues for design may be related to the existence of different frames of reference for the experimenter/designer and the learner. It is more helpful to see the learner as a situated agent with a repertoire of existing perception-action skills than an engine that must apply propositional knowledge to enact a desired change in state.

Author Contributions

JD conceived the topic in discussion with PS and MR and also drafted the manuscript. All three authors were involved in redrafting and editing of the manuscript.


This work was funded in part by a grant from the Northern Ireland Department of Employment and learning awarded to the lead author to undertake this research as part of his Ph. D.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


Adams, J. A. (1971). A closed-loop theory of motor learning. J. Motor Behav. 3, 111–150. doi: 10. 1080/00222895. 1971. 10734898

Boyer, É. O., Bevilacqua, F., Susini, P., and Hanneton, S. (2016). Investigating three types of continuous auditory feedback in visuo-manual tracking. Exp. Brain Res. 235, 691–701. doi: 10. 1007/s00221-016-4827-x

Bruineberg, J., and Rietveld, E. (2014). Self-organization, free energy minimization, and optimal grip on a field of affordances. Front. Hum. Neurosci. 8: 599. doi: 10. 3389/fnhum. 2014. 00599

Buchanan, J. J., and Wang, C. (2012). Overcoming the guidance effect in motor skill learning: feedback all the time can be beneficial. Exp. Brain Res. 219, 305–320. doi: 10. 1007/s00221-012-3092-x

Danna, J., Fontaine, M., Paz-Villagrán, V., Gondre, C., Thoret, E., Aramaki, M., et al. (2015). The effect of real-time auditory feedback on learning new characters. Hum. Mov. Sci. 43, 216–228. doi: 10. 1016/j. humov. 2014. 12. 002

Dubus, G., and Bresin, R. (2013). A systematic review of mapping strategies for the sonification of physical quantities. PLoS ONE 8: e82491. doi: 10. 1371/journal. pone. 0082491

Dyer, J., Stapleton, P., and Rodger, M. W. M. (2015). Sonification as concurrent augmented feedback for motor skill learning and the importance of mapping design. Open Psychol. J. 8, 1–11. doi: 10. 2174/1874350101508010192

Dyer, J., Stapleton, P., and Rodger, M. W. M. (2017). Transposing musical skill: sonification of movement as concurrent augmented feedback enhances learning in a bimanual task. Psychol. Res. 81, 850–862. doi: 10. 1007/s00426-016-0775-0

Effenberg, A. O., Fehse, U., Schmitz, G., Krueger, B., and Mechling, H. (2016). Movement sonification: effects on motor learning beyond rhythmic adjustments. Front. Neurosci. 10: 219. doi: 10. 3389/fnins. 2016. 00219

Fitts, P. M., and Posner, M. I. (1967). Human Performance . Belmont, CA: Brooks/Cole Publishing Company.

Fujii, S., Lulic, T., and Chen, J. L. (2016). More feedback is better than less: learning a novel upper limb joint coordination pattern with augmented auditory feedback. Front. Neurosci. 10: 251. doi: 10. 3389/fnins. 2016. 00251

Gibson, E. J. (1969). Principles of Perceptual Learning and Development. New York, NY: Appleton-Century-Crofts.

Gibson, J. J. (1972). “ A theory of direct visual perception,” in The Psychology of Knowing , eds J. Royce and W. Rozenboom (New York, NY: Gordon & Breach), 76–89.

Gray, R. (2010). Expert baseball batters have greater sensitivity in making swing decisions. Res. Q. Exerc. Sport 81, 373–378. doi: 10. 5641/027013610X13088600028897

Huys, R., Cañal-Bruland, R., Hagemann, N., Beek, P. J., Smeeton, N. J., and Williams, A. M. (2009). Global information pickup underpins anticipation of tennis shot direction. J. Mot. Behav. 41, 158–171. doi: 10. 3200/JMBR. 41. 2. 158-171

Ingold, T. (2000). The Perception of the Environment . Abingdon: Taylor & Francis.

Ingold, T. (2001). “ Beyond art and technology: the Anthropology of Skill,” in Anthropological Perspectives on Technology , ed H. Schiffer (Albuquerque, NM: University of New Mexico Press), 17–33.

Jacobs, D. M., and Michaels, C. F. (2007). Direct learning. Ecol. Psychol. 19, 321–349. doi: 10. 1080/10407410701432337

Kennedy, J. M. (1985). Convergence principle in blind people’s pointing. Int. J. Rehabil. Res. 8, 207–210.

Konttinen, N., Mononen, K., Viitasalo, J. T., and Mets, T. (2004). The effects of augmented auditory feedback on psychomotor skill learning in precision shooting. J. Sport Exerc. Psychol. 26, 306–316. doi: 10. 1123/jsep. 26. 2. 306

Maslovat, D., Brunke, K. M., Chua, R., and Franks, I. M. (2009). Feedback effects on learning a novel bimanual coordination pattern: support for the guidance hypothesis. J. Mot. Behav. 41, 45–54. doi: 10. 1080/00222895. 2009. 10125923

Newell, K. M., McDonald, P. V., and Kugler, P. N. (1991). “ The perceptual-motor workspace and the acquisition of skill,” in Tutorials in Motor Neuroscience , eds J. Requin and G. E. Stelmach (Dordrecht: Springer), 95–108.

Oscari, F., Secoli, R., Avanzini, F., Rosati, G., and Reinkensmeyer, D. J. (2012). Substituting auditory for visual feedback to adapt to altered dynamic and kinematic environments during reaching. Exp. Brain Res. 221, 33–41. doi: 10. 1007/s00221-012-3144-2

Park, J. H., Shea, C. H., and Wright, D. L. (2000). Reduced-frequency concurrent and terminal feedback: a test of the guidance hypothesis. J. Mot. Behav. 32, 287–296. doi: 10. 1080/00222890009601379

Roddy, S., and Furlong, D. (2014). Embodied aesthetics in auditory display. Organ. Sound 19, 70–77. doi: 10. 1017/S1355771813000423

Ronsse, R., Puttemans, V., Coxon, J. P., Goble, D. J., Wagemans, J., Wenderoth, N., et al. (2011). Motor learning with augmented feedback: modality-dependent behavioral and neural consequences. Cereb. Cortex 21, 1283–1294. doi: 10. 1093/cercor/bhq209

Rosati, G., Oscari, F., Spagnol, S., Avanzini, F., and Masiero, S. (2012). Effect of task-related continuous auditory feedback during learning of tracking motion exercises. J. Neuroeng. Rehabil. 9, 1–13. doi: 10. 1186/1743-0003-9-79

Runeson, S. (1977). On the possibility of “ smart” perceptual mechanisms. Scand. J. Psychol. 18, 172–179. doi: 10. 1111/j. 1467-9450. 1977. tb00274. x

Salmoni, A. W., Schmidt, R. A., and Walter, C. B. (1984). Knowledge of results and motor learning : a review and critical reappraisal. Psychol. Bull. 95, 355–386.

Schmitz, G., and Bock, O. (2014). A comparison of sensorimotor adaptation in the visual and in the auditory modality. PLoS ONE 9: e107834. doi: 10. 1371/journal. pone. 0107834

Sigrist, R., Rauter, G., Riener, R., and Wolf, P. (2013a). Augmented visual, auditory, haptic, and multimodal feedback in motor learning: a review. Psychon. Bull. Rev. 20, 21–53. doi: 10. 3758/s13423-012-0333-8

Sigrist, R., Rauter, G., Riener, R., and Wolf, P. (2013b). Terminal feedback outperforms concurrent visual, auditory, and haptic feedback in learning a complex rowing-type task. J. Mot. Behav. 45, 455–472. doi: 10. 1080/00222895. 2013. 826169

Sors, F., Murgia, M., Santoro, I., and Agostini, T. (2015). Audio-Based Interventions in Sport. Open Psychol. J. 8, 212–219. doi: 10. 2174/1874350101508010212

Steenson, C., and Rodger, M. W. M. (2015). Bringing sounds into use : thinking of sounds as materials and a sketch of auditory affordances. Open Psychol. J. 8(Suppl 3), 174–182. doi: 10. 2174/1874350101508010174

Stienstra, J., Overbeeke, K., and Wensveen, S. (2011). “ Embodying complexity through movement sonification,” in Proceedings of the 9th ACM SIGCHI Italian Chapter International Conference on Computer-Human Interaction: Facing Complexity (New York, NY: ACM Press), 39–44.

Stoffregen, T. A., and Bardy, B. G. (2001). On specification and the senses. Behav. Brain Sci. 24, 195–213. doi: 10. 1017/S0140525X01003946

Thomas, T., and Thomas, J. R. (1994). Developing expertise in sport: the relation of knowledge and performance. Int. J. Sport Psychol. 25, 295–312.

Turvey, M. T., Shaw, R. E., Reed, E. S., and Mace, W. M. (1981). Ecological laws of perceiving and acting: in reply to Fodor and Pylyshyn. Cognition 9, 237–304. doi: 10. 1016/0010-0277(81)90002-0

van Dijk, L., Withagen, R., and Bongers, R. M. (2015). Information without content: a Gibsonian reply to enactivists’ worries. Cognition 134, 210–214. doi: 10. 1016/j. cognition. 2014. 10. 012

Vinken, P. M., Kröger, D., Fehse, U., Schmitz, G., Brock, H., and Effenberg, A. O. (2013). Auditory coding of human movement kinematics. Multisens. Res. 26, 533–552. doi: 10. 1163/22134808-00002435

Warren, W. H. (2006). The dynamics of perception and action. Psychol. Rev. 113, 358–389. doi: 10. 1037/0033-295X. 113. 2. 358

Wilson, A. D., and Golonka, S. (2013). Embodied cognition is not what you think it is. Front. Psychol. 4: 58. doi: 10. 3389/fpsyg. 2013. 00058

Wilson, A. D., Snapp-Childs, W., and Bingham, G. P. (2010a). Perceptual learning immediately yields new stable motor coordination. J. Exp. Psychol. Hum. Percept. Perform. 36, 1508–1514. doi: 10. 1037/a0020412

Wilson, A. D., Snapp-Childs, W., Coats, R., and Bingham, G. P. (2010b). Learning a coordinated rhythmic movement with task-appropriate coordination feedback. Exp. Brain Res. 205, 513–520. doi: 10. 1007/s00221-010-2388-y