A query-based framework to handle data issues in concatenative sound synthesis

Concatenative Sound Synthesis (CSS) is a data-driven method to synthesise new sounds. It involves taking in a sound, decomposing it into smaller sound segments, analyzing its spectral and other auditory content, before searching into a database of other sound segments for a matching pair. The select...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd. Norowi, Noris, Miranda, Eduardo Reck
Format: Conference or Workshop Item
Language:English
Published: 2014
Online Access:http://psasir.upm.edu.my/id/eprint/38837/1/38837.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Concatenative Sound Synthesis (CSS) is a data-driven method to synthesise new sounds. It involves taking in a sound, decomposing it into smaller sound segments, analyzing its spectral and other auditory content, before searching into a database of other sound segments for a matching pair. The selected segments are then concatenated together in sequence, and are then resynthesised to produce new sounds that are based on the original. However, with the increase in processing power, hard disk capacity and network bandwidth, the amount of audio information that is possible be extracted from each of the sound segments can become too much, rendering it useless in aiding the matching process. This study looks at the current approaches adopted in matching sound segment in CSS and discusses the challenges which arise from it. This includes the tradeoffs of extracting huge, multi-dimensional audio features and the need to understand human sound perception in order to minimize the synthesis of mismatching segments. To improve similarity result, a query-based CSS framework is proposed. A proof-of-concept, ConQuer, was also developed based on this framework, which offers users parametric control in order to communicate their intended creations to the system to synthesise.