Multi-window Comparison of Sir Performance in Extraction of Mono-aural Vocal and Non-vocal Components in Repet

Kher, Vansha; Lamba, T.S. [Guided by]

Please use this identifier to cite or link to this item: http://www.ir.juit.ac.in:8080/jspui/jspui/handle/123456789/5312

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kher, Vansha	-
dc.contributor.author	Lamba, T.S. [Guided by]	-
dc.date.accessioned	2022-07-28T16:21:29Z	-
dc.date.available	2022-07-28T16:21:29Z	-
dc.date.issued	2015	-
dc.identifier.uri	http://ir.juit.ac.in:8080/jspui//xmlui/handle/123456789/5312	-
dc.description.abstract	The vocalized form of human communication is speech. In linguistics (articulatory phonetics),normal human speech is said to be produced with pulmonary pressure that are created by the lungs , thereby creating phonation in the glottis and larynx , which is then modified by vocal-tract to generate different vowels and consonants. Speech is composed of following three parts: Articulation, Voice and Fluency. The message or information that gets communicated through speech is intrinsically of a discrete nature; i.e. it can be designated by a concatenation of elements from a finite set of symbols. Phonemes are the set of symbols from which every sound can be classified. Thus, phonemes are the basic units of language phonology, which are usually combined with other phonemes to form meaningful units called Morphemes. The Audio signals can be classified as the class of sounds that pursue same frequency as that of human auditory range. The separation of vocals and music has evolved as an extremely quintessential area to be resolved in Automatic Karaoke, vocalist identification and audio pre-processing. The distinction of the lead varying vocals from the background music in an audio recording is an extremely demanding and exigent task. The speech-separation research usually inculcates Time-frequency masking technique that ultimately appraises the hearing-aid design. The core principle in music which is capitalized to discriminate underlying non-vocals from vocals (speech) is Repetition. The rudimentary principle in the field of Music Information Retrieval (MIR) is ‘REPETITION’, as premise of music, as an art. The ‘Repetition’ feature is especially enacted for pop songs where the singer often overlays frequently changing vocals on a periodically repeating background in a mixture. The basic approach of dissertation is the recognisation of periodically repeating segments in audio excerpts, analogize them with a repeating model and finally discrimate the repeating musical patterns via Time-Frequency masking. A TF mask is grounded on the basis of TF representation of any signal. In this project, the quality of foreground vocals and accompanying background can be analyzed in terms of SIR (Signal to Interference Ratio) value utilizing ‘ANOVA’ (Analysis Of Variation) computational method on different genres of musical audios and formulated the complete comparison of SIR values using hamming , hanning and blackmann windows using the software tool ‘MATLAB’ and concluded that separation of mono-aural vocal and non-vocal components applying blackmann window shows better SIR values and separation quality as compared to hanning and hamming windows.	en_US
dc.language.iso	en	en_US
dc.publisher	Jaypee University of Information Technology, Solan, H.P.	en_US
dc.subject	MATLAB	en_US
dc.subject	Beat spectrum	en_US
dc.subject	Music separation	en_US
dc.subject	Music information retrieval	en_US
dc.title	Multi-window Comparison of Sir Performance in Extraction of Mono-aural Vocal and Non-vocal Components in Repet	en_US
dc.type	Project Report	en_US
Appears in Collections:	Dissertations (M.Tech.)

Files in This Item:

File	Description	Size	Format
Multi-window Comparison of Sir Performance in Extraction of Mono-aural Vocal and Non-vocal Components in Repet.pdf		8.74 MB	Adobe PDF	View/Open

Show simple item record