Music content analysis on audio quality and its application to music retrieval

MUSIC CONTENT ANALYSIS ON AUDIO QUALITY AND ITS APPLICATION TO MUSIC RETRIEVAL CAI JINGLI (A0095623B) (B.Sc., East China Normal University) A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF SINGAPORE 2015 ' Declaration I hereby declare that this thesis is my original work and it has been written by all me in its cntircty I have duly acknowlcdged thc sourccsof information which have been used in the thesis This thesis has also not been submitted for any degreein any university previously (Aooe5623B) cAr JTNGLT Jan 2015 s{ l:1ij d ltl ".i Acknowledgments During my stay in Sound and Music Computing (SMC) group, I had the fortune to experience an atmosphere of motivation, support, and encouragement that was crucial for progress in my research activities as well as my personal growth First and foremost, I would like to express my sincere gratitude to my supervisor, Dr Wang Ye, who has supported and led me in my two years’ study and research work He is always there helping me and giving me suggestion and guide on my work I’m deeply infected by his passion and spirit of diligence for the work I also would like to thank all who directly or indirectly involved in my research projects I thank Zhonghua Li, Ju-Chiang Wang, Zhiyan Duan, Shenggao Zhu and Sam Fang for their collaborations and help I also wish to thank the other friends in SMC lab and in daily life, who support me and help me in various aspacts I also want to thank the School of Computing for giving me the opportunity to study here and also providing me with financial support Finally, I would like to express my deepest appreciation for my parents, who have always supported and encouraged me in my study and life ii Contents List of Figures List of Tables Introduction viii ix 1.1 Background and Motivation 1.2 Contribution 1.3 Chapter Plan Literature Survey 2.1 6 2.1.1 Audio Quality Standardization 2.1.2 Research on Audio Quality of Multimedia Signals 2.1.3 2.2 Audio Quality Assessment Research on Audio Quality of Music Music Search Engine 2.2.1 Research on Multidimensional Music Search Engine 2.2.2 Research on Personalized Music Search Engine The Approach for Music Quality Assessment 3.1 10 12 12 3.1.1 Data Collection 12 3.1.2 Audio Feature Sets 14 3.1.3 Machine Learning for Ranking 15 3.1.4 Baseline 16 3.1.5 Segmentation 17 3.1.6 3.2 Framework System Fusion 17 Segmentation and Segment Coupling 18 3.2.1 18 Equalization-based Scheme iii Contents 3.2.2 3.3 iv Structure-based Scheme 20 Fusion Strategy 23 3.3.1 Early Fusion 23 3.3.2 Late Fusion 23 Experiment and Result 4.1 26 27 4.1.1 Performance Metric 27 4.1.2 Baseline 28 4.1.3 Effect of K for Equalization-based Scheme (ES) 31 4.1.4 Early Fusion Study for Equalization-based Scheme (ES) 31 4.1.5 Performance Study for Each Individual Segment with ES 32 4.1.6 Early Fusion Study for the Confidence-aware (CA) Method 33 4.1.7 Early Fusion Study for the Label-aware (LA) Method 34 4.1.8 Late Fusion Study 35 4.1.9 4.2 Objective Evaluation Efficiency Analysis 36 Subjective Evaluation 37 4.2.1 Methodology and Performance Metric 37 4.2.2 Result and Discussion 38 Application to Music Retrieval: i2 MUSE 5.1 40 40 5.1.1 Interface 41 5.1.2 5.2 System Description Framework 42 43 5.2.1 Music Dimensions and Data Collection 43 5.2.2 Content Analysis and Indexing 43 5.2.3 5.3 System Construction Dimensions Correlation Analysis 44 Music Search 45 5.3.1 Interactive Query Input 45 5.3.2 Query Match and Ranking 46 Contents v 5.4 Experiment and Result 47 5.4.1 System Evaluation 47 5.4.2 Effectiveness Study 49 5.4.3 Usability Study 49 5.5 Personalized Music Search with Recommendation Conclusion and Future work 51 53 6.1 Conclusion 53 6.2 Future Work 54 References 55 Summary Nowadays, more and more users are uploading their music recordings of live music concerts to video sharing websites such as YouTube The audio quality of these uploads, however, varies widely due to their recording conditions, and most existing video search engines not take the audio quality into consideration when ranking their search results Given the fact that most users prefer live music videos with better audio quality, we propose the first automatic, non-reference audio quality assessment framework for live music video search online We first construct two annotated datasets of live music recordings The dataset contains 500 human-annotated pieces, and the second contains 2,400 synthetic pieces systematically generated by adding noise effects to clean recordings Then we formulate the assessment task as a ranking problem and try to solve it using a learning-based scheme Initially, we employ “song-level” feature representation and single learning to rank algorithm to predict the quality of the recordings To improve the performance, we then explore various segmentation methods and “segment-level” feature representations to better account for the temporal characteristics of live music Moreover, we also develop a number of integrated learning methods to enhance the capability of learning-to-rank To validate the effectiveness of our framework, we perform both objective and subjective evaluations Results show that our framework significantly improve the ranking performance of live music recording retrieval and can prove useful for various real-world music applications In the end, we apply the work to our Intelligent & Interactive Multidimensional mUsic Search Engine (i2 MUSE), which is a novel content-based music search engine and enables users to input music queries with multiple dimensions efficiently The i2 MUSE provides seven musical dimensions, including tempo, beat strength, genre, mood, instrument, vocal and audio quality to set and retrieve the music We have conducted a pilot user study with 30 vi Contents subjects and validated the effectiveness and usability of the system Now the system is strengthened to be a more functional domain-specific search engine, integrating music retrieval and recommendation techniques for music therapy vii List of Figures 3.1 Framework 13 4.1 Performance based on overall quality using the binary and ranking labels of ADB-H 29 4.2 Performance based on overall quality using the binary and ranking labels of ADB-S 29 4.3 Performance of SVM-Rank on ADB-H using different audio feature sets 30 4.4 Performance of SVM-Rank on ADB-S using different audio feature sets 30 4.5 Performance study on ES using SVM-Rank with different numbers of segments 32 4.6 The performance of ES on each individual segment Sub-figures (a), (b), and (c) show the results of K = 5, and sub-figures (d), (e), and (f) show the result of K = with the three LTR algorithms 33 5.1 The query interface of i2 MUSE 41 5.2 The framework of i2 MUSE 42 5.3 Mean Reciprocal Ranks of 10 example songs in the search-by-example mode 49 5.4 i2 MUSE suggestion function adoption rates (search-by-scenario mode) 50 5.5 i2 MUSE suggestion function adoption rates (search-by-example mode) 51 viii List of Tables 2.1 A five-point grading scale for subjective sound quality test 4.1 Summary table for all the experiment settings in the evaluation 27 4.2 Performance comparison among ES, Baseline and Random 32 4.3 Performance of CA (K = 5) on the most confident segments ‘Seg idx’ stands for the segment index without order 34 4.4 Performance of LA (K = 4) on segments with different labels 34 4.5 Performance for segment-wise fusion (SWF) versus the optimal non-SWF case (NSW) on ES and CA NDCG scores marked by and correspond to early fusion and individual segment, respectively 4.6 35 Performance study for model-wise fusion NDCG scores marked by † and ‡ are derived using SVM-Rank and MART, respectively 35 4.7 Efficiency improvement over the Baseline (SVM-Rank) 36 4.8 The MRR performance on NDB with respect to ranking the best-quality (Best) and worst-quality (Worst) versions 38 5.1 Six music dimensions for data collection in i2 MUSE 44 5.2 Ten real-life scenarios 48 5.3 Usability ratings on i2 MUSE feedback functions Scale: (very dissatisfied) – (very satisfied) 50 ix Chapter Application to Music Retrieval: i2 MUSE 52 the gait training We use the similar multidimensional search engine for the users on a larger dataset (Million Song Data Set) [BMEWL11] with user rating information The prototype contains three main components: • Patient information collection The user need provide some basic information of their background, such as the age, language, disease, music interests and so on • Music filtering With the information, we design a filter to narrow the scale of the dataset and select random songs for the user as the first trial They can choose the appropriate music and add them into the play-list • Music recommendation We employ the simplest algorithm – Collaborative Filtering – to predict the potential songs for the users, after we get some initial information in the filtering feedback Our system is an on-going project, aiming to change current process of music therapy and benefit both therapists and patients, especially those in developing counties lacking medical resource Chapter Conclusion and Future work 6.1 Conclusion In this thesis, we first proposed a novel framework to assess the audio quality for live music online search Two unique live music datasets, ADB-H (500 human annotated recordings) and ADB-S (2,400 synthetic recordings), were established for this study They can also serve as additional benchmark datasets for developing learning-to-rank algorithms To solve the audio quality problem, we applied signal processing and machine learning techniques and achieved high performance on quality raking Specifically, we have explored the effect of different audio feature sets, different learning-to-rank algorithms, different segmentation and coupling methods and different fusion strategies We built the baseline with song level feature set and single algorithm and then improved it more effectively and efficiently by using musical segmentation and fusion strategy In the objective and subjective evaluation, we employed NDCG and MRR as the performance metric, respectively We have confirmed and validated our approach can solve the problem and is appropriate to be applied in the practice Furthermore, we have also implemented an application (i2 MUSE), which integrated the new dimension (audio quality) and aimed to bridge the user intention gap The search engine provided multi-dimension input, correlated dimension suggestion and retrieved music database A pilot user study has been conducted and validated the effectiveness and usability of i2 MUSE We also have employed recommendation algorithm into the system, to help the users find more suitable music in their requirement scenario, especially for gait training 53 Chapter Conclusion and Future work 6.2 54 Future Work As mobile device and Internet access become ubiquitous, more and more users can upload and share their recordings The increasing number of the live version of music is making audio quality as a long-term problem Our future work can be concentrated on: • Currently our system is relatively limited in track scale So we could expand the database with human annotations or utilize some advanced techniques such as transfer learning [PY10] to synergize the human-annotated and synthetic datasets • We hope to integrate the audio quality aspect into existing textual music retrieval and recommendation systems This way, we will be able to examine the effect of the audio quality-based re-ranking on user overall listening experience [SüH13] • Our i2 MUSE also use the limited dataset and we will replace it with the larger one (Million Song Dataset) Furthermore, we have integrated the recommendation component into the system and we need a comprehensive user study to validate its performance • For a specific application of out system, we are planning to build an automatic online application to help the patients and doctors to obtain suitable training tracks for exercise and rehabilitation References [AHD+ 13] Jakob Abeßer, Johannes Hasselhorn, Christian Dittmar, Andreas Lehmann, and Sascha Grollmisch Automatic quality assessment of vocal and instrumental performances of ninth-grade and tenth-grade pupils In International Symposium on Computer Music Multidisciplinary Research, 2013 [AHF13] Nicolas Auguin, Shilei Huang, and Pascale Fung Identification of live or studio versions of a song via supervised learning In Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific, pages 1–4 IEEE, 2013 [BHC+ 98] Chumki Basu, Haym Hirsh, William Cohen, et al Recommendation as classification: Using social and content-based information in recommendation In AAAI/IAAI, pages 714–720, 1998 [BKL+ 11] Linas Baltrunas, Marius Kaminskas, Bernd Ludwig, Omar Moling, Francesco Ricci, Aykan Aydin, Karl-Heinz Lüke, and Roland Schwaiger InCarMusic: Context-aware music recommendations in a car In E-Commerce and Web Technologies, pages 89–100 Springer, 2011 [BL05] Jayme Garcia Arnal Barbedo and Amauri Lopes A new cognitive model for objective assessment of audio quality Journal of the Audio Engineering Society, 53(1/2):22–31, 2005 [BMEWL11] Thierry Bertin-Mahieux, Daniel PW Ellis, Brian Whitman, and Paul Lamere The million song dataset In ISMIR 2011: Proceedings of the 12th International Society for Music Information Retrieval Conference, October 24-28, 2011, Miami, Florida, pages 591–596 University of Miami, 2011 55 References [BPBE05] 56 Marianna Boso, Pierluigi Politi, Francesco Barale, and E Enzo Neurophysiology and neurobiology of the musical experience Functional neurology, 21(4):187– 191, 2005 [Bra87] Karleinz Brandenburg OCF – A new coding algorithm for high quality sound signals In Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP’87., volume 12, pages 141–144 IEEE, 1987 [Bre96] Leo Breiman Bagging predictors Machine learning, 24(2):123–140, 1996 [BS92] Karlheinz Brandenburg and Thomas Sporer "NMR"and "Masking Flag": Evaluation of quality using perceptual criteria In Audio Engineering Society Conference: 11th International Conference: Test & Measurement Audio Engineering Society, 1992 [CBWW10] Chih-Yi Chiu, Dimitrios Bountouridis, Ju-Chiang Wang, and Hsin-Min Wang Background music identification through content filtering and min-hash matching In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pages 2414–2417 IEEE, 2010 [CJG09] Dermot Campbell, Edward Jones, and Martin Glavin Audio quality assessment techniques – A review, and recent developments Signal Processing, 89(8):1489– 1500, 2009 [CL11] Chih-Chung Chang and Chih-Jen Lin Libsvm: a library for support vector machines ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011 [CMD+ 13] Chris Cannam, Matthias Mauch, Matthew EP Davies, Simon Dixon, Christian Landone, Katy Noland, Mark Levy, Massimiliano Zanoni, Dan Stowell, and Luıs A Figueira MIREX 2013 entry: Vamp plugins from the centre for digital music, 2013 [CSS+ 07] Arshia Cont, Diemo Schwarz, Norbert Schnell, Christopher Raphael, et al Evaluation of real-time audio-to-score alignment In International Symposium on Music Information Retrieval (ISMIR), 2007 References [Dan] 57 Van Dang Ranklib – A library of learning to rank algorithms [Online] http://www.cs.umass.edu/~vdang/ranklib.html [Die03] Kristin Diehl Personalization and decision support tools: Effects on search and consumer decision making ADVANCES IN CONSUMER RESEARCH, VOL 30, 30:166–167, 2003 [dLFdJ+ 08] Amaro A de Lima, Fabio P Freeland, Rafael A de Jesus, Bruno C Bispo, Luiz WP Biscainho, Sergio L Netto, Amir Said, A Kalker, R Schafer, Bowon Lee, et al On the quality assessment of sound signals In Circuits and Systems, 2008 ISCAS 2008 IEEE International Symposium on, pages 416–419 IEEE, 2008 [Dow04] J Stephen Downie The scientific evaluation of music information retrieval systems: Foundations and future Computer Music Journal, 28(2):12–23, 2004 [Ell] Dan Ellis Dynamic time warp (DTW) in matlab Web resource, available: [Online] http://www.ee.columbia.edu/~dpwe/resources/matlab/dtw/ [Fri01] Jerome H Friedman Greedy function approximation: A gradient boosting machine, 2001 [Fuj99] Takuya Fujishima Realtime chord recognition of musical sound: A system using common lisp music In Proc ICMC, 1999, pages 464–467, 1999 [FZ01] Hugo Fastl and Eberhard Zwicker Psychoacoustics: facts and models Springer, 2001 [GNOT92] David Goldberg, David Nichols, Brian M Oki, and Douglas Terry Using collaborative filtering to weave an information tapestry Communications of the ACM, 35(12):61–70, 1992 [Hey94] Clinton Heylin The great white wonders: A history of rock bootlegs Viking Pr, 1994 [HK06] Rainer Huber and Birger Kollmeier PEMO-Q – A new method for objective audio quality assessment using a model of auditory perception Audio, Speech, References 58 and Language Processing, IEEE Transactions on, 14(6):1902–1911, 2006 [HKL12] Alan Hanjalic, Christoph Kofler, and Martha Larson Intent and its discontents: the user at the wheel of the online video search engine In Proceedings of the 20th ACM international conference on Multimedia, pages 1239–1248 ACM, 2012 [HR10] Sheila S Hemami and Amy R Reibman No-reference image and video quality estimation: Applications and human-motivated design Signal processing: Image communication, 25(7):469–481, 2010 [Int97] International Telecommunications Union Recommendation (ITU-R) BS.1116-1 Methods for the subjective assessment of small impairments in audio system including multichannel sound systems, 1997 [Int98] International Telecommunications Union Recommendation (ITU-R) BS.1387 Method for objective measurements of perceived audio quality, 1998 [Int01] International Telecommunications Union Recommendation (ITU-R) P.862 Peceptual evaluation of speech quality (PESQ): An objective method for end-toend speach quality assessment of narrow-band telephone networks and speech codecs., 2001 [Int03a] International Telecommunications Union Recommendation (ITU-R) BS.1284-1 General methods for the subjective assessment of sound quality, 1997–2003 [Int03b] International Telecommunications Union Recommendation (ITU-R) BS.1534-1 Methods for the subjective assessment of intermediate quality level of coding systems, 2003 [JK02] Kalervo Järvelin and Jaana Kekäläinen Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems (TOIS), 20(4):422–446, 2002 [Joa02] Thorsten Joachims Optimizing search engines using clickthrough data In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge References 59 Discovery and Data Mining, pages 133–142 ACM, 2002 [Joa06] Thorsten Joachims Training linear SVMs in linear time In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 217–226 ACM, 2006 [Kar85] Matti Karjalainen A new auditory model for the evaluation of sound quality of audio systems In Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP’85., volume 10, pages 608–611 IEEE, 1985 [KCSL05] Fang-Fei Kuo, Meng-Fen Chiang, Man-Kwan Shan, and Suh-Yin Lee Emotionbased music recommendation by association discovery from film music In Proceedings of the 13th annual ACM international conference on Multimedia, pages 507–510 ACM, 2005 [KEA06] Anssi P Klapuri, Antti J Eronen, and Jaakko T Astola Analysis of the meter of acoustic musical signals Audio, Speech, and Language Processing, IEEE Transactions on, 14(1):342–355, 2006 [KN09] Lyndon Kennedy and Mor Naaman Less talk, more rock: Automated organization of community-contributed collections of concert videos In Proceedings of the 18th International Conference on World Wide Web, pages 311–320 ACM, 2009 [LC08] Kyogu Lee and Markus Cremer Segmentation-based lyrics-audio alignment using dynamic programming In ISMIR, pages 395–400, 2008 [LETF08] Olivier Lartillot, Tuomas Eerola, Petri Toiviainen, and Jose Fornari Multifeature modeling of pulse clarity: Design, validation and optimization In ISMIR, pages 521–526 Citeseer, 2008 [LGH08] Cyril Laurier, Jens Grivolla, and Perfecto Herrera Multimodal music mood classification using audio and lyrics In Machine Learning and Applications, 2008 ICMLA’08 Seventh International Conference on, pages 688–693 IEEE, 2008 References [LJK11] 60 Weisi Lin and C-C Jay Kuo Perceptual visual quality metrics: A survey Journal of Visual Communication and Image Representation, 22(4):297–312, 2011 [LL07] Jae Sik Lee and Jin Chun Lee Context awareness by case-based reasoning in a music recommendation system In Ubiquitous Computing Systems, pages 45–58 Springer, 2007 [LL11] Arthur Lenoir and Rémi Landais MuMa: A scalable music search engine based on content analysis In Proceedings of the 19th ACM international conference on Multimedia, pages 753–754 ACM, 2011 [LLK13] Tsung-Jung Liu, Weisi Lin, and C.-C Jay Kuo Image quality assessment using multi-method fusion IEEE Transactions on Image Processing, 22(5):1793–1807, 2013 [LLZ06] Lie Lu, Dan Liu, and Hong-Jiang Zhang Automatic mood detection and tracking of music audio signals IEEE Transactions on audio, speech, and language processing, 14(1):5–18, 2006 [LOL03] Tao Li, Mitsunori Ogihara, and Qi Li A comparative study on content-based music genre classification In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 282–289 ACM, 2003 [LT07] Olivier Lartillot and Petri Toiviainen A matlab toolbox for musical feature extraction from audio In International Conference on Digital Audio Effects, pages 237–244, 2007 [LWC+ 13] Zhonghua Li, Ju-Chiang Wang, Jingli Cai, Zhiyan Duan, Hsin-Min Wang, and Ye Wang Non-reference audio quality assessment for online live music recordings In Proceedings of the 21st ACM international conference on Multimedia, pages 63–72 ACM, 2013 [LWW10] Hung-Yi Lo, Ju-Chiang Wang, and Hsin-Min Wang Homogeneous segmentation and classifier ensemble for audio tag annotation and retrieval In References 61 Multimedia and Expo (ICME), 2010 IEEE International Conference on, pages 304–309 IEEE, 2010 [LWWL11] Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang, and Shou-De Lin Costsensitive multi-label learning for audio tag annotation and retrieval IEEE Trans Multimedia, 13(3):518–529, 2011 [LXH+ 10] Zhonghua Li, Qiaoliang Xiang, Jason Hockman, Jianqing Yang, Yu Yi, Ichiro Fujinaga, and Ye Wang A music search engine for therapeutic gait training In Proceedings of the international conference on Multimedia, pages 627–630 ACM, 2010 [LZY+ 14] Zhonghua Li, Bingjun Zhang, Yi Yu, Jialie Shen, and Ye Wang QueryDocument-Dependent Fusion: A case study of multimodal music retrieval IEEE Transactions on Multimedia, 15(8):1830–1842, 2014 [MBG+ 13] Emilio Molina, Isabel Barbancho, Emilia Gómez, Ana Maria Barbancho, and Lorenzo J Tardón Fundamental frequency alignment vs note-based melodic similarity for singing voice assessment In Proc IEEE International Conference on Acoustics, Speech and Signal Processing, pages 744–748, 2013 [MBK06] Ludovic Malfait, Jens Berger, and Martin Kastner P 563: The ITU-T Standard for Single-Ended Speech Quality Assessment Audio, Speech, and Language Processing, IEEE Transactions on, 14(6):1924–1934, 2006 [ME] Meinard Müller and Sebastian Ewert Chroma Toolbox: MATLAB implementations for extracting variants of chroma-based audio features In Proc International Society for Music Information Retrieval Conference [ME10] Meinard Müller and Sebastian Ewert Towards timbre-invariant audio features for harmony-based music IEEE Transactions on Audio, Speech, and Language Processing, 18(3):649–662, 2010 [MKC05] Meinard Müller, Frank Kurth, and Michael Clausen Audio matching via chroma-based statistical features In ISMIR, volume 2005, page 6th, 2005 References [MO99] 62 Richard Maclin and David Opitz Popular ensemble methods: An empirical study Journal of Artificial Intelligence Research, 11:169–198, 1999 [MvNL10] Bart Moens, Leon van Noorden, and Marc Leman D-jogger: Syncing music with walking In 7th Sound and Music Computing Conference, pages 451–456 Universidad Pompeu Fabra, 2010 [NDL+ 12] Shahriar Nirjon, Robert F Dickerson, Qiang Li, Philip Asare, John A Stankovic, Dezhi Hong, Ben Zhang, Xiaofan Jiang, Guobin Shen, and Feng Zhao Musicalheart: A hearty way of listening to music In Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems, pages 43–56 ACM, 2012 [PCC+ 12] Geoffroy Peeters, Frédéric Cornu, Christophe Charbuillet, Damien Tardieu, Juan José Burred, Marie Vian, Valérie Botherel, Jean-Bernard Rault, and JeanPhilippe Cabanal A multimedia search and navigation prototype, including music and video-clips In ISMIR, pages 439–444, 2012 [PMA+ 00] Claudio Pacchetti, Francesca Mancini, Roberto Aglieri, Cira Fundarò, Emilia Martignoni, and Giuseppe Nappi Active music therapy in Parkinson’s disease: an integrative method for motor and emotional rehabilitation Psychosomatic medicine, 62(3):386–393, 2000 [PMK10] Jouni Paulus, Meinard Müller, and Anssi Klapuri State of the art report: Audio-based music structure analysis In ISMIR, pages 625–636, 2010 [Pol06] Robi Polikar Ensemble based systems in decision making IEEE Circuits and Systems Magazine, 6(3):21–45, 2006 [PSC+ 02] James Pitkow, Hinrich Schütze, Todd Cass, Rob Cooley, Don Turnbull, Andy Edmonds, Eytan Adar, and Thomas Breuel Personalized search Commun ACM, 45(9):50–55, September 2002 [PY10] Sinno Jialin Pan and Qiang Yang A survey on transfer learning Knowledge and Data Engineering, IEEE Transactions on, 22(10):1345–1359, 2010 References [RBK+ 06] 63 AW Rix, JG Beerends, Doh-Suk Kim, P Kroon, and O Ghitza Objective Assessment of Speech and Audio Quality – Technology and Applications IEEE Transactions on Audio, Speech, and Language Processing, 14(6):1890–1901, 2006 [Rok10] Lior Rokach Ensemble-based classifiers Artificial Intelligence Review, 33(12):1–39, 2010 [SEH13] Michael Schoeffler, Bernd Edler, and Jürgen Herre How much does audio quality influence ratings of overall listening experience In Proc of the 10th International Symposium on Computer Music Multidisciplinary Research (CMMR), pages 678–693, 2013 [SFS06] Ruud Stegers, Peter Fekkes, and Heiner Stuckenschmidt MusiDB: A personalized search engine for music Web Semantics: Science, Services and Agents on the World Wide Web, 4(4):267–275, 2006 [SG05] Mirco Speretta and Susan Gauch Personalized search based on user search histories In Web Intelligence, 2005 Proceedings The 2005 IEEE/WIC/ACM International Conference on, pages 622–628 IEEE, 2005 [SGHK95] Thomas Sporer, Uwe Gbur, Jürgen Herre, and Rolf Kapust Evaluating a measurement system volume 43, pages 353–363 Audio Engineering Society, 1995 [SGYO12] Mukesh Kumar Saini, Raghudeep Gadde, Shuicheng Yan, and Wei Tsang Ooi MoViMash: Online mobile video mashup In Proceedings of the 20th ACM International Conference on Multimedia, pages 139–148 ACM, 2012 [SM95] Upendra Shardanand and Pattie Maes Social information filtering: algorithms for automating “word of mouth” In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 210–217 ACM Press/AddisonWesley Publishing Co., 1995 [SMB07] Ahu Sieg, Bamshad Mobasher, and Robin D Burke Learning ontology-based user profiles: A semantic approach to personalized web search IEEE Intelligent References 64 Informatics Bulletin, 8(1):7–18, 2007 [Spo97] Thomas Sporer Objective audio signal evaluation-applied psychoacoustics for modeling the perceived quality of digital audio In Audio Engineering Society Convention 103 Audio Engineering Society, 1997 [SüH13] Michael Schoeffler and J ürgen Herre About the impact of audio quality on overall listening experience Proceedings of the Sound and Music Computing Conference, 2013 [SWS05] Cees GM Snoek, Marcel Worring, and Arnold WM Smeulders Early versus late fusion in semantic video analysis In Proc ACM International Conference on Multimedia, pages 399–402, 2005 [SYYT10] Ja-Hwung Su, Hsin-Ho Yeh, Philip S Yu, and Vincent S Tseng Music recommendation using content and context information mining Intelligent Systems, IEEE, 25(1):16–26, 2010 [TC02] George Tzanetakis and Perry Cook Musical genre classification of audio signals Speech and Audio Processing, IEEE transactions on, 10(5):293–302, 2002 [Thi99] Thilo Thiede Perceptual audio quality assessment using a non-linear filter bank 1999 [Tho02] Dave Thompson A music lover’s guide to record collecting Hal Leonard Corporation, 2002 [TS00] William C Treurniet and Gilbert A Soulodre Evaluation of the ITU-R objective audio quality measurement method Journal of the Audio Engineering Society, 48(3):164–173, 2000 [TTB+ 00] Thilo Thiede, William C Treurniet, Roland Bitto, Christian Schmidmer, Thomas Sporer, John G Beerends, and Catherine Colomes PEAQ - the ITU standard for objective measurement of perceived audio quality Journal of Audio Engineering Society, 48(1/2):3–29, 2000 References [TWV05] 65 Rainer Typke, Frans Wiering, and Remco C Veltkamp A survey of music information retrieval systems In ISMIR, pages 153–160, 2005 [V+ 99] Ellen M Voorhees et al The TREC-8 question answering track report In TREC, volume 99, pages 77–82, 1999 [WBSG10] Qiang Wu, Christopher JC Burges, Krysta M Svore, and Jianfeng Gao Adapting boosting for information retrieval measures Information Retrieval, 13(3):254–270, 2010 [WF13] Alex Wilson and BM Fazenda Perception & evaluation of audio quality in music production In Proc of the 16th Int Conference on Digital Audio Effects (DAFx-13), 2013 [WRW12] Xinxi Wang, David Rosenblum, and Ye Wang Context-aware mobile music recommendation for daily activities In Proceedings of the 20th ACM international conference on Multimedia, pages 99–108 ACM, 2012 [WXC+ 05] Jinjun Wang, Changsheng Xu, Engsiong Chng, Lingyu Duan, Kongwah Wan, and Qi Tian Automatic generation of personalized music sports video In Proceedings of the 13th annual ACM international conference on Multimedia, pages 735–744 ACM, 2005 [YGK+ 06] Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, and Hiroshi G Okuno Hybrid collaborative and content-based music recommendation using probabilistic model with latent user preferences In ISMIR, volume 6, page 7th, 2006 [ZCZ+ 14] Shenggao Zhu, Jingli Cai, Jiangang Zhang, Zhonghua Li, Ju-Chiang Wang, and Ye Wang Bridging the user intention gap: an intelligent and interactive multidimensional music search engine In Proceedings of the First International Workshop on Internet-Scale Multimedia Management, pages 59–64 ACM, 2014 [ZES+ 14] Shenggao Zhu, Robert J Ellis, Gottfried Schlaug, Yee Sien Ng, and Ye Wang Validating an iOS-based rhythmic auditory cueing evaluation (iRACE) for References 66 Parkinson’s disease In Proceedings of the ACM International Conference on Multimedia, pages 487–496 ACM, 2014 [ZSXW09] Bingjun Zhang, Jialie Shen, Qiaoliang Xiang, and Ye Wang CompositeMap: a novel framework for music similarity measure In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 403–410 ACM, 2009 [ZYM+ 10] Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang, Tat-Seng Chua, and Xian-Sheng Hua Visual query suggestion: Towards capturing user intent in internet image search ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 6(3):13, 2010 ... 2.1.1 Audio Quality Standardization 2.1.2 Research on Audio Quality of Multimedia Signals 2.1.3 2.2 Audio Quality Assessment Research on Audio Quality of Music Music Search Engine 2.2.1 Research on. .. System Description Framework 42 43 5.2.1 Music Dimensions and Data Collection 43 5.2.2 Content Analysis and Indexing 43 5.2.3 5.3 System Construction Dimensions Correlation Analysis 44 Music Search... works on audio quality assessment, music structure analysis and segmentation, machine learning on ranking and multi-dimension music search engine • Chapter shows our solution for audio quality

Music content analysis on audio quality and its application to music retrieval

Thông tin tài liệu

Từ khóa liên quan

Mục lục

List of Figures

List of Tables

Introduction

Background and Motivation

Contribution

Chapter Plan

Literature Survey

Audio Quality Assessment

Audio Quality Standardization

Research on Audio Quality of Multimedia Signals

Research on Audio Quality of Music

Music Search Engine

Research on Multidimensional Music Search Engine

Research on Personalized Music Search Engine

The Approach for Music Quality Assessment

Framework

Data Collection

Audio Feature Sets

Machine Learning for Ranking

Baseline

Segmentation

System Fusion

Segmentation and Segment Coupling

Equalization-based Scheme

Structure-based Scheme

Fusion Strategy

Early Fusion

Late Fusion

Tài liệu cùng người dùng

Tài liệu liên quan