ºìÐÓÊÓÆµ

Skip to main content

Jiazhuo Jiang

  • BSc (University of Victoria, 2023)

Notice of the Final Oral Examination for the Degree of Master of Science

Topic

CNN-Based Models for Pitch Estimation, Modification, and Auto-Tuning

Department of Computer Science

Date & location

  • Thursday, December 19, 2024

  • 10:30 A.M.

  • Engineering Computer Science Building

  • Room 467

Reviewers

Supervisory Committee

  • Dr. George Tzanetakis, Department of Computer Science, University of Victoria (Supervisor)

  • Dr. Alex Thomo, Department of Computer Science, UVic (Member) 

External Examiner

  • Dr. Peter Driessen, Department of Electrical and Computer Engineering, University of Victoria 

Chair of Oral Examination

  • Dr. Rob Hancock, Department of Anthropology, UVic

     

Abstract

Pitch estimation and pitch modification are fundamental audio processing tasks that are used in a variety of applications. An important example is the auto-tuning of vocals in which pitch estimation is applied, deviations from a desired target pitch are calculated, and the pitch of input vocal signal is modified to match the target pitch. Most existing approaches to auto-tuning are based on traditional digital signal processing (DSP) techniques for both the pitch detection and the pitch modification of the signal. In this thesis, the use of Convolutional Neural Networks (CNNs) is explored as a possible replacement of traditional DSP methods for pitch estimation, pitch modification as well as end-to-end autotuning. CNNs can model complex intput and output relationships and are more efficient than deep learning methods that take into account time/sequence information such as Long Term/Short Term (LSTM) networks and Recurrent Neural Networks (RNNs). The results show the potential of this approach as well as some of the challenges that need to be overcome. The experimental results indicate that larger data sets can result in better accuracy but they also tend to bring in more noise.