Chord Recognition in Music Using a Robust Pitch Class Profile (PCP) Feature and Support Vector Machines (SVM)

Music serves as a powerful and immediate avenue for the expression of emotions, and a nuanced understanding of musical compositions is crucial for accurately interpreting and appreciating them. This research centers on the examination of robust Pitch Class Profile (PCP) features and Support Vector Machine (SVM) in the realm of music analysis. The initial phase of the study delves into the exploration of pertinent concepts and a myriad of resilient Constant-Q Transform (CQT) methods used in describing chord spectra for audio analysis. Subsequently, the paper elucidates the intrinsic correlation between SVM and speech tonality, outlining the design of a comprehensive system for music chord recognition. Rigorous testing of the system's performance follows, with a particular emphasis on evaluating the recognition rate. The results of these tests underscore the significant enhancement in music chord recognition achieved by the system, highlighting the pivotal role played by robust feature optimization and SVM pattern in bolstering its efficacy. This research not only contributes to the theoretical understanding of music analysis but also provides practical insights into improving the accuracy of music chord recognition systems through innovative feature selection and machine learning techniques.


Introduction
In many music information retrieval applications, it is necessary to analyze the harmonic structure of music.In western music, harmonic structure is usually described by information about chord structure and chord progression [1] [2].The human brain can not only reorganize more objective things, but also deal with and understand more complex subjective things, such as music.Even so, extracting important information from music is still a hot research field for computer lovers.The development of music is inseparable from chords.In life, people usually think that music is "sound".However, with the continuous improvement of social and economic level and material conditions, as well as the gradual increase of spiritual and cultural demand, a large number of modern music creation have been produced [3] [4].
Music chord recognition based on robust PCP feature and SVM has been explored and studied by many scholars.In the middle of the 20th century, the search of vocal music content mainly focused on the basic theoretical scheme to improve the signal.Note recognition was the focus of research at that time.Some scholars used chromaticity features and HMMs with EM algorithm to realize the recognition of chord recognition system.The innovation of this method is to combine music knowledge into the model by defining a state transition matrix based on the pitch distance in the pentatonic ring [5][6].At the same time, it also avoids the random initialization of the mean vector and the covariance matrix of the observed symbol distribution.In addition, when training the model parameters, they assume that the chord distribution does not consider the type of music, that is, they do not model separately for the type of music, and then selectively update the weight of the parameters.

Introduction to Proper Terms of Music Specialty
Pitch: some music sounds high and some sounds low.This is called pitch.Pitch is expressed in vibrations per second.The higher the frequency, the higher the tone, the lower the tone, the lower the frequency, the lower the tone [15][16].This sound that vibrates 440 times per second is called "a", which is the current international standard sound.Interval: Specifies the distance between two notes.The unit used to calculate the interval is called "degree", and the number of syllables between two notes is called several degrees.
Octave: if the frequencies of a group of sounds are arranged strictly according to x1, X2, x4,..., that is, according to the rule of 2n, they sound like "the same pitch sequence".Because the human ear is sensitive to the frequency index, the above-mentioned "x2 represents isometric" relationship is the most basic relationship in music.In music, X2 is an octave.
Consistency: it is composed of three or more scales according to three degree interval relationship or non three degree interval relationship.Vol. 7, No. 1, January 2024, pp.01-07 ISSN 2579-7069 3 Harmony: harmony is a chord progression in which each chord is connected according to certain rules.Harmony is based on the mode that makes the melody more richly express the music content.
Semitone and whole tone: divide an octave into twelve equal parts, one semitone for each, and two semitones correspond to a whole tone.Half tones correspond to small seconds and whole tones correspond to large seconds.

Principle of Music Recognition
In modern society, with the continuous development of science and technology and economic level, people's demand for material life is also higher and higher.This phenomenon makes people have higher requirements and pay more attention to music art to a new point, that is, rhythm and beat, so that it can meet the needs of popularization and personalization.Music feature detection for audio files, analyze and identify many characteristic elements describing music: pitch, duration, intensity, beat, color, melody, melody, rhythm, music style characteristics, etc. Music is one of the most important parts of human spiritual world.Frequency file refers to the direct recording of the data obtained by binary sampling of the waveform of simulated real sound, that is, the reflection of real sound.In this way, the storage space of the sound file generated by storing the sound information is relatively large [17][18].Module files (mod, S3M, XM, MTM, far, Kar, it, etc.) have the same attributes as MIDI files and sound files, which means that module files contain not only instructions on how to play musical instruments, but also sample data of beep.However, the difference is that module files have many different formats, depending on how they are encoded.In order to extract the characteristic parameters of melody, different coding formats must be processed accordingly.

Role of Identifying Chord Features
Due to the growing demand for content-based search with intermediate functions, the search engine based on text keywords can no longer meet the needs of users.If computers can more accurately identify and transcribe chords in music, tags can be used to match the chord process of text blur, downloading and playing songs.
Understanding chords can make people analyze the structure of music more meaningful and appropriate.Chords often have a certain length, so music can be segmented and spliced according to the results of chord recognition [19][20].
On the one hand, it can deepen the understanding and analysis of complex music structure, on the other hand, it can create more complex music emotional expression forms and enrich music expression forms.

SVM multi-classification algorithm
SVM is a binary model, which is a linear classifier with the largest interval defined in feature space.The goal is to find the maximum interval.The kernel method of support vector machine is the main embodiment of its superiority.It maps linear and indivisible data into high-dimensional feature space through nuclear technology, so as to classify them in high-dimensional space.The following is a brief introduction to its principle [21].
For a dichotomic problem, the training data set on the feature space is written in the form: where, x∈x=R", yi∈Y=[-1,+1],=1,2,.. ,N, that is, the total number of samples is N.The characteristic parameters of each sample data are column vectors, and the vector is n-dimensional.The goal of learning is to find a separate hyperplane in the feature space, which separates instances of different types on both sides.The formula of the hyperplane is formula (1) : In general, if the data set can be classified, the number of corresponding hyperplanes is countless.The purpose of support vector machine is to find the hyperplane with the largest interval from the nearest instance, which can be expressed as a function.Formula (2) represents the distance between the ith sample and the separation hyperplane: In the realm of machine learning and support vector machines (SVMs), the significance of identifying support vectors cannot be overstated.The preceding formulas ascertain the equality criteria that designate a particular sample vector as X, thereby classifying it as a support vector.These support vectors play a pivotal role in the SVM algorithm, serving as the foundation for constructing the decision boundary that optimally separates different classes within the data.
It is imperative to acknowledge, however, that real-world datasets often exhibit nuances that challenge the strict adherence to linear separability.In practice, data points may not conform perfectly to a linear boundary, and as a consequence, the previously outlined formula lacks a direct solution.This departure from strict linear separability underscores the need for advanced techniques and modifications to SVMs, such as kernel methods, which enable the algorithm to handle more complex patterns and achieve improved classification performance in scenarios where linearity is not strictly maintained.As researchers continue to delve into the intricacies of SVMs, their adaptability to diverse datasets becomes a focal point, with the overarching goal of enhancing the algorithm's robustness and applicability to real-world problems.

Figure 1. Music and Spin recognition system processes
Figure 1 is the flow chart of music and rotation recognition system designed according to the robust PCP feature and the application of SVM.As can be seen from the figure, when the music sounds, the system will record the music audio, then extract the music format information, and identify the music features after the extraction.At the same time, the system will match and compare the extracted music format information from the system music database.If the matching is successful, the matching result will be directly displayed and the system operation will be ended.However, if the automatic matching fails, the matching can be performed manually, and then the matching will be performed again until the matching is successful and the operation will be ended.

Music and Spin Recognition Needs
Chords can be divided into sound level and silent level.The classification standard of sound level is to distinguish according to the pitch.Therefore, the recognition of music features can also be divided into the following levels: on the one hand, extract the basic features of chords, on the other hand, analyze the complex features on this basis, and finally analyze the music.Based on the basic and complex characteristics, style and emotional connotation of music, note is the smallest basic unit of music.Each piece of music is a sequence of many different notes on a time axis.The basic properties of music are related to the information of notes, which can be obtained directly from music.
With the development of music, the sequence of notes plays a more and more important role in people's life.As the most active, intuitive and representative of rhythm and melody, beat stress.THM and melody represent the change law of tonal order on the time axis, and the description of music rhythm and melody is clearer.The structure, subdivision and emotional connotation of music are the general characteristics of music.They represent the most complete description of music.
Among all recognition categories, basic function recognition is the simplest and can be extracted directly from music format files.Complex characters must be analyzed through the recognized basic features to analyze the order of notes.They are recognizable but relatively complex.In this paper, the basic features, complex features and overall features of music structure are preliminarily identified by combining the robustness features and SVM model.This paper will Vol.7, No. 1, January 2024, pp.01-07 ISSN 2579-7069 5 directly extract the basic attributes of music from the music format, then analyze the extracted note sequence to obtain the attributes of melody and rhythm, and then compare and analyze the music structure according to the attributes of beat and phrase.

System Recognition Rate Analysis
In this study, Table 1 serves as a comprehensive repository of data, presenting a meticulous categorization of various music genres alongside their corresponding spin recognition outcomes.The table encapsulates the essence of our investigation, acting as a visual representation of the intricate relationship between different types of music and their influence on spin recognition within the context of our experimental setup.Each entry in the table is a distinct point of reference, highlighting the nuances and distinctions among the diverse musical genres employed in the test.This tabulated data becomes a crucial tool for analysis, allowing for a systematic exploration of patterns, correlations, and trends in spin recognition across the musical spectrum.As we delve into the intricacies of the information encapsulated in Table 1, our research aims to unravel the underlying factors that contribute to the varying degrees of spin recognition associated with different genres of music, ultimately shedding light on the intricate interplay between auditory stimuli and the cognitive processes involved in spin perception.In conjunction with the insights gleaned from Table 1 and Figure 2, the comprehensive analysis reveals that the accuracy levels of all songs post-system recognition surpass an impressive 91%.This noteworthy performance underscores the efficacy of the recognition system in accurately identifying and classifying diverse musical compositions.Notably, the evaluation metrics presented in Table 1, coupled with the visual representation in Figure 2, provide a comprehensive overview of the system's robust performance across various songs.
It is pertinent to note that the conventional support vector machine (SVM) kernel function, employed for distance measurement, typically relies on the Euclidean distance metric.While this approach is recognized for its simplicity and ease of implementation, its applicability to specific datasets may encounter limitations.The intricacies of certain datasets may render the traditional SVM kernel function less effective in achieving the desired outcomes.This observation emphasizes the need for further investigation into alternative methods or enhancements to the existing system to address potential challenges posed by specific datasets.As we delve deeper into the intricacies of music recognition technologies, it becomes imperative to explore innovative approaches that can elevate the system's performance and adaptability to diverse and complex datasets.

Figure 2 .
Figure 2. Comparison diagram of the music and spin recognition ratio

Table 1 .
Identification data