|
|
|
|
|
|
Line
level alignment of text and musical audio
|
|
|
Text
is crucial for duration estimation
|
|
|
Rhythm
detection can inform downstream components
|
|
|
Accuracy
of chorus detection is vital
|
|
|
Vocal
detection model uses training based approach
|
|
|
For
real-time performance: need to explore alternative vocal
|
detection
models
|
|