References: [링크], [링크2]

Abstract

We propose LPCNet, a WaveRNN variant that combines linear prediction with recurrent neural networks to significantly improve the efficiency of speech synthesis.

LPCNet can achieve significantly higher quality than WaveRNN for the same network size and that high quality LPCNet speech synthesis is achievable with a complexity under 3 GFLOPS

Concept

Prior Knowledge

PCM - 펄스 부호 변조(Pulse-code modulation, 줄여서 PCM)는 아날로그 신호의 디지털 표현으로, 신호 등급을 균일한 주기로 표본화한 다음 디지털 (이진) 코드로 양자화 처리한다.

mu-law Companding algorithms reduce the dynamic range of an audio signal. In analog systems, this can increase the signal-to-noise ratio (SNR) achieved during transmission; in the digital domain, it can reduce the quantization error (hence increasing signal to quantization noise ratio). These SNR increases can be traded instead for reduced bandwidth for equivalent SNR.

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/48241b0c-4763-444d-99e4-58da9005941f/Untitled.png

Excitation Signal

https://www.quora.com/What-is-excitation-of-audio-signal

Features