Practical Adversarial Attacks Against Black Box Speech Recognition Systems and Devices
With the advance of speech recognition technologies, intelligent voice control devices such as Amazon Echo have became increasingly popular in our daily life. Currently, most state-of-the-art speech recognition systems are using neural networks to further improve the accuracy and efficacy of the system. Unfortunately, neural networks are vulnerable to adversarial examples: inputs specifically designed by an adversary to cause a neural network to misclassify them. Hence, it becomes imperative to understand the security implications of the speech recognition systems in the presence of such attacks. In this dissertation, we first introduce an effective audio adversarial attack towards one white box speech recognition system. Followed by this result, we further demonstrate another successful practical adversarial attack towards some commercial black box speech recognition systems and even devices like Google Home and Amazon Echo. We then discuss several methods to spread our adversarial samples by TV and radio signals. Finally, we turn to the defense for our attack and show possible defense mechanisms to alleviate audio adversarial attack. In conclusion, this thesis shows that modern speech recognition systems and devices can be compromised by physical audio adversarial attacks, and also provides the preliminary results for further researches of how to design robust speech recognition systems to defend such attacks.