1 min readJul 6, 2019
This is a great application! I think it would require deployment of two models:
- Audio-to-Text (Mozilla’s DeepSpeech would be my starting point) https://github.com/mozilla/DeepSpeech
- Neural Machine Translation (NMT) model, which is specialized to translate to/from a pair of languages. (https://www.tensorflow.org/beta/tutorials/text/nmt_with_attention)
After getting 1 and 2 working, you could improve further by making the NMT layer pluggable to support many language pairs.
Happy hacking!