Accurate speech-to-text API for audio and video files