Audio-Visual Instance Discrimination

CS753 (Automatic Speech Recognition), IIT Bombay

Posted on May 1, 2023

Audio-Visual Instance Discrimination

CS753 (Automatic Speech Recognition), IIT Bombay

Posted on May 1, 2023

Description

We evaluated the AVID-CMA model on HMDB-51 dataset by tuning the training parameters to improve the accuracy slightly. We implemented a combination of Cross-Entropy Loss and MSE-Loss in order to balance the loss function for this particular dataset.

Contributors

Lakshya Gupta
Parshant Arora

Image credits

Imgur and BeFunky
https://arxiv.org/pdf/2004.12943.pdf