Machine learning Tips for Audio, Image and Video Analysis

Original post :- 

http://www.data-mania.com/blog/machine-learning-tips-for-image-video-and-audio/

By Lillian Peirson

Post is as follows : -

Neural networks are great in image, video, and audio machine learning problems. For example, if you have an image classification task, you can use convolutional neural nets. First, you’ll need to normalize your image, and then downsample it to a smaller size. Usually 16 – 64 pixels for each dimension is good.

After that you can build a simple convolutional net to learn from these downsampled images. The most important hyperparameter is the learning rate – tune it first. After that you can play around with changing layer sizes, the convolutional layer kernel, and pooling sizes. Try adding more layers and activation functions. Definitely try using the dropout method.

If your dataset is not very large, use data augmentation. Usually if you rotate your image or move it by a few pixels horizontally or vertically, the class doesn’t change, right? Sometimes you can even make a mirror image! Data augmentation can help you avoid some overfitting, making it possible to try an even bigger net. Finally, if you need a little better quality, you should definitely try to build several models with similar hyperparameters and then build a voting classifier on top of them.

Comments

Popular posts from this blog

SOX - Sound eXchange - How to use SOX for audio processing tasks in research.

Sox of Silence - Original post - http://digitalcardboard.com/blog/2009/08/25/the-sox-of-silence/

How to get video or audio duration of a file using ffmpeg?