Practical Machine Learning

Dr. Suyong Eum / Dr. Hua Yang

home lectures assignments resources python workshop

Python

Python Installation

Python 2.x is legacy and Python 3.x is the present and future of the language. Python 2.x or 3.x

Anaconda virtual environment. tutorial

Python Packages and library

Numpy

NumPy is the fundamental package for scientific computing in Python. It is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

SciPy

The SciPy library is one of the core packages that make up the SciPy stack. It provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization.

pandas

Software library written for data manipulation and analysis in Python. Offers data structures and operations for manipulating numerical tables and time series.

Matplotlib

Matplotlib is a Python 2D/3D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.

scikit-learn

The higher level probability algorithms for machine learning. If you know the rules for dealing with your data then you will want something lower level. If you want the computer to learn the rules for you and give you probabilistic answers then this library is useful. This requires study of metaparameters to understand if you are getting a more correct answer than not. Don't reinvent the wheel !

Audio manipulation tools and library

pydub

High level API for the manipulation of an audio file.

ffmpeg

FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created. e.g.) this is used to encode and decode mp3 files.

Jupyter Notebook

Virtualenv

source activate speech (Activating the virtual environment: speech)
pip install ipykernel
python -m ipykernel install --user --name=speech
jupyter notebook (you will see your virtual environment [speech] in the new tap)

OpenAI GYM

Installation

conda create --name RL python=3.6
source activate RL
apt-get install -y python-numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl libboost-all-dev libsdl2-dev swig
git clone https://github.com/openai/gym.git
cd gym
pip install -e . (minimal installation)

Examples

Numpy and Matplot

conda create --name speech python=2.7 (Creating a virtual environment called speech with python (2.7))
source activate speech (Activating the virtual environment: speech)
conda install numpy (Installing numpy package)
conda install matplotlib (Installing matplotlib package)
conda install pydub=0.9.0 (Installing pydub)
sudo apt-get install ffmpeg (Installing ffmpeg)
python mp_plot.py (Running an example - wav_plot.py)
source deactivate (Deactivating the virtual environment: speech)

Yaafe: audio feature extraction

conda install --channel https://conda.anaconda.org/Yaafe yaafe (yaffe installation)

librosa

conda install -c conda-forge librosa=0.5.1

TensorFlow-GPU

Anaconda

CUDA needs to be installed
conda create --name speech python=2.7 (Creating a virtual environment called speech with python (2.7))
source activate speech (Activating the virtual environment: speech)
conda clean --all
conda install -c anaconda tensorflow-gpu=1.1.0 (cudatookit, cudnn, numpy, ..., of course, tensorflow-gpu)

Tips

CUDA info: /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery
CUDNN version: /usr/local/cuda/targets/x86_64-linux/include/cudnn.h