Here's the direct and practical answer:
: The math behind the Perceptron and Multi-layer Networks. tom mitchell machine learning pdf github