What Level of Maths Do You Need?
The main question when trying to understand an interdisciplinary field such as Machine Learning is the amount of maths necessary and the level of maths needed to understand these techniques. The answer to this question is multidimensional and depends on the level and interest of the individual. Research in mathematical formulations and theoretical advancement of Machine Learning is ongoing and some researchers are working on more advance techniques. I’ll state what I believe to be the minimum level of mathematics needed to be a Machine Learning Scientist/Engineer and the importance of each mathematical concept.
A colleague, Skyler Speakman, recently said that “Linear Algebra is the mathematics of the 21st century” and I totally agree with the statement. In ML, Linear Algebra comes up everywhere. Topics such as Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Eigendecomposition of a matrix, LU Decomposition, QR Decomposition/Factorization, Symmetric Matrices, Orthogonalization & Orthonormalization, Matrix Operations, Projections, Eigenvalues & Eigenvectors, Vector Spaces and Norms are needed for understanding the optimization methods used for machine learning. The amazing thing about Linear Algebra is that there are so many online resources. I have always said that the traditional classroom is dying because of the vast amount of resources available on the internet. My favorite Linear Algebra course is the one offered by MIT Courseware (Prof. Gilbert Strang).
Machine Learning and Statistics aren’t very different fields. Actually, someone recently defined Machine Learning as ‘doing statistics on a Mac’. Some of the fundamental Statistical and Probability Theory needed for ML are Combinatorics, Probability Rules & Axioms, Bayes’ Theorem, Random Variables, Variance and Expectation, Conditional and Joint Distributions, Standard Distributions (Bernoulli, Binomial, Multinomial, Uniform and Gaussian), Moment Generating Functions, Maximum Likelihood Estimation (MLE), Prior and Posterior, Maximum a Posteriori Estimation (MAP) and Sampling Methods.
Some of the necessary topics include Differential and Integral Calculus, Partial Derivatives, Vector-Values Functions, Directional Gradient, Hessian, Jacobian, Laplacian and Lagragian Distribution.
4.Algorithms and Complex Optimizations:
This is important for understanding the computational efficiency and scalability of our Machine Learning Algorithm and for exploiting sparsity in our datasets. Knowledge of data structures (Binary Trees, Hashing, Heap, Stack etc), Dynamic Programming, Randomized & Sublinear Algorithm, Graphs, Gradient/Stochastic Descents and Primal-Dual methods are needed.
This comprises of other Math topics not covered in the four major areas described above. They include Real and Complex Analysis (Sets and Sequences, Topology, Metric Spaces, Single-Valued and Continuous Functions, Limits), Information Theory (Entropy, Information Gain), Function Spaces and Manifolds.
Some online MOOCs and materials for studying some of the Mathematics topics needed for Machine Learning are:
Khan Academy’s Linear Algebra, Probability & Statistics, Multivariable Calculusand Optimization. Coding the Matrix: Linear Algebra through Computer Science Applications by Philip Klein, Brown University. Linear Algebra – Foundations to Frontiers by Robert van de Geijn, University of Texas. Applications of Linear Algebra, Part 1 and Part 2. A newer course by Tim Chartier, Davidson College. Joseph Blitzstein – Harvard Stat 110 lectures Larry Wasserman’s book – All of statistics: A Concise Course in Statistical Inference . Boyd and Vandenberghe’s course on Convex optimisation from Stanford. Udacity’s Introduction to Statistics.
All answers are good, but I'D like to point out a very related question: "How much maths should you we willing to learn when getting into DS and ML?" Because you can learn it, but there's a lot, and even if languages and tools later do the maths for you, you still need to understand - especially in statistics - the underlying assumptions to know if you can use this formula/method, and to understand the outcomes.
So you can get started without a huge maths background, but you'd better want to learn a lot of maths during this journey.
School math is enough to start, for example, this most popular Coursera ML course by Andrew Ng doesn't requires from a student any experience. Andrew slowly explains and reminds everything you need to know. If you don't know matrices, for example, you will be suggested to watch additional videos.
What do you mean by
getting into data science and machine learning?
?
If you mean starting to (self-) study those things, then your regular school maths should be enough. If you ever need more than that, it will most probably be explained by your tutorial site or book. If it's not introduced, then you should be able to easily look it up. The internet is vast these days :)
See, when I started my technical engineering studies, I came from a regular school with no extra maths. I was taught Higher Mathematics and every thing else I need then and there. At the moment I am studying Applied IT Security and again, I did not learn any extra math beforehand, but everything I need is introduced or I can just look up stuff I forgot over the years.
So, it's really no big deal. Just start learning :)
All of it.
Joking aside, I think the maths is only necessary if you want to have an easier time training, testing, and debugging, but you could probably get by and still learn a lot without an intense maths background. At the very least you'll learn the concepts and might pick up some math knowledge along the way.
There's a neat Youtube channel I recommend checking out called "Siraj Raval" (creator's name I assume). He does a lot of quick machine learning examples at a fast pace, so there's not much fluff to get bored with.
Nathaniel Ng
Not a lot, if it's just to get started. I started off with the John Hopkins Data Science Specialisation by Coursera, which requires very little math. The course website says: "We also suggest a working knowledge of mathematics up to algebra (neither calculus or linear algebra are required)." The next one, Machine Learning by Andrew Ng / Stanford required some basics of linear algebra, but provided an optional module for those who did not have such a background. This covered matrices and vectors, matrix multiplication, inverse.
So the math isn't really difficult if you just want to cover the basics. I should point out one caveat, however. The Coursera Machine Learning course is really a watered down version of the original CS229 from Stanford, where the math is considerably more involved (just take a look at the review notes on linear algebra, probability, convex optimization, etc. to see what they go into). I took a somewhat similar course, Learning from Data by Caltech/edX (see the math from the slides / notes here) and the math got pretty tough, especially at the later part of the course.
Of course, its possible (and likely) that many parts of data science / machine learning do not require such advanced techniques. In many cases, it may be possible to get away with a "black box" approach - i.e. treat the tool as a black box, and not worry too much about what goes on inside the box. However, out of personal interest, it may be worthwhile to understand a bit more about how such algorithms work. Furthermore, who knows whether at some point in the future, you may run into issues with debugging algorithms that require some deeper sort of fundamental knowledge?