Python is a great language for machine learning for a large number of reasons. First, Python has clear syntax. Second, it makes text manipulation extremely easy. A large number of people and organizations use Python, so there are ample development and documentation.
The clear syntax of Python has earned it the name executable pseudo-code. The default install of Python already carries high-level data types like lists, tuples, dictionaries, sets, queues, and so on, which you don’t have to program in yourself. These high-level data types make abstract concepts easy to implement. With Python, you can program in any style you’re familiar with: object-oriented, procedural, functional, and so on. With Python, it’s easy to process and manipulate text, which makes it ideal for processing non-numeric data. You can get by in Python with little to no regular expression usage. There are many libraries for using Python to access web pages, and the intuitive text manipulation makes it easy to extract data from HTML.
Python is popular
Python is popular, so lots of examples are available, which makes learning it fast. Second, popularity means that there are lots of modules available for many applications.
Python is popular in the scientific and financial communities as well. Several scientific libraries such as SciPy and NumPy allow you to do vector and matrix operations. This makes the code even more readable and allows you to write code that looks like linear algebra. Also, the scientific libraries SciPy and NumPy are compiled using lower-level languages (C and Fortran); this makes doing computations with these tools much faster. The scientific tools in Python work well with a plotting tool called Matplotlib. Matplotlib can plot 2D and 3D and can handle most types of plots commonly used in the scientific world. Python also has an interactive shell, which allows you to view and inspect elements of the program as you’re developing it. A new module for Python, called Pylab, seeks to combine NumPy, SciPy, and Matplotlib into one environment and installation. At the time of writing, this isn’t yet done but shows great promise for the future.
What Python has that other languages don’t have
There are high-level languages that allow you to do matrix math such as MATLAB and Mathematica. MATLAB has several built-in features that make machine learning easier. MATLAB is also very fast. The problem with MATLAB is that to legally use it will cost you a few thousand dollars. There are third-party add-ons to MATLAB but nothing on the scale of an open-source project.
There are matrix math libraries for low-level languages such as Java and C. The problem with these languages is that it takes a lot of code to get simple things done. First, you have to typecast variables, and then with Java, it seems that you have to write setters and getters every time you sneeze. Don’t forget subclassing. You have to sub-class methods even if you aren’t going to use them. At the end of the day, you have written a lot of code — sometimes tedious code — to do simple things. This isn’t the case with Python. Python is clean, concise, and easy to read. Python is easy for non-programmers to pick up. Java and C aren’t so easy to pick up and much less concise than Python.
The only real drawback of Python is that it’s not as fast as Java or C. You can, however, call C-compiled programs from Python. This gives you the best of both worlds and allows you to incrementally develop a program. If you experiment with an idea in Python and decide it’s something you want to pursue in a production system, it will be easy to make that transition. If the program is built in a modular fashion, you could first get it up and running in Python and then to improve speed start building portions of the code in C. The Boost C++ library makes this easy to do. Other tools such as Cython and PyPy allow you to write typed versions of Python with performance gains
over regular Python.
If an idea for a program or application is flawed, then it will be flawed at low speed as well as high speed. If an idea is a bad idea, writing code to make it fast or scale to a large number of users doesn’t change anything. This makes Python so beautiful that you can quickly see an idea in action and then optimize it if needed.
Overall Python is a higher-level language; this allows you to spend more time making sense of data and less time concerned with how a machine approximates the data. Python easily allows you to effortlessly express yourself.
Thank you for reading