Main developments and technology trends in data science, machine learning and artificial intelligence


The following information speaks for itself


https://estadosia.files.wordpress.com/2020/02/image-7.png
The Top Programming Languages, Source: spectrum.ieee.org/static/interactive-the-top-programming-languages-2019

eu-ai-alliance-3_0.png

AuthorJuan Antonio Lloret Egea | Member of the European Alliance for AI  | https://orcid.org/0000-0002-6634-3351|© 2020. License of use and distribution: Creative Commons CC BY 4.0.|Written: 29/02/2020. Updated: 29/02/2020.


To reinforce this idea people with responsibilities in this area, as Jean Francois Puget (AI/ML IBM Dept.), in a article, “The Most Popular Language For Machine Learning Is…”, point to Python as the most popular language for ML. (1)

Additionally the trend search results on Indeed.com show the same result with respect or regarding to work and salaries.

[Reference: indeed.com/q-Python-jobs.html].


Python’s most notable points are:

  • Is a great library ecosystem (Scikit-learn, Pandas, Matplotlib, NLTK, Scikit-image, PyBrain, Caffe, StatsModels, TensorFlow, Keras, etc).
  • Growing popularity.
  • Python has a low entry barrier, has flexibility, is a platform independence, has readability, good visualization options, good community support and growing popularity. (2)

And We will now introduce the paper called: “Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence”, by Sebastian Raschka, et al. University of Wisconsin-Madison.


Abstract

« Smarter applications are making better use of the insights gleaned from data, having an impact on every industry and research discipline. At the core of this revolution lies the tools and the methods that are driving it, from processing the massive piles of data generated each day to learning from and taking useful action. Deep neural networks, along with advancements in classical ML and scalable general-purpose GPU computing, have become critical components of artificial intelligence, enabling many of these astounding breakthroughs and lowering the barrier to adoption. Python continues to be the most preferred language for scientific computing, data science, and machine learning, boosting both performance and productivity by enabling the use of low-level libraries and clean high-level APIs. This survey offers insight into the field of machine learning with Python, taking a tour through important topics to identify some of the core hardware and software paradigms that have enabled it. We cover widely-used libraries and concepts, collected together for holistic comparison, with the goal of educating the reader and driving the field of Python machine learning forward».


Conclusions

«This article reviewed some of the most notable advances in machine learning, data science, and scientific computing. It provided a brief background into major topics, while investigating the various challenges and current state of solutions for each. There are several more specialized application and research areas that are outside the scope of this article. For example, attention-based Transformer architectures, along with specialized tools, have recently begun to dominate the natural language processing subfield of deep learning vaswani2017attention; radford2019language».

» Deep learning for graphical data has become a growing area of interest, with graph convolutional neural networks now being actively applied in computational biology for modeling molecular structures raschka2020machine. Popular libraries in this area include the TensorFlow-based Graph Nets battaglia2018relational library and PyTorch Geometric fey2019fast. Time series analysis, which was notoriously neglected in Python, has seen renewed interest in the form of the scalable StumPy library law2019stumpy. Another neglected area, frequent pattern mining, received some attention with Pandas-compatible Python implementations in MLxtend raschka2018mlxtend. UMAP McInnes2018, a new Scikit-learn-compatible feature extraction library has been widely adopted for visualizing high-dimensional datasets on two-dimensional manifolds. To improve the computational efficiency on large datasets, a GPU-based version of UMAP is included in RAPIDS».

» Recent years have also seen an increased interest in probabilistic programming, Bayesian inference and statistical modeling in Python. Notable software in this area includes the PyStan.

» Wrapper of STAN Carpenter2016, the Theano-based PyMC3 Salvatier2016 library, the TensorFlow-based Edward tran2016edward library, and Pomegranate schreiber2017pomegranate, which features a user-friendly Scikit-learn-like API. As a lower-level library for implementing probabilistic models in deep learning and AI research, Pyro Bingham2019 provides a probabilistic programming API that is built on top of PyTorch. NumPyro phan2019composable provides a NumPy backend for Pyro, using JAX to JIT- compile and optimize execution of NumPy operations on both CPUs and GPUs».

» Reinforcement learning (RL) is a research area that trains agents to solve complex and challenging problems. Since RL algorithms are based on a trial-and-error approach for maximizing long-term rewards, RL is a particularly resource-demanding area of machine learning. Furthermore, since the tasks RL aims to solve are particularly challenging, RL is difficult to scale – learning a series of steps to play board or video games, or training a robot to navigate through a complex environment, is an inherently more complex task than recognizing an object in an image. Deep Q-networks, which are a combination of the Q-learning algorithm and deep learning, have been at the forefront of recent advances in RL, which includes beating the world champion playing the board game Go silver2017mastering and competing with top-ranked StarCraft II players vinyals2019grandmaster. Since modern RL is largely deep learning-based, most implementations utilize one of the popular deep learning libraries discussed in Section 5, such as PyTorch or TensorFlow. We expect to see more astonishing breakthroughs enabled by RL in the upcoming years. Also, we hope that algorithms used for training agents to play board or video games can be used in important research areas like protein folding, which is a possibility currently explored by DeepMind quach2018deepmind. Being a language that is easy to learn and use, Python has evolved into a lingua franca in many research and application areas that we highlighted in this review. Enabled by advancements in CPU and GPU computing, as well as ever-growing user communities and ecosystems of libraries, we expect Python to stay the dominant language for scientific computers for many years to come».


*Free training books on Python can be found in our ‘eOpenAccess’ library.


Bibliography

  • (1).- 8 Reasons Why Python is Good for Artificial Intelligence and Machine Learning. Luashchuk, A. Django Stars. [Recuperado (29-02-2020) de (djangostars.com/blog/why-python-is-good-for-artificial-intelligence-and-machine-learning/).
  • (2).- The Most Popular Language For Machine Learning Is…. (Dec, 2016). Puget, J. F. [Recuperado (29-02-2020) de (ibm.com/developerworks/community/blogs/jfp/entry/What_Language_Is_Best_For_Machine_Learning_And_Data_Science?lang=en).
  • (3).- Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence. 2020. Sebastian Raschka, Joshua Patterson, Corey Nolet. arXiv DOI: arXiv:2002.04803 [cs.LG].