Data Mining (Article)

https://editorialia.com/wp-content/uploads/2020/06/data-science-at-the-command-line.jpg

Data Science at the Command Line

Today, data scientists can choose from an overwhelming collection of exciting technologies and programming languages. Python, R, Hadoop, Julia, Pig, Hive, and Spark are but a few examples. You may already have experience in one or more of these. If so, then why should you still care about the command line for doing data science? What does the command line have to offer that these other technologies and programming languages do not?

https://editorialia.com/wp-content/uploads/2020/06/r-packages.jpg

R Packages

Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. In this book you’ll learn how to turn your code into packages that others can easily download and use. Writing a package can seem overwhelming at first. So start with the basics and improve it over time. It doesn’t matter if your first version isn’t perfect as long as the next version is better. This is where we are developing the 2nd edition of this book.

https://editorialia.com/wp-content/uploads/2020/06/r-programming-succinctly.jpg

R Programming Succinctly

The R programming language on its own is a powerful tool that can perform thousands of statistical tasks, but by writing programs in R, you gain tremendous power and flexibility to extend its base functionality. Senior Succinctly series author and editor James McCaffrey shows you how in R Programming Succinctly.

https://editorialia.com/wp-content/uploads/2020/06/efficient-r-programming.jpg

Efficient R programming

There are many excellent R resources for visualization, data science, and package development. Hundreds of scattered vignettes, web pages, and forums explain how to use R in particular domains. But little has been written on how to simply make R work effectively-until now.

https://editorialia.com/wp-content/uploads/2020/06/sql-notes-for-professionals-book.jpg

SQL Notes for Professionals book

This SQL Notes for Professionals book is compiled from Stack Overflow Documentation. (166 pages, published on May 2018)

https://editorialia.com/wp-content/uploads/2020/02/logo-tpot.jpg

TPOT is a Python Automated Machine Learning tool

Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning (AutoML) tool that optimizes machine learning pipelines using genetic programming.

https://editorialia.com/wp-content/uploads/2020/06/python-data-science-handbook-essential-tools-for-working-with-data.jpg

Python Data Science Handbook (Essential Tools for Working with Data)

For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.

https://editorialia.com/wp-content/uploads/2020/06/r-for-data-science.jpg

R for Data Science

This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science.

https://editorialia.com/wp-content/uploads/2020/06/r-notes-for-professionals-book.jpg

R Notes for Professionals book

This R Notes for Professionals book is compiled from Stack Overflow Documentation. (475 pages, published on May 2018)

https://editorialia.com/wp-content/uploads/2020/06/select-star-sql-2.jpg

Select Star SQL

This is an interactive book which aims to be the best place on the internet for learning SQL. It is free of charge, free of ads and doesn’t require registration or downloads. It helps you learn by running queries against a real-world dataset to complete projects of consequence. It is not a mere reference page — it conveys a mental model for writing SQL.