Publications by profiles

SELECT PROFILE down

Data Science Profile

Introduction to Datascience: Learn Julia Programming, Math & Datascience from Scratch

I was emboldened to write this book after my video series called Data Science With Julia got some traction. That too after a tweet about Decision Tree was liked by Julia Language itself. So I thought why not give it more?

30 de January de 20228 de January de 2022

R0:2860421906bee85cbfa5eadd287f7a8c-Data as the main focus of “State of the art of data science in Spanish language and its application in the field of Artificial Intelligence”

Data as the main focus of “State of the art of data science in Spanish language and its application in the field of Artificial Intelligence”

According to the results, there is an evidence of cultural bias for data science in Spanish language. The outcome of the consultation, which carried out on 12 April 2021, confirms that only 10 out of 23.771 datasets “speaks” Spanish.”

15 de August de 20212 de August de 2021

R0:379550808a93f299a00d8391ccd2e65c-Los datos como eje principal en el

Los datos como eje principal en el “Estado del arte de la ciencia de datos en el idioma español y su aplicación en el campo de la Inteligencia Artificial”

Los resultados de este estudio son una evidencia del sesgo cultural que existe entre la lengua inglesa y la española en la ciencia de datos. De los 23.771 conjuntos de datos que se encontraron con fecha de consulta 12/04/2021, tan solo 10 se encontraban en castellano

13 de June de 20218 de August de 2021

State of the art of data science in Spanish language and its application in the field of AI

The study of art provides results that indicate the absence of involvement of Spanish language with AI and all the subareas, which consequently adversely affect to the education of future professionals.

2 de May de 20218 de August de 2021

From Zero to Research Scientist full resources guide

This guide is designated to anybody with basic programming knowledge or a computer science background interested in becoming a Research Scientist with on Deep Learning and NLP.

18 de April de 20211 de April de 2021

El estado del arte de la ciencia de datos en el idioma español y su aplicación en el campo de la Inteligencia Artificial

El estudio arroja resultados que indican la falta de involucración del Español con la IA así como de todas las subáreas, afectando negativamente a la formación de futuros profesionales.

30 de March de 202131 de March de 2021

Classification based on Topological Data Analysis

Topological Data Analysis (TDA) is an emergent field that aims to discover topological information hidden in a dataset. TDA tools have been commonly used to create filters and topological descriptors to improve Machine Learning (ML) methods. This paper proposes an algorithm that applies TDA directly to multi-class classification problems, even imbalanced datasets, without any further ML stage

28 de February de 20211 de April de 2021

Hugging Face datasets

One-line dataloaders for many public datasets & Efficient data pre-processing

19 de February de 202119 de February de 2021

Data Science: A First Introduction

The book is structured so that learners spend the first four chapters learning how to use the R programming language and Jupyter notebooks to load, wrangle/clean, and visualize data, while answering descriptive and exploratory data analysis questions. The remaining chapters illustrate how to solve four common problems in data science, which are useful for answering predictive and inferential data analysis questions[…]

26 de December de 202026 de December de 2020

R0_fe33488e78e8d3bac711f1ffb6ea5a48-Bayesian-Data-Analysis-course

Bayesian Data Analysis: book & course

This book is intended to have three roles and to serve three associated audiences: an introductory text on Bayesian inference starting from first principles, a graduate text on effective current approaches to Bayesian modeling and computation in statistics and related fields, and a handbook of Bayesian methods in applied statistics for general users of and researchers in applied statistics. Although introductory in its early sections, the book is definitely not elementary in the sense of a first text in statistics

13 de November de 202013 de November de 2020

Tidy Modeling with R

This book provides an introduction to how to use our software to create models. We focus on a dialect of R called the tidyverse that is designed to be a better interface for common tasks using R. If you’ve never heard of or used the tidyverse, Chapter 2 provides an introduction. In this book, we demonstrate how the tidyverse can be used to produce high quality models. The tools used to do this are referred to as the tidymodels packages

24 de September de 2020

https://editorialia.com/wp-content/uploads/2020/09/by-danny-friedman-machine-learning-from-scratch-1.jpg

Machine Learning from scratch (by Danny Friedman)

This book covers the building blocks of the most common methods in machine learning. This set of methods is like a toolbox for machine learning engineers. Those entering the field of machine learning should feel comfortable with this toolbox so they have the right tool for a variety of tasks.

12 de September de 2020

https://editorialia.com/wp-content/uploads/2020/08/ieee-use-casee28093criteria-for-addressing-ethical-challenges-in-transparency-accountability-and-privacy-of-cta_ctt.jpg

IEEE Use Case–Criteria for Addressing Ethical Challenges in Transparency, Accountability, and Privacy of CTA/CTT

There are substantial public health benefits gained through successfully alerting individuals and relevant public health institutions of a person’s exposure to a communicable disease. Contact tracing techniques have been applied to epidemiology for centuries, traditionally involving a manual process of interview and follow-up. This is time-consuming, difficult, and dangerous work. Manual processes are also open to incomplete information because they rely on individuals being willing and able to remember and report all contact possibilities.

24 de August de 2020

Mastering Shiny

This book complements Shiny’s online documentation and is intended to help app authors develop a deeper understanding of Shiny. After reading this book, you’ll be able to write apps that have more customized UI, more maintainable code, and better performance and scalability.

16 de July de 2020

The Art of Machine Learning (Algorithms + Data + R)

I wrote this book because: • ML is not a recipe. It is not a matter of knowing the syntax and mechanics of various software packages.• ML is an art, not a science. (Hence the title of this book). • One does not have to be a math whiz or know advanced math in orer to use ML effectively, but one does need to understand the concepts well — the Why? and How? of ML methods

15 de June de 2020

https://editorialia.com/wp-content/uploads/2020/06/best-practives-in-dataviz_-an-r-perspective.jpg

Best Practices in Dataviz: An R Perspective

By the end of this you will have had a whirlwind tour of the very tip of the data visualization best-practices iceberg. We will go over a broad range of topics generally applicable to data science usecases but not dive too deep into any single one. One thing to keep in mind the whole time is none of this is absolutely set in stone, most often in the real world you have to bend or break some of these rules to do what you want.

10 de June de 2020

Data Structures Succinctly Part 2

Data Structures Succinctly Part 2 is your concise guide to skip lists, hash tables, heaps, priority queues, AVL trees, and B-trees. As with the first book, you’ll learn how the structures behave, how to interact with them, and their performance limitations. Starting with skip lists and hash tables, and then moving to complex AVL trees and B-trees, author Robert Horvick explains what each structure’s methods and classes are, the algorithms behind them, and what is necessary to keep them valid.

22 de April de 2020

Data Structures Succinctly Part 1

Data Structures Succinctly Part 1 is your first step to a better understanding of the different types of data structures, how they behave, and how to interact with them. Starting with simple linked lists and arrays, and then moving to more complex structures like binary search trees and sets, author Robert Horvick explains what each structure’s methods and classes are and the algorithms behind them. Horvick goes a step further to detail their operational and resource complexity, ensuring that you have a clear understanding of what using a specific data structure entails.

19 de April de 2020

MySQL® Notes for Professionals

MySQL® Notes for Professionals book is compiled from Stack Overflow Documentation. (187 pages, published on May 2018)

16 de April de 2020

Fundamentals of Data Visualization

Guide to making visualizations that accurately reflect the data, tell a story, and look professional. It has grown out of my experience of working with students and postdocs in my laboratory on thousands of data visualizations.

6 de April de 2020

Advanced R

Advanced R helps you understand how R works at a fundamental level. It is designed for R programmers who want to deepen their understanding of the language, and programmers experienced in other languages who want to understand what makes R different and special. This book will teach you the foundations of R; three fundamental programming paradigms (functional, object-oriented, and metaprogramming); and powerful techniques for debugging and optimising your code.

3 de April de 2020

IPython Interactive Computing and Visualization Cookbook

IPython Interactive Computing and Visualization Cookbook, Second Edition contains many ready-to-use, focused recipes for high-performance scientific computing and data analysis, from the latest IPython/Jupyter features to the most advanced tricks, to help you write better and faster code.

2 de April de 2020

Text Mining with R (A Tidy Approach)

If you work in analytics or data science, like we do, you are familiar with the fact that data is being generated all the time at ever faster rates. (You may even be a little weary of people pontificating about this fact.) Analysts are often trained to handle tabular or rectangular data that is mostly numeric, but much of the data proliferating today is unstructured and text-heavy. Many of us who work in analytical fields are not trained in even simple interpretation of natural language.

We developed the tidytext (Silge and Robinson 2016) R package because we were familiar with many methods for data wrangling and visualization, but couldn’t easily apply these same methods to text.

30 de March de 2020

Data Science at the Command Line

Today, data scientists can choose from an overwhelming collection of exciting technologies and programming languages. Python, R, Hadoop, Julia, Pig, Hive, and Spark are but a few examples. You may already have experience in one or more of these. If so, then why should you still care about the command line for doing data science? What does the command line have to offer that these other technologies and programming languages do not?

27 de March de 2020

R Packages

Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. In this book you’ll learn how to turn your code into packages that others can easily download and use. Writing a package can seem overwhelming at first. So start with the basics and improve it over time. It doesn’t matter if your first version isn’t perfect as long as the next version is better. This is where we are developing the 2nd edition of this book.

25 de March de 2020

Loading…

Something went wrong. Please refresh the page and/or try again.

Share this on:

Pages: 1 2 3 4 5 6 7 8