Ciencia de Datos | 🇬🇧 Data Science

R0:e3d9ea294a21c145042e5f31369de739-CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

CARLA (Counterfactual And Recourse LibrAry), a python library for benchmarking counterfactual explanation methods across both different data sets and different machine learning models. In summary, our work provides the following contributions: (i) an extensive benchmark of 11 popular counterfactual explanation methods, (ii) a benchmarking framework for research on future counterfactual explanation methods, and (iii) a standardized set of integrated evaluation measures and data sets for transparent and extensive comparisons of these methods. We have open-sourced CARLA and our experimental results on Github, making them available as competitive baselines. We welcome contributions from other research groups and practitioners.

R0:5e6fade87218b43e4b8d96158080cc85-A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.

Classification based on Topological Data Analysis

Classification based on Topological Data Analysis

Topological Data Analysis (TDA) is an emergent field that aims to discover topological information hidden in a dataset. TDA tools have been commonly used to create filters and topological descriptors to improve Machine Learning (ML) methods. This paper proposes an algorithm that applies TDA directly to multi-class classification problems, even imbalanced datasets, without any further ML stage

Hugging Face datasets

Hugging Face datasets

One-line dataloaders for many public datasets & Efficient data pre-processing

R0: dde004c79ac901067ab1189ea01b8ac7-Data Science: A First Introduction

Data Science: A First Introduction

The book is structured so that learners spend the first four chapters learning how to use the R programming language and Jupyter notebooks to load, wrangle/clean, and visualize data, while answering descriptive and exploratory data analysis questions. The remaining chapters illustrate how to solve four common problems in data science, which are useful for answering predictive and inferential data analysis questions[…]

R0_fe33488e78e8d3bac711f1ffb6ea5a48-Bayesian-Data-Analysis-course

Bayesian Data Analysis: book & course

This book is intended to have three roles and to serve three associated audiences: an introductory text on Bayesian inference starting from first principles, a graduate text on effective current approaches to Bayesian modeling and computation in statistics and related fields, and a handbook of Bayesian methods in applied statistics for general users of and researchers in applied statistics. Although introductory in its early sections, the book is definitely not elementary in the sense of a first text in statistics

https://editorialia.com/wp-content/uploads/2020/09/tidy-modeling-with-r.jpg

Tidy Modeling with R

This book provides an introduction to how to use our software to create models. We focus on a dialect of R called the tidyverse that is designed to be a better interface for common tasks using R. If you’ve never heard of or used the tidyverse, Chapter 2 provides an introduction. In this book, we demonstrate how the tidyverse can be used to produce high quality models. The tools used to do this are referred to as the tidymodels packages