Los resultados de este estudio son una evidencia del sesgo cultural que existe entre la lengua inglesa y la española en la ciencia de datos. De los 23.771 conjuntos de datos que se encontraron con fecha de consulta 12/04/2021, tan solo 10 se encontraban en castellano
Topological Data Analysis (TDA) is an emergent field that aims to discover topological information hidden in a dataset. TDA tools have been commonly used to create filters and topological descriptors to improve Machine Learning (ML) methods. This paper proposes an algorithm that applies TDA directly to multi-class classification problems, even imbalanced datasets, without any further ML stage
“In this book, we will cover the most common types of ML, but from a probabilistic perspective. Roughly speaking, this means that we treat all unknown quantities (e.g., predictions about the future value of some quantity of interest, such as tomorrow’s temperature, or the parameters of some model) as random variables, that are endowed with probability distributions which describe a weighted set of possible values the variable may have.[…].”.
Documentation is key – design decisions in AI development must be documented in detail, potentially taking inspiration from the field of risk management. There is a need to develop a framework for large-scale testing of AI effects, beginning with public tests of AI systems, and moving towards real-time validation and monitoring. Governance frameworks for decisions in AI development need to be clarified, including the questions of post-market surveillance of product or system performance. Certification of AI ethics expertise would be helpful to support professionalism in AI development teams. Distributed responsibility should be a goal, resulting in a clear definition of roles and responsibilities as well as clear incentive structures for taking in to account broader ethical concerns in the development of AI systems. Spaces for discussion of ethics are lacking and very necessary both internally in companies and externally, provided by independent organisations. Looking to policy ensuring whistleblower protection and ombudsman position within companies, as well as participation from professional organisations. One solution is to look to the existing EU RRI framework and to ensure multidisciplinarity in AI system development team composition. The RRI framework can provide systematic processes for engagement with stakeholders and ensuring that problems are better defined. The challenges of AI systems point to a general lack in engineering education. We need to ensure that technical disciplines are empowered to identify ethical problems, which requires broadening technical education programs to include societal concerns. Engineers advocate for public transparency of adherence to standards and ethical principles for AI-driven products and services to enable learning from each other’s mistakes and to foster a no-blame culture.
The book is structured so that learners spend the first four chapters learning how to use the R programming language and Jupyter notebooks to load, wrangle/clean, and visualize data, while answering descriptive and exploratory data analysis questions. The remaining chapters illustrate how to solve four common problems in data science, which are useful for answering predictive and inferential data analysis questions[…]