In this survey, we connect several lines of work from the pre-neural and neural era, by showing how hybrid approaches of words and characters as well as subword-based approaches based on learned segmentation have been proposed and evaluated. We conclude that there is and likely will never be a silver bullet singular solution for all applications and that thinking seriously about tokenization remains important for many applications
Natural Language Processing (Article)
This paper introduces TextAttack, a Python framework for adversarial attacks, data augmentation, and adversarial training in NLP. TextAttack builds attacks from four components: a goal function, a set of constraints, a transformation, and a search method. TextAttack’s modular design enables researchers to easily construct attacks from combinations of novel and existing components. TextAttack provides implementations of 16 adversarial attacks from the literature and supports a variety of models and datasets, including BERT and other transformers, and all GLUE tasks.
This repository contains examples and best practices for building NLP systems, provided as Jupyter notebooks and utility functions. The focus of the repository is on state-of-the-art methods and common scenarios that are popular among researchers and practitioners working on problems involving text and language
El diseño del kit de herramientas permite trabajar en paralelo entre más de 70 idiomas, utilizando el formalismo de Dependencias Universales. Stanza está construido con componentes de red neuronal de alta precisión, que también permiten una capacitación y evaluación eficientes con sus propios datos anotados.
AI assistants represent a significant frontier for development. But the complexities of such systems pose a significant barrier for developers. In Natural Language Processing Succinctly, author Joseph Booth will guide readers through designing a simple system that can interpret and provide reasonable responses to written English text. With this foundation, readers will be prepared to tackle the greater challenges of natural language development. (Syncfusion).