Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP

Recommendation numberedNº: 27032022p1

Please, thank the authors

Thank you very much for this work to @sjmielke, @zaidalyafeai, @esalesk et al, via @States_AI_IA #nlp #tokenization #modeling #machinelearning #models #ai #artificialintelligence #thebibleai #openscience #openaccess #thanks

Sheet paper
Internal Id 27032022p1
Author/sSabrina J. Mielke, Zaid Alyafeai, Elizabeth Salesky, Colin Raffel, Manan Dey, Matthias Gallé, Arun Raja, Chenglei Si, Wilson Y. Lee, Benoît Sagot, Samson Tan
TitleBetween words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Publication date20 Dec 2021
Access to the paper page is provided by the Publisher herself (or the author). Keep in mind that the policy of the Publisher or the author may change. You must observe and comply with the terms of use set by the publication.Warning 1
The images used are restricted to the unambiguous identification of the publication (paper) with its authors, to avoid mistakes in the educational review and recommendation.
Other images can be used only if the permission of its owners has been obtained, or the type of publication (its license) allows it.
Warning 2
We do not provide links (in general) with hypertext to any website. We do this to avoid suggesting any type of commercial or interest relationship of ours in the recommendation. There are no conflicts of interest between the recommender and the recommended. Also to comply with the Intellectual Property Law (IP)/ Copyright, and (finally) let the user that all decisions regarding the use of the material are always personal. And to ensure private use without commercial purposes.
The aforementioned, supposes that there is no responsibility of ours regarding the uses of licensing that a specific user may make.
Warning 3
A particular policy
If an author, or an Publisher, considers that we must rectify, delete or modify elements in this recommendation, please let us know in the contact form. Our manifest interest is always to protect the author and the Publisher from piracy; on which we position ourselves totally against.Warning 4
For the author and the publisher
If you notice any incidence (or related third party infringement) in the terms of use of this paper review, please let us know in the contact form. (If it occurs, we will suspend the review of this paper in a precautionary manner).Warning 5
Precautionary suspension
Sheet paper
Click to rate this post
[Total: 3 Average: 5]

Liked this post? Follow this blog to get more.