Weekly Research Newsletter - KIIT

By Priyansi, Junaid Rahim and Biswaroop Bhattacharjee

Weekly Research Newsletter - KIIT - Issue #17

#17・
Weekly Research Newsletter - KIIT
430

subscribers

18

issues

Weekly Research Newsletter - KIIT - Issue #17
2nd June 2021,
We are excited to share this week’s picks for the research newsletter. We hope you’ll enjoy reading them over the weekend.

Reward is enough
By David Silver, Satinder Singh, Doina Precup, Richard S. Sutton
In this article we hypothesise that intelligence, and its associated abilities, can be understood as subserving the maximisation of reward. Accordingly, reward is enough to drive behaviour that exhibits abilities studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalisation and imitation. This is in contrast to the view that specialised problem formulations are needed for each ability, based on other signals or objectives. Furthermore, we suggest that agents that learn through trial and error experience to maximise reward could learn behaviour that exhibits most if not all of these abilities, and therefore that powerful reinforcement learning agents could constitute a solution to artificial general intelligence.
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
By Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler
State-of-the-art models in natural language processing rely on separate rigid subword tokenization algorithms, which limit their generalization ability and adaptation to new settings. In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. To this end, we introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters in a data-driven fashion. Concretely, GBST enumerates candidate subword blocks and learns to score them in a position-wise fashion using a block scoring network. We additionally introduce Charformer, a deep Transformer model that integrates GBST and operates on the byte level. Via extensive experiments on English GLUE, multilingual, and noisy text datasets, we show that Charformer outperforms a series of competitive byte-level baselines while generally performing on par and sometimes outperforming subword-based models. Additionally, Charformer is fast, improving the speed of both vanilla byte-level and subword-level Transformers by 28%-100% while maintaining competitive quality. We believe this work paves the way for highly performant token-free models that are trained completely end-to-end.
Did you enjoy this issue?
Priyansi, Junaid Rahim and Biswaroop Bhattacharjee

An opt-in weekly newsletter for the undergraduate research enthusiasts in KIIT. We intend to share interesting research articles and start conversations about the latest ideas in artificial intelligence, computer science and mathematics.

Every Friday, all subscribers will receive some research articles straight in their inbox. The papers will usually be a mix of that week’s popular research articles, review articles and some seminal papers in the various fields mentioned above.

In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue