Member-only story
Your Daily AI Research tl;dr — 2022–07–06 🧠
A Racist AI, a 256GB Open-Source Legal Dataset, and a 200 language model.
Welcome to your official daily AI research tl;dr (often with code and news) for AI enthusiasts where I share the most exciting papers I find daily, along with a one-liner summary to help you quickly determine if the article (and code) is worth investigating. I will also take this opportunity to share daily exciting news in the field.
Let’s get started with this iteration!
1️⃣ No Language Left Behind: Scaling Human-Centered Machine Translation
Translating 200 languages with a Transformer-based model.
Link to the paper: https://research.facebook.com/publications/no-language-left-behind/
My video on the model: https://youtu.be/2G4NeG17Eis
2️⃣ Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset
Pile of Law, a ∼256GB (and growing) dataset of open-source English-language legal and administrative data, covering court opinions, contracts, administrative rules, and legislative records.