Member-only story

Your Daily AI Research tl;dr — 2022–10–13 🧠

The largest public dataset of real-world 3D spaces with dense semantic annotations, A Benchmark for Multimodal Models and more!

2 min readOct 13, 2022

Welcome to your official daily AI research tl;dr (often with code and news) for AI professionals where I share the most exciting papers I find daily, along with a one-liner summary to help you quickly determine if the article (and code) is worth investigating.

1️⃣ f -DM: A MULTI-STAGE DIFFUSION MODEL VIA PROGRESSIVE SIGNAL TRANSFORMATION

They extend diffusion models to incorporate a set of (hand-designed or learned) transformations, where the transformed input is the mean of each diffusion step allowing the model to learn more abstract representations.

Link to the paper: https://arxiv.org/pdf/2210.04955.pdf

2️⃣ [deepmind] Perception Test: A Diagnostic Benchmark for Multimodal Models

A novel multimodal benchmark — the Perception Test — that aims to extensively evaluate perception and reasoning skills of multimodal models by having real-world videos designed to show perceptually interesting situations and defines multiple tasks that…

Your Daily AI Research tl;dr — 2022–10–13 🧠

The largest public dataset of real-world 3D spaces with dense semantic annotations, A Benchmark for Multimodal Models and more!

1️⃣ f -DM: A MULTI-STAGE DIFFUSION MODEL VIA PROGRESSIVE SIGNAL TRANSFORMATION

2️⃣ [deepmind] Perception Test: A Diagnostic Benchmark for Multimodal Models

Written by Louis-François Bouchard

No responses yet