soid

I feel behind on what's going on with those deepfakes and generative art. It was curious to see the quick presentation looking back at the progress starting with GANs. It was presented by Shafik Quoraishee from the NYT.

❧

It all started in 2014 with Goodfellow's GAN paper.

Two NNs play a game: one learns to generate an image, and another says if it's any good. One is a generator, another is a discriminator.
it uses Jensen-Shannon Divergence (JSD) for the objective of reducing divergence between the generator and the discriminator.

Improvement 1: 2017, Wasserstein GAN - improved GAN for face generation (paper)

replaces JSD by Wasserstein metric which is continuous and differentiable, making easier to optimize

Improvement 2: 2018 ProGAN by NVIDIA (paper)

Multiple GANs for different resolution
First, runs a GAN for low resolution, then feeds the result to a higher resolution GAN

Improvement 3: 2018 StyleGAN (paper)

learns picture representation as a lenient vector, then some dimensions represent the style
old to young dimension, summer/winter, porous/glossy skin, etc

Somehow, we didn't get to stable diffusion, mentioning only that iterates by adjusting the pixels the same way as diffusion spreads.

APPLICATIONS

Inpainting - instead of generating the whole image, select a region to replace by generation.

Fast Neural Style Transfer - a recent advance that makes styling instantaneous (summer to winter, etc).

Emotion Model - transfer emotions. Face swap works the same way.

Wav2Lip - given an audio file and photo of a person, make a video with the person's lips moving according to the speech.

NVC - Neural Voice Cloning. Learns the shape of voice.

The presenter gathered 6Gb of Trump speeches and generated a pretty good fake voice. The problem was to sort out a person's voice with other people voice.

FOMM - first order motion model (link)

Makes photos of people alive
Preset motions

Thin plate spline motion model 2023 (link)

Like FOMM, but takes a reference video for imitating the motion
Demo: Musk becomes Robert Downey Jr.

DeepFaceLab - complete tool for deep fakes (github)

Full-blown fakes creation studio that includes everything needed: images, voice, motion generation, etc.

Aitana Lopes - AI model/influencer, the person does not exist in real life (IG)

Channel 1 news - AI generated and personalized TV news channel (website - aired yet?). Not about fake news, but about using AI generators to present personalized news broadcasts.

Other: