![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
It all started in 2014 with Goodfellow's GAN paper.
- Two NNs play a game: one learns to generate an image, and another says if it's any good. One is a generator, another is a discriminator.
- it uses Jensen-Shannon Divergence (JSD) for the objective of reducing divergence between the generator and the discriminator.
Improvement 1: 2017, Wasserstein GAN - improved GAN for face generation (paper)
- replaces JSD by Wasserstein metric which is continuous and differentiable, making easier to optimize
Improvement 2: 2018 ProGAN by NVIDIA (paper)
- Multiple GANs for different resolution
- First, runs a GAN for low resolution, then feeds the result to a higher resolution GAN
Improvement 3: 2018 StyleGAN (paper)
- learns picture representation as a lenient vector, then some dimensions represent the style
- old to young dimension, summer/winter, porous/glossy skin, etc
Somehow, we didn't get to stable diffusion, mentioning only that iterates by adjusting the pixels the same way as diffusion spreads.
APPLICATIONS
Inpainting - instead of generating the whole image, select a region to replace by generation.
Fast Neural Style Transfer - a recent advance that makes styling instantaneous (summer to winter, etc).
Emotion Model - transfer emotions. Face swap works the same way.
Wav2Lip - given an audio file and photo of a person, make a video with the person's lips moving according to the speech.
NVC - Neural Voice Cloning. Learns the shape of voice.
- The presenter gathered 6Gb of Trump speeches and generated a pretty good fake voice. The problem was to sort out a person's voice with other people voice.
FOMM - first order motion model (link)
- Makes photos of people alive
- Preset motions
Thin plate spline motion model 2023 (link)
- Like FOMM, but takes a reference video for imitating the motion
- Demo: Musk becomes Robert Downey Jr.
DeepFaceLab - complete tool for deep fakes (github)
- Full-blown fakes creation studio that includes everything needed: images, voice, motion generation, etc.
Aitana Lopes - AI model/influencer, the person does not exist in real life (IG)
Channel 1 news - AI generated and personalized TV news channel (website - aired yet?). Not about fake news, but about using AI generators to present personalized news broadcasts.
Other:
- Fencing problem - making many fingers like a fence. Hard to solve, but recently seems to be solved.
Question from the audience:
- We're heading into the elections in the 1/3 of the world. What will all of these do to our democracies?