Information Blog: vision

How do you go about teaching a computer what is attractive and what is not?

This is a very difficult question I have been thinking about recently.

Do you create a duck face detector and subtract points? Is it a series of features we are looking for (specific hair color, eye color, skin smoothness symmetry).

Would this data come out in some sort of statistical analysis?

I decided to research this further using Eigen Faces and SVMs.

For those of you that dont know about Eigen Faces[1], they are a decomposition of a set of images into eigenvalues (weights) and eigenvectors (eigenfaces). The amazing thing about these is that given a large enough training set, any image of a face can be reconstructed by multiplying a set of weights (eigenvalues) with the eigenvectors (eigenfaces).

Facial recognition simply extracts the eigenvalues from an image and finds the L2 Norm (a distance metric) between it and the weights of the training data set. The closest distance (below a certain threshold) indicates which face it is.

Top 20 Eigen Faces from our data set

Average image from our data set

The attractiveness of the average of all images is greater than the average of the attractiveness of all images.

We are currenlty using a similar model for detecting attractiveness. Currently we are accurate ~64% of the time; however, that is just using the L2 Norm. The hope is that using an SVM classifier and the weights for all the faces, we will be able to determine the important eigenfaces and their weights for attractiveness and then create a classification system.

If you review the heat-mapped eigenfaces in the first figure, you will notice specific expressions and features that are highlighted. We have 2027 of these eigenfaces and each face can be seen as a weighted combination of these. Our hope is to find the most attractive and unattractive features in the eigenfaces.

I hypothesize that somewhere in their is an eigenface that corresponds to duckface. Think of the utilization of such a thing. Anytime someone uploads a duckface on facebook, it could warn them that they should stop doing that.

Below are our current statistics of the mean and standard deviation of the weight vectors for each level of attractiveness (1 - 5 with 5 being the most attractive). We have also included a close view of the first 50 weights and a histogram of the weights.

Histogram

It seems there are some very interesting differences in the mean, standard deviation, and histogram based on the level of attractiveness. However, this may be because we have a different amount of training examples for each level of attractiveness (attractiveness is semi-gaussian).

It will be interesting what more tinkering will yield. We can only hope to find the elusive eigenface corresponding to duck face.

References:
[1] Turk, Matthew A and Pentland, Alex P. Face recognition using eigenfaces.Computer Vision and Pattern Recognition, 1991. Proceedings {CVPR91.}, {IEEE} Computer Society Conference on 1991

Consider donating to further my tinkering.

Posted by Vincent Vanhoucke, Google Research Scientist

Much of the worlds data is in the form of visual media. In order to utilize meaningful information from multimedia and deliver innovative products, such as Google Photos, Google builds machine-learning systems that are designed to enable computer perception of visual input, in addition to pursuing image and video analysis techniques focused on image/scene reconstruction and understanding.

This week, Boston hosts the 2015 Conference on Computer Vision and Pattern Recognition (CVPR 2015), the premier annual computer vision event comprising the main CVPR conference and several co-located workshops and short courses. As a leader in computer vision research, Google will have a strong presence at CVPR 2015, with many Googlers presenting publications in addition to hosting workshops and tutorials on topics covering image/video annotation and enhancement, 3D analysis and processing, development of semantic similarity measures for visual objects, synthesis of meaningful composites for visualization/browsing of large image/video collections and more.

Learn more about some of our research in the list below (Googlers highlighted in blue). If you are attending CVPR this year, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for hundreds of millions of people. Members of the Jump team will also have a prototype of the camera on display and will be showing videos produced using the Jump system on Google Cardboard.

Tutorials:
Applied Deep Learning for Computer Vision with Torch
Koray Kavukcuoglu, Ronan Collobert, Soumith Chintala

DIY Deep Learning: a Hands-On Tutorial with Caffe
Evan Shelhamer, Jeff Donahue, Yangqing Jia, Jonathan Long, Ross Girshick

ImageNet Large Scale Visual Recognition Challenge Tutorial
Olga Russakovsky, Jonathan Krause, Karen Simonyan, Yangqing Jia, Jia Deng, Alex Berg, Fei-Fei Li

Fast Image Processing With Halide
Jonathan Ragan-Kelley, Andrew Adams, Fredo Durand

Open Source Structure-from-Motion
Matt Leotta, Sameer Agarwal, Frank Dellaert, Pierre Moulon, Vincent Rabaud

Oral Sessions:
Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection
George Papandreou, Iasonas Kokkinos, Pierre-André Savalle

Going Deeper with Convolutions
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time
Richard A. Newcombe, Dieter Fox, Steven M. Seitz

Show and Tell: A Neural Image Caption Generator
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, Trevor Darrell

Visual Vibrometry: Estimating Material Properties from Small Motion in Video
Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Frédo Durand, William T. Freeman

Fast Bilateral-Space Stereo for Synthetic Defocus
Jonathan T. Barron, Andrew Adams, YiChang Shih, Carlos Hernández

Poster Sessions:
Learning Semantic Relationships for Better Action Retrieval in Images
Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, Charles Rosenberg, Li Fei-Fei

FaceNet: A Unified Embedding for Face Recognition and Clustering
Florian Schroff, Dmitry Kalenichenko, James Philbin

A Mixed Bag of Emotions: Model, Predict, and Transfer Emotion Distributions
Kuan-Chuan Peng, Tsuhan Chen, Amir Sadovnik, Andrew C. Gallagher

Best-Buddies Similarity for Robust Template Matching
Tali Dekel, Shaul Oron, Michael Rubinstein, Shai Avidan, William T. Freeman

Articulated Motion Discovery Using Pairs of Trajectories
Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

Reflection Removal Using Ghosting Cues
YiChang Shih, Dilip Krishnan, Frédo Durand, William T. Freeman

P3.5P: Pose Estimation with Unknown Focal Length
Changchang Wu

MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching
Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg

Inferring 3D Layout of Building Facades from a Single Image
Jiyan Pan, Martial Hebert, Takeo Kanade

The Aperture Problem for Refractive Motion
Tianfan Xue, Hossein Mobahei, Frédo Durand, William T. Freeman

Video Magnification in Presence of Large Motions
Mohamed Elgharib, Mohamed Hefeeda, Frédo Durand, William T. Freeman

Robust Video Segment Proposals with Painless Occlusion Handling
Zhengyang Wu, Fuxin Li, Rahul Sukthankar, James M. Rehg

Ontological Supervision for Fine Grained Classification of Street View Storefronts
Yair Movshovitz-Attias, Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, Liron Yatziv

VIP: Finding Important People in Images
Clint Solomon Mathialagan, Andrew C. Gallagher, Dhruv Batra

Fusing Subcategory Probabilities for Texture Classification
Yang Song, Weidong Cai, Qing Li, Fan Zhang

Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici

Workshops:
THUMOS Challenge 2015
Program organizers include: Alexander Gorban, Rahul Sukthankar

DeepVision: Deep Learning in Computer Vision 2015
Invited Speaker: Rahul Sukthankar

Large Scale Visual Commerce (LSVisCom)
Panelist: Luc Vincent

Large-Scale Video Search and Mining (LSVSM)
Invited Speaker and Panelist: Rahul Sukthankar
Program Committee includes: Apostol Natsev

Vision meets Cognition: Functionality, Physics, Intentionality and Causality
Program Organizers include: Peter Battaglia

Big Data Meets Computer Vision: 3rd International Workshop on Large Scale Visual Recognition and Retrieval (BigVision 2015)
Program Organizers include: Samy Bengio
Includes speaker Christian Szegedy - “Scalable approaches for large scale vision”

Observing and Understanding Hands in Action (Hands 2015)
Program Committee includes: Murphy Stein

Fine-Grained Visual Categorization (FGVC3)
Program Organizers include: Anelia Angelova

Large-scale Scene Understanding Challenge (LSUN)
Winners of the Scene Classification Challenge: Julian Ibarz, Christian Szegedy and Vincent Vanhoucke
Winners of the Caption Generation Challenge: Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan

Looking from above: when Earth observation meets vision (EARTHVISION)
Technical Committee includes: Andreas Wendel

Computer Vision in Vehicle Technology: Assisted Driving, Exploration Rovers, Aerial and Underwater Vehicles
Invited Speaker: Andreas Wendel
Program Committee includes: Andreas Wendel

Women in Computer Vision (WiCV)
Invited Speaker: Mei Han

ChaLearn Looking at People (sponsor)

Fine-Grained Visual Categorization (FGVC3) (sponsor)

Information Blog

Autonomously Estimating Attractiveness using Computer Vision

How do you go about teaching a computer what is attractive and what is not?

Top 20 Eigen Faces from our data set

Average image from our data set

Histogram

Consider donating to further my tinkering.

Google Computer Vision research at CVPR 2015

Search

Archive