So Far...
In Part 1 I discuss the needs and problems involved with NPC portraits. Archmage Rises needs at least 1,000 NPC portraits if not more. And I can't afford that.
When I left off, I had made a quick & dirty prototype (I call it Version 1) that gave me hope to keep pursuing this:
Insights into the Human Face
1. The Visual Brain
Ever recognize a person but can't remember their name? Yep, me too. Apparently this is because the spatial recognition part of our brain runs like a super fast GPU, while the data recovery co-processor struggles to keep up.
I mention this for two reasons. First, it explains the exact problem in my game where a picture is worth a thousand names. Simply having a different name fails to convince the player it is someone different. Furthermore, we've met all met two people named Michael but who looked completely different. With the exception of twins, we've never met anyone that looks exactly the same. The player's tolerance for duplicate names is high; duplicate faces it is not.
Second, our ability to recognize a face is the most finely honed part of my brain. My endeavor here to trick it is a probably a fools errand. Awell, let's keep going. :-)
2. Silhouettes and Babies
Like most new dad's, I felt horribly unprepared for the arrival of my first baby. So I read a baby development book explaining what happens week by week after birth.
A baby's eyesight is extremely short range and mostly light and darkness. Which makes sense, for if all you care about in the world is a nipple 2 inches from your face, why bother about the rest of the stuff.
Babies see more by silhouette than by feature. So the baby book warned a new mom not to get a drastic haircut (going long to short hair), otherwise your baby won't recognize you! Wow!
It's long been known in character design (and logo design) that it has to be recognizable by silhouette first. Like these two examples:
Whether it is a blur of advertising that goes by on the train, or the blur of a character model whizzing by in an FPS, we recognize first in silhouette. Probably because it is the first tool we had in our developing brain tool belt. We are fearfully and wonderfully made!
3. Associative Memory (Context)
Ever been in the same restaurant and recognized a waiter? Now, would you have recognized him in the mall, or on a bus? Probably not. Why is that?
Because my memory of that waiter is tied (associated) to the context of the restaurant. When I'm in the restaurant my brain puts in local cache's everything I've ever seen there so I recognize people and things. Here is a scholarly article on this that I mostly understand. :-)
4. Recognition Sorts From Largest to Smallest
Our brains crunch massive amounts of data per second. Which is a tremendous problem for computer vision and hearing.
As any good coder knows, you can quickly get through a ton of data by sorting it on first pass, then focus in on what you need on additional passes. We sort (recognize) from large to small as a first pass. We know the difference between a car, building, and dog first by size, then other factors.
This is why at first glance these pictures look identical:
When viewing the above pic, we first process the big things: the silhouette, the large blue sky, large blue sea, big black sail, etc. The game becomes fun because our brain's first 50ms pass says they are the same, but we know they are not, so now the hunt begins. And the reason it is fun is because they are only different in the small. If there were large differences, even babies would find the game boring.
5. Effects of Age
The human face never stops growing. Our ears and nose continue to grow across our lives, so you can easily tell the age of someone (or if they had a nose job) by the size of their nose. Some other facts:
- The cheeks sag inferiorly resulting the appearance of jowls
- The corners of the mouth move inferiorly resulting in a slight frown look
- The tissue around the eyes sag inferiorly
- The eyelids, upper and lower, themselves sag inferiorly
- The tissue of the forehead drifts inferiorly, creating wrinkles and dropping the eyebrows downward and giving them a flatter appearance
- The nose may elongate and move the tip inferiorly
- The nose may develop a small to pronounced dorsal hump
- The tip of the nose may enlarge and become bulbous
- Generalized wrinkling of the face may occur
See the full Face Variations by Age article here.
So What's this Have to do with Portraits?!?
I've now given you the approach for tricking the brain into seeing different people when in fact you are using a limited data set of features.
Using the above theory I made myself a test subject and looked at thousands of portraits. Both RPGish hand painted ones, straight headshots, and collages from stock photos. I found the stock photo ones the most helpful in coming up with rules for how to generate a face.
These people are all dressed similarly, have a similar pose, and show the same expression. Yet they all look distinct. If I am going to create a convincing portrait generator this gives me hope.
Summary of Generating Rules
When generating portraits I have only one determinant for success: number of perceived twins. I'll call that the Twin Factor. If I have a low Twin Factor, then it is convincing, and producing a good result.
Using the above theories, here are the most important things to vary:
- Background -- context, associative memory
- Hair style -- most significant factor of silhouette
- Hair color -- depending on style, largest color area
- Skin color-- depending on hair style, first or second largest color area
- Eyes -- after the above, this is where we our vision immediately focuses
So mathematically having the same face, hair, skin color, but 10 different noses and 10 different mouths should result in 100 different looking people... right?!
It doesn't. :-(
At first glance these two guys look the same even though they have different eyes, nose, and mouth. Maybe you don't think they are twins, but you might think they are related. Or the same guy just more angry in one case.
So what is tipping the scales? It's the hair style & color.
This is some valuable validated learning: put our energies into the bigger things and don't sweat the smaller stuff. Said differently, having 20 different backgrounds and 20 different hairstyles is time better spent than 20 different eyes and 20 different mouths.
Things to Watch Out For
Don't have anything extreme. The blander the better.
The above "twin" demonstrates an important principle of extremes. The hair style is so distinctive you can easily pick out when we've used it twice. Even if it was a different color or shade. So if we are going to be successful in this we need lots of boring haircuts.
This is also true about things like scars, facial hair, hats, and jewelry. If we create a really distinctive scar running down a person's whole face, well the first time you see it is cool, but the second time it totally looks like a copy. So we get far more mileage out of small minor scaring than big slashes.
And isn't this true of life? We know plenty of people with boring brown hair with a boring ordinary haircut (ok i just described myself). But you probably only know one (or few) people with wild crazy hair in a wild crazy color. We are predisposed to picking out the extremes, but have a hard time telling if three boring haircuts are the same or not.
Well, I was going to get into what we actually did for Version 2, but I think this is a good stopping point. I've learned more about the human face in the last 2 weeks than I ever thought i would. I hope you've learned something as well!
To be continued...