Oculus Rift hack transfers your facial expressions onto your virtual avatar

Status
Not open for further replies.

araemo

Ars Scholae Palatinae
1,212
Subscriptor
So, I recently read Snow Crash for the first time.. and one of the things stressed as part of the success of their persistent virtual avatar-inhabited world was good code for interpreting facial expressions and transmitting them to your avatar. Basically, allowing almost proper non-verbal communication online. We can technically do that over skype/videoconferencing... if the connection speed is good. But perhaps a compressed 'facial expression description' would take less bandwidth?
 
Upvote
11 (12 / -1)

TiHKAL

Seniorius Lurkius
30
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062187#p29062187:s742swdy said:
araemo[/url]":s742swdy]So, I recently read Snow Crash for the first time.. and one of the things stressed as part of the success of their persistent virtual avatar-inhabited world was good code for interpreting facial expressions and transmitting them to your avatar. Basically, allowing almost proper non-verbal communication online. We can technically do that over skype/videoconferencing... if the connection speed is good. But perhaps a compressed 'facial expression description' would take less bandwidth?


Up-voted for Snowcrash! Check out Headcrash by Bruce Bethke if you can find it.
 
Upvote
1 (2 / -1)
Pretty cool, I could see this playing into future Pixar style movie production some day perhaps.

I think maybe it would be necessary to integrate hardware in the goggles to track the user's eye movements as well, that's a big piece of the puzzle that I'm not sure this setup is capable of handling yet, so I hope they have that in their sights soon as well.
 
Upvote
2 (2 / 0)

tecnico.hitos

Wise, Aged Ars Veteran
176
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062187#p29062187:3kjsim0t said:
araemo[/url]":3kjsim0t]So, I recently read Snow Crash for the first time.. and one of the things stressed as part of the success of their persistent virtual avatar-inhabited world was good code for interpreting facial expressions and transmitting them to your avatar. Basically, allowing almost proper non-verbal communication online. We can technically do that over skype/videoconferencing... if the connection speed is good. But perhaps a compressed 'facial expression description' would take less bandwidth?

This and other technologies like smartphones, google glass, smartwatches and so forth make me think about gargoyles instead. It may cause people to focus more on their virtual connections than real life.

That said I'd totally be a gargoyle because most of the time real life is spent with duties and emptycourtesy rather than meaningful relationships with people, so as long as people keep it under control there is little loss. It would also be great for internet friendship and distance relationships.
 
Upvote
2 (2 / 0)
Unless you have some ridiculous cartoony avatar I can see this leading to uncanny valley real quick
hurrrr-g%C3%B1e-e1362739324495.jpg
 
Upvote
18 (18 / 0)

pavon

Ars Tribunus Militum
2,320
Subscriptor
Even better, latency was generally low, with the researchers measuring 3ms for facial feature detection, 5ms for blend shape optimisation, and 3ms for the mapping in software.
Unsurprisingly, it currently needs a rather powerful rig to run well: powered by a Core i7-4820K, 32GB of RAM, and a GTX 980, the system renders at a steady 30 FPS.
I wonder what the discrepancy is that is adding another 22ms. Are the first numbers an average that can vary a lot, or do they not include the actual render time which is being done in series with the facial detection rather than in a pipelined fashion?
 
Upvote
1 (1 / 0)
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062739#p29062739:apr4ljr1 said:
nehinks[/url]":apr4ljr1]The examples seem to be hitting the uncanny valley pretty hard, IMHO. Is that just me?

Still, it is a good advancement - can only get better from here.

Oh yeah. The model doesn't carry the expression being made at all for what my brain is saying the expression should be. I say that knowing I'm only seeing half the face so my brain is making a lot of assumptions based on a lifetime of calibration. I'll assume I'm right but I'm open to being wrong as well. Yeah, unless it's dynamically changing the model without predefined movements based on the current measurements it's not going to exactly match, but some of those results are some creepily non-natural facial expressions unless you're Ace Ventura or in the throes of death. Maybe they need to sprinkle glitter on people's faces for motion capture purposes....

Either way I simply won't be impressed until it can catch my 'are you serious' eyebrow lift.
 
Upvote
-1 (1 / -2)

Arsigi

Seniorius Lurkius
40
Hmm... interesting to see this pop up pretty much 24 hours after the Fove Kickstarter campaign launched. Sure, eye-tracking is their main focus, but they have expression-recognition in the works, too. The more the merrier, though. I wonder if they will eventually work out 'standards' for such things, so that people with different hardware can interact.
 
Upvote
1 (1 / 0)

lewax00

Ars Legatus Legionis
17,402
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062187#p29062187:ve1ht1o1 said:
araemo[/url]":ve1ht1o1]But perhaps a compressed 'facial expression description' would take less bandwidth?
In theory, yes, it should take less bandwidth, because an image contains all that information, plus additional information on things like colors. So less information is less data in a perfect and minimal representation. In reality...that remains to be seen, but it's plausible, depending on how many points they're tracking.
 
Upvote
2 (2 / 0)
[url=http://meincmagazine.com/civis/viewtopic.php?p=29063245#p29063245:2q5jpvd5 said:
lewax00[/url]":2q5jpvd5]
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062187#p29062187:2q5jpvd5 said:
araemo[/url]":2q5jpvd5]But perhaps a compressed 'facial expression description' would take less bandwidth?
In theory, yes, it should take less bandwidth, because an image contains all that information, plus additional information on things like colors. So less information is less data in a perfect and minimal representation. In reality...that remains to be seen, but it's plausible, depending on how many points they're tracking.

They could implement this a few ways -- for very little bandwidth the expression could be computed client side, and the avatar (which could already have been sent) could have a number of expressions, so the expression selction rather than the point data could be sent.

Presumably the point data could be compressed like streaming video, assuming expressions don't change rapidly and frequently. Although it could lead to an indecisiveness based ddos.
 
Upvote
1 (1 / 0)

araemo

Ars Scholae Palatinae
1,212
Subscriptor
[url=http://meincmagazine.com/civis/viewtopic.php?p=29063245#p29063245:56u7omtu said:
lewax00[/url]":56u7omtu]
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062187#p29062187:56u7omtu said:
araemo[/url]":56u7omtu]But perhaps a compressed 'facial expression description' would take less bandwidth?
In theory, yes, it should take less bandwidth, because an image contains all that information, plus additional information on things like colors. So less information is less data in a perfect and minimal representation. In reality...that remains to be seen, but it's plausible, depending on how many points they're tracking.
Yeah.. basically, faces only move in so many ways. If the software can identify how your face is moving (IE, where the muscles pull from, what direction they pull in - that could be advertised to the remote display that can run a 3d 'rig' of a face with muscles in the specific locations/pulling in the specific directions.

As for the 'gargoyles' - I think it's more likely we'll have something more akin to Ghost in the Shell - everyone who wants to be a 'gargoyle' could be, without noticeable external signs. You'll never know who is using OLED-implanted contacts to get an information overlay and who isn't.
 
Upvote
1 (1 / 0)

Falos

Ars Tribunus Militum
1,599
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062485#p29062485:1e7i43du said:
ElectricBlue[/url]":1e7i43du]Unless you have some ridiculous cartoony avatar I can see this leading to uncanny valley real quick

http://www.gamesajare.com/2.0/wp-conten ... 324495.jpg

This. The tech is good and all, it should have other application, but humans are extremely fine-tuned to facial expression, the tiniest change in angle or flex matters. I'm no artist but they'll tell you.
 
Upvote
0 (0 / 0)
I got to try a pre-release of Dead Secret.

http://robotinvader.com/deadsecret/

The very first scene in the game you walk up to a mirror. The point being to show you the character you're playing since you're not playing yourself.

In a traditional game it would have been "whatever" but in VR 1st person it really needed this!

Cool game.
 
Upvote
0 (0 / 0)

lewax00

Ars Legatus Legionis
17,402
[url=http://meincmagazine.com/civis/viewtopic.php?p=29064047#p29064047:1spc9nai said:
NateH[/url]":1spc9nai]This has real potential to be very helpful for teaching those with Autism spectrum disorders to interact with people more normally. Learning to read and respond facially can be very difficult for some of them.
I'm not sure how this would help, all it does is take those same expressions and put them on a digital face. That seems like saying putting words on a screen instead of paper should help people with dyslexia.
 
Upvote
1 (1 / 0)

araemo

Ars Scholae Palatinae
1,212
Subscriptor
[url=http://meincmagazine.com/civis/viewtopic.php?p=29064105#p29064105:2g3ztvam said:
lewax00[/url]":2g3ztvam]
[url=http://meincmagazine.com/civis/viewtopic.php?p=29064047#p29064047:2g3ztvam said:
NateH[/url]":2g3ztvam]This has real potential to be very helpful for teaching those with Autism spectrum disorders to interact with people more normally. Learning to read and respond facially can be very difficult for some of them.
I'm not sure how this would help, all it does is take those same expressions and put them on a digital face. That seems like saying putting words on a screen instead of paper should help people with dyslexia.
More like.. putting words on a screen where you can dynamically change their representation as the learner gains skill could help people with <limitation x>.

So, you COULD do that with paper, but you would need lots of different versions of the same books/etc with different types of enhancements for dyslexia.. or you could just let the display adjust as the student learns.

Same thing with facial expressions - you could program the 'game' to overexaggerate them just enough, and then slowly reduce the exaggeration as the student learns to identify them correctly in real-time. So, yes, I think it has potential. Potential does not equal success, but it might be worth the attempt, in case it is successful.
 
Upvote
2 (2 / 0)

lewax00

Ars Legatus Legionis
17,402
[url=http://meincmagazine.com/civis/viewtopic.php?p=29064175#p29064175:c9qy09m9 said:
araemo[/url]":c9qy09m9]
[url=http://meincmagazine.com/civis/viewtopic.php?p=29064105#p29064105:c9qy09m9 said:
lewax00[/url]":c9qy09m9]
[url=http://meincmagazine.com/civis/viewtopic.php?p=29064047#p29064047:c9qy09m9 said:
NateH[/url]":c9qy09m9]This has real potential to be very helpful for teaching those with Autism spectrum disorders to interact with people more normally. Learning to read and respond facially can be very difficult for some of them.
I'm not sure how this would help, all it does is take those same expressions and put them on a digital face. That seems like saying putting words on a screen instead of paper should help people with dyslexia.
More like.. putting words on a screen where you can dynamically change their representation as the learner gains skill could help people with <limitation x>.

So, you COULD do that with paper, but you would need lots of different versions of the same books/etc with different types of enhancements for dyslexia.. or you could just let the display adjust as the student learns.

Same thing with facial expressions - you could program the 'game' to overexaggerate them just enough, and then slowly reduce the exaggeration as the student learns to identify them correctly in real-time. So, yes, I think it has potential. Potential does not equal success, but it might be worth the attempt, in case it is successful.
I'm not sure I see where the real time matching to an actual human is necessary in that scenario. Seems like pre-recorded expressions could be used just as well, and we could already do that before this research.

EDIT: Also, if you need different exaggerations for different expressions, you also need expression matching by the computer, which this does not solve.
 
Upvote
2 (2 / 0)

Pillage

Wise, Aged Ars Veteran
153
[url=http://meincmagazine.com/civis/viewtopic.php?p=29062187#p29062187:1wf3077r said:
araemo[/url]":1wf3077r]So, I recently read Snow Crash for the first time.. and one of the things stressed as part of the success of their persistent virtual avatar-inhabited world was good code for interpreting facial expressions and transmitting them to your avatar. Basically, allowing almost proper non-verbal communication online. We can technically do that over skype/videoconferencing... if the connection speed is good. But perhaps a compressed 'facial expression description' would take less bandwidth?

I can do that with 16-bits. :)
 
Upvote
0 (0 / 0)

edzieba

Ars Scholae Palatinae
1,431
It's interesting they used load cells to measure foam compression rather than something less sensitive to HMD movement (like EMG measurement of facial muscles). For fairly static use the foam-compression will correspond to pressure from facial muscles, but with more significant head movement (you can whip your head around at a pretty good speed) or with an insufficiently tight HMD, or unevenly tightened HMD, that signal would be drowned out.
 
Upvote
0 (0 / 0)
TLDR: Cyberdating = Creepy; Discussing a speadsheet/code review = Not bad

This is really cool. Beyond gaming, I'd like to see it catch on as a viable candidate for remote office work and collaboration.

Best-case in the next few years: showing your model/presentation/numbers in a VR space with colleagues who visit your session and have a conversation with facial feedback about your work. As long as the primary focus isn't your colleague but rather the work itself, I think you wouldn't run afoul of the uncanny valley. For heavy-duty negotiations or dating, we're probably a long way off. But it'd be a hell of a step up from videoconferencing, skyping, text chatting, and phone calls.

An additional advantage over videoconferencing is bandwidth: instead of sending voice and video (large and continuous), you're sending an initial avatar (large and one-time), voice, and facial coordinate updates (continuous and much smaller than video).
 
Upvote
0 (0 / 0)
Status
Not open for further replies.