Combination of facial movements on a 3D talking head Proceedings Computer Graphics International, Combination of facial movements on a 3D talking head.

Many talking faces have been de- Facial movements play an important role in interpreting veloped. Examples include [1], [6], [19] and [23].

These spoken conversations and emotions. There are several types systems combine facial movements by just adding them to- of movements, such as conversational signals, emotion dis- gether without taking into account the resolution of conflict- plays, etc. We call these channels of facial movement. Specifically, significant istic animation of these movements would improve the real- attention has been paid to visual speech [5, 19]. Some sys- ism, liveliness of the interaction between human and com- tems are also able to generate facial expressions as conver- puters using embodied conversational agents.

To date, no sational signals during speech [2, 24]. However, no appro- appropriate methods have been proposed for integrating all priate methods have been proposed for integrating all these facial movements. We propose in this paper a scheme of facial movements. First, The activity of human facial muscles is far from simply we concatenate the movements in the same channel to gen- additive. A typical example would be smiling while speak- erate smooth transitions between adjacent movements.

The Zygomatic Major and Minor muscles contract to combination only applies to individual muscles. The move- pull the corner of lip outward, resulting in a smile. However, the activation of the Zygomatic Major and Mi- 1 Introduction nor muscles together with the lip funneler Orbicularis Oris would create an unnatural movement.

The activation of a muscle may require the deactivation of other muscles [10]. They occur continuously during social interactions and conversations.

Depending on the ously during social interactions and conversations. They priority of the tasks to be performed on the face, appropri- include lip movements when talking, conversational sig- ate muscles are selected to activate. In most of Dirk receive a facial cases, the nals, emotion displays and manipulators to satisfy biolog- visual speech has higher priority than the smile.

The smile ical needs. Unfortunately when and how a movement ap- may also have higher priority then the visual speech when pears and disappears, and how co-occurrent movements are the subject is too happy to utter the speech naturally.

In addition, the problem of overlaying and cial movements on a 3D talking head. There are several blending facial movements in time, and the way felt emo- types of movements, such as conversational signals, emo- tions are expressed in facial activity during speech, has not tion display, etc. We call these channels of facial move- received much attention [20]. We concentrate on the dynamic aspects of facial In the field of embodied agents, facial animation has re- movements and the combination of facial expressions in dif- ceived quite a lot of attention.

Realistic animation of faces would improve the realism, liveliness of the interaction be- we concatenate the movements in the same channel to gen- tween human and machine. To create realistic facial anima- erate smooth transitions between adjacent movements. This tion, many 3D face models have been proposed see [22] combination only applies to individual muscles. for a summary.

The 3D face punctuation marks such as a comma or an exclama- model of the talking head is discussed in Section 3. They are used to help the interaction be- section also presents a summary of conflicting muscles on tween the speaker and the listener or to provide feed- the face. The generation of conversa- movements are described in Section 4. Section 5 explains tional signals can be done by analyzing the text [24] how facial movements inside a channel are combined while or speech [2].

The generation of conversa- movements Dirk receive a facial described in Section 4. Section 5 explains tional signals can be done by analyzing the text [24] how facial movements inside a channel are combined while or speech [2].

We have proposed a fuzzy rule based system to generate emotion displays from Our talking face takes as input the text to be pronounced emotions [3]. Gaze and head movements function e. A simple something during conversation. Head movements are example of marked text looks like this also used to replace verbal content e. I like it very much.

I like it very much. From text input, the text to phoneme module [30] gen- Atomic movements within a channel occur sequentially, al- erates phoneme sequences, which are used to synthesize though they may overlap each other at their beginning and speech [8].

They are also used to generate lip movements ending.

This classification is also based on the function when talking. It is similar to Pelachaud et al. Movements from different channels can happen Facial movements are then combined in two stages: The In our system, we distinguish six channels: The latter com- ments to satisfy biological requirements of the face.

In bines the movements from all channels taking into account our system, we consider eye blinking to wet the eyes the muscle conflicting resolution. The result is displayed on as manipulators. These movements are random rather a Dirk receive a facial face model to create the final animation in synchro- than repeated with fixed rate as in [24]. The random nization with the synthesized speech. Lip movements are gen- It is a simple muscle-based 3D face model that can realize erated from the text that is going to be spoken by the both of the following objectives: The text is converted to phoneme seg- expressions and real-time animation on a regular personal ments phoneme with temporal information - starting computer.

The face model, which is also not too compli- and ending time [30]. The phonemes are converted to cated so as to keep the animation realtime, allows high qual- corresponding visemes. Each viseme is equipped with ity and realistic facial expressions. The face is equipped a set of dominance functions of parameters participat- with a muscle system that produces realistic deformation of ing in the articulation of the speech segment.

We use the facial surface, handles multiple muscle interaction cor- dominance functions from [6] for each viseme seg- rectly and produces bulges and wrinkles in real-time. These are sible for visual speech lip movements and facial expres- movements to accentuate or emphasize speech.