François Mairesse



François Mairesse

Member of the Dialogue Systems Group of the Machine Intelligence Laboratory
Department of Engineering
University of Cambridge

Trumpington Street, Cambridge CB2 1PZ, United Kingdom
Phone: +44 1223 7657758
Email:

[ Young Researchers' Roundtable on Spoken Dialogue Systems, September 13-14th 2009 ]
[ Research | Publications | Theses | Online demos | Data and software | Talks | Teaching ]




I am a research associate at the Machine Intelligence Lab at a the University of Cambridge. I am currently working on the EU FP7 CLASSiC project (Computational Learning in Adaptive Systems for Spoken Conversation), which focuses on statistical methods for data-driven semantic parsing, dialogue management and natural language generation. I completed my Ph.D. thesis in 2008 under the supervision of Prof. Marilyn Walker, at the Computer Science Department of the University of Sheffield, United Kingdom. I obtained a Master of Engineering and Computer Science in 2004 from the Université Catholique de Louvain in Belgium.

I am working on statistical methods for semantic decoding (natural language understanding) and natural language generation, with an emphasis on individual linguistic style variation. It involves analysing how to produce different utterances based on a particular information to communicate. My Ph.D. thesis focuses on the effect of individual differences on human language production and perception, in order to automatically generate language varying along the main dimensions of personality. I'm also interested in linguistic adaptation to the user in dialogue systems, which requires identifying the user's individual preferences as well as producing utterances matching those linguistic characteristics.

Research Interests:

  • Linguistic variation in natural language generation
  • Expressive text-to-speech synthesis
  • Semantic parsing/decoding
  • Models of the influence of personality on language
  • Paraphrase acquisition from corpora
  • Individual adaptation in dialogue systems

Journal articles:


Peer-reviewed publications at international conferences:

Theses:
Online demos:
  • CamInfo: The Cambridge Tourist Information Dialogue System (requires a microphone)
    This Java applet is an interface to our group's live dialogue system, which provides information about most places in Cambridge, including pubs, restaurants, colleges, museums, etc. The system can also be called using the number +44 1223 852 453. The system implements the HIS framework, i.e. it relies on Partially-observable Markov Decision Processes to reason over multiple hypotheses about the user input, which are provided by the ATK speech recogniser. Some functionalities of Personage are used for language generation (e.g., syntactic aggregation, WordNet synonym selection). The speech synthesiser is an HTS voice trained on emphasis-dependent context features using the two-pass context clustering method.

  • PERSONAGE: Language Generation with Personality
    The PERSONAGE generator can produce personality-rich utterances for presenting information in the restaurant domain. You can use the interactive Java interface to observe how each utterance varies along the extraversion dimension. PERSONAGE is based on models of the generation parameters computed from human personality ratings, detailed in this paper.

  • Automatic personality recognition
    What does your language reveal about you? The personality recognition models can estimate your scores along the 5 main personality dimensions based on your input text. Models are detailed in this paper.

  • Big Five Inventory
    Find out your personality scores by filling this short questionnaire.

Data and software:

Here are various human-annotated datasets and freely available software. Feel free to use and modify them for non-commercial purposes.

  • PERSONAGE dataset: a personality-annotated corpus
    This dataset contains 580 utterances annotated with personality/stylistic ratings from human judges, for each Big Five trait. The data also includes the generation decisions made for each utterance, as well as the intermediary content plan tree, sentence plan tree and syntactic structures. Naturalness ratings are also included. This data was used for evaluating the PERSONAGE generator, as well as for training parameter estimation models (Mairesse & Walker, 2007, 2008). More details in the PERSONAGE dataset readme file.

  • Personality Recognizer v1.02 (new version 06/06/2007)
    This Java command-line application extracts psycholinguistic features from multiple text files and runs the included Weka models to compute personality scores for all Big Five traits. An online demo is also available.

  • jMRC - MRC Psycholinguistic Database Java Interface v0.9
    This Java interface allows you to query the MRC Psycholinguistic Database from your Java programs, providing psycholinguistic features for over 150,000 words.

Talks:
  • Trainable Generation of Personality through Data-driven Parameter Estimation
    • NLIP Seminar at the Computer Laboratory, Cambridge University, 21/11/2008.
    • 46th Annual Meeting of the Association for Computational Linguistics (ACL), Columbus, 16/06/2008.
  • Generating Language with Personality
    • SRI International's Artificial Intelligence Center, Menlo Park, 03/04/2008.
    • AAAI Spring Symposium on Emotion, Personality and Social Behavior, Stanford University, 26/03/2008.
    • Psychology Department, University of Texas at Austin, 14/11/2007.
    • Machine Intelligence Lab Seminar, Department of Engineering, Cambridge University, 22/10/2007.
    • Computing Department Seminar at the Open University, Milton Keynes, 20/09/2007.
    • 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague, 26/06/2007.
    • NLP Group Talk, Sheffield, 20/03/2007.

  • Computational Models of Personality Recognition through Language
    • 28th Annual Conference of the Cognitive Science Society (CogSci 2006), Vancouver, 29/07/2006.
    • HLT-NAACL 2006 Conference, New York City, 05/06/2006.
    • NLP Group Talk, Sheffield, 16/05/2006.

  • Learning Individual Adaptation in Dialogue Systems
    • Symposium on Dialogue Modelling and Generation, Amsterdam, 07/07/2005.
    • NLP Group Talk, Sheffield, 10/05/2005.

Teaching:

Counters


François Mairesse, 2007 -