Kamrul Hasan

PhD Candidate @ University of Rochester

I am a PhD candidate at the Univcersity of Rochester, USA. I received my MS degree in Computer Science from the University of Rochester in 2018 and B.Sc. in Computer Science and Engineering from Bangladesh University of Engineering and Technology in 2014.

My research focused on theoretical and empirical aspects of multimodal machine learning. It is concerned with joint modeling of multiple modalities (language, acoustic & visual) to understand human language as it happens face-to-face communications. My research work spans the following areas: multimodal representation learning, humor, sentiment, argument & credibility understanding in video utterance. My supervisor is Ehsan Hoque.

Resume



News
  • [2021]   EMNLP: one paper got accepted - multimodal argument analysis.
  • [2021]   One paper got accepted in IEEE Transactions on Affective Computing.
  • [2020]   AAAI: one paper got accepted - multimodal humor understanding.
  • [2020]   ACL: one paper got accepetd - Multimodal Finetuning of Advanced Transformer Models.
  • [2020]   Passed thesis proposal "Multimodal Represenation Learning for Human Behavior Understanding".
  • [2019]   EMNLP: one paper got accepted - UR-FUNNY dataset.

  • Experience

    Research Intern
    Comcast AI, Washington DC
    Jul - Sep 2020. Advisor: Mahmudul Hasan,
    Project: unsupervised text segmentation.

    Applied Scientist Intern
    Amazon, Sunnyvale, CA
    May - Aug 2019. Advisor: Yelin Kim,
    Project: multimodal emotion recognition from noisy data.

    Software Engineer
    Therap (BD) Ltd, Bangladesh
    Aug 2014 - May 2015.
    Project: Worked with Java and J2EE technologies. Desinged web solutions based on Spring framework and Hibernate ORM tool.


    Publications

    Highlighted Papers

    Hitting your MARQ: Multimodal ARgument Quality Assessment in Long Debate Video
    Md Kamrul Hasan, James Spann, Masum Hasan, Md Saiful Islam, Kurtis Haut, Rada Mihalcea, and Ehsan Hoque
    EMNLP 2021
    [PDF]
    Present the first comprehensive study on multimodal argument quality assessment.

    Humor Knowledge Enriched Transformer for Understanding Multimodal Humor
    Md Kamrul Hasan, Sangwu Lee, Wasifur Rahman, Amir Zadeh, Rada Mihalcea, Louis-Philippe Morency, and Ehsan Hoque
    AAAI 2021
    [PDF] [Code]
    Can machine recognise humorous punchline given the context story and all three modalities?

    Integrating Multimodal Information in Large Pretrained Transformers
    Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee, Amir Zadeh, Chengfeng Mao, Louis-Philippe Morency, and Ehsan Hoque.
    ACL 2020
    [PDF] [Code]
    Can large pre-trained language model integrate non-verbal features?

    UR-FUNNY: A Multimodal Language Dataset for Understanding Humor
    Md Kamrul Hasan, Wasifur Rahman, Amir Zadeh, Jianyuan Zhong, Md Iftekhar Tanveer,and Louis-Philippe Morency
    EMNLP 2019
    [PDF] [Code] [Data]
    Proposed first multimodal dataset and baseline model C-MFN for humor detection.

    Facial Expression Based Imagination Index and a Transfer Learning Approach to Detect Deception
    Md Kamrul Hasan, Wasifur Rahman, Luke Gerstner, Taylan Sen, Sangwu Lee, Kurtis Haut, and Ehsan Hoque
    ACII 2019
    [PDF] [Project]
    Can baseline facial expression helps to detect deception during interview questions?.

    Unsupervised Text Segmentation using Coherence aware BERT
    Md Kamrul Hasan, Md Mahmudul Hasan, and Faisal Ishtiaq
    pre print
    Designed unsupervised text segmentation alogrithm that achived SOTA performances in multiple datasets.

    Other Papers
    1. Md Kamrul Hasan, Taylan Sen, Yiming Yang, Raiyan Abdul Baten, Kurtis Glenn Haut, Mohammed Ehsan Hoque. "LIWC into the Eyes: Using Facial Features to Contextualize Linguistic Analysis in Multimodal Communication". In 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1-7. IEEE, 2019. [PDF] [Project]
    2. Taylan Sen, Md Kamrul Hasan, Minh Tran, Matt Levin, Yiming Yang, M. Ehsan Hoque.``Say CHEESE: the Common Habitual Expression Encoder for Smile Examination and its Application to Analyze Deceptive Communication''. Automatic Face and Gesture Recognition Conference (FG), 2018. [PDF] [Project]
    3. Taylan Sen, Md Kamrul Hasan, Zach Teicher, Mohammed Ehsan Hoque. "Automated dyadic data recorder (ADDR) framework and analysis of facial cues in deceptive communication". Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, no. 4 (2018): 1-22. [PDF]
    4. Rasoul Shafipour, Raiyan Abdul Baten, Md Kamrul Hasan, Gourab Ghoshal, Gonzalo Mateos, Mohammed Ehsan Hoque. "Buildup of speaking skills in an online learning community: a network-analytic exploration". Palgrave Communications, 4(1), 1-10. [PDF]

    Research Projects

    Multimodal Representation Learning

    Collecting large-scale multimodal behavioral video datasets (e.g humor, sentiment prediction etc.) is extremely expensive, both in terms of time and money. There are no pre-trained models for visual or acoustic modalities which are suitable for studying non-verbal cues of human behavior. This project aims to learn multimodal representation from million of video utterances in self-supervised manner. The pre-trained model will be helpful for the downstream multimodal behavioral tasks with small datasets.
    Publications: Thesis (work in progress)

    Multimodal Humor Understanding

    Humor produced in multimodal manner, through the usage of words (text), gestures (visual) and prosodic cues (acoustic). We introduced UR-FUNNY- the first video dataset for humor detection task. 8257 humorous punchlines are presented, along with the prior sentences that build up their respective contexts. Total duration of the dataset is 90 hours. UR-FUNNY opens the door to the research community for studying multimodal cues involved in expressing humor.
    Publications: AAAI 2021 , EMNLP 2019

    Multimodal Sentiment Analysis

    Predicting sentiment in video utterance using multimodal signal involving language,visual and acoustic modalities. Designing deep learning algorithms that can capture both intra-modality and inter-modality interactions among these signals. We experiment the models on two popular multimodal sentiment analysis datasets of CMU-MOSI and CMU-MOSEI.
    Publications: ACL 2020 , AAAI 21 (under-review)

    Multimodal Argument Analysis

    The current literature mostly considers textual content while assessing the quality of an argument, and is limited to datasets containing short sequences (18-48 words). In this project, we study argument quality assessment in a multimodal context, and experiment on DBATES, a publicly available dataset of long debate videos.
    Publications: EMNLP 2021 , Transcation on Affective Computing

    Credibility Understanding

    Is there a chance that a computer can aid a human in detecting deceptive behavior? Are there other facial features that could indicate deception? Could you combine linguistic characteristics to aid in lie detection? How is all this data being collected and analyzed in the first place? The follwoing papers in address these types of questions and raise even more interesting ones.
    Publications: ACII 2019.a , ACII 2019.b , FG 2018 , UBICOMP 2018



    Contact

    Department of Computer Science
    2513 Wegmans Hall
    Box 270226
    University of Rochester
    Rochester, NY 14627
    Email: mhasan8@cs.rochester.edu