Seminar Papers

[Article] A semi-supervised autoencoder for autism disease diagnosis.

summary : In this paper, they tried to diagnose ASD using semi-supervised autoencoder (SSAE). SSAE tries to reconstruct the features that are specifically important for downstream task (ASD vs HC), which is conceptually different than conventional AE. With sparsity constraints, the model can overcome SOTA performance in ABIDE classification. With a little bit of fine-tuning(ex. sparsity control), this method could be applied for our research using CHA data or p-factor related future research.

Yin, W., Li, L., & Wu, F. X. (2022). A semi-supervised autoencoder for autism disease diagnosis. Neurocomputing, 483, 140-147.)

[Article] Transformer-based multimodal information fusion for facial expression analysis.

summary : In this work, they utilize multimodal features of spoken words, speech prosody and facial expression from Aff-WIld2 dataset. They combine these features using a transformer-based fusion module which makes the output embedding features of sequences of images, audio and text. Integrated output feature is then processed in MLP layer for Action Unit (AU) detection and also facial expression recognition.

Zhang, Wei, et al. “Transformer-based multimodal information fusion for facial expression analysis.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[Article] Video-based multimodal spontaneous emotion recognition using facial expressions and physiological signals.

summary : In this work, they propose the first video-based multimodal spontaneous emotion recognition that combines facial expressions with physiological data to derive the advantages of each modality. The feature vector of facial expression is fused with physiological signals including iPPG signal and HRV. The feature-level fusioned input is then processed in a 3D Xception-net based DNN model.

Ouzar, Yassine, et al. “Video-based multimodal spontaneous emotion recognition using facial expressions and physiological signals.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[Article] OFC represents shifting latent states in learning.

summary : This paper investigated how the neural representations for learning would change during rapid behavioral changes. Various cortical regions contribute to the representational changes, notably DLPFC and ACC representing uncertainty, and OFC for representations of (rapidly) shifting contexts.

Nassar, M. R., McGuire, J. T., Ritz, H., & Kable, J. W. (2019). Dissociable forms of uncertainty-driven representational change across the human brain. Journal of Neuroscience, 39(9), 1688-1698.

[Article] Multimodal network dynamics underpinning working memory.

Summary: First, they found WM performance is related to FPN and DMN coupling. Furthermore, they found two sub-networks of FPN and showed how their activity, FC, and SC are related to the integrative processing of complex cognition using HCP 2-back task.

Murphy, A. C., Bertolero, M. A., Papadopoulos, L., Lydon-Staley, D. M., & Bassett, D. S. (2020). Multimodal network dynamics underpinning working memory. Nature communications, 11(1), 3035.

[Article] scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses.

summary : This paper introduced single-cell graph neural network (scGNN) to provide a hypothesis-free deep learning framework for scRNA-Seq analysis. This formulates and aggregates cell-cell relationships with GNN and models heterogeneous gene expressio naptterns using a left-truncated mixture Gaussian model. They integrates three iterative multi-modal autoencoders and outperforms existing tools for gene imputation and cell clustering.

Wang, Juexin, et al. “scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses.” Nature communications 12.1 (2021): 1882.

[Article] Short and long range relation based spatio-temporal transformer for micro-expression recognition.

Summary: In this paper, they investigate facial micro-expression that is getting much attention recently. They propose a novel spatio-temporal transformer architecture - the first purely transformer based approach for micro-expression recognition. It captures both local and global spatio-temporal patterns of video in an end-to-end way. This model is currently SOTA in MER(Micro-Expression Recognition) task.

Zhang, Liangfei, et al. “Short and long range relation based spatio-temporal transformer for micro-expression recognition.” IEEE Transactions on Affective Computing 13.4 (2022).

[Article] Multi-modal Self-supervised Pre-training for Regulatory Genome Across Cell Types.

Summary: Current deep learning methods often focus on modeling genome sequences of a fixed set of cell types and do not account for the interaction between multiple regulatory elements. They propose a simple yet effective approach for pre-training genome data in a multi-modal and self-supervised manner, which we call GeneBERT. They pre-train and evaluate GeneBERT model on regulatory downstream tasks across different cell types, including promoter classification, transaction factor binding sites prediction, disease risk estimation, and splicing sites prediction.

Mo, Shentong, et al. “Multi-modal Self-supervised Pre-training for Regulatory Genome Across Cell Types.” arXiv preprint arXiv:2110.05231 (2021).

[Article] Through the looking glass: deep interpretable dynamic directed connectivity in resting fMRI

Summary: Static functional connectivity matrix is usually calculated using simple Pearson’s correlation coefficients. This is simple, but cannot represent the dynamic relations of our brain. Here, they applied self-attention to calculate the attention scores of each embedded region and temporal attention to compute the weighted sum of these dynamic functional connectivities. Using this architecture, called DICE, they were able to classify mental disorders, genders, and predict age in different big datasets.

Mahmood, U., Fu, Z., Ghosh, S., Calhoun, V., & Plis, S. (2022). Through the looking glass: deep interpretable dynamic directed connectivity in resting fMRI. NeuroImage, 119737.

[Article] A dynamic graph convolutional neural network framework reveals new insights into connectome dysfunctions in ADHD.

summary : They proposed novel approach to incorporate dynamic graph computation and 2-hop neighbor nodes feature aggregation into graph convolution for brain network modeling. They used convolutional pooling strategy to readout the graph, which jointly integrates graph convolutional and readout functions. They could visualize model weights which showed interpretable connectomic patterns facilitating the understanding of brain functional abnormalities.

Zhao, Kanhao, et al. “A dynamic graph convolutional neural network framework reveals new insights into connectome dysfunctions in ADHD.” Neuroimage 246 (2022): 118774.