Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Knowledge-aware Assessment of Severity of Suicide Risk for Early Intervention

Published in The Web Conference (previously, WWW), 2019

Mental health illness such as depression is a significant risk factor for suicide ideation, behaviors, and attempts. A report by Substance Abuse and Mental Health Services Administration (SAMHSA) shows that 80% of the patients suffering from Borderline Personality Disorder (BPD) have suicidal behavior, 5-10% of whom commit suicide. While multiple initiatives have been developed and implemented for suicide prevention, a key challenge has been the social stigma associated with mental disorders, which deters patients from seeking help or sharing their experiences directly with others including clinicians. This is particularly true for teenagers and younger adults where suicide is the second highest cause of death in the US Prior research involving surveys and questionnaires (e.g. PHQ-9) for suicide risk prediction failed to provide a quantitative assessment of risk that informed timely clinical decision-making for intervention. Our interdisciplinary study concerns the use of Reddit as an unobtrusive data source for gleaning information about suicidal tendencies and other related mental health conditions afflicting depressed users. We provide details of our learning framework that incorporates domain-specific knowledge to predict the severity of suicide risk for an individual. Our approach involves developing a suicide risk severity lexicon using medical knowledge bases and suicide ontology to detect cues relevant to suicidal thoughts and actions. We also use language modeling, medical entity recognition, and normalization and negation detection to create a dataset of 2181 redditors that have discussed or implied suicidal ideation, behavior, or attempt. Given the importance of clinical knowledge, our gold standard dataset of 500 redditors (out of 2181) was developed by four practicing psychiatrists following the guidelines outlined in Columbia Suicide Severity Rating Scale (C-SSRS), with the pairwise annotator agreement of 0.79 and group-wise agreement of 0.73. Compared to the existing four-label classification scheme (no risk, low risk, moderate risk, and high risk), our proposed C-SSRS-based 5-label classification scheme distinguishes people who are supportive, from those who show different severity of suicidal tendency. Our 5-label classification scheme outperforms the state-of-the-art schemes by improving the graded recall by 4.2% and reducing the perceived risk measure by 12.5%. Convolutional neural network (CNN) provided the best performance in our scheme due to the discriminative features and use of domain-specific knowledge resources, in comparison to SVM-L that has been used in the state-of-the-art tools over similar dataset.

Recommended citation: Manas Gaur, Amanuel Alambo, Joy Prakash Sain, Ugur Kursuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. Knowledge-aware assessment of severity of suicide risk for early intervention. In The World Wide Web Conference, pp. 514-525. ACM, 2019. https://dl.acm.org/citation.cfm?id=3313698

Towards Geocoding Spatial Expressions (Vision Paper)

Published in SIGSPATIAL, 2019

Imprecise composite location references formed using ad hoc spatial expressions in English text makes the geocoding task challenging for both inference and evaluation. Typically such spatial expressions fill in unestablished areas with new toponyms for finer spatial refer- ents. For example, the spatial extent of the ad hoc spatial expression “north of” or “50 minutes away from” in relation to the toponym “Dayton, OH” refers to an ambiguous, imprecise area, requiring translation from this qualitative representation to a quantitative one with precise semantics using systems such as WGS84. Here we highlight the challenges of geocoding such referents and pro- pose a formal representation that employs background knowledge, semantic approximations and rules, and fuzzy linguistic variables. We also discuss an appropriate evaluation technique for the task that is based on human contextualized and subjective judgment.

Recommended citation: Al-Olimat, Hussein S., Valerie L. Shalin, Krishnaprasad Thirunarayan, and Joy Prakash Sain. "Towards geocoding spatial expressions (vision paper)." In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 75-78. 2019. https://dl.acm.org/doi/abs/10.1145/3347146.3359356

Assessing the Severity of Health States based on Social Media Posts

Published in International Conference on Pattern Recognition (ICPR), 2020

The unprecedented growth of Internet users has resulted in an abundance of unstructured information on social media including health forums, where patients request health-related information or opinions from other users. Previous studies show that online peer support has limited potency without expert intervention. Therefore, a system capable of assessing the severity of patients from their social media posts can help health professionals (HP) in making a timely intervention. In this study, we inspect the efficacy of different aspects of Natural Language Understanding (NLU) to identify the severity of users’ health state over two perspectives (a) Medical Condition (i.e., Recover, Exist, Deteriorate, Other) and (b) Medication (i.e., Effective, Ineffective, Serious Adverse Effect, Other) in online health communities. We propose a deep learning framework that models both the textual content as well as contextual-information to assess a users’ health state. Specifically, our model utilizes the NLU features such as sentiment, emotions, personality, and use of figurative language to extract the contextual information. These multifaceted NLU features help in understanding an individual’s feelings, mental state, and behavior and thereby assist the model in capturing the health states more accurately along with the content feature extracted from social media medical blog-posts. We compare the performance of our framework on a publicly available dataset against the state-of-the-art baselines which are based on deep learning (CNN, LSTM, and Adversarial Learning) algorithms. The experimental results show that our proposed model significantly outperforms the baseline methods.

Recommended citation: Yadav, Shweta, Joy Prakash Sain, Amit Sheth, Asif Ekbal, Sriparna Saha, and Pushpak Bhattacharyya. "Assessing the severity of health states based on social media posts." In 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5728-5735. IEEE, 2021. https://ieeexplore.ieee.org/abstract/document/9411980/

Identifying Depressive Symptoms from Tweets: Figurative Language Enabled Multitask Learning Framework

Published in International Conference on Computational Linguistics (COLING), 2020

Existing studies on using social media for deriving the mental health status of users focus on the depression detection task. However, for case management and referral to psychiatrists, health-care workers require a practical and scalable depressive disorder screening and triage system. This study aims to design and evaluate a decision support system (DSS) to reliably determine the depressive triage level by capturing fine-grained depressive symptoms expressed in user tweets through the emulation of the Patient Health Questionnaire-9 (PHQ-9) that is routinely used in clinical practice. As the 280-character limit on tweets incentivizes the use of creative artifacts in the utterances and figurative language forms a general fabric of communication for effective expression, the reliable detection of depressive symptoms from tweets is challenging. We propose a novel BERT based robust multi-task learning framework to accurately identify the depressive symptoms using the auxiliary task of figurative language detection. Specifically, our proposed novel task-sharing mechanism, co-task aware attention, enables automatic selection of optimal information across the BERT layers by soft-sharing of parameters. Our results show that modeling figurative language can demonstrably improve the model’s robustness and reliability for distinguishing the depression symptoms.

Recommended citation: Yadav, Shweta, Jainish Chauhan, Joy Prakash Sain, Krishnaprasad Thirunarayan, Amit Sheth, and Jeremiah Schumm. "Identifying Depressive Symptoms from Tweets: Figurative Language Enabled Multitask Learning Framework." In Proceedings of the 28th International Conference on Computational Linguistics, pp. 696-709. 2020. https://aclanthology.org/2020.coling-main.61.pdf

METHOD AND SYSTEM FOR UNDERSTANDING FINANCIAL DOCUMENTS

Published in US Patent, 2022

Recommended citation: Kaur, S., and Smiley, C., Sain, J. P., Gupta, A., Siddagangappa, S., and Shah, S. “METHOD AND SYSTEM FOR UNDERSTANDING FINANCIAL DOCUMENTS”. U.S. Patent Application No. 17/647,356, filed January 07, 2022.

REFinD: Relation Extraction Financial Dataset

Published in ACM Special Interest Group on Information Retrieval (SIGIR), 2023

A number of datasets for Relation Extraction (RE) have been created to aide downstream tasks such as information retrieval, semantic search, question answering and textual entailment. However, these datasets fail to capture financial-domain specific challenges since most of these datasets are compiled using general knowledge sources, hindering real-life progress and adoption within the financial world. To address this limitation, we propose REFinD, the first large-scale annotated dataset of relations, with ∼29K instances and 22 relations amongst 8 types of entity pairs, generated entirely over financial documents. We also provide an empirical evaluation with various state-of-the-art models as benchmarks for the RE task and highlight the challenges posed by our dataset. We observed that various state-of-the-art deep learning models struggle with numeric inference, relational and directional ambiguity.

Recommended citation: Kaur, S., Smiley, C., Gupta, A., Sain, J. P., Wang, D., Siddagangappa, S., Aguda, T.D., Shah, S. “REFinD: Relation Extraction Financial Dataset." In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023.

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.