Research
I have conducted research in machine learning optimization, feature selection, and image segmentation, resulting in 9 peer-reviewed publications across IEEE, Springer, and Elsevier journals, amongst others. My current work at Columbia University is at the intersection of Causal Inference and Privacy Laws in the contemporary world to ensure adherence of LLMs to legal frameworks, data management, and building user trust in these technologies.
Current Research
I am currently a Research Assistant at the ARiSE (Advanced Research in Software Engineering) Lab at Columbia University. I am working on an NSF-funded project focused on compliance auditing of LLMs using Causal Inference and Explainable AI techniques. The primary goal is to check adherence of modern and upcoming LLMs to data privacy laws across the globe, such as GDPR, EU AI Act, Colorado AI Act, amongst others. These laws build upon FIPPs and OECD's purpose limitation and data minimization guidelines. The current task focuses on building a benchmark to audit the current open-source LLMs against both real-world and synthetically generated datasets in the Finance, Employment and Healthcare domains, providing counter-factuals and eventually evaluating the models. Apart from Columbia University, there are academic collaborators from Wesleyan University and University of South Carolina, while industry participation is being spearheaded by Google and IBM whose focus is to provide their own LLMs and datasets.
- • Collaborating with 2 other students and 3 faculties from Columbia, Wesleyan and University of South Carolina.
- • Real-time data from industry will be provided by researchers from Google and IBM.
- • Currently developing a benchmark containing both manually generated and LLM-assisted scenarios to detect compliance of LLMs for finance, healthcare and employment domains.
- • Next phase of work is to develop counter-factuals and a minimal cause explainer to establish the set of causal factors behind the decision of an LLM.
- • The final software will auto-detect compliance and reason the set of parameters needed to be changed or tuned in a model, to be used by companies intending to launch new LLM versions.
Tech: Python, MCP, Generative AI, LMStudio, Ollama
Previous Research
I have previously worked as a Research Assistant at the Centre for Microprocessor Applications Training and Research (CMATER) lab at Jadavpur University, India during my undergraduate studies. To continue my passion for ML optimization problems, upon request from various journals, I currently review research works to help advancement in this domain. Some salient observations during my time at CMATER are as follows:
- • Collaborated with 5+ academic researchers while guiding undergraduate and graduate students at the CMATER Lab.
- • Published 9 peer-reviewed research papers with 400+ citations in areas including cancer detection, stock market prediction, signature verification, social networks' advertising optimization, and image segmentation of brain MRI scans.
- • Presented research findings at multiple international conferences organized by IEEE and Springer.
- • Actively involved in helping current university students in the feature selection domain.
Tech: Python, NumPy, SciPy, scikit-learn, PyPI, MATLAB, Statistics