Posts by Collection

phd

portfolio

projects

Hybrid Precoding for mmWave Massive MIMO OFDM

Hybrid Precoding, Partially Connected Structure, 2018

Hybrid precoding, a combination of digital and analog precoding, is an alternative to traditional precoding methods in massive MIMO systems with a large number of antenna elements and has shown promising results recently. In this paper, we implement a parallel framework to make hybrid precoding competitive in fast-fading environments. A low-complexity algorithm which exploits the block diagonal phase-only nature of the analog precoder in a partially connected structure is proposed to arrive at a hybrid precoding solution for a multi-carrier single-user system using orthogonal frequency division multiplexing (OFDM). The original problem is broken down into two subproblems of finding the magnitude and the phase components which are solved independently. A per-RF chain power constraint is introduced instead of the sum power constraint over all antennas which are much more practical in real systems. An alternating version of the same algorithm is proposed for increased spectral-efficiency gains. Complexity and run-time analysis demonstrate the advantage of the proposed algorithm over existing hybrid precoding schemes for partially connected structure in an OFDM setting. The simulation results reveal certain insights about the partially connected structure and the tradeoffs that have to be made to make it workable in a real wideband system.

Character Based Language Models Through Variational Sentence and Word Embeddings

NLP, Language Model, 2018

Language models have come of age recently with the introduction of Long-Short-Term-Memory based encoders, decoders and the advent of the attention mechanism. These models however work by generating one word at a time and cannot account for character level similarities and differences. In this project we propose a novel character based hierarchical variational autoencoder framework that can learn the word and sentence embeddings at the same time. We couple this with an attention mechanism over the latent word embeddings to realize the end-to-end autoencoder framework.

Representation Learning Strategies for the Epigenome and Chromatin Structure using Recurrent Neural Models

Thesis, Thesis, 2023

In this Ph.D. thesis, we propose frameworks for designing informative position-specific representations from epigenomic and structural genomic signals. We use recurrent priors in our analysis owing to the fact that the genome is heavily correlated with nearby positions, and implement them using recurrent neural models. We demonstrate that the representations we learn are helpful for various tasks, including, locating known genomic elements, identifying conserved sites, correlating with established genomic measures, enabling accurate decoding, finding elements that drive 3D conformation, attributing relative positional importance, and performing in-silico modifications. In the process of designing these representations, we study two classes of strategies that differ in their underlying philosophy, namely, autoencoding and categorical encoding. We show that the usefulness of these representations depends on the underlying strategies used while designing them.

publications

A Downscaled Faster-RCNN Framework for Signal Detection and Time-Frequency Localization in Wideband RF Systems

Published in IEEE Transactions on Wireless Communications, 2020

We propose a wideband spectrum sensing technique to detect and localize wireless radio frequency (RF) signals of interest in time and frequency when uninteresting signals cause RF interference (RFI). Specifically, we adopt and downscale the existing Faster-RCNN (FRCNN) framework to achieve better signal detection and localization than the state-of-the-art. For experimental evaluation, we present a data generation framework for Wi-Fi as the signals of interest and the Bluetooth and microwave oven signals as the RFI. Experiments reveal that (i) the downscaled FRCNN model can achieve up to a mean average precision (mAP) of 0.8, significantly outperforming the state-of-the-art, (ii) feature extraction with the VGG-13 architecture gives the best mAP with pretrained weights and configured as trainable, (iii) for signal detection in real RF traces, when compared to training purely with synthetic RF data, a better mAP can be achieved by training with a mixture of synthetic and real RF traces or by fine tuning the synthetically-trained weights with an additional round of training on a small amount of real RF traces, and (iv) the mAP performance decreases as the signal to noise ratio (SNR) is lowered.

Recommended citation: Prasad, K. S. V., D’souza, K. B., & Bhargava, V. K. (2020). A downscaled faster-RCNN framework for signal detection and time-frequency localization in wideband RF systems. IEEE Transactions on Wireless Communications, 19(7), 4847-4862. Full Document

Latent representation of the human pan-celltype epigenome through a deep recurrent neural network

Published in IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021

The availability of thousands of assays of epigenetic activity necessitates compressed representations of these data sets that summarize the epigenetic landscape of the genome. Until recently, most such representations were cell type-specific, applying to a single tissue or cell state. Recently, neural networks have made it possible to summarize data across tissues to produce a pan-cell type representation. In this work, we propose Epi-LSTM, a deep long short-term memory (LSTM) recurrent neural network autoencoder to capture the long-term dependencies in the epigenomic data. The latent representations from Epi-LSTM capture a variety of genomic phenomena, including gene-expression, promoter-enhancer interactions, replication timing, frequently interacting regions, and evolutionary conservation. These representations outperform existing methods in a majority of cell types, while yielding smoother representations along the genomic axis due to their sequential nature.

Recommended citation: Dsouza, K. B., Li, A. Y., Bhargava, V., & Libbrecht, M. W. (2021). Latent representation of the human pan-celltype epigenome through a deep recurrent neural network. IEEE/ACM Transactions on Computational Biology and Bioinformatics. Full Document

Wireless threat detection device, system, and methods to detect signals in wideband RF systems and localize related time and frequency information based on deep learning

Published in US Patent, 2021

The present invention comprises a novel system and method to detect and estimate the time-frequency span of wireless signals present in a wideband RF spectrum. In preferred embodiments, the Faster RCNN deep learning architecture is used to detect the presence of wireless transmitters from the spectrogram images plotted by searching for rectangular shapes of any size, then localize the time and frequency information from the output of the FRCNN deep learning architecture.

Recommended citation: Koppisetti, N. R. S. V. P., Dsouza, K. B., Boostanimehr, H., & Mallick, S. (2022). U.S. Patent Application No. 17/825,304. Full Document

Learning representations of chromatin contacts using a recurrent neural network identifies genomic drivers of conformation

Published in Nature Communications, 2021

Despite the availability of chromatin conformation capture experiments, discerning the relationship between the 1D genome and 3D conformation remains a challenge, which limits our understanding of their affect on gene expression and disease. We propose Hi-C-LSTM, a method that produces low-dimensional latent representations that summarize intra-chromosomal Hi-C contacts via a recurrent long short-term memory neural network model. We find that these representations contain all the information needed to recreate the observed Hi-C matrix with high accuracy, outperforming existing methods. These representations enable the identification of a variety of conformation-defining genomic elements, including nuclear compartments and conformation-related transcription factors. They furthermore enable in-silico perturbation experiments that measure the influence of cis-regulatory elements on conformation.

Recommended citation: Dsouza, K. B., Maslova, A., Al-Jibury, E., Merkenschlager, M., Bhargava, V. K., & Libbrecht, M. W. (2022). Learning representations of chromatin contacts using a recurrent neural network identifies genomic drivers of conformation. Nature Communications, 13(1), 1-19. Full Document

Assessing the climate benefits of afforestation in the Canadian Northern Boreal and Southern Arctic

Published in Nature Communications, 2025

Afforestation greatly influences several earth system processes, making it essential to understand these effects to accurately assess its potential for climate change mitigation. Although our understanding of forest-climate system interactions has improved, significant knowledge gaps remain, preventing definitive assessments of afforestation’s net climate benefits. In this review, focusing on the Canadian northern boreal and southern arctic, we identify these gaps and synthesize existing knowledge. The review highlights regional realities, Earth’s climatic history, uncertainties in biogeochemical (BGC) and biogeophysical (BGP) changes following afforestation, and limitations in current assessment methodologies, emphasizing the need to reconcile these uncertainties before drawing firm conclusions about the climate benefits of afforestation. Finally, we propose an assessment framework which considers multiple forcing components, temporal analysis, future climatic contexts, and implementation details. We hope that the research gaps and assessment framework discussed in this review inform afforestation policy in Canada and other circumpolar nations.

Recommended citation: Dsouza, K. B., Ofosu, E., Salkeld, J., Boudreault, R., Moreno-Cruz, J., & Leonenko, Y. (2025). Assessing the climate benefits of afforestation in the Canadian Northern Boreal and Southern Arctic. Nature Communications, 16(1), 1964. Full Document

Bridging Farm Economics and Landscape Ecology for Global Sustainability through Hierarchical and Bayesian Optimization

Submitted, 2025

Agricultural landscapes face the dual challenge of sustaining food production while reversing biodiversity loss. Agri-environmental policies often fall short of delivering ecological functions such as landscape connectivity, in part due to a persistent disconnect between farm-level economic decisions and landscape-scale spatial planning. We introduce a novel hierarchical optimization framework that bridges this gap. First, an Ecological Intensification (EI) model determines the economically optimal allocation of land to margin and habitat interventions at the individual farm level. These farm-specific intervention levels are then passed to an Ecological Connectivity (EC) model, which spatially arranges them across the landscape to maximize connectivity while preserving farm-level profitability. Finally, we introduce a Bayesian Optimization (BO) approach that translates these spatial outcomes into simple, cost effective, and scalable policy instruments, such as subsidies and eco-premiums, using non-spatial, farm-level policy parameters. Applying the framework to a Canadian agricultural landscape, we demonstrate how it enhances connectivity under real-world economic constraints. Our approach provides a globally relevant tool for aligning farm incentives with biodiversity goals, advancing the development of agri-environmental policies that are economically viable and ecologically effective.

Recommended citation: Dsouza, K. B., Watt, G. A., Leonenko, Y., & Moreno-Cruz, J. (2025). Bridging Farm Economics and Landscape Ecology for Global Sustainability through Hierarchical and Bayesian Optimization. arXiv preprint arXiv:2508.06386. Full Document

BoreaRL: A Multi-Objective Reinforcement Learning Environment for Climate-Adaptive Boreal Forest Management

Submitted, 2025

Boreal forests store 30-40% of terrestrial carbon, much in climate-vulnerable permafrost soils, making their management critical for climate mitigation. However, optimizing forest management for both carbon sequestration and permafrost preservation presents complex trade-offs that current tools cannot adequately address. We introduce BoreaRL, the first multi-objective reinforcement learning environment for climate-adaptive boreal forest management, featuring a physically-grounded simulator of coupled energy, carbon, and water fluxes. BoreaRL supports two training paradigms: site-specific mode for controlled studies and generalist mode for learning robust policies under environmental stochasticity. Through evaluation of multi-objective RL algorithms, we reveal a fundamental asymmetry in learning difficulty: carbon objectives are significantly easier to optimize than thaw (permafrost preservation) objectives, with thaw-focused policies showing minimal learning progress across both paradigms. In generalist settings, standard preference-conditioned approaches fail entirely, while a naive curriculum learning approach achieves superior performance by strategically selecting training episodes. Analysis of learned strategies reveals distinct management philosophies, where carbon-focused policies favor aggressive high-density coniferous stands, while effective multi-objective policies balance species composition and density to protect permafrost while maintaining carbon gains. Our results demonstrate that robust climate-adaptive forest management remains challenging for current MORL methods, establishing BoreaRL as a valuable benchmark for developing more effective approaches. We open-source BoreaRL to accelerate research in multi-objective RL for climate applications.

Recommended citation: Dsouza, K. B., Ofosu, E., Amaogu, D. C., Pigeon, J., Boudreault, R., Maghoul, P., ... & Leonenko, Y. (2025). BoreaRL: A Multi-Objective Reinforcement Learning Environment for Climate-Adaptive Boreal Forest Management. arXiv preprint arXiv:2509.19846. Full Document

Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics

Submitted, 2025

Collective action problems, which require aligning individual incentives with collective goals, are classic examples of Ill-Structured Problems (ISPs). For an individual agent, the causal links between local actions and global outcomes are unclear, stakeholder objectives often conflict, and no single, clear algorithm can bridge micro-level choices with macro-level welfare. We present ECHO-MIMIC, a computational framework that converts this global complexity into a tractable, Well-Structured Problem (WSP) for each agent by discovering compact, executable heuristics and persuasive rationales. The framework operates in two stages: ECHO (Evolutionary Crafting of Heuristics from Outcomes) evolves snippets of Python code that encode candidate behavioral policies, while MIMIC (Mechanism Inference & Messaging for Individual-to-Collective Alignment) evolves companion natural language messages that motivate agents to adopt those policies. Both phases employ a large-language-model-driven evolutionary search: the LLM proposes diverse and context-aware code or text variants, while population-level selection retains those that maximize collective performance in a simulated environment. We demonstrate this framework on a canonical ISP in agricultural landscape management, where local farming decisions impact global ecological connectivity. Results show that ECHO-MIMIC discovers high-performing heuristics compared to baselines and crafts tailored messages that successfully align simulated farmer behavior with landscape-level ecological goals. By coupling algorithmic rule discovery with tailored communication, ECHO-MIMIC transforms the cognitive burden of collective action into a simple set of agent-level instructions, making previously ill-structured problems solvable in practice and opening a new path toward scalable, adaptive policy design.

Recommended citation: Dsouza, K. B., Watt, G. A., Leonenko, Y., & Moreno-Cruz, J. (2025). Structuring Collective Action with LLM-Guided Evolution: From Ill-Structured Problems to Executable Heuristics. arXiv preprint arXiv:2509.20412. Full Document

talks

teaching

CPEN 491: ECE final year undergraduate Capstone design project

Projects, The University of British Columbia, Department of ECE, 2019

This course involved mentoring final year undergraduate capstone students. I provided design inputs at various stages and helped them to drive the projects to completion. For Data and ML related projects I provided substantial guidance each week (2019:1,2,3; 2020:1,2; 2021:1,2,3). All Coding was done by the students.