Independent component analysis (ICA) is a signal processing algorithm that outperforms other methods for the extraction of biologically meaningful regulatory modules from gene expression data (Sastry et al.). When applied to large, high-quality RNA-seq datasets such as this one, ICA can extract independently modulated groups of genes known as iModulons. To learn more about the experimental conditions associated with each sample in this dataset, click the Projects tab in the navigation bar.

A list of iModulons extracted from this dataset is provided in the table below. The name, regulator, function, category, number of genes (N), explained variance, precision and recall of each iModulon is shown. Each row redirects to a page dedicated to each specific iModulon. To learn more about the specific genes contained within an iModulon, please click the row in the table.

Explained Variance of the iModulon Decomposition

The explained variance of the iModulon decomposition can be best described by the TreeMap below. The size of each iModulon is proportional to its explained variance (from the table above). iModulons are grouped by their functional category. Mouse over each iModulon to see its explained variance, function, and number of genes. For a description of how explained variance is calculated, please see Sastry et al. (2019) or Rychel et al. (2021).

Total Explained Variance :

Do iModulons recapture known regulons?

Regulons are sets of genes that are found to be co-regulated by experimental approaches (such as ChIP-seq). Regulon discovery is a bottom-up, bio-molecular approach based on the location of binding sites of known regulatory proteins (e.g. transcription factors). Alternatively, iModulons are sets of co-expressed genes identified by independent component analysis (ICA), a top-down, data-driven decomposition of high-quality transcriptomic datasets across many conditions. To assess the overlap between iModulon and regulon genes, we define the iModulon and Regulon Recall as the quotient of the number of genes shared between an iModulon and its linked Regulon and the number of genes in the iModulon or regulator, respectively. Please see Lim et al. (2022) for an example in P. putida.

  • x → Regulon Recall = Gene Overlap / Genes in Regulon
  • y → iModulon Recall = Gene Overlap / Genes in iModulon
The Regulon and iModulon Recall for all iModulons with linked regulons in this dataset are shown below. The graph is split into four quadrants, each defined by the type of overlap depicted in the venn diagrams. The iModulons are colored by their functional category (from above), and sized by the number of genes in each iModulon. Mouse over each datapoint to reveal the recall values and the size associated with each iModulon. Click each datapoint to be redirected to the corresponding iModulon page.

Are the activities of two iModulons correlated?

The activities of multiple iModulons can be correlated across an RNA-seq dataset, suggesting that genes within multiple iModulons may work together as the cell responds to changes in its environment. The iModulon phase-plane, a plot comparing the activities of two iModulons, can elucidate iModulon cooperation in different conditions.

To calculate an iModulon phase-plane, select two iModulons from the table above and click the analyze button below. On each graph, hover over each datapoint to reveal its metadata.

Analyze the entire {datasetName} dataset

iModulonDB can detect all highly correlated iModulon phase planes in the {datasetName} dataset. On the Analysis page, select a desired R2 threshold and identify all pairs of iModulons whose activities are correlated across all experimental conditions.