What's New in iModulonDB?

Version 2.0.2: Search-based analysis and the gene expression correlation matrix
July 16th, 2024
New Datasets
This update allows users to conduct analyses through the 'Search' page. For any gene-gene, gene-iM, or iM-iM pair, users can analyze the associated correlations in expression and/or activity. To better understand the differences between iModulon (top-down, data-driven) and Regulon (bottom-up, bio-molecular) approaches to identify co-expressed genes, the iModulon page now features a correlation matrix for the expression of genes in the iModulon and its annotated regulon. This data is provided as a clustered heatmap and associated dendrogram.
  1. Search
    Now, you can analyze any pair of iModulons, any pair of genes, or any iModulon-gene pair directly from the 'Search' page. Simply add genes and/or iModulons from the search results to the analysis box and click the analyze button.
  2. iModulon
    For each iModulon where a regulon has been identified, the gene expression correlation matrix (heatmap) and the associated dendrogram is displayed for genes in either the iModulon or its associated Regulon. Mouse over the heatmap to reveal the correlation between any two genes in the iModulon and/or regulon. Boxes are drawn to highlight genes in the iModulon only (yellow), in the regulon only (black), in both the iModulon and regulon (green). The hierachical clustering of the heatmap is displayed as a dendrogram. The size of each node reflects the number of genes in each node. Each node is displayed as a piechart, colored by the fraction of genes in the iModulon only (yellow), in the regulon only (black), and in both the iModulon and regulon (green). Mouse over the dendrogram to reveal the genes and gene-products in each node.
  3. S. elongatus
    A new dataset has been published for S. elongatus. This dataset includes 161 unique conditions 300 samples. This manuscript associated with this dataset is in preparation.
  4. S. albidoflavus
    A new dataset has been published for S. elongatus. This dataset includes 88 unique conditions 218 samples. This manuscript associated with this dataset is in review.
    Previously, the STRING-PPI networks were displayed by sending UniProt or NCBI protein IDs to the STRING-API. However, some organisms are less annotated and do not work with these direct ID queries. We have now used BLAST to identify the correct STRING IDs for the API queries (Seq. Ident > 80%, e-value ≤ 0.01). Organisms whose STRING IDs match the NCBI Protein ID or UniProt ID continue to use the text-based query method.
Version 2.0.1: TreeMaps, recall graphs, and protein-protein interactions
July 10th, 2024
This update adds new features to the Dataset, iModulon, and Gene pages. TreeMaps and recall graphs have been added to the Dataset page. Condition-specific correlations are plotted on the Gene and iModulon pages. Protein-protein interactions are displayed on the iModulon and Gene page, as well as for pair-wise iModulon analyses. Loading speeds of the 'Dataset' and 'iModulon' pages to function with mobile devices. A new 'Project' page has been added to display the publication & control/experimental variables associated with each project.
  1. Project
    A new Project page has been created to display the publication associated with each project. This has been moved from the Dataset page. Two boxes have been created to separate the control and experimental variables across the samples from each project. The publication associated with each entire dataset is highlighed on loading.
  2. Dataset
    The explained variance of the iModulon decomposition is displayed as a TreeMap. The iModulon boxes are colored according to their functional category. Each box redirects to the corresponding iModulon page when clicked. When regulons are annotated to an iModulon, a Recall Plot displays the iModulon recall and Regulon recall as described by Lim et al., (2022). Each datapoint redirects to the corresponding iModulon when clicked. In the pair-wise analysis feature, protein-protein interaction maps are displayed from the String database for the genes in each iModulon, in both iModulons, and in either iModulon.
  3. iModulon
    The protein-protein interaction network from the String database is displayed in the header for each iModulon. Each iModulon header contains links to the operon diagram(s) in BioCyc. Graphs are added to display condition-specific correlations between iModulon activity and the expression of its constituent genes (and/or its regulator).
  4. Gene
    The protein-protein interaction network from the String database is displayed in the header for each iModulon. Graphs are added to display condition-specific correlations between the gene expression and its iModulon's activity.
  5. Home
    The total database size (organisms, datasets, iModulons, and samples) are dynamically calculated.
Version 2.0: Additional graphs and analysis tools faciliate knowledge-mining on the UI
June 30th, 2024
The goal of this update is to facilitate knowledge-mining of the datasets published on iModulonDB. For this reason, all datasets have been updated to reflect genetic perturbations in each sample. Additionally, we identified more than 240 DOIs associated with data published on iModulonDB and display the publication information (title, abstract, etc.) directly on the corresponding 'Dataset Page'. Each page now offers interactive graphs of the relevant data (see 'iModulon' and 'Gene' oages). On the 'Dataset' page users can select pairs of iModulons to analyze. On the 'Analysis' page, users can perform dataset-wide queries to identify highly correlated and iModulons (by activity across all samples) and iModulon-regulator (activity vs gene expression). Furthermore, the 'Analysis' page allows users to identify correlations between known regulators and iModulons with no annotated regulators. To explain these changes, the 'About Page>Using This Site' has been modified.
  1. Dynamic Text Descriptions: Each page populates certain details in the graph legends and descriptions dynamical. For example, the iModulon name, gene name, dataset name, regulator name, number of genes, number of regulators, number of iModulons, etc. are dynamically generated where appropriate.
  2. Data Curating: A list of >200 DOIs has been identified across all datasets / organisms. Publication information is displayed on the 'Dataset page' of each dataset. The genetic perturbations (gene KOs, mutations, or expression on plasmids) of all samples across all datasets / organisms has been added to the metadata.
  3. Data Filtering: \ For all new analyses, plots for the expression of genes or the activity of iModulons undergoes QCQA in the backend of the website. Genetic pertubations are identified and plotted, but are not used for calculating correlations.
  4. Gene
    Dynamic text descriptions have been added. External links have been added to direct users to the approrpiate protein-protein interaction networks (via StringDB) and operon diagrams (BioCyc). These external resources can offer information to aid users in understanding the genes in an iModulon. Interactive graphs have been added to show the gene's expression level and the gene's corresponding iModulons' activities. To help users understand if iModulon activity and gene expression are condition-dependent, additional graphs are displayed for samples that are grouped first by their associated study.
  5. iModulon
    Dynamic text descriptions have been added. Graphs are color-coded by study. Data filtering is conducted prior to calculated correlations to the iModulon regulator. Condition-specific correlation values are displayed on a new graph. iModulon activity is now mapped against the expression of all genes, not just its regulators.
  6. Dataset
    Dynamic text descriptions have been added. All publications associated with the dataset are easily accessible on the page (title, authors, abstract, DOI). All samples in each study are displayed in table. Control and experimental variables are identified and displayed separately to facilitate understanding of the experiment. Users can selected two iModulons from the dataset and conduct pairwise analysis. Page will calculate the correlation between the iModulon activities, the condition-specific correlation of iModulon activities, the genes shared by the iModulons, the gene weights of all genes for each iModulon.
  7. Analysis
    This a new page that facilitates multiple dataset-wide analyses. Users are prompted to choose a R2 threshold. The page offers three analyses based on this threshold:
    • Users can identify all iModulon pairs whose activities are correlated above the threshold. This pairwise analysis is also offered on the 'Dataset Page' for user-selected iModulons (up to two).
    • Users can identify all iModulons whose activity is correlated to the gene expression of their annotated regulator.
    • Users can identify all known regulators whose gene expression is correlated to the activity of an iModulon with no annotated regulator.
  8. About
    Changes have been made to reflect updates to the 'Gene', 'iModulon','Dataset', and 'Analysis' pages ('About'>>'Using this Site').
  9. Update
    This page has been created to log updates to iModulonDB.
Version 1.0.1: Addition of datasets for multiple organisms
October 12th, 2020 - Present
New Datasets
Version 1.0: Initial Publication
October 12th, 2020
The initial publication of iModulonDB is described by Rychel et al., (2020). Three datasets accompany the initial publication: E. coli (precise-278), B. subtilis (microarray), and S. aureus (precise-108)

Contact Us

For questions, comments, feedback, or to collaborate with us, please send an email to Edward Catoiu (imodulondb@ucsd.edu).

For more information on the Systems Biology Research Group (SBRG) at the University of California, San Diego, please see our website here.