We currently have two high-powered workstation for our computational processes. They have the following specs:
- 24 core-dual CPU (a total 48 cores of up to 96 threads)
- Triple quad graphic card
- 74TB storage
- 1TB memory
- Running Linux (Ubuntu 20.04)
In addition to high-performing personal laptops with 1TB storage, 32GB memory and running Windows 10 Pro.
Storage and Backup
The core utilizes the high-performance workstations for storage and will store all your data and results for one month.
We currently have one Amazon AWS instance for web-based needs, and are in the process of expanding our cloud computing resources.
- An easy-to-use web tool for gene and signature enrichment and pathway analysis called the PCTA (thepcta.org).
- A compendium of cancer transcriptome collection consisting of six major cancer types including prostate, bladder, lung, colon, breast and liver cancers.
- A compendium of epigenetic data resources; the Encyclopedia of DNA Elements (ENCODE) and Roadmap Epigenomics Mapping Consortium (REMC). The ENCODE consortium has published extensive and high-quality ChIP-seq datasets of TFs, RNA expression, DNA methylation and histone modifications from 109+ commonly used cell lines and normal primary cells. REMC is a high-resolution map of the human epigenome from 127 human tissues.
- Molecular interactomes. We have compiled TF-targets, protein-protein interactions, miRNA-target information from 15 different databases (GTRD, CellPhoneDB, ReMAP, ChEA, hmChIP, EEDB, TRED, TRRUST, Amadeus, MSigDB, Mentha, STRING, HuRI, Microcosm, TarBase, TargetScan, miRecords) into one molecular interactome resource. This resource includes 799,657 TF-target interactions of 1,391 TFs, 427,605 miRNA-target interactions of 513 miRNAs, and 139,338 protein-protein interactions of 13,743 proteins. Furthermore, we compiled data on pathways and protein complexes from KEGG, Biocarta, REACTOME, CORUM and Epifactors databases.
- Various NGS data analysis pipelines.
- Proteomics data analysis pipelines.
- In-house software tools for data visualization and statistical analysis based on Python, R and MATLAB.
Code and Data Share
Our code repository is version-controlled using Github, and all repositories are kept private until publication requirement (or other reasons), but can be shared through collaborations. Results (report and figures) are typically sent via email by default—but we are happy to work with whatever file exchange you prefer. For larger file transfers, we will be using Box—but again, we’re able to work with whatever is most convenient for you.
We currently employ CrowdStrike to secure our portable devices, while our high-compute workstation is secured behind the Cedars-Sinai network. User accounts can be created by members of the core and only those with granted access can access the machine.
Have Questions or Need Help?
Contact us if you have questions or would like to learn more about the Cancer Bioinformatics Shared Resource at Cedars-Sinai.
8687 Melrose Ave., Suite G-566
Los Angeles, CA 90069