About the Gadgets & Data
GadgetsData
Gadgets
Name |
Entity Linker |
Description |
The Entity Linker gadget facilitates the natural language processing task of Entity Linking on biographical data. Entity Linking consists of connecting keywords contained in a text with their corresponding keywords stored in a knowledgebase. When English text input (e.g. the abstract of an academic paper) is given to the Entity Linker, it outputs the keywords contained in the text in the form that they are registered in the knowledgebase. |
Reference |
http://prm-ezcatdb.cbrc.jp/entity_linking/ |
Institute |
National Institute of Advanced Industrial Science and Technology (AIST) |
Contributors |
Masami Ikeda & Hiroya Takamura |
Name |
NamedEntityRecognizer |
Description |
Named Entity Recognizer is a gadget that facilitates the natural language processing task of Named Entity Recognition (NER) on literature information. NER is used to extract and classify keywords such as disease names, cell names, and pharmacological substances found within texts. When English text is input, the gadget will find keywords in the text that match one or more of 37 pre-defined criteria (including names of diseases, cells, pharmacological substances, and other proper nouns relevant to the field of drug discovery). |
Reference |
http://prm-ezcatdb.cbrc.jp/named_entity_recognition/ |
Institute |
National Institute of Advanced Industrial Science and Technology (AIST) |
Contributors |
Masami Ikeda & Hiroya Takamura |
Name |
JaMIE |
Description |
Relation extraction is the extraction of semantic relations between keywords in a text. When a Japanese medical text (e.g. CT image reading finding) is input into this gadget, it outputs the relationship between the keywords in the text and their associated keywords in the knowledgebase. |
Reference |
https://github.com/racerandom/JaMIE |
Institute |
Kyoto University |
Contributors |
Fei Cheng & Sadao Kurohashi |
Name |
Semantic Search |
Description |
This is a document retrieval system that presents similar documents when given medical documents such as electronic medical records and radiological findings. The search target is a group of medical documents annotated by the PRISM project. An example application of this gadget would be searching for existing cases in a hospital. |
Reference |
https://aoi.naist.jp/prism-search/ |
Institute |
Nara Institute of Science and Technology (NAIST) |
Contributors |
|
Name |
HeaRT |
Description |
When medical documents such as electronic medical records and findings are input, a Gantt-like chart is created that illustrates the information in chronological order. This can be used to facilitate information sharing among health professionals. |
Reference |
https://aoi.naist.jp/prism-heart/ |
Institute |
Nara Institute of Science and Technology (NAIST) |
Contributors |
|
Name |
SFM, bST |
Description |
Space-efficient feature maps for string alignment kernels (SFMEDP) takes a set of input strings and outputs a set of feature vectors. Using the features produced by SFMEDM, a support vector machine (SVM) can be used to perform predictive tasks such as string classification and regression. One example of this gadget's utility is prediction tasks which use amino acids as training data. Because strings are mapped in a nonlinear space, prediction performance using SFMEDM is highly accurate and memory efficient. |
Reference |
https://github.com/tb-yasu/SFMEDM https://github.com/kampersanda/integer_sketch_search |
Institute |
RIKEN |
Contributors |
Yasuo Tabei |
Name |
kGCN Network Prediction |
Description |
Graph convolutional neural networks (GCNs) allow structural information of small molecule compounds to be input as graphs and have been reported to perform well on many types of prediction tasks. kGCN is an open-source, GCN-based gadget that provides the necessary preprocessing for building prediction models. Bayesian optimization for model tuning and atom visualization contribute significantly to the prediction for interpretation of results. This gadget predicts and outputs new links that may exist between nodes upon input of the dataset, nodes, and trained models. |
Reference |
https://github.com/clinfo/kGCN |
Institute |
Kyoto University |
Contributors |
Ryosuke Kojima & Yasushi Okuno |
Name |
Molenc |
Description |
One approach to using information about a compound in machine learning is to generate fingerprints, which are vectors that indicate how many specific substructures are present in the compound. There are various ways to generate fingerprints, but this gadget generates Signature Molecular Descriptors (SMDs) which were originally published by J.L. Faulon et al. in 2003. By inputting a list of structural information (SMILES) of a compound of interest into the gadget, a correspondence table of features (substructures) and SMD fingerprints is generated and output. Preexisting correspondence tables can be uploaded by ticking the "encoding.dix" checkbox and pressing Run. If the user does not have a correspondence table, the user should ensure that the aforementioned box is not checked before pressing Run. |
Reference |
https://github.com/UnixJunkie/molenc |
Institute |
Kyushu Institute of Technology |
Contributors |
Francois Berenger & Yoshihiro Yamanishi |
Name |
Vanishing Ranking Kernels |
Description |
Ligand-based virtual screening is performed by learning a classification model of activity strength for a set of compounds and predicting it for a set of compounds of unknown activity based on vanishing kernels and intermolecular Tamimoto coefficients. The resulting model defines an applicability domain (AD) for the activity. This AD is used to improve screening efficiency. The input file is the feature (descriptor) rather than the structure of the compound. Please refer to https://github.com/UnixJunkie/rankers for details. |
Reference |
https://github.com/UnixJunkie/rankers |
Institute |
Kyushu Institute of Technology |
Contributors |
Francois Berenger & Yoshihiro Yamanishi |
Name |
Modified Diet Networks |
Description |
With ultra-high dimensional (n<<p) data, such as genomic data, it is difficult to avoid overlearning even with regularization and other methods. Diet Networks is a deep learning method designed to train on high dimensional data, and Modified Diet Networks is an improved version of Diet Networks that provides stable and accurate predictions. This gadget is equipped with a pre-trained model that uses Modified Diet Networks to classify lung cancer patients into lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) based on their somatic mutation profiles. When patient information is provided to the model in the form of a vector of counts of how many somatic mutations are present in each gene/patient (multiple patients can be entered as a matrix) the model outputs a prediction of whether the patient is LUAD or LUSC. |
Reference |
https://www.mdpi.com/2218-273X/10/9/1249 |
Institute |
National Cancer Center |
Contributors |
Ken Asada & Ryuji Hamamoto |
Name |
Multiomics Analyzer |
Description |
In recent years, multi-omics data analysis has attracted attention for various applications, but its methodology has not yet been fully established. One of the biggest challenges in omics data analysis is how to handle high-dimensional data. The Multi-omics Analyzer is equipped with models created by applying unsupervised deep learning methods to miRNA and mRNA data acquired from lung cancer patients in The Cancer Genome Atlas (TCGA). When miRNA/mRNA data is input, it outputs feature vectors with reduced dimensionality. The feature vectors obtained from the Multi-omics Analyzer can be used for prediction tasks such as classification and regression. |
Reference |
https://www.mdpi.com/2218-273X/10/4/524/htm |
Institute |
National Cancer Center |
Contributors |
Kazuma Kobayashi & Ryuji Hamamoto |
Name |
Subset Binder |
Description |
This gadget uses an algorithm called subset binding to find groups of items that are linked across two datasets. For example, when medical information and omics data are input, the gadget outputs a pair of molecule groups that fluctuate together as well as a corresponding group of medical information that changes in conjunction with them. Miné includes two types of data - hepatotoxicity phenotype data and gene expression profiles - that reflect hepatotoxicity when high concentrations of acetaminophen are administered to rats as data for operation checks. Subset binding is based on association rule mining technology, so the parameters used are generally the same as those used in association rule mining. |
Reference |
https://www.researchsquare.com/article/rs-405195/latest.pdf |
Institute |
National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) & RIKEN |
Contributors |
Yayoi Natsume-Kitatani & Naonori Ueda |
Name |
RPPA |
Description |
This gadget is a prognostic system for lung squamous cell carcinoma and lung adenocarcinoma using a Deep Autoencoder. It can predict prognosis using only reversed phase protein array (RPPA) data as well as six types of omics data (RNA sequencing data, miRNA sequencing data, DNA methylation data, copy number variation, somatic mutation, DNA, sequencing data, and RPPA data). |
Reference |
https://www.mdpi.com/2218-273X/10/10/1460 |
Institute |
National Cancer Center |
Contributors |
Ken Asada, Ryuji Hamamoto |
Name |
PathoGN |
Description |
Upon input of mutation information and gene relationship networks, this gadget presents information such as the pathogenicity of the mutation and the predicted relevant genes present in the biomolecular network. This is a novel method implemented as an extension of kGCN. |
Reference |
|
Institute |
Kyoto University |
Contributors |
Ryosuke Kojima, Yasushi Okuno |
Name |
INGOR |
Description |
INGOR is an implementation of a Bayesian network estimation algorithm. When provided with measured biomolecular profiles, it creates a causal network between biomolecules such as genes and proteins. This gadget can be used to elucidate intermolecular regulatory causal mechanisms and to search for new drug target candidates. |
Reference |
https://ytlab.jp/clinfo/ingor/tutorialja.html |
Institute |
Kyoto University |
Contributors |
Yoshinori Tamada, Yasushi Okuno |
Name |
INGOR ECv |
Description |
Given a measured biomolecular profile and a Bayesian network estimated with INGOR, INGOR ECv outputs the edge contribution value (ECv) for each branch in the network and the partial network extracted using this value. This tool is for academic use only. |
Reference |
https://doi.org/10.1038/s41598-021-02394-w |
Institute |
Kyoto University |
Contributors |
Nakazawa, M.A., Tamada, Y., Tanaka, Y., Ikeguchi, M., Higashihara, K., Okuno, Y. |
Name |
INGOR RC |
Description |
When the measured biomolecular profiles and the Bayesian network estimated by INGOR are input to INGOR RC, it will output the relative contribution value (RC) for each branch needed to visualize the network for each sample. This tool is for academic use only. |
Reference |
https://doi.org/10.1038/s41598-021-90556-1 |
Institute |
Kyoto University |
Contributors |
Tanaka, Y., Higashihara, K., Nakazawa, M.A., Yamashita, F., Tamada, Y., Okuno, Y. |
Name |
INGOR Network |
Description |
Based on the measured biomolecular profiles, this Bayesian network estimation algorithm provides results of stratification and grouping of samples based on causal networks, ECv, and networks among biomolecules such as genes and proteins. It can be used for elucidating intermolecular regulatory causal mechanisms, searching for novel drug target candidates, and patient stratification.
This tool is for academic use only. |
Reference |
|
Institute |
Kyoto University |
Contributors |
Yoshinori Tamada, Yasushi Okuno |
Name |
TargetMine |
Description |
A data warehouse that integrates more than 30 public data sources widely used internationally to support early drug discovery research, especially in target discovery, enabling efficient knowledge discovery. TargetMine covers a wide range of data from genes, proteins, and pathways to 3D structures and interactions with compounds. Presently, the data incorporated in TargetMine is primarily focused on the most studied model organisms in the field of drug discovery: humans, rats, and mice. |
Reference |
https://doi.org/10.1093/bioinformatics/btac507 |
Institute |
National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) |
Contributors |
Yi-An Chen, Kenji Mizuguchi |
Data
There is no data available for release.