DMER: Data mining epidemiological relationships

Latest posts

  • Pilot analysis on BioRxiv and MedRxiv full text data to facilitate comprehensive data mining on biomedical literature

    We have curated the full text data archives of BioRxiv and MedRxiv preprints and conducted some exploratory analyses for our next stage research projects in text mining biomedical literature with automated approaches.

    Posted by Yi Liu on Aug 21, 2023

    seedcorn funding text mining NLP

  • Proteome-wide Mendelian randomization in global biobank to identify multi-ancestry drug targets

    PhD student Huiling Zhao and co-supervisor Dr Jie (Chris) Zheng published an interesting cross-ancestry MR analysis of potential drug targets in collaboration with the Global Biobank Meta-analysis Initiative.

    Posted by Tom Gaunt on Nov 1, 2022

    drug-targets mr colocalization

  • New funding: NIHR Bristol Biomedical Research Institute.

    The National Institute for Health and Care Research Bristol Biomedical Research Centre (NIHR Bristol BRC) has been awarded nearly £12 million of new funding for the next five years. The DMER programme is linked to the Translational Data Science theme of the BRC, providing a mechanism for translation of our methodological research, software tools and data resources.

    Posted by Tom Gaunt on Oct 14, 2022

    drug-targets mr funding

  • Systematic comparison of Mendelian randomization studies and randomized controlled trials using electronic databases

    Triangulating results between Mendelian randomization studies and randomized controlled trials has the potential to strengthen evidence for an intervention target. In this work, led by Maria Sobczyk, we mined ClinicalTrials.Gov, PubMed and EpigraphDB databases and carried out a series of 26 manual literature comparisons among 54 MR and 77 RCT publications to explore the potential for systematic triangulation.

    Posted by Tom Gaunt on Apr 16, 2022

    database epigraphdb MR NLP

  • Triangulating evidence in health sciences with Annotated Semantic Queries

    Integrating information from data sources representing different study designs has the potential to strengthen evidence in population health research. In this work, led by Yi Liu, we present ASQ (Annotated Semantic Queries), a natural language query interface to EpiGraphDB, which enables users to annotate “claims” from a piece of unstructured text with evidence relevant to the claim.

    Posted by Tom Gaunt on Apr 16, 2022

    database EpiGraphDB MR NLP software

  • Evaluating the potential benefits and pitfalls of combining protein and expression quantitative trait loci in evidencing drug targets

    Molecular quantitative trait loci (molQTL), which can provide functional evidence on the mechanisms underlying phenotype-genotype associations, are increasingly used in drug target validation and safety assessment. In this work, led by Jamie Robinson, we evaluate the differences between expression and protein QTL and explore the possible reasons for apparent contradictory effects of genetic variants.

    Posted by Tom Gaunt on Mar 17, 2022

    database epigraphdb MR NLP

  • Senior Research Associate / Research Fellow in Health Data Science

    We are seeking a talented postdoctoral scientist with expertise in biomedical data integration and analysis, data mining and causal inference

    Posted by Tom Gaunt on Jan 12, 2022

    jobs health data science work with us

  • Trans-ethnic Mendelian-randomization study reveals causal relationships between cardiometabolic factors and chronic kidney disease

    This paper, led by Jie Zheng, systematically analysed previously reported risk factors for chronic kidney disease in European and East Asian populations using Mendelian randomization. The analysis showed evidence of both cross-population and population-specific risk factors.

    Posted by Tom Gaunt on Oct 20, 2021

    MR CKD

  • EpiGraphDB platform version 1.0

    EpiGraphDB v1.0 and summary of features and changes.

    Posted by Yi Liu on Mar 22, 2021

    EpiGraphDB software database

  • MendelVar: gene prioritization at GWAS loci using phenotypic enrichment of Mendelian disease genes

    This paper, led by Maria Sobczyk, presented MendelVar, a tool which integrates knowledge from four databases on Mendelian disease genes with enrichment testing for a range of functional annotations to support the prioritization of genes at GWAS loci.

    Posted by Tom Gaunt on Jan 1, 2021

    database GWAS software

  • Neo4J data integration pipeline

    We make extensive use of Neo4J for graph databases (including EpiGraphDB). One of the key challenges in constructing a heterogeneous graph database is the data integration from different sources. Ben Elsworth describes the pipeline he has developed to automate this process.

    Posted by Ben Elsworth on Nov 17, 2020

    database Neo4J data integration software

  • Reducing drug development costs

    Explaining our work in a way that is accessible to a wide audience is often challenging. Here we summarise some of our approaches to drug target prioritization in a short animation.

    Posted by Tom Gaunt on Nov 8, 2020

    drug targets video MR colocalization

  • Visualising Brexit’s Impact on Food Safety in Britain

    PhD students Marina Vabistsevits and Ollie Lloyd entereed the Jean Golding Institute data visualization competition on food hazards from around the world. Here they present their visualizations and interpretation, which won them a runner-up prize.

    Posted by Marina Vabistsevits and Ollie Lloyd on Oct 6, 2020

    data visualization data science

  • Drug target prioritization using protein QTL

    A lot of our research recently has focused on drug target prioritization using Mendelian randomization and genetic colocalization. Here we introduce Jie (Chris) Zheng’s Nature Genetics paper which describes our systematic analysis of the plasma proteome, part of an ongoing collaboration with pharma partners.

    Posted by Tom Gaunt on Sep 7, 2020

    drug targets papers MR colocalization

  • Exploring Elasticsearch architectures with Oracle Cloud

    The IEU OpenGWAS database contains well over 100 billion rows of data on genetic associations. Ben Elsworth describes his work on implementing a cloud-based ElasticSearch database on the Oracle Cloud Infrastructure to can handle millions of queries per week.

    Posted by Ben Elsworth on Apr 16, 2019

    database Elasticsearch OpenGWAS cloud software

  • Indexing 200 billion records in 2 days

    A few years ago we started collecting genome-wide association study datasets and making them available to the research community. As the data grew from tens of millions to tens of billions of rows we found a MySQL database no longer sufficient. Ben Elsworth describes how he implemented an ElasticSearch solution to the challenge of querying a really large dataset.

    Posted by Ben Elsworth on Jan 24, 2019

    database Elasticsearch OpenGWAS software

No matching items