MRC IEU: Data Mining Epidemiological Relationships

Neo4J data integration pipeline

Background We’ve been using Neo4j for around five years in a variety of projects, sometimes as the main database MELODI and sometimes as part of a larger platform (OpenGWAS). We find creating queries with Cypher intuitive and query performance to be good. However, the integration of data into a graph is still a challenge, especially when using many data from a variety of sources. Our latest project EpiGraphDB uses data from over 20 independent sources, most of which require cleaning and QC before they can be incorporated.

Reducing drug development costs

A new animation

Overview This short animation explains how we use Mendelian randomization and colocalization to help prioritise drug targets. One of our aims in both programme 4 of the MRC IEU and the Integrative Cancer Epidemiology Programme is to integrate such prioritizations with other data to help inform drug development. Video  About the animation The animation is based on recent work by Dr Jie (Chris) Zheng, Vice-Chancellors Fellow based in programme 4 of the MRC IEU, who recently published an innovative Mendelian randomization and colocalization study of plasma protein levels in Nature Genetics, that demonstrated how genetic data can be used to support drug target prioritisation by identifying the causal effects of proteins on diseases.

Visualising Brexit’s Impact on Food Safety in Britain

Written by Marina Vabistsevits and Oliver Lloyd, researchers on PhD studentships linked to the “Data Mining Epidemiological Relationships” programme at the MRC IEU. Follow us on twitter – @marina_vab, @PlotThiggins Leaving the EU presents many unique challenges to Britain, among which is the crucial task of maintaining our high levels of food safety. As a submission to the Jean Golding Institute’s data visualisation competition, we briefly investigated the impacts that Brexit may have on British food supplies.

Drug target prioritization using protein QTL

Overview An innovative genetic study of blood protein levels, led by researchers in the DMER programme at the MRC Integrative Epidemiology Unit (MRC-IEU) at the University of Bristol, has demonstrated how genetic data can be used to support drug target prioritisation by identifying the causal effects of proteins on diseases. Working in collaboration with pharmaceutical companies, Bristol researchers have developed a comprehensive analysis pipeline using genetic prediction of protein levels to prioritise drug targets, and have quantified the potential of this approach for reducing the failure rate of drug development.

Exploring Elasticsearch architectures with Oracle Cloud

The IEU GWAS Database The MRC Integrative Epidemiology Unit (MRC IEU) at the University of Bristol hosts the IEU GWAS Database, one of the world’s largest open collections of Genome Wide Associate Study data. As of April 2019, the database contains over 250 billion genetic association records from more than 20,000 analysis of human traits. The IEU GWAS Database underpins the IEUs flagship MR-Base analytical platform ( which is used by people all over the world to carry out analyses that identify causal relationships between risk factors and diseases, and prioritize potential drug targets.