The 1000 Genomes Project today presents a map of normal human genetic variation – everything from tiny changes in the genetic code to major alterations in our chromosomes. In a DNA version of ‘spot-the-difference’, EMBL scientists and their colleagues studied the genomes of 1092 healthy people from Europe, the Americas and East Asia, systematically tracking what makes us different from each other. Their results, published in Nature, open new approaches for research on the genetic causes of disease.
“The 1000 Genomes Project has achieved something truly exceptional in providing this powerful baseline of human variation,” said Paul Flicek of EMBL-EBI, who co-chairs the project’s Data Coordination Centre (DCC). As well as providing that baseline – a clearer picture of which DNA sequences are common and which are rare in people from different areas or ethnic backgrounds – the results could help the ever-ongoing search for genetic links to diseases.
Jan Korbel from EMBL Heidelberg, who co-leads the project’s study of variation in large sections of chromosomes, pointed out the advantages of combining information on such large-scale variations with data on changes at a smaller scale. “This integrated view of genome variation will be extremely useful for understanding cause and consequence, and hence provide an invaluable context for future medical studies,” Korbel said. “When people find a SNP, a single letter change, that’s associated with a disease, they can now see if there’s a change in a larger chunk of the genome that’s always inherited alongside that SNP, and could cause the disease.”
The results also open up new avenues for researchers interested in how different genetic sequences have spread across human populations – taken by European settlers to the Americas, for instance. Ensuring that the project’s results are useful to researchers working in a wide range of fields is the mission of Flicek’s data coordination team. “Like ENCODE and other massive datasets, it is crucial that people working in all areas of human health and biomedical research can make the most of it. Our role has been to make these data not just freely available but truly accessible.”
To that end, the scientists have already made the current results available to the scientific community. “The results of this first phase are in the 1000 Genomes browser, which has a whole suite of Ensembl-based tools that help you make practical use of the data,” said Laura Clarke of EMBL-EBI, Technical Lead for the DCC. “For example it lets you look at shared patterns of variance, which can be a good indicator of whether a particular genetic factor is related to disease. Another very practical tool lets you take just a slice of the data, so you don’t have to download the whole massive dataset.”
With the help of such tools, and the continuation of the 1000 Genomes Project, scientists are set to keep learning about, and from, the differences between us.
The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature, 1 November 2012.