Introducing EUCANCan’s Genomic Benchmarking Tool

One goal of EUCANCan is to enable the sharing of genomic data within and across institutions in Europe and Canada. In order to effectively and accurately compare data on mutations of the genome, different institutions must harmonize their variant calling strategies and how they structure the results of their variant calling pipelines. Aiding this process is the rationale of the Variant Call Format (VCF) benchmarking tool.

The VCF benchmarking tool is developed as part of EUCANCan’s Work Package 2 (Genome analysis pipelines to support the therapeutic decision), led by Dr Philippe Hupé of Institut Curie in Paris, France. The tool is the result of a collaboration between Institut Curie, Hartwig Medical Foundation in Amsterdam, the Netherlands, German Cancer Research Center  (DKFZ ) in Heidelberg, Germany and EUCANCan’s coordinating institute, Barcelona Supercomputing Center, Spain.

The EUCANCan communications team recently met with Tom Gutman of Institut Curie, a member of the VCF benchmarking tool working group, to learn more about the tool.

“This is a powerful tool that can help EUCANCan’s partners to improve their variant calling performances. Other genomic benchmarking tools exists out there, but most are limited to specific types of variants or variant calling tools. Our tool enables the analyses of results of a wide range of variant calling methods and variant types,” explains Tom Gutman.

The tool has been specifically designed for EUCANCan. The team behind the tool has been holding weekly meetings since early 2020, spending Thursday mornings together (virtually) to develop, test, and refine the tool.

Harmonize the variant calling strategies of EUCANCan’s partners

In order to harmonize the variant calling strategies, the first step is to streamline the way EUCANcan centers use bioinformatics tools with geniac, the second step is to homogenize with the VCF benchmarking tool the output such as lists of mutations in a specific sample set, of different tools.

In developing the tool, the working group asked several of EUCANCan’s partners to analyze the same private data sets, using their current tools. Their results were later analyzed and compared. The initial result of this work shows that there are small discrepancies in the partners’ ability to correctly detect small-sized mutations but that their capacity to discovered and classify medium- and large-sized mutations differ. Beyond visualizing the differences in the partners’ results, the VCF tool will also provide recommendations to each user for improving their results. This latter part is still under development.

At the time of writing, the VCF benchmarking tool is only used by the four institutions developing it. The tool will be made available to all EUCANCan partners later this year and the working group is also looking into the possibility of sharing the tool outside the consortium towards the end of the project. Around the same time, the team will share their lessons learned from developing the VCF benchmarking tool in a publication.