Long read: Open science – Part 1

Welcome to our new "long read" series. We invite readers to take an in-depth look at some of the topics that are important to the world of fusion research. We open the series with the first part on Open Science. The shorter version of this piece is available on the Spring 2019 issue of Fusion In Europe.

Article authored by Karl Tischler

The Devil is in the Details

Science is better shared. Scientists have long understood this and travelled the “known world” to exchange ideas between different fields of knowledge. They wrote papers to make sense of their observations – the data they gathered – and conducted peer reviews of their findings through academic journals since the 17th century. Through standards and transparency, they strengthened credibility and improved knowledge sharing.

Three and a half centuries later, the Open Science movement seeks to take this goal even further by sharing scientific research more broadly with society at large and more deeply by supplementing scientific papers with supporting data, methods and even software.  All this to make research more innovative, accountable, collaborative, and advance faster.

It sounds simple. But the devil is in the details.

New framework

Increasing the returns on public-funded research is in some cases a sure-win. With proper infrastructure and resources in place, findings are relatively easy to share and combine with research and data from other fields of study, leading to new insights and discoveries. This scientific cross-pollination creates a smarter society.

Sometimes research is easy to share but very complex or specialized. Misinterpretation is a big concern because it could cause confusion instead of furthering understanding. The research’s complexity may limit cross-pollination. Opening this research up to the general public therefore seems to be of little benefit.

Finally are cases of highly complex, niche-like research that is both difficult to share and use.  A good example would be fusion energy research. While the first photograph of a black hole recently released by the European Southern Observatory (ESO) captured headlines and the public’s imagination, a EUROfusion study into deuterium-tritium plasma instabilities due to infinitesimally small changes in magnetic field conditions is largely ignored outside of the fusion community for rather obvious reasons. Sharing this research publicly outside the fusion community makes little sense – especially because it is so difficult and costly to share!

How to get published

Publishing papers in science journals is surprisingly complex, restrictive and sometimes costly. In 2018, EUROfusion Consortium researchers submitted 1.100 papers for review.  These papers first undergo an internal scientific peer-review.

Researchers want their paper published in the most prestigious journal possible because of the distinction and exposure it brings. Journals are selective about which papers they publish and send them to experts for peer-review before publishing. “Some journals’ acceptance rate is only 30%,” informs Kinga Gal, Scientific Secretary in the EUROfusion Programme Management Unit. “Thanks to our own internal review process to ensure the quality of each paper, around 70% of our submitted papers get published.”

Embargo

In return for their efforts, journals collect a fee from the researcher(s) to publish each paper. The more prestigious the journal, the higher the price. The price fusion journals charge is on average €2,000 per paper, but it costs €5,000 in other scientific fields.

Journals also charge for access to published papers. During the “embargo period” lasting from six months to two years, the paper is only accessible via paid subscriptions or a one-time purchase. Subscriptions are more or less affordable. However one-time purchases can be costly. For example a recent scientific conference paid €130,000 in licensing fees for its 150 attendees - after a 50% discount!

After the embargo period, papers can be shared following Green Open Access practices. However, only the submitted, non-typeset version of the paper can be freely shared, and only via an institutional or personal repository, each with additional different rules. The published version remains the journal’s exclusive property who continues to charge a nominal fee for access. 

There is also a Gold Open Access option. In exchange for a higher page rate plus a substantial fee, a journal  gives free access to the paper via its website. The researcher incurs all the costs, leaving less money for research.

As you can see, the existing publishing process is complex and expensive. Changing this traditional process is being actively discussed. We’ll have to see what develops.

More than data

Sharing complexity is compounded when the related supporting data, software and methods are added. The data can be terabytes in size. It might only be understandable by experienced professionals or even require supercomputer processing capabilities to use.  To remain usable, the data’s format must be kept current and sometimes software has to be provided,   which creates software licensing issues.

“The methods used to create the data and the research notes and codes which convey data in real physical terms are essential,” elaborates Tony Donné, EUROfusion Programme Manager.

(Really) Big Data

Finally, there is the sheer amount of data to be shared. At the Joint European Torus (JET) fusion research facility, 10 to 100 gigabytes of data are captured with each experiment, commonly referred to as a “shot”. To date they have made nearly 100 thousand shots. At the ASDEX Upgrade fusion device they have made around 37 thousand shots. That’s a lot of data!

Modelling is an even bigger source of data. A big plasma turbulence code easily generates terabytes of data. And then there will be ITER , the international fusion megaproject being built in southern France, which may capture as much as one petabyte of data per shot

Storing, managing and maintaining these amounts of data is costly. Gathering and sharing it as well.

Versions of a Scientific Paper

“Draft” peer-reviewed internally by EUROfusion

“Submitted” to journal

“Accepted” journal peer-reviewed

“Embargo” journal formatted & published

“Preprint” submitted version made public

“Post-print” published version publicly available after embargo period

 

Open Access (OA)

“Gold OA” researcher pays a hefty sum for immediate open access of the published version via the journal’s website

“Green OA” after the embargo period the preprint version can be shared via the institute’s online repository

“Without proper context, data by itself has no meaning.”

David Coster Group Leader in Edge Physics, Tokamak Theory, at the Max Planck Institute for Plasma Physics in Garching, Germany

Shot

At JET, 10 to 100 gigabytes of data are captured with each experiment.

I scratch your back…

Once data is openly shared it is available to everyone in the world. But is it right for expensive publicly-funded research to benefit non-reciprocal citizens and countries?

“We keep our data to ourselves for a certain period, in the order of a year,” explains Tony Donné. “This gives the EUROfusion community enough time to use it first. For multiple experiments, or when we plan to use data in other ways, we’ll hold onto it even longer.” This ensures that research helps the countries who funded it first.

Security & intellectual property

Security must also be considered. Because fusion is a nuclear science, some research could be misused. The new Open Science methodology must highlight and protect such information from being released publicly.

There is also the question of intellectual property (IP). It is in the European public’s interest to protect and benefit from the IP they fund. But how best to do this, especially when the rules and laws are so different even between EU member states? It will be no small undertaking to develop the necessary systems for the protection, enforcement and licensing of IP.

So many considerations

What seemed like an easy task now involves financial, commercial, intellectual property rights (IPR), maintenance and accessibility considerations. The people working towards the realisation of Open Science have to create a system flexible enough to work across and within different areas of scientific research. It must also be sufficiently adaptable to remain affordable and usable in an ever-changing world.

In this way, it will be possible to work towards, attain and maintain the delicate balance required for Open Science to work – both as an accelerator of innovation and a magnifier of returns on public investment in scientific research.

In an upcoming issue of Fusion in Europe we will look at the people working behind the scenes towards the realisation of Open Science in Europe and the benefits these efforts have already created. Stay tuned!