Panton Principles

Contributors
========

* Peter Murray-Rust
* Rufus Pollock
* Bryan Bishop
* Mike Chelen
* Cameron Neylon
* Daniel Mietchen
* Andy Powell

Basics
======

## Q1: What are the Panton Principles?

The Panton Principles are a set of recommendations that address how best to make published data from scientific studies available for re-use. In this context, “published” means “made public” and is not restricted to formal publication in the scholarly literature.

## Q2: Why the Panton Principles?

There is a need to state clearly what openness is in relation to public science drawing on the Open Knowledge Foundation’s [Open Definition](http://www.opendefinition.org/) and Science Commons’ Protocol for Implementing Open Access Data.

## Q3: Why are the Panton Principles important?

Scientfic research is stored in structured data files. Uncertainty about our ability or rights to use data from these studies limits their access and reuse. If the Panton Principles can increase sharing of data, it will speed the advancement of the topics.

## Q4: Who has adopted the Panton Principles?

Over hundred people from around the world have already signed up to the Panton Principles, see: https://pantonprinciples.org/endorse/ .

The Principles are also endorsed by the Open Knowledge Foundation. Science Commons have not “formally” endorsed the principles but they are compatible with the Science Commons Protocol for Implementing Open Access Data, and the head of Science Commons was a drafter of the principles.

## Q5: Who are the Open Knowledge Foundation and Science Commons?

The Open Knowledge Foundation is a not-for-profit organization founded in 2004 and dedicated to promoting open knowledge in all its forms. It is a leader in this field nationally and internationally.

The Foundation’s activities are organized around individual working groups and projects, each focused on a different aspect of open knowledge, but united by a common set of concerns, and a common set of traditions in both etiquette and process.

Science Commons is a project of the Creative Commons, a 501c(3) non-profit organization based in San Francisco that is best know for its development of the Creative Commons licences for creative works. The Science Commons project has focussed on examining how best to make the outputs of research available and has lead to the development a of a range of tools and policy statements in this space.

## Q6: Are the Panton Principles sponsored, supported, or considered in any way by the Open Source Initiative, the Free Software Foundation, the Electronic Frontier Foundation or other organizations? (i.e., what are their contributed opinions?)

These principles have not been explicitly sponsored or supported by the Open Source Initiative, the Free Software Foundation or the Electronic Frontier Foundation as these organizations either focus on (software) “code” or deal with more general digital rights matters. However, we believe the Panton Principles to be entirely consistent with the position taken by the Open Source Initiative and the Free Software Foundation in relation to code and the Electronic Frontier Foundation’s general position in relation to the rights in data.

## Q7: How should I cite the Panton Principles?

The Panton Principles may be cited using the following details in the relevant citation style for your publication:

Panton Principles, Principles for open data in science. Murray-Rust, Peter; Neylon, Cameron; Pollock, Rufus; Wilbanks, John; (19 Feb 2010). Retrieved [insert date] from https://pantonprinciples.org/

## Q8: Is “Open Data” a precise term?

The Open Knowledge Foundation defines Open Data very precisely, see: http://www.opendefinition.org/okd/. While there is some disagreement about the details (specifically with respect to share-alike provisions) there is broad agreement that for data to be open it must be freely re-usable by anyone for any purpose.

## Q9: Is Open Data the same as Open Source or Open Access?

No. Data has some specific differences to Open Source software licences and Open Access provided by licences such as CC-BY. The primary and critical difference is that in most jurisdictions the rights in datasets are different from those in “content”. This means that licences such as software licences and creative commons licences are not applicable to data.

Therefore – while the principles of wishing to enable access and re-use of data are similar to the motivations behind Open source and Open Access – the mechanisms need to be different.

## Q10: Is Open Data the same as CC0 or PDDL?

All CC0 and PDDL data is Open Data as it has been explicitly placed in the Public Domain. There is data that qualifies as Open under the Open Knowledge Definition which is not made available under these waivers such as the data from the Open Street Map project.

## Q11: What are community norms and why are they important?

A given community has a set way of working, an intrinsic methodology of activities, processes and working practices for which a consensus exists for the appropriate way in which these practices are carried out. For example, in the scholarly research community the act of citation is a commonly held community norm when reusing another community member’s work.

Community norms can be a much more effective way of encouraging positive behaviour, such as citation, than applying licenses. A well functioning community supports its members in their application of norms, whereas licences can only be enforced through court action and thus invite people to ignore them when they are confident that this is unlikely.

The Panton Priciples encapsulate an emerging community norm for the best way to share scientific research data.

## Q12: To which kind of data do the Panton Principles apply?

The Panton Principles apply to scientific research data – especially where that research has been publicly funded.

The principles, in their current form, do not include data from the social sciences and humanities. However, if there were interest from those communities in expanding the principles to those areas or for developing similar principles specifically for those disciplines that would be very welcome.

## Q13: Who decides whether to make certain data Open?

The owner(s) of the rights in that data. This is both a simple statement and potentially very complex as the rights owners may be many and varied. In most cases, the people who make a decision to publish, and were intimately involved in the generation of the data, should be making this decision. The organisations that fund the gathering and integration of data may choose to require that data they have paid for be made open.

## Q14: Where can I see examples of Open data?

See the [Data Hub](http://thedatahub.org/), especially those packages which are tagged as downloadable and openly licensed.

For a specific example see [Crystal Eye](http://thedatahub.org/dataset/crystal-eye), where all data are explicitly stated to be under the PDDL. There are over 1 million pages each with an OpenData button.’

## Q15: What are some of the options for hosting and serving Open Data?

Almost all Universities and publicly funded research organizations have an “Instititutional Repository”. Many repository managers are keen to manage data as well as publications and other artefacts. If they have not yet implemented an Open Data policy, suggest that they do.

The Talis Connected Commons initiative offers free hosting of data that are licensed CC0 or PDDL. See http://blogs.talis.com/n2/cc .

## Q16. What sort of material is data? Can graphs, tables, etc. be marked as Open Data.

There is a grey area between creative text and factual data. A graph or table can be regarded as the natural expression of data and the PP encourage people to take this view. There may be a problem where a Closed Access publisher regards nothing behind a firewall as Open Data while the authors and readers assume this is data.

## Q17. I have used third-party data in my research – can I combine it with my data and stamp it as Open Data?

This will depend on the particular circumstances. If there are explicit restrictions on the re-use of the other data this may forbid integration. (You may however be able to link effectively to it). In other cases you may need to make a judgment as to whether the data are “facts”, a significant amount of the third-party-resource and how it is organized (are you likely to infringe database “sui-generis” rights). You may also consider the community norms – in bioscience it is common and encouraged to create derivate works of data even there are not always explicit permissions. Do you think the data provider will welcome their data being given greater visibility and citations?

You may wish to take the “ask for forgiveness, not permission” approach.

For Data Generators
=============

## Q1: Why should I want my data to be marked as Open Data?

So that people can find out quickly what they can and cannot do with it.

## Q2: Do I have to do anything to make my data Open Data?

Yes, you have to explicitly apply an appropriate license. For more information see this guide:

## Q3: Is there a way to automate the process?

Yes, a license file or metadata can be attached programmatically. TODO: details of how.

## Q4: How far back can I go in labeling data as Open Data?

Old data is fine as long as the rights owners are still known.

## Q5: When should I label data as Open Data?

As soon as you release them to the public in a way that makes the data indeed reusable.

For Data Users
===========

## Q1: How do I know that data is Open Data?

A license must be specified that conforms to http://www.opendefinition.org/okd/

## Q2: Are there any restrictions on what I can do with Open Data?

Possibly, depending on the license. Some licenses, such as CC0, have no restrictions. Others may require attribution. Also note that those who generated the data may have specific wishes about how the data is used, described or cited. You should always aim to follow any reasonable requests made by the data owners/publishers. These may be explicit or may be implicitly understood by the community. You should make an effort to understand any relevant “community norms” for the data you are using.

## Q3: I’m a robot – can I tell what data is Open Data?

Yes, if the files include license information.

For Publishers
===========

## Q1: Why should publishers care about Open Data?

It has long been accepted that access to the data behind research is essential for transparency and reproducibility. Simply making data available however is not enough unless the data is also openly licensed.

Moreover, making the data open will increase the visilibility of the original research and promote reuse and the likelihood that other authors build upon (and cite) that original research — all of which are good for both the author and the journal.

## Q2: I publish Open Access journals – does that automatically mean the data are Open Data?

No. The journal must in addition apply a license to the data. Moreover, in general, content licenses (e.g. standard Creative Commons licenses) are *not* appropriate for data (see Panton Principle no. 2). We recommend specific licenses to use in principle no. 4.

## Q3: My Editor and Board are insisting that authors deposit their data – should it be Open Data?

Yes.

## Q4: How do we mark our data as Open Data?

Include a license file in the dataset, or explicitly state the license in any available metadata. In addition you can apply an open data button, see the sidebar on the Panton Principles site.

For Repositories
================

## Q1: I want to deposit my data in my Institutional Repository – will it be Open Data?

Repository managers are keen that their contents are widely accessed and Open Data will help to highlight the output of the institution. Repository managers (like publishers) sh become aware of the value and practice of Open Data and seek to promote it. Repositories have a wide range of digital artefacts and some of them may be covered by other rights and practices so it is unlikely there will be a universal policy.

## Q2: Are the main discipline repositories Open Data?

In subjects such as astronomy, bioscience, and many others there are implicit and explicit community norms which ensure considerable freedom of re-use but are not Open Data. See for example: http://www.galaxyzoo.org/copyright which explains why the data have to be licensed in a non-OpenData manner. We believe that as Open Data becomes more widely known that many providers will be keen to change their policies to comply with the Panton Principles.

Panton Principles

Principles for Open Data in Science

Web buttons

Related Links