This lesson is still being designed and assembled (Pre-Alpha version)

Intellectual Property, Licensing, and Openness

Overview

Teaching: 8 min
Exercises: 2 min
Questions
  • What is intellectual property?

  • Why should I consider IP in Open Science?

Objectives
  • Timeline matters for legal protection

  • Understand what can and cannot be patented

  • Understand what licenses to use for re-use of data and software

Open Science and Intellectual property

Intellectual property (IP) is something that you create using your mind - for example, a story, an invention, an artistic work or a symbol.

The timeline of “opening” matters when one seeks legal protection for their IP.

For example, patents are granted only for inventions that are new and were not known to the public in any form. Publishing in a journal or presenting in a conference information related to the invention completely prevents the inventor from getting a patent!

You can benefit from new collaborations, industrial partnerships, and consultations which are acquired by openness. This can yield greater benefit than from patent-related royalties.

(Optional) Intellectual property protection

You can use a patent to protect a non-obvious (technical) invention that provides “technical contribution” or solves a “technical problem”. It gives you the right to take legal action against anyone who makes, uses, sells or imports it without your permission.

In principle, software can be patented. It is usually, settled by the court for each case.

Software code is copyrighted. Copyright prevents people from:

  • copying your code
  • distributing copies of it, whether free of charge or for sale.

Data cannot be patented, and in principle, it cannot be copyrighted. It is not possible to copyright facts!

Facts are not patentable, and since machine learning algorithms like neural networks are basically mathematical methods, they are exempt from protection. However, applied to a certain problem, an algorithm may become part of a patent. IF framed it in the right way, patenting an algorithm is possible. For example, a deep learning algorithm generating a certain kind of audio may be eligible. But that would not prevent the network from being applied to any other problem.

However, how data are collated and presented (especially if it is a database), can have a layer of copyright protection. Deciding what data needs to be included in a database, how to organize the data, and how to relate different data elements are all creative decisions that may receive copyright protection. Again, it is often a case by case situation and may come down to who has better lawyers.

After:

https://www.uspto.gov/patents/basics

Exercise 3: Checking common licenses

  1. Open CC BY license summary https://creativecommons.org/licenses/by/4.0/ is it clear how you can use the data under this licence and why it is popular in academia?

  2. Check the MIT license wording: https://opensource.org/licenses/MIT is it clear what you can do with software code under this licence?

  3. Compare the full wording of CC BY https://creativecommons.org/licenses/by/4.0/legalcode can you guess why the MIT licence is currently the most popular for open source code?

Solution

  1. CC BY license states material can be reproduced, shared, in whole or in part, unless where exceptions and limitations are stated. Attributions must be made to the Licensor.
  2. MIT license states that Software can by used without restriction (to copy, modify, publish, distribute etc…)
  3. The MIT license is short, to the point and optimised for software developers as it offers flexibility.

Attribution

Content of this episode was adapted from:

  • @@(https://carpentries-incubator.github.io/fair-bio-practice/)

Key Points

  • A license is a promise not to sue - therefore attach license files

  • For data use Creative Commons Attribution (CC BY) license

  • For code use open source licenses such as MIT, BSD, or Apache license