Frequently Asked Questions
What is the status of Text and Data Mining activities under the current Canadian Copyright regime?
The Canadian Copyright Act does not address text and data mining. The federal government has signalled that they intend to consider changes to the Act for this type of research, but there is no clear regulation at this point.
For more information see ‘A Consultation on a Modern Copyright Framework for Artificial Intelligence and the Internet of Things’: https://www.ic.gc.ca/eic/site/693.nsf/eng/00316.html
What are the applications and limits of the fair dealing doctrine?
Canadian courts have not provided sufficient direction for researchers to confidently use the fair dealing exception to allow for broad copying of entire protected works for TDM. It should be noted that this is contrary to rights codified in the U.S. fair use where copying for TDM has been successfully used as a defence for researchers in that jurisdiction. Other jurisdictions, such as Japan and the EU, have introduced exceptions specific to TDM. Canadians should expect direction from the government on this issue but should seek the permission of the copyright owners for any mass copying for TDM research.
Can I harvest publicly available data - Twitter feeds, facebook comments, news sites, etc. for analysis?
There is an important distinction between facts and data and copyright protected works. Copyright rules do not apply to raw facts and data, but do apply to the original expression of the data in, for example, the form of written discussion, charts, graphs, etc. However, publicly available data and works are generally protected by a license or ‘terms of use’ that will stipulate how information on a website can be used. Unless there is language in the license that permits the type of copying necessary to harvest the information, permission from the owner is required.
How do I obtain permissions to harvest a corpus of texts, and are there different licensing and access models?
Looking into the rights or permissions needed to harvest a corpus of texts, or to use data, should be the first consideration when they are needed for a research project. You are taking an avoidable risk if you don’t request rights and permissions until after the research is complete, and you may be disappointed if permissions are denied. It is important to have as much information as possible on what material is necessary, how it will be used, if it will be used in partnered research, if you will transform or build upon it, where it will be stored, etc. Permission can be as simple as an email or as complex as a licensing agreement. If the ‘terms of use’, or a license such as the Creative Commons, the Community Data License agreement, etc. are specified, you will need to make sure your intended use aligns with what is allowed under the specified license or ‘terms of use’. If no license is specified, you cannot use, share, distribute or change the material without obtaining permission or a license from the owner. Alternatively, if your intended use is not allowed under the license agreement that was specified, you will need to ask the owner’s permission.
Can anyone at UNB Libraries help me navigate the legal landscape on this and reach out to implicated publishers?
For more information, contact Josh Dickison, Copyright Officer and Manager of Digital Delivery at UNB Libraries: copyright@unb.ca
UNB Libraries may forward you to, or ask you to contact, the Office of Research Services at ors@unb.ca if a research license agreement is needed, or for assistance on different licenses and their allowed uses.