Tutorial - (based on the tutorial of the Global Open Data Index 2015)

Introduction

Thank you for contributing to the Finland Local Open Data Census, a joint community effort undertaken by contributors around Finland.

The Census helps quantify and compare the openness of government data around Finland. It is both an advocacy tool to promote open data and a research tool to evaluate it. It is important, therefore, that the data submitted to the Index is as accurate as possible.

The following tutorial will help you make your first contributions to the Census.

How does the Census work?

The Census is built on two units — places, that represent Finnish municipalities, and datasets. There are 20 datasets representing different governmental themes. Each datasets has a description followed by a list characteristics. These characteristics describes the dataset and suppose to assist you in finding the right dataset to evaluate. For more information about each dataset, please refer to our methodology page.

Each of these datasets are first evaluated by the contributor with nine different questions. After this process the data is sent for review. The reviewers will check the submissions that were submitted under a given dataset (e.g. spending).

After of the reviews are completed and the datasets are evaluated by reviewers, final score will be shown publicly on the website.

Contributing a new submission to the Census

In this Census you can contribute in 3 different ways:

Enter a new submission: you’ll see an ‘Add’ button under the dataset Update a past submission with new information: You'll see a colored block that reflects the last year that there has been a recorded score for the place's dataset. Click the block and you'll see options to view or update the data, choose 'Update". Leave a comment: you’ll see a ‘Review’ button under the dataset

Add a new submission when there is no data on a specific dataset in a municipality or, if there is already data in a category where you have more updated information, you can propose an revision (see further down in this tutorial).

The survey website is -The Finland Local Open Data Census

  1. Select a place and a dataset from the homepage.

  2. Login either with your Google account or Facebook account. (You don’t need to log in to leave a comment.)

  3. Start answering the questions about the dataset.

You’ll be asked a series of questions about the dataset you have located to help determine how it is scored. There is a help text prompt next to each question in the “information” column.

Below, you will find further tips for each specific question.

Does the data exist? If you choose “yes”, you will be asked to enter the data publisher, a title and a brief description and then you will be guided through the remainder of the questions.

If you choose “no”, your submission will be recorded as such. It is very important to know the data does not exist, so please be sure to indicate so rather than leave the entry blank if you have made an honest effort to look for it but can not locate the dataset.

How do I know if the data exist?

Most governmental data can be found on the national government data portal (such as data.gov.uk). Try to search these portals first. If a country doesn’t have a data portal or the data is not on their portal, look for the data on specific government departmental websites. For example, look for budget data on your national treasury website. If you can’t find the data on a website, email the relevant department and ask them about the dataset and whether it’s online. If email isn’t an option, call the relevant department; sometimes you can get a better response on the phone than on email.
Still can’t find the data or you didn’t get an answer from the government? Try one last time by using your favourite search engine.

Is the data in digital form? Choose ‘yes’ if the data exists in any digital format, even if it can’t be accessed on the Internet. Data can be digital, but not accessible online.

If you choose “no” you’ll notice a couple questions greyed out because if the data isn’t digital, by default it is not online, it is not machine readable, nor is it available in bulk. Move on to the next question.

How do I know if the data is in digital format?

If the data exists, but only on paper, it’s not digital! If you found the data on the Internet, it’s definitely digital, even if it’s just scanned versions of paper documents. Some data might be in digital format on a private government network, but not available publicly on the Internet. If you are aware that the data is digital somewhere (for instance, if a government official tells you so), then mark this one “yes” and add a note about how you acquired that information and any relevant contact details or links.

Is the data publicly available?

Choose “yes” if the data is made available to the public in any format without restrictions.

If you choose “no” you’ll notice a couple questions will ‘disappear’ because that implies that the data is not available for free, is not online, is not openly licensed, nor available in bulk.

How do I know if the data is publicly available?

If you need a password or some other form of permissions to access the data it’s not publicly available. If the data is only available in paper form without any restrictions on the number of copies you can make, it’s publicly available. If there are limits on photocopying, it’s not considered publicly available. If you need to make a freedom of information (FOIA) request to access the data, it is not publicly available. If the data is only available to government officials and not citizens, it is not publicly available.

Is the data available for free?

Choose “yes” if the data is available without any cost.

Choose “no” if there is any cost involved in accessing the data. You’ll notice that the openly licensed question is greyed out because being available at no cost is a key provision of the Open Definition and open licensing.

Is the data available online?

Choose “yes” if the data is available on the Internet and you will be prompted to add the url that links to it.

Choose “no” if the data is not available anywhere on the Internet.

How do I know if the data is available online?

If the data is publicly available (see above) and can be freely accessed on the Internet, it is available online. If the data is available in digital format, but not available on a public website, it is not considered to be available online.

Is the data machine readable?

Choose “yes” if the data is in a format that can be easily processed by a computer.

Choose “no” if the file format can not be easily processed by a computer.

How do I know if the data is in a machine readable format?

The easiest way to answer this question is to look at the dataset’s file type.

As a rule of thumb the following file types are machine readable: .XLS .CSV .JSON .XML The following formats are NOT machine readable: .HTML .PDF .DOC .GIF .JPEG .PPT If your dataset is a different file type and you don’t know if it’s machine readable or not, ask in the Open Data Census forum thread.

Is the data available in bulk?

Choose “yes” if the entire dataset can be downloaded at once.

Choose “no” if you can not access the database in its entirety.

How do I know if the data is available in bulk?

If you aren’t able to download a single file that contains the entirety of the dataset you are looking for, it is not available in bulk. Often times governments will provide access to their data through an online interface. If access is restricted to querying a web form and retrieving a only a subset of results at a time from a very large database, the data is considered to not be available in bulk.

Is the data openly licensed?

Choose “yes” if the data is licensed in a way that conforms to the Open Definition.

Choose “no” if the data is protected under a license that does not conform to the Open Definition.

How can I find the licensing information?

Usually, a license or Terms & Conditions can be found at the bottom of the website (in the footer) or under the site’s “About” section. If the site has a search function or a sitemap, those are good places to look as well. If there is no visible license and there are no terms and condition or any other information on the site, the data is not open and you should answer “no”.

How do I know if the data is openly licensed?

In order for data to be openly licensed, it needs to be free to use, reuse, and free to redistribute. The Open Definition website lists the licenses that are certified open. If Creative Commons (CC) licences are used then the data is generally openly licensed. If the non-commercial CC licenses (the ones with “NC” or “ND” in their names) are used, then the data is NOT considered openly licensed. These are partially open, but not fully open according to The Open Definition. Sometimes countries do not make use of Creative Commons licensing, but the terms and conditions do allow use, re-use and distribution. In that case we suggest you write to the Open Data Census forum thread and get feedback from the community about how to answer this question.

Is the data provided on a timely and up to date?

Choose “yes” if the data is relevant and complete for the year or time period that it claims to represent.

Choose “no” if the data is outdated or otherwise not representative of the stated or a reasonable time period.

How do I know if the data is timely and up to date?

Check the date-stamp on the data (see below if that’s not obvious). If the data doesn’t seem relevant for the current year, mark a “no”. It’s important to remember that not all datasets need to be updated with the same frequency. Transportation data can be updated on a daily basis while postal codes might not change for many years. Do your best to determine what is reasonable for a given dataset. Does the data align with how your country’s government works in a particular area? If national budgets are determined yearly, there should be yearly data, if they’re determined every two years, then a two year period for the data would be considered timely and up to date.

How to find when the data was published?

If the dataset was found on a government portal, there will most often be a timestamp attached to it. If the data was found on a government site, sometimes the date will be written next to the dataset link or a release date might be listed in some related content (like a press release or news clip). Sometimes the date stamp is within the dataset itself. For example, a tab in a spreadsheet that is named for the date it represents. You can also download the data and find the creation date of the file. This might not represent the right date but give you some useful clues. Use your best judgement and leave thorough comments about your assessment. Lastly, sometimes there are no timestamps at all. In that case, it might be most fair to mark it not timely or up-to-date.

Please join the conversation in the Finnish Open Data Ecosystem if you have any questions or need some help.

Commenting on a submission

The Index allows only one submission per one dataset. However, you can still help by commenting on a current submission and propose changes. Leaving detailed notes in the comment field goes a long way in supporting work on the Index during the review process.

Here’s some tips for leaving good comments:

If you’ve determined that a particular dataset does not exist, let us know where you did look and what factors led you to believe that it doesn’t. If the dataset is available in digital form but not available online, let us know what format the data is available in and provide contact information or further instructions on how one could get a hold of the data. If the dataset is not publicly available, let us know how one could get a hold of the data, if at all. If the data is only available through a freedom of information request (FOIA), give us an indication of what is involved in making that request. If you are unsure whether or not a file type is machine readable or not, mark ‘no’ as an answer and explain your rationale in the comment section. If the dataset is not available in bulk, describe what is available and how to access it, including links. If you are unsure as to whether a license is open or not, answer ‘no’ and indicate why in the comments field. Include information and/or links to licenses or terms of use pages so the reviewer can quickly make a second assessment. Let us know why you think the data is either made available in a timely manner or not. Different places have different legislative and governmental spending cycles so it’s important for us to understand the local context as much as possible before making a judgement.