Entry for National Laws / Australia

tobybellwood · March 6, 2017, 8:47pm

hi @JXhaf - in your review for National Laws, you have marked down for the data not being in HTML format, but as this is the primary mechanism for their web delivery, it seems a little harsh?

e.g. https://www.legislation.gov.au/Latest/C2016C00828 is the dedicated HTML page for the Bank Integration Act 1991 (C2016C00828), and will always be the latest version.

In https://www.legislation.gov.au/Content/Linking they even acknowledge the use of crawlers!

JXhaf · March 8, 2017, 5:36pm

Thank you @tobybellwood for your feedback. After also consulting with Alyssa Beaton of Open North, I will not be making any changes to the review because of the following reasons:

In the link that you provided it clearly states that:"Most documents on this website can be downloaded and printed using the “Download” tab at the document level. Documents are usually available as a PDF file, a formatted text (.doc, .docx. or .rtf) file, and a zip file. " As you can see there is no mention of the contents being available in html. My review was based on the fact that the text of the laws is actually not available in html- despite the fact that I found the document on a webpage. The only information available in html is the title of the laws (which is not sufficient data for the purpose of the survey)
as per your comment on the use of crawlers, you are right to point that they do acknowledge the use of crawlers- but this is insufficient. While the information can be crawled, it cannot be scraped (this is more important for the purpose of the survey). Information on the difference between scraping and crawling can be found here: https://www.quora.com/What-are-the-biggest-differences-between-web-crawling-and-web-scraping

tobybellwood · March 11, 2017, 2:14am

thanks @JXhaf for the explanation. I would certainly benefit from a documented clarification on what constitutes open formats. If the HTML is parseable at a defined endpoint (however tough that parsing may be), surely it’s open, given the plain text nature. In this case it is relatively simple to parse the (relatively) well-structured HTML from the webpage using core Python libraries.

It’s worth adding that AustLII (the other main free access resource for Australian Legislation) provides a version of the laws in a much cleaner HTML format (BANK INTEGRATION ACT 1991) as well as RTF/TXT download (Download Menu) which may fit the open definition better in this case.

JXhaf · March 15, 2017, 3:08pm

@tobybellwood. After your useful feedback, I have updated the review to include a comment that the data is available in html. However, as you may have noticed there has been no change on the survey itself as html is no longer one of the options for machine readable formats (because it can be difficult to scrape the content).

As per the other source you indicated, it does not appear to be a government source and so it is not relevant for the purpose of the index.

tobybellwood · March 15, 2017, 8:34pm

Thanks @JXhaf for the update, and all your hard work! Much appreciated.

Topic		Replies	Views
Entry for law / nz Global Open Data Index 2016	1	619	May 31, 2017
Entry for National Laws/ Taiwan Global Open Data Index 2016	10	807	June 9, 2017
Entry for draftlegislation / fr Global Open Data Index 2016	6	2214	December 15, 2017
Revised Entry for National Laws / Ukraine Global Open Data Index 2016	3	1014	June 29, 2017
Entry for National Laws / Argentina Global Open Data Index 2016	3	864	June 12, 2017

Entry for National Laws / Australia

Related topics