Core Data Curators - Introductions


Hi All,

I am Andreas and work as in search engine optimization. Working with very large dataset every day mostly bash tools to work the data, excel to make it pretty (kind of), and custom tools for my job, like splunk for loganalysis, large scale seo tools.

This weekend will be my time to take a close look at how things are done. Looking forward to this!


Welcome @andreas - great to have your participation. Let us know how you get on with the Getting Started Guide.



I’m Peter and I live in Seattle. My background is in large scale IT infrastructure management and I’ve recently set out on my own doing web development.

From my own experience building data visualizations and maps, as well as from organizing a couple civic hackathons, I’ve felt the time suck of corralling messy datasets. I want to help make it easier for others to use open data while becoming better skilled myself in processing unstructured data.

And I will say that messy, unstructured datasets are still better than no data at all, so I am keenly interested in upping my involvement with open data advocacy. I look forward to strategizing with like-minded spirits!


Hi everyone, my name is Riyadh, a phd student at SOAS researching copyright and human rights. I’m interested in copyright, open data, and the commons. My background is legal, but I can manage doing all the basic things on Excel and github.


@peter @bluechi welcome and great to have your contribution. Please head over to the Registry:

Have a browse and see if there is a dataset you would like to take on - either researching or packaging (you can also suggest new datasets!)



I’m Santiago, a recent graduate in Library and Information Science with no previous experience on data curation.

However, I know how to make Google do a barrel roll (and luckily some more useful things too), how to properly query databases and I have some idea on assessing the quality of information sources. I also know easy BASH tricks, some HTML and I am learning Python. I’m also a Spanish and Catalan native speaker, in case that could be of any help.

My participation is driven by curiosity: I want to learn how to prepare neat data packages, I want to see what people can do with them and, in the meanwhile, I want to meet brilliant people and improve my IT skills. I’m really glad to colaborate with you in this project.


@stmartin great to hear from you! Please dive in and get started - the guide is here:

And the queue of things to contribute to is here:


Hi all! Sorry for the delay in getting started. I work as a researcher in the fields of law and anthropology (international law, public health, indigenous people’s rights, medical anthropology,…). For the last several years I have been specializing in data science and visualization applied to the social sciences. I am a big believer in the importance of open scientific cooperation and the unrestricted, efficient sharing of information and knowledge between people -the implementation of open data is an obvious prerequisite to enable and maximize the impact of such scenario. So here I am, looking forward to contributing to the initiative by creating some data packages!!

EDIT: Forgot to mention, I am proficient in Spanish and Catalan (my mother tongues) and English. I also speak a little bit of Simplified Chinese (my wife is from Shanghai, I used to be an expat over there). R and Python are the languages I am most comfortable with. Also Sed, Grep, regular expressions and the like. Could code some HTML / Css if necessary (no js though).


Hola Santiago. Nice to meet a fellow Catalan over here! I am originally from Barcelona, although I have spent the last eight years abroad. Where are you based?


@EnricGTorrents welcome! Please dive in - the guide to getting started is here:

And the queue of things to contribute to is here:


Welcome, Enric, and nice to meet you too. I’m currently based in London. Are you living nearby?


Thanks, Santiago! No, I am based in Almería. Just got back to Spain after spending eight years living abroad (first four in the UK -one in London-, the rest in China).


Hi Everyone, I’m Robb. My background is in software engineering and law. My current work is in getting the world’s restaurant health inspections online, and publishing law in an accessible way.

My skills are the US legal system and intellectual property law, git and GitHub, software architecture, and languages like Python, Ruby, Perl, and Javascript.

I’m interested in contributing because I’m already working to make more data open, and I can see that I have the skills to contribute here.

Good to meet everyone!


Welcome Robb - great to have your participation!


Hello everyone, sorry for joining late.
I am Mahroof, a civil engineer - urban planner from India, currently working for, an action research project with grants from the Gates Foundation. I am deeply interested in GIS and spatial analysis, and data wrangling and analysis is a key skill. I am part of a growing open data community in India ( and am organizer of the local chapter in our City, Ahmedabad, trying to coalesce an open data community here. Hope to learn from you all, and to contribute to OKFN.


Hello everybody,

I am Yann-Aël, and work as a postdoc researcher in machine learning/big data/bioinformatics at the University of Brussels. As programming languages, I mostly use R, and occasionally Python.

Interested in open data initiatives, and willing to contribute here to make some more public data easily available.


Hi, I tried to start an OKLab in Fürth/Nürnberg, but did not find any potential collaborators yet. I want to throw in some volunteer work now to get more involved with the OK community.

I’m working as a freelance developer … have a long background as a data analyst … especially experienced in data wrangling with Excel (focused on vba) … data management with Sql Server … working in c#, t-sql, javascript, some python (for scraping), some php (re Wordpress) … academic background is political science and communication science … and I’m enthusiastic about the Open Data Movement.

When fiddling with open datasets (from I saw that many of the csv-files show rather unusual and non-standard formats - with leading lines before the actual data table starts and with multi-line headers. This is bad. It makes the data impossible to consume automatically and requires some manual cleaning before files can be read in. So data curation is a major issue.


Hello everybody,

My name is Alex and I have an economic data website at I am interested in creating new ways to organize economic data. I have experience using excel and python. I also have experience finding economic data on the internet. I want to contribute to Core Datasets because economic data needs to be more accessible to the public.


@alexpeek1 welcome Alex and thanks for introducing yourself! If any general questions arise around the Core Datasets effort please just open a thread here in the forum


Hi, all! I’m Hoony, Jang an organizer of Code for Seoul in South Korea.

I’ve uploaded some spending data on and developed in last year and using korean local govs debt data. (I was participated in ok festival 2014!) I’m mainly doing web programming, but love data so that studying hard data science now.

I heard about this project from @jgkim . there are lots of open data in Korea but not much related to transparency I think. So I’d like to contribute to core data project to make this environment which is hard to civic hacking better.