For our Open Trials project, we are aiming to index and make links between different data sources on clinical trials, drugs, and health conditons. Toward this end, we’re looking to incorporate structured data from ClinicalTrials.gov. We know lots work has been done on scraping Clinical Trials in the past (including by Open Knowledge ). We’ve come up with the following list on past work. Does anyone have experience here? Any pitfalls to avoid?
There is a project called LinkedCT, which crawls and turns data from ClinicalTrials.gov into linked data, making links between different datasets including DrugBank, DailyMed, PubMed, Wikipedia, etc. However, I guess data on LinkedCT is not up-to-date.