I’m one of the developers working on the EveryPolitician/Democratic Commons projects, and @georgieburr asked me to explain a bit more of what we’re doing.
Getting information into Wikidata happens in one of two ways - entered manually, or entered and maintained in bulk. We agree that bulk entry is often the best path to take, but the main barrier to overcome is that of licensing - Wikidata requires that information entered be free of licensing constraints (or more specifically that it may be licensed as CC0), but in most of the world the fact that something exists on the web doesn’t guarantee this. In addition, writing scrapers or transforming data can be technically challenging, and by offering a lower barrier to entry we can support and encourage a wider range of users.
To help overcome this we’re encouraging people who have an interest in the data being correct to help maintain it in Wikidata, using a few tools we’ve built to smooth the process. The “prompts” that Georgie refers to is basically a comparison between any CSV source and the data in Wikidata - an example of this is one which compares current members of the South African National Assembly. Behind the scenes this compares a Wikidata query for all current members with the output of a scraper looking at the official website. By highlighting discrepancies we make it easier for groups to use Wikidata as their primary data store by making it easier to spot missing or incorrect information, and then correct it.
In many cases we find that local groups have also done excellent work in creating or maintaining pages on Wikidata. Since information in Wikidata is suitably licensed, we can use these pages and the fact that Wikipedia and Wikidata are aware of each other to mass-import information. To aid this we’ve built a tool known as Verification Pages which takes a list of things which are presumed to be true (in CSV format, so this could be the output of a scraper or an existing source) then presents people with statements (such as “Joe Bloggs is a councillor for Foobury”), links to something which could be a source to back this statement (such as the Foobury council website), and then asks if the two match. Where they do, this tool automatically adds the necessary data to Wikidata, meaning large volumes can quickly be checked for veracity and then entered.
Finally, where people already have found or created data which can be licensed as CC0 we’re encouraging them to load it into Wikidata, and then both maintain it and rely on Wikidata as a primary source of data. This neatly brings us on to your second question!
Getting data out of Wikidata is possible in a number of ways. As you mentioned you can use the Wikidata API to easily access information about a specific object (such as a single politician). You can also use the Wikidata Query Service to make SPARQL queries against the data which can return the results in a number of formats (eg current members of the South African National Assembly in JSON).
As far as EveryPolitician goes, we’ve already switched some countries to use Wikidata as their primary source of data. For example, South Africa relies exclusively on Wikidata to decide who to include as a politician. The plan is to move countries to use Wikidata as this type of source as and when the data for those countries is accurate enough - EveryPolitician will then continue to make the data easily available. Even where we don’t use Wikidata as a primary source of information, in many countries EveryPolitician also includes biographical data (such as date of birth, or gender) which is sourced from Wikidata.
Ultimately the intent is that by encouraging people to all use the same source of information, the amount of effort needed to maintain it will be shared amongst all users and similarly all will benefit. We plan to continue generating datasets from the information in Wikidata on a regular basis and storing them in GitHub for all to use with the minimum of effort.
Hopefully this answers your questions - if you want me to dive into a bit more detail on any of it, just let me know!