How would you build a responsible climate chatbot?

AI Learning Labs is partnering with the educators, students and scientists from Climate Academy and #semanticClimate to build a prototype for a chatbot that can answer questions about climate change reliably.

If you ask a mainstream chatbot a question about climate change, you can’t know for sure what it bases it’s answers on or whether its misinformation. Climate Academy wishes they had something better to offer students for learning about the climate crisis.

We’re asking: How would you build an alternative? One that also considers the climate impact of AI? There’s more than one way to do this, and I bet people in this forum have suggestions. Are there projects and methods you recommend? Join us in considering options.

We’re also hosting an online roundtable on May 18 for everyone who wants to hear (or speak) more.

3 Likes

More context about this project for folks interested:

In terms of sourcing high-quality open data, Climate TRACE offers very detailed open data about emissions from millions of asset locations across the world. Go to Data Downloads - Climate TRACE | Climate TRACE to find out about the data downloads and API.

Global Energy Monitor also offers fantastic open data on the energy system and energy companies - including the Global Energy Ownership Tracker.

Global Energy Monitor are a Climate TRACE partner so this ownership data is reused on the Climate TRACE platform. Here is a recent blog post explaining how to make use of this ownership data for emitting assets.

Global Energy Monitor are a member of the Global Open Data Integration Network which I help to convene. This calls on organisations to use the Legal Entity Identifier in more global datasets to uniquely identify companies to improve interoperability and allow linking between datasets more easily.

3 Likes

Thanks for sharing these resources. Climate Trace is incredible, I hadn’t seen it before. And visually beautiful too.

Recently I had a chance to learn more about energy topics in connection with a publication I worked on with the Green Web Foundation called State of the Fossil-Free Internet. I wish there had been more open data available from tech companies about their data centre and cloud service emissions!

1 Like

We are very fortunate in having Matthew Pye and Jules Pye providing the key learning infrastructure for climate. “Climate” is so huge that no one can understand all of it, so there’s no simple answer. Matthew and his collaborators have created a system for late high-school where learners are likely to be reasonably fluent in their first language, familiar with English as a Second Language and understand the basis of numeracy.

Matthew’s “Climate Academy” combines key Climate {facts+science} with {philosophy, politics, history}. This is unusual so it’s a great opportunity to develop a simple chatbot.

My main fear of chatbots is that the general models on the web are untrustable. Most of the time they’ll be reasonably ok, but occaisonally they’ll hallucinate. I’ve found this especially with numbers, dates, names, references, etc. We can’t afford mistakes so we are providing guardrails using RAG

Wikipedia >Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information from external data sources.[1] With RAG, LLMs first refer to a specified set of documents, then respond to user queries. These documents supplement information from the LLM’s pre-existing training data.[2] This allows LLMs to use domain-specific and/or updated information that is not available in the training data.[2] For example, this helps LLM-based chatbots access internal company data or generate responses based on authoritative sources.

We have started by using Matthew’s book (350pp) as the RAG guardrails. The first version of the chatbot explicitly shows every paragraph in the book which contributes to the answer. In this way readers can decide how good or useful the answers are. (We’d expect to add feedback tools soon).

In implementing “Responsible” I look for trust and traceability . Trust requires the original “ground truth” text. It may not be truthful in an absolute sense but it’s persistent and accessible. Traceability means we can follow how the machine arrives at its final answer .

In the current prototype (led by Aleena Harold Peter from semanticClimate) the reply form the chatbot is supplemented by "chips"with links to the original text. There can be up to six or seven of these so the user has a chance to see how well the chatbot is able to synthesize the info. Ultimately we hope to have readers give their own feedback.