- A project with Lea Berrang-Ford, Priestley Chair in Climate and Health, Sustainability Research Institute (SRI) and Department for International development (DfID) looking at the mapping between topics and locations of research papers.
- Duration: 2 weeks
- CEMAC Output – To create an interactive web interface to visualise existing natural language processing work.
- Project URL: https://cemac.github.io/DIFID/ui/
As part of the DfID project papers are fed into a Natural Language Processing (NLP) algorithm to generate a set of weighted topics. Plotting these as a graph we can see that topics tend to be grouped in a scale-free graph structure.
The live site is here: https://apsis.mcc-berlin.net/climate-health/
Using a dimensionality reduction algorithm (t-SNE), it is possible to achieve much the same result, displaying like-papers together. This is shown as part of the DfID app at the link above.
Within this we can also display items geographically.
The CEMAC role was to provide guidance and produce an interactive app with the following criteria:
- Visualise the global positions of each paper and study location
- Visualise the dimensionality reduced grouping of items
- Interactive zoom, item identification and filtering
- Selecting individual continents
- Filtering using a hierarchical topic tree
- Slider to isolate items with only strong relationships to a topic
- Linking data points to a download link for each paper
- Fuzzy matching for relevant papers
- Intuitive non-obfuscated region identification (using the t-SNE dataset)
- Using data within the format provided (no pre-processing)
- Potential scalability – the visualisation needs to still be responsive with millions of data points.