class: top, left, inverse, title-slide .title[ # Text Clustering using the Human Rights Index (UNHR) ] .author[ ### Lampros Sp. Mouselimis ] .institute[ ### Monopteryx ] .date[ ### 2023-08-05
monopteryx-dashboard/
linkedin
Declaration of Human Rights, January 2021
] --- class:hide_logo
# Human Rights <br> Human rights are moral principles or norms for certain standards of human behavior and are regularly protected in municipal and international law. They are commonly understood as inalienable, fundamental rights "to which a person is inherently entitled simply because she or he is a human being" and which are "inherent in all human beings", regardless of their age, ethnic origin, location, language, religion, ethnicity, or any other status" [wikipedia](https://en.wikipedia.org/wiki/Human_rights) Human rights include the right, * to life, liberty, and security of person * to freedom from torture and cruel, inhuman, or degrading treatment or punishment * to freedom of thought, conscience, and religion * to freedom of opinion and expression * to work and favorable conditions of work * to education * to an adequate standard of living Human rights are important because they protect the dignity and worth of all human beings. They ensure that everyone has the opportunity to live a life free from fear and discrimination. They also help to create a more just and peaceful world. --- class:hide_logo # Text Analysis of the Human Rights Index * One of the many data sources available on the web related to human rights is the "Universal Human Rights Index" (UHRI). As the [official webpage](https://uhri.ohchr.org/en/) mentions *"The users can produce overviews of recommendations by region, country, human rights themes, concerned groups and by Sustainable Development Goals (SDGs) and targets, as well as to perform text searches and advanced searches by using filters."* <div class="figure" style="text-align: center"> <img src="images/human_rights_index_img.png" alt="Source: https://uhri.ohchr.org/en/" width="50%" /> <p class="caption">Source: https://uhri.ohchr.org/en/</p> </div> * The website also allows users to access the latest data from the [Human Rights Index](https://uhri.ohchr.org/en/our-data-api) website using * the REST API * a big download in the form of a .json or .xlsx file To perform text processing and analysis, I downloaded the .json file from date "2023-08-03" --- # Workflow The following diagram shows the workflow that I followed, <img src="diagram/diagram.png" width="100%" style="display: block; margin: auto;" /> --- class:hide_logo # Text & Word frequencies (Bar Plot) * From the downloaded data I used the "Text" attribute for text processing and analysis and the "Themes" attribute as an indication of the number of clusters. * Once the word embeddings were computed and the clustering has been performed I visualized the results. The bar plot shows the *word frequencies* for 8 out of the total 100 clusters (or "Themes") <img src="images/bar_plot.png" width="105%" style="display: block; margin: auto;" /> --- class:hide_logo # Text & Word frequencies (Word Clouds) For each one of the 8 bar plots (and clusters) we see that there are *dominant* words such as, * *rights*, *violence*, *education*, *committee*, *state*, *trafficking*, *detention* In a similar way, we can illustrate the *word frequencies* in the form of *wordclouds*, <img src="images/wordcloud_multiplot.png" width="105%" style="display: block; margin: auto;" /> --- class:hide_logo # Graphs of Word frequencies (Static 1) An alternative way to explain the results and show potential connections between the *high word frequencies* is based on *Network Graph Visualizations*, <img src="images/igraph_clusters.png" width="60%" style="display: block; margin: auto;" /> --- class:hide_logo # Graphs of Word frequencies (Static 2) We observe that same high-frequency words (based on the "Text" attribute) appear in different Clusters. For instance, in the following image * the words "efforts" and "women" are present in the "Cluster-1" and "Cluster-2". However, "Cluster-2" includes also the high-frequency words ("measures", "violence") whereas "Cluster-1" ("rights", "human") indicating that these two clusters represent different topics. <img src="images/cluster_closeness.png" width="60%" style="display: block; margin: auto;" /> --- class:hide_logo # Graphs of Word frequencies (Interactive) Interactive Network graph visualizations (similar to the next image) with additional functionality (compared to the static) are feasible and can be viewed in the corresponding [Youtube video](https://youtu.be/MpcqHeRzbsQ) <img src="images/word_freq_interactive_viz.png" width="60%" style="display: block; margin: auto;" /> --- class:hide_logo # Graph of Clusters (Interactive 1) Besides the word frequencies (of the "Text" attribute) we can also extract information about the "Themes" attribute using the output clusters. The "Themes" attribute includes 100 unique levels which can give an indication of the groups in the text data. To simplify the visualization we keep only high-frequency groups of Themes. The interactive visualization works in the same way as done previously for the word frequencies. <img src="images/themes_interactive.png" width="90%" style="display: block; margin: auto;" /> --- class:hide_logo # Graph of Clusters (Interactive 2) In case of grouped themes we have 3 themes which appear in more than 2 clusters with a different weight for each topic. More details for the grouped "Themes" are available in the [Youtube video](https://youtu.be/EhvAwdNqKYw), .panelset[ .panel[.panel-name[Equality non discrimination] <img src="images/equality_non_discrimination.png" width="65%" height="100%" /> ] .panel[.panel-name[Violence against women] <img src="images/violence_against_women.png" width="85%" height="100%" /> ] .panel[.panel-name[Arbitrary arest detention] <img src="images/arbitrary_arest_detention.png" width="65%" height="100%" /> ] ] --- # Professional Services <br> If you are looking for professional help in text processing & analysis using R and Python programming don't hesitate to contact me [https://monopteryx.netlify.app/contact/](https://monopteryx.netlify.app/contact/) <br><br> ### References: * [Wikipedia human rights definition](https://en.wikipedia.org/wiki/Human_rights) * [Universal Human Rights Index (UHRI)](https://uhri.ohchr.org/en/) * [Human Rights Index](https://uhri.ohchr.org/en/our-data-api) * [Python programming](https://www.python.org/) * [R programming](https://www.r-project.org/)