Can the development of machine learning algorithms improve semantic search of human rights documents in digital rights databases?
The experiment will test the potential for human rights organisations to interact with machine-generated intelligence in their work. The grantee will look into whether machine learning can increase the effectiveness and efficiency of human rights defenders in curating large collections of human rights documents, and enable them to make better use of data generated by collective intelligence. HURIDOCS will test algorithms to improve semantic search of human rights documents in a digital rights database covering 20 countries in the Arab League. Semantic search aims to improve search accuracy by understanding the searcher's intent and the contextual meaning of terms to generate more relevant results. This means that search results will reveal not only documents containing exactly that word, but also will show documents from related fields that might use different terminology to talk about the same subject.
As people increasingly conduct their lives online, digital rights, including the right to privacy and freedom of expression, are becoming more important. But growing numbers of governments are blocking access to the internet, censoring political websites and arresting digital activists. This experiment will test how machine intelligence helps human rights practitioners compare digital rights legislation of different countries to international law, and spot patterns and trends across the region. If the experiment is successful, it will also relieve the workload of human rights defenders and activists and allow them to free up more time for analysis and advocacy. This is because the information needed to hold human rights perpetrators to account is often concealed in vast amounts of unstructured data. The manual curation process is time-consuming, inefficient, and error-prone.
The experiment will provide new insights into how to use inputs generated through collective intelligence more effectively. This is relevant for anyone using large databases with documents of different types or from different sources. The findings will also help us advance our understanding about how to best combine human expert knowledge with the capabilities of machines.