Hugging Face, the go-to platform for AI developers and researchers, has long played a pivotal role in starting and sustaining a dialogue about ethics and responsibilities. The open-source community has been dependent on the platform to access resources, one of the contributing factors being the space it provides for an open and inclusive discussion to build AI ethically — be it textual or visual models.
Some of the key contributors, who have been involved in various projects aimed at promoting ethical AI, are: Alexandra Sasha Luccioni, Margaret Mitchell, and Yacine Jernite from Hugging Face and Christopher Akiki from ScaDS.AI, Leipzig University.
Here are six tools hosted on Hugging Face to assist researchers in building AI models with ethical considerations.
Diffusion Cluster Explorer
This tool was designed to investigate biases at the societal level within data. The demo on the website leverages the gender and ethnicity representation clusters to analyze social trends within machine-generated visual representations of professions.
The Professions Overview tab lets users compare the distribution over identity clusters across professions for Stable Diffusion and DALL.E-2 systems. The ‘Professions Focus’ tab provides more details for each of the individual professions, including direct system comparisons and examples of profession images for each cluster.
Users can compare the distribution of identity clusters across different professions and access detailed information about individual professions. This work is part of the Stable Bias Project.
Identity Representation Demo
This demo showcases patterns in images generated by Stable Diffusion and DALL.E-2 systems. Specifically, those obtained from prompt inputs that span various gender- and ethnicity-related terms are clustered to show how those shape visual representations. ‘System’ corresponds to the number of images from the cluster that come from each of the TTI systems that we are comparing: DALL.E 2, Stable Diffusion v.1.4. and Stable Diffusion v.2.
‘Gender term’ shows the number of images based on the input prompts that used the phrases man, woman, non-binary person, and person to describe the figure’s gender. Meanwhile, ‘Ethnicity label’ corresponds to the number of images from each of the 18 ethnicity descriptions used in the prompts. A blank value denotes unmarked ethnicity.
BoVW Nearest Neighbors Explorer
This tool utilizes a TF-IDF index of identity dataset images generated by three models, employing a visual vocabulary of 10,752 words. Users can select a generated identity image to find its nearest neighbors using a bag-of-visual-words model.
Language models
Plug-and-Play Bias Detection
As language models are today being used in everyday technology, it has become imperative to choose an approach to limit biases and constraints in the systems. To address the issue, researchers are developing key metrics such as BOLD, HONEST, and WinoBias. These metrics will help in quantifying the proclivity of language models to generate text that may be perceived as “unfair” across a spectrum of diverse prompts.
Within the framework provided, users can select a model of their choice along with a metric of relevance to execute their own assessments. However, generating these evaluative scores ends one issue. To further use those generated numbers, AVID’s data model comes in handy to simplify the process of collating findings through structured reports.
Data Measurements Tool
The demo of this tool under development showcases the dataset measures as we develop them. Right now this has a few preloaded datasets for which users can:
- view some general statistics about the text vocabulary, lengths, labels
- explore some distributional statistics to assess the properties of the language
- view some comparison statistics and an overview of the text distribution
Fair Diffusion Explorer
Here the researchers introduce a novel strategy designed to mitigate biases in generative text-to-image models post-deployment. This approach, as demonstrated, involves the deliberate adjustment of biases based on human instructions. As underscored by empirical evaluations, this newfound capability allows the precise instruction of generative image models on the principles of fairness, without the need for data filtering or additional training.
To read the full paper by Friedrich et al., see here.