YouTube needed first-principles guidance for how inclusive AI image generation should behave by default, not as a post-hoc filter on model output. The studio ran a cross-cultural depth-interview study, six ninety-minute IDIs anchored by cultural immersions in Lagos, Jakarta, and Mumbai, with coverage across six markets: USA, Brazil, Japan, Nigeria, Indonesia, and India. The work produced a five-force framework for the evolving representation landscape and principle-level guidance across four representation dimensions, with homogeneous, white-centric defaults rejected in every market studied. Findings now inform responsible AI image-generation direction.
Scoped the six markets, recruited US-based participants with representative backgrounds, and defined a cross-cultural discussion guide grounded in representation rather than raw accuracy.
Across all six markets, participants rejected AI imagery that defaulted to a single majority look. The reaction held from the USA to Lagos, Jakarta, and Mumbai, and across race, gender, and body dimensions.
Participants expected AI imagery to reflect balanced local-population representation rather than over-indexing on a global majority. The baseline is locality-aware, not universally uniform.
Race and ethnicity are the most visible axis, but attire, age, body, and ability carried nearly equal weight in whether an image read as genuinely representative.
The five-force framework (Compounding Complexity, Shifting Demographics, Ideals vs Reality, Visual Representation, AI and Inclusion) described representation as an active field rather than a static specification.
Participants understood that AI inherits biases from its training data. They expected product teams to treat generation defaults as a design decision carrying responsibility downstream, not as a technical output.
A cross-cultural framework for inclusive AI image generation that held across six markets.
The work gave responsible AI teams a research-backed framework to reason about representation defaults. The five-force landscape and the four-dimension principle set reframed inclusive generation as an active design responsibility rather than a post-hoc filter on model output.