Beijing, 1975. A girl is born into a family that is educated, aspirational, and living through the end of the Cultural Revolution — a period in which those two qualities, education and aspiration, had been actively dangerous. The country is beginning, cautiously, to reopen. The possibilities that had been foreclosed for a generation are beginning to edge back into view.
The girl’s name is Fei-Fei Li. Her parents are both educated — her mother a teacher, her father an engineer — and they have specific hopes for their daughter, hopes that the Cultural Revolution had forced them to defer for years. Education. Achievement. A life that would not be confined by the accidents of history.
Twenty-six years later, Fei-Fei Li is a PhD student at Caltech, studying the mathematics of how biological visual systems process information, and she has begun to think about a project that everyone she describes it to tells her is impossible.
She is going to build the largest labelled image dataset in the history of artificial intelligence. She is going to call it ImageNet. And she is going to change the world.
Beijing to New Jersey: The Immigration Story
Fei-Fei Li’s story begins with migration — the specific, difficult migration of a Chinese family to the United States in the early 1990s, with limited English, limited money, and unlimited ambition for their daughter.
She was sixteen when her family moved to Parsippany, New Jersey, in 1992. Her father, a former engineer who had been unable to work in his profession during the Cultural Revolution, arrived in the United States essentially starting over. Her mother found work wherever she could. Fei-Fei, younger than most of her classmates and working with English that was improving but not yet fluent, navigated an American high school while also working in a Chinese restaurant and later helping to run a laundromat to help support the family.
The immigration experience was both a burden and a gift. It was a burden in the obvious sense — the language difficulties, the financial strain, the social displacement of being new and foreign in an environment that had its own established social hierarchies. It was a gift in the sense that it made visible, with unusual clarity, the contingency of the circumstances that people typically take for granted. Someone who has seen how dramatically circumstances can change — whose parents lived through a revolution that reshaped an entire society — has a different relationship to the assumption that things will continue as they are.
Li absorbed from this experience a specific relationship to ambition and to possibility. She did not take for granted the advantages that American education offered. She did not assume that intelligence and hard work would automatically produce the outcomes they were supposed to produce. She knew that circumstances mattered, that access mattered, that the path from potential to achievement was not smooth or guaranteed. This knowledge would shape how she thought about AI — specifically about who benefited from AI and who did not, about whose data was used to train AI systems and whose perspectives were embedded in them.
She studied physics at Princeton, graduating in 1999. The choice of physics reflected a specific orientation — toward fundamental questions, toward the mathematical description of how things worked at a basic level, toward the kind of rigorous, quantitative understanding that required real intellectual commitment. She was not primarily interested in AI as engineering — in building things that worked. She was interested in AI as science — in understanding how visual intelligence worked, in the mathematical description of perception.
Caltech and the Neuroscience of Vision
Her PhD work at Caltech, completed in 2005, was in computational neuroscience — the application of mathematical models to understanding how biological visual systems processed information. The specific questions she worked on were at the intersection of neuroscience and computer vision: how did the visual cortex represent objects? What were the computational principles that allowed the visual system to recognise objects despite enormous variation in appearance?
This background in computational neuroscience shaped her approach to machine vision in specific and important ways. She approached the problem not as an engineer trying to build a system that worked but as a scientist trying to understand how visual recognition was possible — what the underlying principles were, why biological visual systems were so much better than machine vision systems, and what those principles implied for how machine vision systems should be designed.
The neuroscientific perspective led her to think about what biological visual systems had that machine vision systems lacked. The answer, as she developed it through her doctoral research and into her early faculty career, was not primarily architectural — it was experiential. Human visual systems were extraordinarily capable not because of the specific structure of the visual cortex but because they had been exposed, from birth, to an enormous quantity and diversity of visual experience. The representations that supported human visual recognition had been shaped by millions of hours of looking at the world.
Machine vision systems in the early 2000s had been trained on thousands of images — orders of magnitude less than human visual experience. Of course they were brittle. Of course they failed on images that differed from their training data. They had seen nothing. The solution was not primarily to develop better algorithms. The solution was to provide machine vision systems with something more like the visual experience that biological visual systems had.
This was the insight that would eventually become ImageNet. But getting from insight to project required navigating a research culture that was not convinced the insight was important.
The Resistance: Why Everyone Said No
When Li began describing her ImageNet idea to colleagues and potential collaborators in 2006 and 2007, the response was consistently discouraging.
The objections were numerous and, on the surface, reasonable.
The scale objection. Building a dataset of millions of labelled images was previously unprecedented in computer vision research. The largest existing datasets — PASCAL VOC, Caltech-101 — contained tens of thousands of images. Building a dataset two to three orders of magnitude larger seemed not just ambitious but implausible. Where would the images come from? How would the labelling be done? Who would pay for it?
The approach objection. The prevailing view in computer vision was that data was less important than algorithms — that the way to advance the field was to develop better feature representations, better classifiers, better learning algorithms, not to collect more data. The argument that the field was data-limited rather than algorithm-limited was not the consensus view. Collecting more data seemed like a brute-force approach that missed the subtlety of the underlying problem.
The evaluation objection. Academic research valued carefully controlled experiments in which the specific variables of interest were isolated and systematically varied. A dataset built from the internet — messy, varied, inconsistently photographed — seemed hard to use for such experiments. The messiness that Li saw as a feature of the dataset was, for more traditional researchers, a methodological problem.
The career objection. Building a large dataset was not the kind of contribution that advanced academic careers in the way that publishing papers in top venues did. The review process for research funding and academic promotion rewarded algorithmic innovations — novel theoretical insights, new learning methods, improved benchmark results. The years of work required to build ImageNet were years not spent publishing the papers that counted for academic advancement.
Li was a junior faculty member at Princeton when she began the ImageNet project, and later at Stanford after her move there in 2009. The career risk of devoting significant effort to a project that the field was not convinced was important was real. Senior colleagues who advised her about her career trajectory were not encouraging.
She proceeded anyway.
The Project: Three Years of Painstaking Work
The ImageNet project took approximately three years to complete the initial version of the dataset, from the beginning of serious work in 2007 to the public release and first ILSVRC competition in 2010.
The work was, in its details, unglamorous. It involved developing automated systems for downloading images from the internet, cleaning and filtering the downloads, designing the labelling tasks, managing the Amazon Mechanical Turk labelling process, quality-controlling the resulting labels, and dealing with the many specific problems — inconsistent image quality, noisy labels, copyright issues, category ambiguities — that arose in the course of building a dataset at unprecedented scale.
The Mechanical Turk methodology that Li and her collaborators developed — paying crowdworkers small amounts per label, using multiple workers per image for redundancy, building quality control systems to identify unreliable workers — was itself an innovation. No one had previously used crowdsourcing at this scale for scientific data collection. The methodology required careful design to ensure that the resulting labels were accurate enough to be useful for training machine learning systems.
Li later described the experience of building ImageNet in a memoir and in interviews as one of the most intellectually demanding and emotionally challenging periods of her career. The scale of the project, the resource constraints, the absence of institutional support for what she was doing, and the persistent scepticism of colleagues created an environment in which continuing to believe the project was worth doing required specific acts of will.
She was sustained by the conviction that she was right — that data mattered as much as algorithms, that the field was being held back by the poverty of its training datasets, that building ImageNet would eventually transform what machine vision could do. This conviction was not shared by most of her colleagues, but it was well-founded on the specific scientific argument she had developed from her neuroscience background.
The project was also sustained by a specific research culture that Li cultivated among her students and collaborators. The people who worked with her on ImageNet — Jia Deng, Alexander Berg, Li-Jia Li, and others — shared the commitment to the project and the willingness to do the unglamorous work that building the dataset required.
The First ILSVRC and the Gradual Vindication
The first ImageNet Large Scale Visual Recognition Challenge, held in 2010, attracted twelve teams. The results were modest by the standards of what would follow — the best systems achieved top-5 error rates of approximately 28% — but the competition served its intended purpose: it created a common benchmark that focused the computer vision community’s attention on the large-scale recognition problem.
The 2011 ILSVRC attracted more teams and showed improvements over 2010. The progress was real but incremental. Li’s team was preparing for the 2012 competition when they learned about a team from the University of Toronto that was planning to submit a deep convolutional neural network.
The AlexNet result — 15.3% top-5 error, compared to the best previous year’s result of approximately 26% — was the vindication of ImageNet as a benchmark and of the data-driven approach that Li had been arguing for. The result demonstrated conclusively that the scale of ImageNet enabled deep learning approaches to demonstrate capabilities that had not been visible on smaller datasets. The dataset Li had spent three years building was the enabling condition for the breakthrough that transformed the field.
The vindication was not without complexity. Li was the architect of ImageNet and of ILSVRC, but AlexNet — the specific system that demonstrated ImageNet’s enabling power — was the work of Hinton’s group at Toronto. The relationship between dataset creator and the researchers who used the dataset to achieve the breakthrough is not straightforward in terms of credit. Li created the conditions for the breakthrough; others achieved the breakthrough given those conditions. Both contributions were essential.
Li’s own account of the AlexNet moment is generous and accurate: she saw it as a vindication of the ImageNet project and of the data-driven philosophy, not as a personal achievement competing with Hinton’s group’s achievement. The breakthrough was a product of the intersection between the dataset she had built and the approach that Hinton’s group had developed.
Stanford AI Lab and the Research Programme
After the AlexNet breakthrough transformed computer vision, Li’s research group at Stanford became one of the most productive and most influential in the field. The combination of her neuroscientific perspective, her data-driven approach, and the credibility that the ImageNet project had given her attracted talented students and significant research funding.
The specific research directions she pursued in the years following AlexNet reflected a specific intellectual agenda: using deep learning as a tool for understanding both machine vision and biological vision, exploring the connections between the two, and pushing machine vision toward capabilities that were closer to what human visual understanding achieved.
Visual question answering. One of the most interesting problems Li’s group worked on was visual question answering — the task of answering natural language questions about images. (“What colour is the umbrella?” “How many people are in the room?” “Is the dog sitting or standing?”) This task required the integration of visual understanding and language understanding in ways that pushed beyond pure classification or detection.
Image captioning. Generating natural language descriptions of images — translating visual content into text — was another direction Li’s group pursued. The task required a deep connection between the visual content of an image and the vocabulary and grammar of natural language description.
Visual genome. Li’s group built the Visual Genome dataset — a large collection of images annotated with detailed descriptions, question-answer pairs, and region-level annotations. Visual Genome was designed to provide the kind of rich, structured visual knowledge that would support more sophisticated visual understanding than classification or detection alone could provide.
These research directions reflected a consistent theme: pushing machine vision toward genuine understanding rather than just benchmark performance, toward the kind of visual comprehension that humans had rather than the narrow accuracy on specific tasks that classification metrics measured.
The Google Chapter: Chief Scientist, Real Impact
In 2017, Li accepted a position as Chief Scientist of AI/ML at Google Cloud — a two-year leave from Stanford to work on how AI could be deployed at scale to create genuine value.
The Google role was different from anything Li had done before. It required engaging with AI not as a research problem but as a deployment problem — understanding the specific challenges of making AI capabilities accessible and useful to the enormous range of customers and use cases that Google Cloud served.
The experience was formative in specific ways. She saw, at close range, the gap between AI research performance on benchmarks and AI performance in real-world deployment. She saw the ways that AI systems trained on specific data distributions failed when deployed in contexts with different distributions. She saw the difficulty of explaining AI system behaviour to non-expert users who needed to understand and trust the systems they were using.
She also saw the potential — the genuine, large-scale value that AI could provide when deployed thoughtfully on real problems. Healthcare applications where AI could help address the shortage of specialists in under-resourced settings. Educational applications where AI could provide personalised learning at scale. Environmental applications where AI could support monitoring and analysis of ecosystems.
The Google chapter reinforced her conviction that the central challenge in AI was not technical — it was the challenge of connecting technical capability to genuine human benefit, of ensuring that AI development was oriented toward the people it would affect rather than toward the interests of those who controlled the technology.
She returned to Stanford in 2019, bringing from the Google experience a more concrete understanding of the deployment challenges and a more acute sense of the governance questions that the rapid advance of AI capability was raising.
Human-Centered AI: The Institute and the Idea
Li’s most important institutional contribution after ImageNet was the co-founding of the Stanford Institute for Human-Centered Artificial Intelligence (HAI) in 2019. HAI was designed as a response to what Li saw as a fundamental gap in how AI was being developed and deployed.
The gap was this: AI development was being driven primarily by technical researchers focused on capability — on making AI systems better at specific tasks — without adequate attention to the social, ethical, and governance questions that increasingly capable AI systems raised. The field lacked the interdisciplinary institutions and the research culture that would be required to ensure that AI development served broad human interests rather than narrow ones.
HAI was designed to fill this gap. It brought together researchers from computer science, social science, medicine, law, education, and other disciplines to work on the intersection of AI and human values. It funded research on the social and economic impacts of AI, on governance and regulation, on the specific applications where AI could create the most benefit for the most people.
The HAI model was explicitly interdisciplinary — Li believed that the questions raised by AI were too important and too complex to be answered by technical researchers alone, and that genuine progress on these questions required the integration of social science, ethics, law, and domain expertise alongside technical AI research.
HAI also took seriously the question of who participated in AI development and who was represented in it. The field had a well-documented diversity problem — computer science broadly and AI specifically were dominated by men, and by researchers from a small number of countries and institutional backgrounds. The perspectives embedded in AI systems — the assumptions, biases, and value judgments built into the data, the architectures, and the objectives — reflected the perspectives of the people who built them. If those people were not diverse, the systems were not diverse.
This connection between diversity and technical quality — the argument that who builds AI matters for what AI does — was a specific and important contribution of Li’s approach to the human-centered AI agenda. It was not primarily an ethical argument, though it had ethical dimensions. It was a technical argument: AI systems trained on data generated by non-diverse populations, built by non-diverse teams, optimised for objectives defined by non-diverse groups, would be less capable and less reliable for the full diversity of users they were deployed to serve.
The Research on AI and Society
Through HAI and through her own research group, Li engaged systematically with the empirical questions about AI’s social impact that the field was generating.
Algorithmic bias. The discovery that AI systems trained on historical data could perpetuate and amplify historical biases — in facial recognition, in loan approval, in medical diagnosis — became one of the most actively discussed issues in AI ethics. Li engaged with this issue both as a research problem (how do you detect and mitigate bias in AI systems?) and as a governance problem (what standards and regulations are needed to ensure that AI systems do not discriminate?).
Healthcare AI. One of Li’s most sustained research interests was the application of AI to healthcare — both the specific technical problem of building capable medical AI systems and the broader question of how to deploy such systems equitably. She was acutely aware of the irony that AI could simultaneously improve healthcare access for some populations while making it worse for others, if the systems were trained on data that underrepresented certain groups or if the deployment infrastructure was inaccessible to under-resourced healthcare settings.
AI and the labour market. The question of how AI would affect employment — which jobs would be automated, what new jobs would be created, who would benefit and who would suffer from the transition — was one of the most socially significant questions raised by AI’s rapid advance. Li engaged with this question empirically, through HAI research on the labour market impacts of AI, and politically, through advocacy for education and training programmes that would help workers navigate the transition.
AI governance. The governance question — how should AI development be regulated, what standards should be required, what international frameworks were needed — was one that Li engaged with consistently through her public work. She contributed to government reports, testified before legislative bodies, and participated in international forums on AI governance. Her combination of technical credibility and policy engagement made her one of the most effective technical experts in the governance conversation.
The Book: The Worlds She Has Seen
In 2023, Li published a memoir titled “The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI.” The book was unusual for a technical researcher of her standing — memoirs from AI scientists are rare, and the genre typically attracts either popularisers or people at the end of their careers reflecting on completed work.
Li’s memoir was neither. It was a mid-career reflection by a person who was still actively shaping the field, written from a perspective that was personal and intellectual simultaneously. It traced the arc from Beijing to New Jersey to Princeton to Caltech to the ImageNet project to Google to HAI, and it used that arc to develop a specific argument about what AI was for and what it required.
The argument was implicit in the structure as much as explicit in the text: that the technical work of AI research was inseparable from the human context in which it happened, that the questions AI could answer were inseparable from the questions the people who built it were asking, and that building AI responsibly required the kind of broad perspective — across cultures, across disciplines, across the gap between the privileged institutions of elite AI research and the communities that AI would most affect — that no single researcher had but that the field needed to cultivate collectively.
The memoir was received warmly, both by the general audience it was addressed to and by the technical community that recognised in it an honest account of what AI research actually felt like from the inside — the frustrations, the persistence, the uncertainty, the moments of genuine discovery.
The Privacy Problem Li Helped Create
An honest account of Li’s career must address the privacy problems associated with ImageNet — the discovery that the dataset contained images of people who had not consented to be included, and that some categories in the dataset reflected harmful biases and offensive labels.
The privacy concerns were fundamental to how ImageNet was built. The images in ImageNet were downloaded from the internet without the explicit consent of the people who appeared in them. The images were publicly posted — on social media, on personal websites, on image hosting services — but public posting does not constitute consent for inclusion in an AI training dataset that would be used to train commercial AI systems deployed to billions of users.
The category structure of ImageNet, derived from WordNet’s lexical taxonomy, included some categories with offensive labels — labels that reflected cultural biases and historical attitudes that were not appropriate for inclusion in a scientific dataset. The crowdsourced labelling process, conducted at scale with workers who were not trained in sensitivity to such issues, amplified rather than corrected these problems.
When researchers documented these problems in 2019, Li and the ImageNet team responded by removing the most problematic categories and images. The response was substantive and relatively quick by the standards of how institutions typically respond to such discoveries. But the response also raised broader questions: if these problems existed in ImageNet, a dataset built by a careful, thoughtful researcher at a prestigious institution, what did they imply for the much larger datasets being assembled by commercial AI companies with less careful governance?
Li has engaged honestly with these questions in her subsequent work. The privacy and bias problems in ImageNet are part of her account of what she built and what it implied — not hidden or minimised but acknowledged as genuine failures that the field needs to take seriously.
The honest account is: Li built something enormously important and enormously impactful. She built it with the best methodological practices available at the time, with genuine attention to quality and accuracy. The dataset contained problems that she did not anticipate and that reflected limitations of the methodology and the field’s understanding at the time. When the problems were identified, she addressed them. This is the record of a researcher who is genuinely trying to do good work in good faith, not a record of negligence or indifference.
The Advocacy: AI as a Public Question
One of the most distinctive features of Li’s career is the sustained engagement with AI as a public question — as a matter of concern for society broadly, not just for the technical researchers who build AI systems.
This engagement took many forms. She wrote op-eds in major newspapers, not just technical papers in academic journals. She testified before Congress about AI governance, bringing technical expertise to a policy conversation that was sometimes dominated by people with limited technical understanding. She participated in international forums — at the World Economic Forum, at the United Nations — where AI was being discussed as a geopolitical and governance issue. She gave TED talks that reached audiences far beyond the academic research community.
The advocacy was grounded in a specific conviction: that the decisions being made about how AI was developed and deployed were too important to be left entirely to the technical researchers and technology companies that were closest to the technology. These decisions — about who AI was built for, whose data was used to train it, how its benefits were distributed, how its harms were prevented — were fundamentally political and social decisions that required democratic engagement.
This conviction put Li in a specific position in the AI governance conversation. She was more technically credible than most policy advocates, because she had built ImageNet and had spent her career on the frontier of AI research. She was more policy-engaged than most AI researchers, because she had invested years in understanding the governance questions alongside the technical ones. The combination gave her unusual influence in conversations that typically failed to bridge the technical and policy dimensions.
The Scientific Legacy: What ImageNet Built
The scientific legacy of Li’s ImageNet project is clear, extensive, and still growing.
The deep learning revolution. The AlexNet result that transformed computer vision, and through computer vision the entire AI research agenda, was enabled by ImageNet. Without ImageNet, the deep learning revolution would have happened, but later and differently. The specific pace at which computer vision was transformed — from less than human performance to superhuman performance in approximately four years — was a function of the specific scale and quality of the ImageNet dataset.
Transfer learning as a paradigm. The discovery that features learned on ImageNet transferred effectively to other visual tasks established transfer learning as the dominant approach to building computer vision systems. This discovery changed the economics of AI research by making capable AI systems achievable with much smaller task-specific datasets than would otherwise be required. The pre-training paradigm that dominates large language model research is a direct descendent of the ImageNet transfer learning paradigm.
The competition as research infrastructure. The ILSVRC competition model — common dataset, clear evaluation metric, public leaderboard — became the template for competitions in computer vision, NLP, and other AI domains. The competition infrastructure accelerated progress in each domain by focusing research attention, enabling comparison, and creating the competitive dynamics that drove rapid innovation.
Medical AI. The applications of deep learning to medical imaging — the systems for detecting diabetic retinopathy, for classifying chest X-rays, for reading mammograms — were enabled by ImageNet pre-training. The medical AI industry that has emerged from these applications represents one of the most significant real-world impacts of the deep learning revolution.
The science of AI. Beyond the specific applications, ImageNet contributed to the scientific understanding of what made AI systems work — the relationship between training data scale and performance, the transferability of learned features, the importance of data diversity for generalisation. These scientific contributions have shaped AI research methodology in ways that extend far beyond computer vision.
The Human Legacy: The Students and the Culture
Beyond the scientific contributions, Li’s most important legacy may be the students she trained and the culture she cultivated.
The doctoral students and postdoctoral researchers who worked with Li at Stanford have gone on to influential positions throughout the AI field — at major technology companies, at universities, in AI policy organisations. The specific combination of technical rigour, data-driven empiricism, and engagement with social and ethical questions that Li modelled in her own research has influenced how a generation of AI researchers approaches their work.
The culture at Li’s lab was notable for several features that were not universal in AI research. It was explicitly welcoming to women and to underrepresented groups in ways that many AI labs were not. It connected technical work to social impact in ways that many technically-focused labs did not. It engaged with the ethics of AI research — the privacy implications of large datasets, the bias problems in training data — in ways that many labs avoided.
These cultural features were not separate from the scientific work. Li’s conviction that diversity mattered — that the perspectives embedded in AI systems reflected who built them — was connected to her commitment to building a lab that was itself diverse. And the lab’s engagement with ethics was connected to her conviction that technical researchers had a responsibility to think about the implications of what they were building.
The Complexity of Impact: Intended and Unintended
Any honest account of Fei-Fei Li’s career must grapple with the complexity of impact — the reality that the most important interventions in a fast-moving field often have consequences that were not intended and could not be fully anticipated.
ImageNet enabled the deep learning revolution. The deep learning revolution has produced enormous benefits — in medical imaging, in translation, in scientific discovery, in the accessibility of information and services that AI makes possible. It has also produced significant costs and risks — the concentration of AI power in a small number of companies, the displacement of workers whose jobs have been automated, the use of AI for surveillance and manipulation, the algorithmic biases that have caused specific harms to specific communities.
Li did not intend these costs and risks when she built ImageNet. She intended — and largely achieved — the advancement of visual AI research. The broader consequences of that advancement were not in her control and could not have been fully anticipated.
This is a general feature of transformative scientific contributions: the people who make them are responsible for what they contribute, but they cannot be held responsible for all the uses and misuses of what they contribute. Newton cannot be blamed for nuclear weapons because his gravitational theory contributed to the physics that eventually produced them. Li cannot be blamed for all the applications of the deep learning systems that ImageNet enabled.
What Li can be — and should be — held to account for is her response to the consequences she did not intend. And her response has been to take those consequences seriously — to engage with the privacy problems in her own dataset, to advocate for AI governance that could manage the broader risks, to build HAI as an institution specifically designed to connect technical capability to human values.
The response is not perfect. No response to consequences of this scale could be perfect. But it is genuine, sustained, and substantive — the response of a person who is taking seriously the responsibility that comes from having contributed to something enormously important.
Fei-Fei Li and the Future
Fei-Fei Li is still in the middle of her career. The work she has done — ImageNet, HAI, the advocacy for human-centered AI — has been important, but she has described it herself as the beginning of a longer project, not its completion.
The project, as she has articulated it, is to ensure that the development of AI is oriented toward the full diversity of human needs and human values, not just toward the interests of the organisations and individuals who currently control AI development. This requires technical work — building AI systems that are more equitable, more interpretable, more aligned with diverse user needs. It requires institutional work — building the organisations and the governance frameworks that can ensure AI is developed responsibly. And it requires cultural work — changing the culture of AI research so that questions of ethics, equity, and impact are integral to the scientific work, not external additions.
This is ambitious and possibly not fully achievable. The forces that concentrate AI development in powerful organisations, that orient AI systems toward commercial objectives, that limit participation in AI research to those with access to elite institutions — these forces are strong and have structural causes that no single person or institution can overcome.
But the project is worth pursuing. And Li’s career demonstrates that one person, with the right insight and the right persistence, can make a difference that shapes the trajectory of an entire field.
She assembled fourteen million labelled images when everyone told her it was impossible. She built the benchmark that made the deep learning revolution possible. She went to Google and saw how AI was deployed at scale. She came back to build an institution designed to connect AI capability to human values. She wrote a memoir that connected her personal story to the story of AI’s development.
All of this from a woman who arrived in New Jersey at sixteen, speaking limited English, working in a restaurant to help her parents pay the bills.
The worlds she has seen — Beijing in the Cultural Revolution’s aftermath, Parsippany in the 1990s, Princeton and Caltech, Stanford and Silicon Valley, the corridors of the Capitol and the halls of Davos — have all shaped what she builds and how she builds it. The perspective they provide is not incidental to her scientific work. It is essential to it.
Further Reading
- “The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI” by Fei-Fei Li (2023) — Her memoir. Accessible, personal, and essential for understanding the human dimensions of the ImageNet story.
- “ImageNet: A Large-Scale Hierarchical Image Database” by Deng, Dong, Socher, Li, Li, and Li (2009) — The original ImageNet paper. The technical foundation.
- “ImageNet Large Scale Visual Recognition Challenge” by Russakovsky et al. (2015) — The comprehensive ILSVRC account, documenting the competition’s progression and impact.
- “Human-Centered AI” — the HAI website at stanford.edu/hai — Li’s ongoing institutional project, describing the research programme and the vision that animates it.
- “Large-Scale Video Classification with Convolutional Neural Networks” by Karpathy, Toderici, Shetty, Leung, Sukthankar, and Li (2014) — An example of Li’s research group’s extension of the ImageNet approach to video, showing the breadth of the research programme she led.
Next in the Profiles series: P16 — Sam Altman & OpenAI: The Organisation That Changed Everything — The full story of OpenAI — the founding mission, the commercial pivot, the release of ChatGPT, the board crisis of 2023, and what it means that the organisation most explicitly committed to humanity’s benefit has become the world’s most commercially successful AI company.
Minds & Machines: The Story of AI is published weekly. If Fei-Fei Li’s story — the impossible dataset, the transformative benchmark, the sustained commitment to human-centered AI — illuminates something about the relationship between individual contribution and collective impact, share it with someone who would find that illumination worth having.