california forever or, the aesthetics of ai images
25 February 2024 by kazys
(note: due to the limitations of Substack, this post is truncated, but you can read the original in full at https://varnelis.net/california-forever-or-the-aesthetics-of-ai-images/)
This past August (2023), a new urban project called “California Forever” was announced, promising a walkable city for 400,000 in Solano County, not far from the Bay area. Critics soon pointed out several flaws in the renderings the company distributed. First, even though the venture was backed by billionaire Silicon Valley investors such as Laurene Powell Jobs (Steve Jobs’s widow), Reid Hoffman (LinkedIn co-founder), Michael Mortiz (former partner at Sequoia Capital), and Marc Andreesen (author of Mosaic and Netscape co-founder), the project looked profoundly retardataire. Instead of a high-tech city next to the world’s tech capital, the renderings depict a new urbanist fantasy with American flags and children on old-fashioned bicycles. Where has our imagination gone? How is it that Archigram’s fifty-five-year-old Instant City still looks fresher than this recycled Americana? Neom and the Line are terrible, but at least they show an interest in doing something new.
The founders are your typically older tech investors: their imaginative days are long behind them and, having been glued to computer screens their entire lives, it’s hard to imagine they have few original thoughts left. A drive around Silicon Valley is enough to show the banality of the tech industry’s vision. Some of them may have read Christopher Alexander’s A Pattern Language, which has had a large influence in software development and thus become interested in the New Urbanist movement his writing spawned. There are no architects listed among the team, although a planner who was involved in Culdesac Tempe, a moderately interesting, if boring, car-free development is involved. The rendering indicates a “contextual” approach derivative of San Francisco, with a variety of windows and townhouse shapes to break up the massing since somebody told them to do that. The architecture is barely there, its utter banality indicating how little it matters. The end result will likely be even more disappointing. But I am more interested in the problems with the rendering that other critics, such as the San Francisco Chronicle’s Chase DiFeliciantonio observed about the renderings: “A girl pedaling a bicycle with a missing foot. An asymmetrical airplane. An impossible ladder.” (link). The renderings, as the California Forever team eventually admitted, were made with an Artificial Intelligence image generator, apparently Midjourney.
More than one friend asked me to weigh in as I have been working with Midjourney and other AI image generators for some time now, exploring a critical approach to AI image generation, investigating the properties and problematics of the medium itself. If California Forever is so backwards-looking, why are images created by image generators also so banal? Hot women (lots and lots of hot women), fan service art, gaudy hyperrealistic landscapes, cringe anime, and bad cartoons are the order of the day (for examples, check out the feed for the Midjourney gallery). Why this junkscape of imagery? Why is AI imagery not more worthy of our future? Why is it that so much of what is commonly called AI “art” is kitsch?
In part this is because users of AI image generators fancy themselves as artists even though few of them have any art training. This is common in photography. Wealthy individuals purchase camera gear based on reviews claiming that some camera or lens has greater technical abilities to reproduce reality faithfully and then apply complicated methods to assure that their photographs demonstrate technical proficiency. High Dynamic Range (HDR) photography is the leading example of this. Popular with amateurs with no aesthetic training, HDR is an attempt to capture a scene in which the range of luminance exceeds the dynamic range of the camera sensor, and often even the human eye itself. The results typically have too much detail in the shadows, dark skies, unnatural colors, the hyperrealistic effect of an acid trip.
These sorts of photographers, along with individuals who produce digital illustrations for consumption on platforms like Artstation and DeviantArt, 3D printing enthusiasts, makers, indie musicians working with samplers and synthesizers, vloggers creating content for YouTube, gamers streaming on Twitch and YouTube, and fashion enthusiasts showcasing their work on social media are “prosumers,” a term coined by futurist Alvin Toffler in his 1980 book The Third Wave. Toffler’s “prosumer” merges the roles of producer and consumer, suggesting a shift in the economy and society. In this model, individuals are not only consumers of products and services but also take on an active role in their production. This concept was revolutionary at the time, predicting the rise of customization, personalization, and participatory culture facilitated by technological advancements, particularly in digital technology and the Internet.
At the same time, prosumers largely create kitsch, characterized by an appeal to popular tastes and a frequently derivative nature. Kitsch thrives in environments where production is geared towards mass appeal and immediate consumption rather than nuanced artistic merit or innovation. For traditional modernist critics, such as Clement Greenberg, kitsch represented the antithesis of genuine culture and the avant-garde. Kitsch, Greenberg explained in his seminal 1939 essay “Avant Garde and Kitsch,” is produced by industrialization, designed to satisfy the tastes of the least discerning audience without intellectual or emotional challenges. Greenberg associated kitsch with the replication of traditional art forms and aesthetics, but emptied of genuine meaning or complexity, offering immediate gratification rather than enduring value or depth. Greenberg:
The peasants who settled in the cities as proletariat and petty bourgeois learned to read and write for the sake of efficiency, but they did not win the leisure and comfort necessary for the enjoyment of the city’s traditional culture. Losing, nevertheless, their taste for the folk culture whose background was the countryside, and discovering a new capacity for boredom at the same time, the new urban masses set up a pressure on society to provide them with a kind of culture fit for their own consumption. To fill the demand of the new market, a new commodity was devised: ersatz culture, kitsch, destined for those who, insensible to the values of genuine culture, are hungry nevertheless for the diversion that only culture of some sort can provide.
Kitsch, using for raw material the debased and academicized simulacra of genuine culture, welcomes and cultivates this insensibility. It is the source of its profits. Kitsch is mechanical and operates by formulas. Kitsch is vicarious experience and faked sensations. Kitsch changes according to style, but remains always the same. Kitsch is the epitome of all that is spurious in the life of our times. Kitsch pretends to demand nothing of its customers except their money-not even their time.
Clement Greenberg, “Avant-Garde and Kitsch,” 1939
With the rise of postmodernism, however, both artists and critics revalued the role of mass culture. Initially, this was done with the knowing wink that reinterpreted kitsch as camp. By bracketing the degraded, Andy Warhol, Roy Lichtenstein, Philip Johnson, Stanley Tigerman, Robert Venturi and Denise Scott Brown, the Harry Who, followed by John Waters and David Lynch, Jeff Koons and Pierre et Giles were among the many artists who ironically reframed kitsch into art. In her 1964 essay “Notes on ‘Camp’,” later published in the book Against Interpretation, and other Essays, Susan Sontag flipped the valence on kitsch, valorizing camp as an aesthetic sensibility that found beauty in artifice, exaggeration, and theatricality. Camp, for Sontag, is the love of the unnatural: of artifice and exaggeration. It is a mode of enjoyment, of appreciation—not judgment. Camp is the good taste of bad taste, a celebration of the extravagant and the absurd, but with a nuanced affection that discerns quality within the ostensibly tasteless. Sontag nevertheless contrasted camp with kitsch, which she viewed less favorably. Kitsch, for Sontag, is associated with mass-produced art or objects that lack sophistication and are designed to appeal to popular or uncritical taste. The critical difference, as Sontag and others have implied, lies in the intentionality and reception: camp involves a conscious, nuanced embrace of excess and irony, whereas kitsch is earnest, unironic, and often pandering to sentimental or lowbrow tastes.
In 1983, theorist Frederic Jameson concluded that the thorough permeation of culture by capital—and vice versa once the techniques of the avant-garde were embraced by commercial art—meant the end of a distinction between mass culture and art, thus producing postmodernism. Indeed, by the 1980s, the distinction between camp and kitsch had been thoroughly blurred. If John Waters was camp, were the B-52s? If Adam Ant and Boy George were camp, were Van Halen and Bon Jovi? If the cover of Sgt. Pepper’s Lonely Hearts Club Band was a masterpiece of camp, what about the cloying song “Wonderful Christmastime” by ex-Beatle Paul McCartney? Perhaps the ultimate end of any distinction between camp and kitsch came in John Chase’s brilliant 1982 Exterior Decoration: Hollywood’s Inside-Out Houses in which Chase explored the unique architectural vernacular of West Hollywood’s do-it-yourself remodels, transformations that turned ordinary stucco bungalows into distinctive visual statements, often utilizing historicizing elements traditionally found indoors on the exterior of these remodels. Adding to this is the rise of the art museum store, which in the 1980s transformed from a bookstore selling scholarly books as well as an odd postcard and reproduction to include a wider range of items, such as jewelry, toys, and even furniture inspired by the museum’s collection and exhibits by commercially popular (and generally kitsch) artists like Yayoi Kusama, Kaws, Banksy, Damien Hirst, Jeff Koons, Keith Haring, and Shepard Fairey. Seeing the museum store as a crucial source of revenue, museums now regularly think about the tie-ins between exhibitions and “merch.”
In a recent (paywalled so don’t bother to look for it unless you want to pay $30) essay, “Digital Kitsch: Art and Kitsch in the Informational Milieu,” Domenico Quaranta discusses the emergence of “digital kitsch,” which he calls “the default mode for all creative endeavors with digital media.” This is a provocative position, but he leaves it undertheorized. There is little question that the vast majority of cultural production today is kitsch—just as it was in the nineteenth or twentieth century—but that does not mean that it is the default mode or that somehow digital tools produce nothing but kitsch. Now the artists of the Net.Art movement, as promoted on Rhizome.org and various mailing lists since the 1990s, not only embraced kitsch, they saw its manipulation as their primary concern. But this is a typical case of mistaking what is being heavily promoted by the art market for what is worth looking at. There are few writers, photographers, or musicians who do not employ digital media in some way today, but that does not automatically make them kitsch. I don’t see William Basinski, Katie Paterson, Paul Prudence, or Guy Dickinson—to name only a few artists whose work I admire—as kitsch, even though they work with digital media or have web sites (Paterson, does, on occasion purposefully engage with kitsch, but certainly not in most works). Moreover, to somehow suggest this is a digital trend is reductive: painting or classical music are more likely than not to be kitsch today, as those art forms have largely exhausted themselves, subject to endless, academicized retreads.
One can certainly still produce works of sophistication and effort today, but it does require effort. If one abandons the Hegelian exploration of art’s proper object, embraces politics as the sole cause of art, or turns to the academician’s fatal poison, the knowing disdain of snark, it can be virtually impossible. Blindly searching for the new is a long-dead end as well. Architect Eric Moss, endlessly repeated Ezra Pound’s dictum “make it new” (none of us think he knew who Pound was, let alone that this was his phrase), but that did not elevate his work above kitsch. Instead, as I detail in my essay “On Art and the Universal,”
[A Greenbergian] revival, however, should begin with a call for art to investigate itself again, not merely play to political activism for the sake of theater. The task at hand is to discern the proper object of knowledge for art, a fulcrum upon which we can rest our research. Or, if not the proper object, a proper object that would be suitable for investigation and productive of knowledge.
In that essay, I suggest that a serious proper object for AI art would be to explore the intertextuality of all artwork, using it to access the collective cultural subconscious. But this is not what AI image generators are designed for. On the contrary, the engineers programming AI image generators know that, generally speaking, they do not need to engage with art history, but rather with the imagery commonly found on the Internet, imagery that is “scraped” to create training data for AI image generators.
Writer Andy Baio investigated (see here) the training data for AI image generator Stable Diffusion, data composed of sets of English-captioned images from the nonprofit Large-scale Artificial Intelligence Open Network (LAION), particularly a set of images called LAION-Aesthetics, which in turn were subsets of images from the massive LAION datasets created by what LAION calls “lightweight [AI] models” that “predict the rating people gave when they were asked ‘How much do you like this image on a scale from 1 to 10?‘” (see here). These subsets were then used for fine-tuning of AI image generators. Academics have droned on, as they will, about AI image generators’ biases toward producing stereotypically beautiful young white or Asian women. Of course such biases exist, just as Internet searches are biased toward the United States. We live in a global monoculture, there is nothing good about it and I don’t endorse such biases, but there is also no revelation here, this is a lazy analysis pandering to political positions held by individuals of simple minds, an observation about as instructive as suggesting that poor people are disadvantaged in society. Training data reflects society and all its flaws. Just this past week, we saw what an utter catastrophe training AI image generators to artificially incorporate diversity in their results, what Zvi Mowshowitz calls the “Gemini Incident,” with black Nazis, female NFL quarterbacks, and Asian viking warriors (this is not really that new, ChatGPT’s Dall-E3 does the same sort of tuning, albeit slightly less egregiously as this dump of the initial prompt—which I have independently verified—shows). What is deeply weird, however, is that AIs are being trained to produce images based on a selection of images chosen not by humans but by AI judges that predict which images humans will judge as aesthetically superior. It’s the return of Komar and Melamid, as robots.
A large number of the illustrations in these image generators seem to be digital in origin, belying a clear preference for work produced for consumption on the Net. Baio analyzed some 12 million images in the LAION-Aesthetics v2 6+ model. His conclusion is worth quoting at length instead of paraphrasing or summarizing:
Nearly half of the images, about 47%, were sourced from only 100 domains, with the largest number of images coming from Pinterest. Over a million images, or 8.5% of the total dataset, are scraped from Pinterest’s pinimg.com CDN.
User-generated content platforms were a huge source for the image data. WordPress-hosted blogs on wp.com and wordpress.com represented 819k images together, or 6.8% of all images. Other photo, art, and blogging sites included 232k images from Smugmug, 146k from Blogspot, 121k images were from Flickr, 67k images from DeviantArt, 74k from Wikimedia, 48k from 500px, and 28k from Tumblr.
Shopping sites were well-represented. The second-biggest domain was Fine Art America [editor’s note: nothing on that site qualifies as fine art], which sells art prints and posters, with 698k images (5.8%) in the dataset. 244k images came from Shopify, 189k each from Wix and Squarespace, 90k from Redbubble, and just over 47k from Etsy.
Unsurprisingly, a large number came from stock image sites [editor’s note: virtually nothing on these sites qualifies as fine art]. 123RF was the biggest with 497k, 171k images came from Adobe Stock’s CDN at ftcdn.net, 117k from PhotoShelter, 35k images from Dreamstime, 23k from iStockPhoto, 22k from Depositphotos, 22k from Unsplash, 15k from Getty Images, 10k from VectorStock, and 10k from Shutterstock, among many others.
It’s worth noting, however, that domains alone may not represent the actual sources of these images. For instance, there are only 6,292 images sourced from Artstation.com’s domain, but another 2,740 images with “artstation” in the caption text hosted by sites like Pinterest.
Andy Baio, “Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator“, https://waxy.org/2022/08/exploring-12-million-of-the-images-used-to-train-stable-diffusions-image-generator/
Subject matter aside, certain aesthetic qualities emerge from these sources—qualities that both the robots choosing the training sets and the engineers tuning them seem to share. First, there is hyperrealism. To succeed, engineers creating image generators need to engage the prosumer market, constantly announcing better resolution, faster processing times, and greater “realism.” But realism, as we have learned from Roland Barthes, is always coded. In the case of AI image generation, realism is coded by existing visual regimes, but these are less art historical, more technical and related to the mass imagery found on the Internet. A certain aspect of this recalls the photorealistic rendered “graphics demo” images from the 1960s to the 1990s as well as graphically sophisticated first-person video games from the 2000s and 2010s. At the time, these were evaluated by their technical proficiency with complicated graphical techniques, such as rendering reflections on curved surfaces or complicated, multi-source lighting effects and success with these critirea still codes as realistic. Second, there is the legacy of hyperrealistic “photorealism” as interpreted by HDR photographers described above. Being popular, HDR is judged as high quality by the models, so it is promoted in data sets. Finally, there is a clear bias toward prosumer art, in particular the fantasy “concept art” found on the net, anime, and the fandom graphics found on sites such as Deviantart.
But there are also other, formal qualities that initially may be harder to pin down, most notably a certain distinct use of luminosity. Thus, a prompt for “Emma Watson (a commonly used test of how realistic an image generator was in 2022, used as such because of some clear preference for Emma Watson in either the data set or the fine-tuning of the AIs)” does not present the actress in a photograph, but rather creates an illustration of the sort that a skilled digital artist would produce with a program such as Procreate.

With the spread of AI image generators, it also became common to add certain modifiers to the end of prompts to create “better” AI images. Individuals claiming to be successful prompt engineers would write articles like “The Ultimate Midjourney Cheat Sheet,” promising “to provide you with a comprehensive guide on leveraging Midjourney prompts to create stunning visuals effortlessly.” Such guides reported that modifiers such as “32-bit,” “HDR,” and “8K” produced excellent results, or rather, visual cocaine, oversharpened, highly-saturated images, much like the demo or “vivid” settings on HDR televisions that are intended to seduce consumers in electronics stores, not to deliver accurate images. Other modifiers such as “cinematic,” “stunning,” “shot on medium format,” and “masterpiece” were intended to somehow coax AIs into producing better quality. Famously, “style of Greg Rutkowksi” seemed to be appended to nearly every image prompt in mid-2022. Exactly what it did was unclear, but somehow suggesting that the output should be like that of a commercial fantasy artist was seen as a good thing.
But the over-use of luminosity is the most curious one. Why is Emma Watson facing the sunrise or sunset? The only commonly-used modifier I can think of in AI production would be “golden hour,” referring to the warm light found right after sunrise and right before sunset that articles tell amateur photographers are when the best images can be taken. So where might the sense of luminosity come from? Baio’s article confirmed an intuition I had had earlier: the number one artist in the sample of LAION-Aesthetics that he examined is Thomas Kinkade, the painter of light. Kinkade is certainly among the most well-known artists in the country, producing kitsch, expressly commercial art made for a mass market.
A Northern California native, growing up in Placerville, some 180 miles from Silicon Valley, Kinkade studied art at the University of California, Berkeley and the Art Center College of Design in Pasadena. After a brief time working in the film industry, he became a born-again Christian and set off to paint landscapes consisting of backward-looking subject matter intended to be evocative of a peaceful life, a traditional cottage or house in an idealized American scene often featuring bucolic gardens, streams, stone cottages, lighthouses, or the main street in a small town. Strangely, people are either absent from Kinkade’s paintings or, on the occassion when they are present, are isolated passersby, seemingly disconnected from each other, fitting more of Edward Hopper than, say Gustave Callebotte. It’s as if his scenes happen in another reality, perhaps the afterlife.
… This is the limit of what can be sent in Substack. You will need to go to https://varnelis.net/california-forever-or-the-aesthetics-of-ai-images/ to read the conclusion.
Kazys- I appreciate this article’s ability to reach from New Urbanism to AI image generator—which even today, I’m still not sure about. I might be the only dinosaur here, but I don’t really believe that we’re quite there yet in terms of real art from image generators. Thanks for sharing.