Hassan Ragab shares his recent exploration of the intersection between art and architecture through the use of Midjourney and Stable Diffusion. As we approach a year that is expected to showcase the implications of advanced artificial intelligence, Hassan reflects on the past few months that have seen numerous concepts, images, practices, and apps emerge, potentially shaping a new creative era in art, architecture, and design. While OpenAI may have been the first to introduce a coherent AI text-to-image generator (DALL-E-2), Midjourney and Stable Diffusion have made significant strides towards the end of 2022 with several updates that have not only transformed how these tools function but also influenced their users’ creations. The impact of these developments on both our digital and physical future could be tremendous. It’s worth noting, however, that these generators were not created specifically for architecture and design.
Midjourney and Stable Diffusion have become incredibly popular among architects, drawing in thousands of users daily. However, it’s essential to note that architects make up only a small fraction of the overall user base for these AI models. The reason for this could be because both MJ and SD focus purely on artistic aesthetics, as they were trained on artistic datasets. Nevertheless, many architects and designers are using these tools extensively to create new concepts and visualizations that suit their preferences and backgrounds. However, this usage comes with its limitations, as these tools can restrict architects from visualizing their ideas through their preferred medium (3D models) and architectural vocabulary. Despite these challenges, working with Midjourney and Stable Diffusion offers architects an exciting opportunity to integrate their ideas with an ocean of different concepts and ideas that hardly existed in contemporary architectural practice. Architects face various challenges when working in historical contexts.
Artificial intelligence models possess biases that cannot be ignored, but Hassan Ragab is adapting to this reality. While experimenting with Islamic and Pharaonic architecture using Midjourney, DALL-E-2, and Stable diffusion last July, he noticed that the resulting images lacked photorealism but were still beautiful due to their mystic ambiguity. However, when applying historical context, there are many challenges to overcome, especially with Midjourney V4. Although this model recognizes some historical places better than older models, the integration of different concepts with historical architecture is difficult, resulting in often-blurry images with fuzzy details. When working with more contemporary western architecture, the output results are significantly better, showcasing prompt responses with fewer surprises and variations.
Midjourney’s developers have been working towards providing users with greater control over their results, focusing on creating the most photorealistic outputs possible while maintaining accuracy to user prompts. This effort has resulted in increased usage of MJ due to its ease of use and impressive results. However, it appears that achieving photorealism may come at the cost of creativity and ambiguity. The variations in the latest version, V4, are minimal and lack the interesting variations found in older models. While there are tools available to overcome this, such as chaos and the Remix tool, they are not enough to compensate for the older models’ unique variations. Overall, Midjourney strives to offer a stable and democratic medium for users to produce high-quality outputs.
Hassan Ragab suggests that Midjourney is a popular text-to-image generator, but many enthusiasts have also incorporated Stable diffusion’s open-source model into various tools. One advantage of SD is that it is free to use on your own machine. Although SD may lack the mobility, speed, user-friendliness, and aesthetics of Midjourney, it offers more control over outputs and includes additional features such as inpainting/outpainting, animations, and design software plugins that are not available in MJ.
As we continue to explore the potential of AI in architecture and design, it is becoming increasingly clear that these tools will have a major impact on our field in the near future. Early experiments with text-to-3D, text-to-BIM, and video conversion are already underway, and some argue that these tools are driving designers towards new and unprecedented aesthetics. However, since new tools are being developed at an alarming rate, it is difficult to keep up and analyze them. People from various backgrounds have demonstrated great adaptability in utilizing these tools to enhance their professional practices. It is vital to maintain an open mind and a balanced approach to the use of new technologies in our work, recognizing that these tools are too powerful to be used without scrutiny and realistic expectations. We should embrace these unstoppable technologies in our workflow while being aware of the risks and consequences of either ignoring or relying blindly on them. Ultimately, the communities that adopt and embrace these emerging technologies will determine the core visual principles of our future digital and physical realities.