Or – AI for construction and the impact of Generative Pre-trained Transformer 3
Someone told me that the way to write a successful LinkedIn article was to think ‘clickbait’ and come up with a crazy, off-the-wall headline to draw readers in.
Hey-presto! It has worked – because you are here. Thank you for starting to read this article.
Don’t go! What follows is about cool machine learning stuff and the architecture, contracting and engineering (ACE) sector. And all of the words in the headline feature below.
The trouble with technology is that we struggle with trends. We think about how good a system is now, not its potential for the future. Take Thomas Watson, the President of IBM back in 1943, who said he thought that there was a world market “for maybe five computers”.
So it is with machine learning. It is not such a big deal for the AEC sector at the moment. But that will change. It really, really will!
McKinsey’s 2018 report [1] – Artificial Intelligence: Construction technology’s next frontier – pointed to project schedule optimisers, image recognition systems and enhanced analytics platforms as examples of AI use cases in construction, adding that “take up is quite low…particularly compared with other industries”. (Quelle surprise! My bookshelf overflows with McKinsey reports saying that construction is towards the bottom of various league tables for productivity, digitisation…this, that and the other. They probably have construction near the bottom of the Isthmian Premier League as well).
But as they say in the small print at the bottom of investment ads, the past is not a reliable guide to the future. Something happened last summer with machine learning that should make the industry – all industries, including AEC – sit up and take notice. That something was the launch of GPT-3.
Generative Pre-trained Transformer 3 is a new natural language processing (NLP) model developed by OpenAI, the AI research laboratory that was co-founded by Elon Musk in 2015. It has ten times more trainable parameters (175 billion) than its nearest rival, Microsoft’s Turing NLG system. This is a huge step forward in terms of power.
An NLP model is trained on an enormous body of text, so that it begins to learn probabilistic connections between words. To put it another way, it ‘learns’ to read and write. GTP-3 appears to read and write very well, be it computer code, poetry or prose. Having the opportunity to play with the OpenAI interface last July, New York Times commentator Farhad Manjoo [2] described what GPT-3 produced as not just “amazing”, “spooky” and “humbling”, but also “more than a little terrifying”.
To give a sense of what it can do, Gwern Branwen has created a showcase for GTP-3 writing [3]. He gives the system a well-known poem and asks it to continue and write another verse in the same style and with meaning. Here is what GPT-3 came up with for an (additional) fourth verse to Edward Lear’s ‘The Owl and the Pussycat’:
I think that this is scary at a number of levels!
Poetry-writing like this is an example of a ‘one-shot’ task. That is, in addition to the task description, the model sees a single example of the task – “write a final verse to this poem in a similar style to these other verses”. This is in contrast to a ‘zero-shot’ task, where the model is just asked to do something – “write a poem in the style of Edward Lear”. Or a ‘few-shot’ task, where the model would get several examples of the task in addition to the description.
Accuracy vs number of model parameters for various Zero / One / Few shot tasks [4]
What the increased number of learning parameters shows, is the huge step forward that GPT-3 has made in the accuracy of ‘one-shot’ tasks. The power of one-shot learning is huge. It is like showing someone how to swing a tennis racquet only once, and they can go on to compete at Wimbledon [5].
This should make everyone in the AEC world sit up. Particularly as it is now not just text output that GPT-3 is producing. In January 2021, the DALL-E system, which uses the GPT-3 model, showed that it was able to generate a variety of drawings and pictures based on simple text prompts. The system was asked something like – “draw an armchair in the shape of an avocado; an armchair imitating an avocado”. [6] The result was the images in the header to this article.
Now think of designing in Revit. If you ‘show’ a GPT-3-type system an object, it should gain an ‘understanding’ of the object almost straightaway. This is a great example of a ‘one-shot’ task. The understanding may come from ‘reading’ the underlying script or code. Or, in time, as GPT-3-type systems are adapted to the spatial language of design, it will be able to ‘read’ geometry, object-type and scheme direct from the model.
At the moment, if you want to find mirrored doors in your model, you need to write a 20 line python script. GPT-3 would allow you to ‘show’ the system a mirrored door object, and it would be able to go and find the others in the model. (“Siri, find me the mirrored doors in my model” sort-of-a-thing.)
And DALL-E – or something similar – would be able to create all sorts of objects and start to optimise them. (“Siri, design me a facades system that looks like a bird’s nest”).
Last summer, Tessa Lau, Founder and CEO at Dusty Robotics asked the question – who was using GPT-3 to generate Revit models? The answer (judging by the lack of answers) was – not many. But that does not mean that integration between Revit models and NLP-type systems such as GTP-3 is not coming.
But there are some people doing some very interesting things with machine learning in AEC, right now.
nCircle Tech [7] have developed a ‘Scan-to-BIM’ system that uses machine learning to ‘understand’ and enrich the point cloud model with type information and turn it into BIM objects that can be used in Revit. In the future, using a DALL-E-type approach, such a system may be able to learn the shape of things in the world and design new elements and objects for us.
Jonathan Ingram, the ‘father’ [8] of 3D parametric design, who created Sonata and Reflex in 1980s and 1990s and which led to Revit, is also working with machine learning and new ways of designers interacting with their models.
“Currently we are using AI in a number of ways in the design, optimisation and layout of retail stores and products. We are currently investigating DALL-E/GPT-3 as it brings a new range of possibilities to the interface to the design process, in particular. In terms of building design, I look forward to being able to say to my designer avatar – ‘Create a building on this site with this footprint in the style of Norman Foster.’ We cannot do that yet, but we soon will. To make this possible, we need to develop a sorely needed new interface to design and for BIM.”
Jonathan Ingram (the ‘father’ of BIM)
It would be kind of cool to end this article by saying that it was written by a GPT-3 model. Not true. But the next one from me might be. O’Fuffle! O’Fuffle! Fuffle-ee!
References:
[1] ‘Artificial Intelligence – Construction technology’s next frontier’ www.mckinsey.com/business-functions/operations/our-insights/artificial-intelligence-construction-technologys-next-frontier
[2] ‘How Do You Know a Human Wrote this?’ www.nytimes.com/2020/07/29/opinion/gpt-3-ai-automation.html
[3] Gwern Branwen website www.gwern.net/GPT-3
[4] ‘OpenAI Unveils 175 Billion Parameter AI Model’ – www.medium.com/@Synced/openai-unveils-175-billion-parameter-gpt-3-language-model-3d3f453124cd
[5] ‘Impact of GPT-3 on AEC Industry’ – https://www.techrfi.com/2020/08/23/impact-of-gpt3-on-aec-industry/
[6] ‘This Avocado Armchair could be the Future of AI’ www.technologyreview.com/2021/01/05/1015754/avocado-armchair-future-ai-openai-deep-learning-nlp-gpt3-computer-vision-common-sense/
[7]NCircleTech website www.ncircletech.com
[8] Jonathan Ingram Wikipedia Page – https://en.wikipedia.org/wiki/Jonathan_Ingram#:~:text=Sometimes%20called%20the%20’Father%20of,building%20in%20a%20single%20file.
Recent Comments