Large Language Model | Exponential Industry

🖨️ AI and 3D printing: Ai Build’s Daghan Cam and Luke Rogers on simplifying large-format 3D printing with AI

📅 Date: August 2, 2023

🔖 Topics: Additive Manufacturing, Large Language Model

🏢 Organizations: AI Build, KUKA, Meltio, Evo 3D, Massive Dimension, Boeing, Weir Group

Ai Build has already partnered with a number of leading 3D printer hardware manufacturers, including Hans Weber Maschinenfabrik, Meltio, KUKA, Evo3D, CEAD, and Massive Dimension. Through these partnerships, the company incorporates a wide range of large-format 3D printers into their Ai Lab workshop. Here, the hardware is used to test, develop, verify, and integrate Ai Build’s software for a growing range of applications. Whilst Cam could not disclose too many names, global engineering solutions firm Weir Group and aerospace manufacturer Boeing were pinpointed as key customers employing AiSync software.

Ai Build’s key product is its AiSync software, an AI-driven toolpath optimization and quality control platform. Regarding toolpath optimization, it was announced earlier this year that Ai Build had developed a process which allows users to create advanced 3D printing toolpaths using natural language prompts. This feature, called Talk to AiSync, allows users to input simple text, such as “slice the part with 2mm layer height.” This text is then translated into machine instructions to produce the desired 3D printed part.

Key to this feature is large language AI models. AiSync uses OpenAI on the back end, with GPT-4 running the software’s natural language processing. “With the addition of large language models, we are able to translate simple English words, plain sentences, into a stack of workflow that we create on our software,” explained Cam. “The goal is to make it super accessible to inexperienced users by making the user experience really smooth.”

Retentive Network: A Successor to Transformer for Large Language Models

📅 Date: July 17, 2023

✍️ Authors: Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei

🔖 Topics: Retentive Network, Transformer, Large Language Model, Generative AI

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection between recurrence and attention. Then we propose the retention mechanism for sequence modeling, which supports three computation paradigms, i.e., parallel, recurrent, and chunkwise recurrent. Specifically, the parallel representation allows for training parallelism. The recurrent representation enables low-cost O(1) inference, which improves decoding throughput, latency, and GPU memory without sacrificing performance. The chunkwise recurrent representation facilitates efficient long-sequence modeling with linear complexity, where each chunk is encoded parallelly while recurrently summarizing the chunks. Experimental results on language modeling show that RetNet achieves favorable scaling results, parallel training, low-cost deployment, and efficient inference. The intriguing properties make RetNet a strong successor to Transformer for large language models.

LongNet: Scaling Transformers to 1,000,000,000 Tokens

📅 Date: July 5, 2023

✍️ Authors: Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei

🔖 Topics: Transformer, Large Language Model, Generative AI

Scaling sequence length has become a critical demand in the era of large language models. However, existing methods struggle with either computational complexity or model expressivity, rendering the maximum sequence length restricted. To address this issue, we introduce LongNet, a Transformer variant that can scale sequence length to more than 1 billion tokens, without sacrificing the performance on shorter sequences. Specifically, we propose dilated attention, which expands the attentive field exponentially as the distance grows. LongNet has significant advantages: 1) it has a linear computation complexity and a logarithm dependency between any two tokens in a sequence; 2) it can be served as a distributed trainer for extremely long sequences; 3) its dilated attention is a drop-in replacement for standard attention, which can be seamlessly integrated with the existing Transformer-based optimization. Experiments results demonstrate that LongNet yields strong performance on both long-sequence modeling and general language tasks. Our work opens up new possibilities for modeling very long sequences, e.g., treating a whole corpus or even the entire Internet as a sequence.

Training ChatGPT on Omniverse Visual Scripting Using Prompt Engineering

Palantir AIP | Defense and Military

What does it take to talk to your Industrial Data in the same way we talk to ChatGPT?

📅 Date: April 18, 2023

✍️ Author: Jason Schern

🔖 Topics: Generative AI, Large Language Model

🏢 Organizations: Cognite

The vast data set used to train LLMs is curated in various ways to provide clean, contextualized data. Contextualized data includes explicit semantic relationships within the data that can greatly affect the quality of the model’s output. Contextualizing the data we provide as input to an LLM ensures that the text consumed is relevant to the task at hand. For example, when prompting an LLM to provide information about operating industrial assets, the data provided to the LLM should include not only the data and documents related to those assets but also the explicit and implicit semantic relationships across different data types and sources.

An LLM is trained by parceling text data into smaller collections, or chunks, that can be converted into embeddings. An embedding is simply a sophisticated numerical representation of the ‘chunk’ of text that takes into consideration the context of surrounding or related information. This makes it possible to perform mathematical calculations to compare similarities, differences, and patterns between different ‘chunks’ to infer relationships and meaning. These mechanisms enable an LLM to learn a language and understand new data that it has not seen previously.

How Large-Language Models Can Revolutionize Military Planning

📅 Date: April 12, 2023

✍️ Authors: Benjamin Jensen, Dan Tadross

🔖 Topics: Large Language Model

🏭 Vertical: Defense

🏢 Organizations: Scale AI

What happens when you give military planners access to large-language models and other artificial intelligence and machine-learning applications? Will the planner embrace the ability to rapidly synthesize diffuse data streams or ignore the tools in favor of romanticized views of military judgment as a coup d’œil? Can a profession still grappling to escape its industrial-age iron cage and bureaucratic processes integrate emerging technologies and habits of mind that are more inductive than deductive?

A team that includes a professor from Marine Corps University and a portfolio manager from Scale AI share our efforts to bridge new forms of data synthesis with foundational models of military decision-making. Based on this pilot effort, we see clear and tangible ways to integrate large-language models into the planning process. This effort will require more than just buying software. It will require revisiting how we approach epistemology in the military professional. The results suggest a need to expand the use of large-language models alongside new methods of instruction that help military professionals understand how to ask questions and interrogate the results. Skepticism is a virtue in the 21st century.

Will Generative AI finally turn data swamps into contextualized operations insight machines?

📅 Date: April 5, 2023

🔖 Topics: Large Language Model, Generative AI

🏢 Organizations: Cognite

Generative AI, such as ChatGPT/GPT-4, has the potential to put industrial digital transformation into hyperdrive. Whereas a process engineer might spend several hours performing “human contextualization” (at an hourly rate of $140 or more) manually – again and again – contextualized industrial knowledge graphs provide the trusted data relationships that enable Generative AI to accurately navigate and interpret data for Operators without requiring data engineering or coding competencies.

Can Large Language Models Enhance Efficiency In Industrial Robotics?

📅 Date: March 28, 2023

✍️ Author: Dmitry Golitsyn

🔖 Topics: AI, Large Language Model, Industrial Robot

🏢 Organizations: ABAGY

One of the factors that slow down the penetration of industrial robots into manufacturing is the complexity of human-to-machine interfaces. This is where large language models, such as ChatGPT developed by OpenAI, come in. Large language models are a cutting-edge artificial intelligence technology that can understand and respond to human language at times almost indistinguishable from human conversation. Its versatility has been proven in applications ranging from chatbots to language translation and even creative writing.

It turns out that large language models are quite effective at generating teach pendant programs for a variety of industrial robots, such as KUKA, FANUC, Yaskawa, ABB and others.

ChatGPT for Robotics: Design Principles and Model Abilities

📅 Date: February 20, 2023

✍️ Authors: Sai Vemprala, Rogerio Bonatti, Arthur Bucker, Ashish Kapoor

🔖 Topics: ChatGPT, Industrial Robot, Large Language Model

🏢 Organizations: Microsoft

ChatGPT unlocks a new robotics paradigm, and allows a (potentially non-technical) user to sit on the loop, providing high-level feedback to the large language model (LLM) while monitoring the robot’s performance. By following our set of design principles, ChatGPT can generate code for robotics scenarios. Without any fine-tuning we leverage the LLM’s knowledge to control different robots form factors for a variety of tasks. In our work we show multiple examples of ChatGPT solving robotics puzzles, along with complex robot deployments in the manipulation, aerial, and navigation domains.

Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance

📅 Date: April 4, 2022

🔖 Topics: Large Language Model, Transformer

🏢 Organizations: Google

Last year Google Research announced our vision for Pathways, a single model that could generalize across domains and tasks while being highly efficient. An important milestone toward realizing this vision was to develop the new Pathways system to orchestrate distributed computation for accelerators. In “PaLM: Scaling Language Modeling with Pathways”, we introduce the Pathways Language Model (PaLM), a 540-billion parameter, dense decoder-only Transformer model trained with the Pathways system, which enabled us to efficiently train a single model across multiple TPU v4 Pods. We evaluated PaLM on hundreds of language understanding and generation tasks, and found that it achieves state-of-the-art few-shot performance across most tasks, by significant margins in many cases.

Assembly Line