How Data Sharing Advances Zero-Defect Semiconductor Manufacturing
With the proliferation of digital technologies and exacerbation by the chip shortage, there is immense pressure on the semiconductor industry to manufacture products with limited defects and deliver novel innovations to market rapidly. The vast amounts of data produced today create opportunities for the entire industry to maximize production, innovation and cost reduction. Materials-related industry participants know that there is a need for data collaboration to accelerate the total output, production/research and quality, but several challenges must be addressed.
First, individual companies are reluctant to establish an isolated ecosystem given the initial invested cost and time required. Second, companies usually have disparate data systems, elongated learning cycles and lengthy processes to build new capabilities. Finally, there isn’t a suitable solution that solves these issues while preserving intellectual property (IP) and ensuring that companies have full control over their data.
The need for a collaborative data ecosystem
Parts-per-billion levels of impurities in a raw material could impact the customer’s process performance. An important tool in the industry’s arsenal is machine learning, utilized to comprehend deviations and construct predictive models for early detection of yield-related challenges. However, the effectiveness of these algorithms depends on flawless and consistent alignment of data spanning the entire manufacturing journey, from suppliers to final products. This process is time-intensive, typically taking several months up to one year to aggregate before you can act on relevant data.
Having a collaborative data ecosystem helps identify data relationships that can be automated, as the raw material, finished good material and device process data can all be combined. Equally important is optimizing performance and maintenance for equipment based on specific parameters and on-site performance to direct the right resources to maximize uptime and minimize costs.
Already adopted by, for example, TEL and Micron, an open data analytics platform allows participants in the semiconductor industry to collaborate on relevant information. The benefit for the semiconductor ecosystem is that all players can now start to unlock new insights about material and equipment quality and how that relates to manufacturing processes.
Additionally, they can benefit from supply chain transparency and sustainability insights. Traditionally, each company looks to the optimization of its own process, but as we bring all parties together across the supply chain, we enable them to see interactions between their datasets that impact performance that otherwise would have been hidden. Assuming a comprehensive data set is available, the analysis can be performed within days, offering significant time and resource advantages.
How does each player in the ecosystem benefit?
Material suppliers improve quality and fab engagement through collaborative analytics, assist in managing sub-suppliers and unlock efficiency gains by integrating production and quality data to achieve unprecedented transparency, which can be made available/digestible with a consolidated dashboard.
Device makers can reduce downtime and the impact of quality excursions by proactively identifying root causes and enabling tighter quality control. They can engage with suppliers and their data via a single, secure platform and deliver continuous feedback. The integration of data allows for predictive analytics, fewer quality deviations and faster time to market.
Equipment suppliers improve visibility into equipment operation and improve effectiveness by leveraging data analytics for predictive maintenance. They help device markers ramp faster through better tool matching, co-optimizing equipment, process parameters and optimizing maintenance schedules to reduce downtime.
Protecting confidential information
Protecting confidential information and IP in the semiconductor industry is paramount, but an overabundance of caution in sharing even non-sensitive data impairs critical learning. The capability to provide multiple levels of control, security and traceability to ensure that confidential information stays confidential is the foundation of a collaborative ecosystem.
Data access is of significant concern, particularly when it is brought into a shared ecosystem. If parties are not willing to share raw process data, they have the option to obfuscate and normalize it before it is shared. In addition, individual organizations have full ownership and control of the data that it makes available to others in the ecosystem.
As an example, a participant of an organization working on a specific process module can only access that set of resources or a subset of the material supplier data they need to proceed with the analysis. As a result of this, monitoring users and data access by projects requires a systematic tracking functionality, with a specific operational business or analytics goal in mind. This provides a high degree of auditing capability natively embedded in the platform, which then helps build on top of the other multi-levels of security present within the ecosystem. The platform ensures sensitive data is processed in accordance with applicable data privacy rules like GDPR, regulations and standards. The software provides first-class security, access controls, partitioning auditing and accountability functions to support responsible data use.
Data normalization and obfuscation pipelines
Normalization is an integral method of any data analytics workflow and even more important in a collaborative setting. In process manufacturing, datasets and variables may have significant differences in magnitudes and units of measurement. To find meaningful relationships and develop stable models that are not distorted by scale, it is necessary to linearize and scale data to comparable ranges. Discrepancies amongst the character of datasets are even more acute when looking across semiconductor manufacturing and different operations up the value chain.
A simplified example of the data platform that automates the appropriate transformation and scaling of data between a supplier and an integrated device maker is illustrated in Figure 1. It shows the data lineage flow from the respective parties as it goes through a set of transformations to yield a dataset version ready to be shared. In this example, a specific normalization scheme is defined by a configuration file (config) and then applied, and then column names are obfuscated using an encoder function, which produces the dataset ready to be shared and merged on a join key with the remaining party.
There are several data normalization techniques to choose from, including quantization, scaling, variable transformation, feature creation and ranking. Figure 2 shows the pros and cons of each.
Collaborative insights for better decisions
Bringing critical data into one central hub with automated data ingestion and machine learning allows continuous improvement as more information is added, enhancing quality and reducing excursions that can create supply chain shortages. Ecosystem participants can gain important insights needed to drive better business decisions, using a combination of data analytics, AI and machine-learning tools, in combination with the process knowledge and chemical knowledge each company possesses. It is imperative for this unified system to encompass the typical data science workflows in a secure environment, which can then be standardized across the supply base and device manufacturers.