MineCor Project

A. Casali
Laboratoire d’Informatique Fondamentale de Marseille (LIF),
CNRS UMR 6166,
Aix Marseille Université,  IUT d’Aix en Provence
13625 Aix en Provence, France.
E-mail: alain.casali@lif.univ- mrs.fr
C. Ernst
Ecole des Mines de St Etienne,
CMP - Site Georges Charpak, 13541 Gardanne France.
E-mail: ernst@emse.fr

Road Map

Project Presentation
Related Publications
Download Area

Project Presentation

Data Mining, Manufacturing Sciences and Semiconductor Production

In Semiconductor Manufacturing facilities (fabs), an important issue is to identify hidden patterns in the parameters that control manufacturing processes, and/or to determine and to improve product quality. However, the volume and the complexity of the collected data are generally much more larger than in other manufacturing contexts due to the very nature of the domain: semiconductor fabrication processes include several hundred of steps depending on the “chip” being produced. Each of these steps uses various chemico-physical recipes which are themselves divided into four several phases.

Two techniques can be here applied to improve yield (here, the number of "correct" chips on the number of produced chips): real-time and post hoc. The real-time approach involves monitoring on-line measurements of specific process steps (generally, defect monitoring), and undertakes corrective actions to ensure that the parameter being measured remains within the desired limits. The post hoc approach compares the end result of the whole manufacturing process with the desired specifications, analyzing the root causes of low yield for adjusting the process parameters to ensure future quality. Advanced Process Control (APC), a relatively recent domain which extends SPC (Statistical Process Control), considers the two aspects: APC aims to highlight correlations between production parameters in order to rectify possible drifts of the associated process(es). This can be done for specific process steps in real-time: R2R (Run To Run) regulation loops and FDC (Fault Detection and Classification) tools are here most representative APC techniques. On the other hand, correlations can also be discovered post hoc, i.e. after the whole fabrication processes have completed. Both approaches try first to identify which parameters are the root causes of a particular yield "excursion". By automatically deriving correlations between variability in multiple process parameters and process yield, model-based analysis can in a second step dramatically reduce the time required to determine the root causes of yield losses. Let us emphasize that this second non-trivial problematic is excluded from the current project and thus presentation scope.

Our Work

In order to highlight correlations between production parameters, we propose a full Knowledge Discovery in Databases (KDD) model.
A first study, completed in 2006, focused on the following problem: given an input file of measurements associated with product lots, whose characteristics are that they include a significant number of attributes (nature of the measurements) compared to the number of lines of data (ratio of 1800 per 300 on average), if we consider one parameter (the yield Y), what are the other parameters with the most "impact" on the value of Y? The proposed approach is derived from association rule searches between values in a transaction database. It uses a variation of levelwise algorithms and integrates the Chi2 statistical measure, after the preprocessing and discretization stages. The implementation of this work, called MineCor, was proposed in [CAS 07].

In a second step, we built decision correlation rules on the twofold basis of the Chi-Squared measure and of the support of the extracted values. Due to the very nature of the problem, levelwise algorithms only allow extraction of results with long execution times and huge memory occupation.  To offset these two problems, a further study was carried between 2007 and 2009. It consisted in:
In [CAS 09a], we defined and built the LHS-CHI2 algorithm from the two first points. In [CAS 09b], we included the proposed algorithm in MineCor and compared our new results with the ones previously obtained. The current version of the mining heart of the model uses this new method.

More recently, we have included in MineCor a second discretization method (Jenks' method) in addition to the generic method (intervals of equal width), which makes MineCor a real complete KDD model. It was presented from an industrial point of view in [CAS 10a].


Related Publications

Technical Report

Publications in International Conferences

Publications in National Conferences


Download Area

Each archive is provided with:
  1. a binary file, or associated source and environment files;
  2. a README.txt file which explains how to
  3. sample files and associated examples of configuration files.

Binary Files

Debian - Ubuntu*

Source Files

Win32 (Visual Studio 2005)
Ubuntu (GNU C)

* the binary file works with Linux kernel version 2.6.31-20-generic



This work was initially supported by Research Project “Rousset 2003-2008”, financed by the Communauté du Pays d'Aix, Conseil Général des Bouches du Rhône and Conseil Régional Provence Alpes Côte d'Azur.

Last Web Page Update : 2011/07/08
Last Downloadable Files Update : 2011/07/08