| |||||||||||||
ATDP 2013 : First International Workshop on Advanced Techniques for Data Preprocessing | |||||||||||||
Link: http://di.unito.it/ATDP2013 | |||||||||||||
| |||||||||||||
Call For Papers | |||||||||||||
The performance of Machine Learning algorithms can vary greatly depending on the amount of “pre-processing” performed on the raw data. Selecting or extracting the right features, normalizing ranges, or simply rotating them with simple tricks like PCA can be critical. If we think about a real-world learning system as a pipeline of preprocessing and learning activities, there is a continuum of possibilities to distribute the complexity of the task. This ranges from the total absence of preprocessing, followed by a complex classifier, through to very advanced preprocessing, followed by a simple classifier.
This first edition of the workshop on advanced techniques for data preprocessing will be mainly concerned with data transformations aimed at dimensionality reduction, which can be roughly divided in two categories: feature extraction and feature selection. Dimensionality reduction makes use of many techniques (ranging from application-specific ones through to generic methods). To give a flavor of some relevant techniques which have proven useful for feature extraction, we point out Principal / Independent Components Analysis, singular value decomposition, linear discriminant analysis, Mahalanobis transformations, partial least square dimension reduction, and canonical correlation analysis. As for feature selection, suggested techniques include consistency-driven filters, discretization, generalized correlation ratio, minimum redundancy-maximum relevance, importance variable score, entropy / mutual information measures, minimum description length, and kernel-based selection. Dimensionality reduction can be studied from a theoretical or from a practical perspective. In the former case, the typical aim is to discover useful underlying properties and/or constraints that allow us to better understand the technique. The latter case may concentrate on understanding the reason why some techniques are often prevalent with reference to a given application domain (e.g., wavelet transform for image processing, position specific score matrices for bioinformatics, and TFIDF for information retrieval) or simply on comparative assessments. Potential topics relevant for the workshop include, but are not limited to: - definition of novel techniques or algorithms for dimensionality reduction, including selection/extraction in both supervised and unsupervised scenarios. - large empirical studies of existing techniques, - theoretical / experimental analysis concerning the ability of a technique i) to preserve or highlight discriminant information, ii) to remove the noise inherent in the data acquisition phase, iii) to separate characteristic from discriminant information, and iv) to reduce the impact of the so-called curse of dimensionality, - definition of software architectures / algorithms for specific application domains or problems, provided that they embed / implement relevant dimensionality reduction techniques able to facilitate the corresponding classification, regression, or prediction task. Selected papers will be invited to submit for a special issue in a high-quality international journal. All these manuscripts will undergo a further revision at that time to produce higher quality papers for the journal. |
|