Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices

The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicoche...

Description complète

Détails bibliographiques
Auteurs principaux:	Folguera, Laura, Zupan, Jure, Cicerone, Daniel, Magallanes, Jorge
Format:	Texto completo
Langue:	Inglés
Publié:	Elsevier Science Bv 2015
Sujets:	CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES
Accès en ligne:	https://ri.unsam.edu.ar/handle/123456789/1009

id	ds-123456789-1009
record_format	dspace
institution	Repositorio Institucional
collection	RI
language	Inglés
topic	CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES
spellingShingle	CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
description	The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method.
format	Texto completo
author	Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge
author_facet	Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge
author_sort	Folguera, Laura
title	Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
title_short	Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
title_full	Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
title_fullStr	Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
title_full_unstemmed	Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
title_sort	self-organizing maps for imputation of missing data in incomplete data matrices
publisher	Elsevier Science Bv
publishDate	2015
url	https://ri.unsam.edu.ar/handle/123456789/1009
work_keys_str_mv	AT folgueralaura selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT zupanjure selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT ciceronedaniel selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT magallanesjorge selforganizingmapsforimputationofmissingdatainincompletedatamatrices
_version_	1747968924248440832

Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices

Documents similaires