Scale recoding in sociological research: a new validation methodology. An application to a political survey
DOI:
https://doi.org/10.3989/ris.2019.77.2.17.088Keywords:
Simple and Multiple Factor Analysis, Social Survey Measurement ScalesAbstract
The recoding of scale variables is a common step in the analysis of survey data. It is not immune, however, to certain pitfalls, such as the introduction of biases, or potential data distortion. This paper presents a methodological proposal for the validation of any recoding process, whether it involves metric- or categorical-scale variables. The aim of the proposed methodology is to verify the adequacy of the re-codification by indicating how close in structure the re-coded data are to the original data. The basis of the methodology is a factorial analysis technique, Multiple Factor Analysis (MFA), which is performed on a global data table juxtaposing the original-scale and recoded-scale data. The procedure is tested on real-world data drawn from a public opinion poll on perceptions of leading politicians in the Spanish Parliament.
Downloads
References
Abascal, E., K. Fernández, M. I. Landaluce and J. Modroño. 2001. "Diferentes aplicaciones de las técnicas factoriales de análisis de tablas múltiples en las investigaciones mediante encuestas". Metodología de Encuestas 3: 251-280.
Abascal, E., I. García-Lautre and M. I. Landaluce. 2006. "Multiple factor analysis of mixed tables of metric and categorical data". Pp. 351-367 In Multiple Correspondence Analysis and related Methods. London: Chapman and Hall. https://doi.org/10.1201/9781420011319.ch15
Abascal, E., I. García-Lautre and F. Mallor. 2006b. "Data mining in a bicriteria clustering problem". European Journal of Operational Research 173: 705-716. https://doi.org/10.1016/j.ejor.2005.10.006
Abascal, E. and V. Díaz de Rada. 2014. "Analysis of 0 to 10-point response scales using factorial methods: a new perspective". International Journal of Social Research Methodology 17: 569-584. https://doi.org/10.1080/13645579.2013.799736
Abascal, E., V. Díaz de Rada, I. García-Lautre and M. I. Landaluce. 2017. "Analysis of the response structure to a set of questions with large number of scale points: a new combined metric and categorical approach". International Journal of Social Research Methodology 21(4): 395- 407. https://doi.org/10.1080/13645579.2017.1399620
Bell, D. S., C.M. Mangione and C.E. Kahn. 2001. "Randomized testing of alternative survey formats using anonymous volunteers on the world wide web", Journal of the American Medical Informatics Association nº 8: 616-620. https://doi.org/10.1136/jamia.2001.0080616
Bourque, L.B. and V. A. Clark. 1994. Processing Data The Survey Example. Newbury Park, CA: Sage. https://doi.org/10.2307/2532229
CIS (2014) Centro de Investigaciones Sociológicas. Barómetro de julio 2014, estudio 3033.
Conrad, F. and F. Kreuter. 2015. Questionnaire design for social surveys, course by Coursera. https://es.coursera. org/learn/questionnaire-design
Couper, M. P., R. Tourangeau, F. G. Conrad and C. Zang. 2013. "The design of grids in web surveys", Social Science Computer Review, vol. 31 (3), pp. 322-341. https://doi.org/10.1177/0894439312469865 PMid:25258472 PMCid:PMC4172361
Couper, M. P., M. Traugott and M. Lamias. 2001. "Web survey design and administration", Public Opinion Quarterly, nº 65, pp. 230-253. https://doi.org/10.1086/322199 PMid:11420757
Dazy, F. and J. F. Le Barzic. 1996. L'Analyse des Données Evolutives. Paris: Technip
Escofier, B. and J. Pagès. 1982. "Comparaison de groupes de variables définies sur le même ensemble d'individus", Working paper 165, Institut National de Recherche en Informatique et en Automatique.
Escofier, B. and J. Pagès. 1990. Analysis Factorielles Simples et Multiples. Paris: Dunod.
Escofier, B. and J. Pagès. 2016. Analysis Factorielles Simples et Multiples. Cours et éstudes de cas.5e edition. Paris: Dunod.
García-Lautre, I. 2001. Medición y Análisis de las Infraestructuras: Una Nueva Metodología basada en el Análisis Factorial Múltiple. Unpublised Ph.D, thesis. Pamplona. Public University of Navarra.
García-Lautre, I. and E. Abascal. 2003. "Una metodología para el estudio de la evolución de variables latentes. Análisis de las infraestructuras de carreteras de las comunidades autónomas (1975-2000)". Estadística Española, Vol. 45, Nº 153, pp. 193-210.
García-Lautre, I. and E. Abascal. 2004. "A methodology measuring latent variables based on multiple analysis" Computational statistics and data analysis 45(3): 505-517. https://doi.org/10.1016/S0167-9473(03)00037-9
Greenacre, M. and J. Blasius. 2014. Visualization and Verbalization of Data. Computer Science and Data Analysis Series. London: Chapman and Hall/Taylor and Francis Group/CRC.
Iglesias, C.P., Y. F. Birks and D.J. Torgerson. 2001. "Improving the measurement of quality of life in older people: The york SF-12", Quarterly Journal of Medicine nº 94: 695-698. https://doi.org/10.1093/qjmed/94.12.695 PMid:11744790
Landaluce, M.I. 1995. "Estudio de la estructura de gasto medio de las Comunidades Autónomas españolas. Una aplicación del Análisis Factorial Multiple". Unpublised Ph.D, thesis. Bilbao. Public University of Basque Country.
Lebart, L., A. Morineau and M. Piron. 2000. Statistique exploratoire multidimensionnelle. Paris: Dunod.
Lebart, L. 2006. "Validation Techniques in Multiple Correspondence Analysis". Pp. 179-195 in Multiple Correspondence Analysis and related Methods, edited by M. Greenacre and J. Blasius. London: Chapman and Hall. https://doi.org/10.1201/9781420011319.ch7
Lebart, L., M. Piron and A. Morineau. 2006. Statistique exploratoire multidimensionnelle. Visualisation et inference en fouilles de données. 4 ed. Paris: Dunod.
Lebart, L. 2007. "Which Bootstrap for Principal Axes Methods?". Pp. 581-588 in Selected Contributions in Data Analysis and Classifications. Studies in Classifications, Data Analysis, and Knowledge Organization, edited by P. Birto, G. Cucumel, P. Bertrand and F. de Carvalho. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-540-73560-1_55
Meulman, J.J., A.J. Van Der Kooij and W.J. Heiser. 2004. "Principal Components Analysis with nonlinear optimal scaling transformations for ordinal and nominal data". Pp. 49-70 in The sage handbook of quantitative methodology for the social sciences, edited by D. Kaplan, Thousand Oaks, CA: Sage.
Miller, D.C. 1991. Handbook of Research Design and Social Measurement. London: Sage.
Revilla, M., D. Toninelli and C. Ochoa. 2017. "An experiment comparing grids and item-by-item formats in web surveys completed through PCs and smartphones", Telematics and Informatics, nº 34: 30-42. https://doi.org/10.1016/j.tele.2016.04.002
Roßmann, J., T. Gummer and H. Silber. 2017. "Mitigating satisficing in cognitively demanding grid questions: evidence from two web-based experiments", Journal of Survey Statistics and Methodology 6(3): 376-400. https://doi.org/10.1093/jssam/smx020
Smyth, Jolene D. and K. Olson. 2016. "A Comparison of Fully Labeled and Top-Labeled Grid Question Formats". Paper presented at the International Conference on Questionnaire Design, Development, Evaluation, and Testing, Miami, FL, November 9-13, 2016.
SPAD7.3 (Software used). Data management. Analyse des Données. Data Mining Coheris-Spad.
Tien, C. 2008. "Recoded Variable". Pp. 696-697 in Encyclopedia of Survey Research Methods (vol. 1).
Thorndike, F. P., P. Carlbring, F.L. Smyth, J.C. Magee, L. Gonder-Frederick, L.G. Ost and L.M. Ritterband. 2009. "Web-based measurement: effect of completing single or multiple items per webpage", Computers in Human Behavior, nº 25: 393-401. https://doi.org/10.1016/j.chb.2008.05.006
Toepoel, V., M. Das and A. van Soest. 2009. "Design of web questionnaires: The effects of the number of items per screen", Field Methods, nº 21: 200-213. https://doi.org/10.1177/1525822X08330261
Tourangeau, R., M.P. Couper and F.G. Conrad. 2004. "Spacing, position, and order: Interpretive heuristics for visual features of survey questions", Public Opinion Quarterly, nº 68: 368-393. https://doi.org/10.1093/poq/nfh035
Weisberg, F., J.A. Krosnic and B.D. Bowen. 1996. An Introduction to Survey Research, Polling and Data Analysis, 3 edition. California, USA: Sage.
Published
How to Cite
Issue
Section
License
Copyright (c) 2019 Consejo Superior de Investigaciones Científicas (CSIC)

This work is licensed under a Creative Commons Attribution 4.0 International License.
© CSIC. Manuscripts published in both the print and online versions of this journal are the property of the Consejo Superior de Investigaciones Científicas, and quoting this source is a requirement for any partial or full reproduction.
All contents of this electronic edition, except where otherwise noted, are distributed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. You may read the basic information and the legal text of the licence. The indication of the CC BY 4.0 licence must be expressly stated in this way when necessary.
Self-archiving in repositories, personal webpages or similar, of any version other than the final version of the work produced by the publisher, is not allowed.