Statistical data depths
Statistical Data Depths has proved successful in the analysis of multivariate data sets, e.g., for deriving an overall center and assigning ranks to the observed units. Two key features are the directions of the ordering, from the center towards the outside, and the recognition of a unique center of the distribution. More recently, Depths are used in the analysis of high-dimensional data and of functional data, in classification and clustering setting. This session is devoted to the latest advances in this research field.
Organizer: Claudio Agostinelli, Università di Trento, ITALY
Alicia Nieto Reyes
Universidad de Cantabria
SPAIN
Stanislav Nagy
Charles University in Prague
CZECH REPUBLIC
Germain Van Bever
Université Libre de Bruxelles
BELGIUM
Latent variable models for complex networks - I
Network analysis is an area of study which is gaining an increasing interest from scientists coming from many distinct fields. A recent focus of this broad research area is on the study of how subjects interact or are connected each other. The statistical analysis of network data represents a quite difficult context to deal with, as the relations to be studied are usually complex, and sophisticated models are needed to obtain a reliable conclusions, which require particular inferential tool due to the level of data complexity. Relations may refer to economic exchanges, as in input-output tables, to social proximity, as in human groups interactions, to scientific literature, as in research collaborations, to disease and mortality spread, as in the analysis of multiple death cases (co-)occurence. The proposed sessions focus on latent variable models for the analysis of multidimensional network data, either in the case of the same relation recorded over the same actors in different times, or with several relations recorded on the same set of units in a given time occasion. Latent variables may use to cluster units, describe their time dynamics, and represent them in a dimensional reduced space.
Organizers: Marco Alfò, Università di Roma “La Sapienza”, ITALY – Francesco Bartolucci, Università di Perugia, ITALY
Silvia D’Angelo
University College Dublin
IRELAND
Saverio Ranciati
Università di Bologna
ITALY
Daniel K Sewell
The Iowa State University
US
Latent variable models for complex networks - II
Network analysis is an area of study which is gaining an increasing interest from scientists coming from many distinct fields. A recent focus of this broad research area is on the study of how subjects interact or are connected each other. The statistical analysis of network data represents a quite difficult context to deal with, as the relations to be studied are usually complex, and sophisticated models are needed to obtain a reliable conclusions, which require particular inferential tool due to the level of data complexity. Relations may refer to economic exchanges, as in input-output tables, to social proximity, as in human groups interactions, to scientific literature, as in research collaborations, to disease and mortality spread, as in the analysis of multiple death cases (co-)occurence. The proposed sessions focus on latent variable models for the analysis of multidimensional network data, either in the case of the same relation recorded over the same actors in different times, or with several relations recorded on the same set of units in a given time occasion. Latent variables may use to cluster units, describe their time dynamics, and represent them in a dimensional reduced space.
Chair: Daniel K. Sewell, The Iowa State University, US
Daniela D'Ambrosio
ITALY
MARCO SERINO
Invalsi
ITALY
GIANCARLO RAGOZINI
Università di Napoli Federico II
ITALY
Maria Francesca Marino
Università di Firenze
ITALY
SILVIA PANDOLFI
Università di Perugia
ITALY
Catherine Matias
French National Centre for Scientific Research
FRANCE
Statistical analysis of Data streams
This session aims to bring together research works on the statistical analysis of data streams. Many today’s real-world applications generate huge amounts of real time data which do not follow static and predictable data distributions but change over time with no specific patterns. Usually, due to the dimensionality of data and to the fast rate of recording, traditional data analysis methods cannot be applied to data streams but specific approaches are needed. This session includes contributions in the field of data stream summarization and analysis through fast algorithm which adapt the knowledge with the flowing of data.
Organizer: Antonio Balzanella, Università della Campania Luigi Vanvitelli, ITALY
Guénaël Cabanes
Université Paris 13 Nord
FRANCE
Fabrizio Maturo
Università della Camania Luigi Vanvitelli
ITALY
Rosanna Verde
Università della Campania Luigi Vanvitelli
ITALY
Generalized Linear, Latent, and Mixed Models
Generalized linear models and their extensions are now a common tool in many fields of application. However, issues remain about their use, such as finding simple ways to interpret effects when link functions are nonlinear, choosing appropriate models with subjective (e.g., ordinal) response variables, adjusting for poor behavior of some standard inference methods (e.g., Wald methods) when effects are strong, and choosing appropriate residuals. In addition, often many variables under investigation are not directly observable, and latent variable models can represent phenomena such as “true” variables measured with error, hypothetical constructs, unobserved heterogeneity, missing data, etc. A generalized latent variable modeling framework can integrate many methodologies in a global context.
Organizer: Matilde Bini, Università Europea di Roma, ITALY
Alan Agresti
University of Florida
US
Leonardo Grilli
Università di Firenze
ITALY
CARLA RAMPICHINI
Università di Firenze
ITALY
Lu Yan
London School of Economics
UK
Challenges in modern robust data analysis
Robust statistical methods have been much studied but perhaps little applied in the past. The field of “Robust statistics” has long been well recognized by the scientific community, although the related domain of “Robust data analysis” has grown up only recently. The session will address some of the issues that robust data analysis must face in order to be attractive for scientists working in different substantial domains. These issues include high dimensionality of data, the need for scalable and efficient state-of-the art algorithms, and the development of methods that can unveil the features of each unit in complex situations.
Organizer: Andrea Cerioli, Università di Parma, ITALY
Alessio Farcomeni
Università di Roma “La Sapienza”
ITALIA
Gianluca Morelli
Università di Parma
ITALY
FRANCESCA TORTI
Joint Research Centre of the European Commission
ITALY
Anne Ruiz-Gazen
Toulouse School of Economics
FRANCE
Advances in recursive partitioning (with applications)
TBA
Organizer and Discussant: Claudio Conversano, Università di Cagliari, ITALY
Chair: Antonio D’Ambrosio, Università di Napoli Federico II, ITALY
Federico Banchelli
Università di Bologna
ITALY
Moritz Berger
University of Bonn
GERMANY
Carmela Cappelli
ITALY
Cluster validation and the selection of a cluster configuration
In cluster analysis typically one obtain several groupings of the units for the same data set, and then a final clustering solution is selected. These solutions are often obtained by performing different algorithms, or the same algorithm with different input parameters including the number of clusters, feature weights, constraints on population parameters, etc. The session aims at exploring validation and estimation methods useful for formulating a final decision about the desired clustering solution. This session will focus on methodological contributions, applications, and software implementations.
Organizer: Pietro Coretto, Università di Salerno, ITALY
Luca Coraggio
ITALY
Christian Hennig
ITALY
Volodymyr Melnykov
University of Alabama
US
Preference rankings
Typically, rank data consist of a set of individuals, or judges, who have ordered a set of items —or objects— according to their overall preference or some pre-specified criterion. When each judge has expressed his or her preferences according to his own best judgment, such data are often characterized by systematic individual differences.
The first speaker, Ekhine Irurozki, will talk about the well-known Mallows Models and the Generalized Mallows Models under the Cayley distance. These kind of models are popular in analyzing rank data, but the used metric is most of the time either the Kendall distance or the Spearman distance. One reason is that these metrics are consistent with the geometrical space of preference rankings, which is the permutation polytope. The presenter shows an application in the field of biology, motivating the interest of this model with this particular distance.
The second speaker, Mariangela Sciandra, will talk about a way to weight the items in determining a distance between two judges. The presenter defines the item-weighted Kemeny distance, showing its properties and tackling the problem of aggregating preferences when different weights are assigned to different items.
One of the main problems in analyzing rank data is determine the so-called consensus ranking, which is known to be a NP hard problem. When the number of items is large, estimating the consensus ranking is somewhat unrealistic and not computationally feasible. The third speaker, Valeria Vitelli, will present an algorithmic approach, which is grounded on the Bayes Mallows model, to perform item selection when the number of items is really large. Applications on genomic data are presented.
Organizer and Discussant: Antonio D’Ambrosio, Università di Napoli Federico II, ITALY
Chair: Claudio Conversano, Università di Cagliari, ITALY
Ekhine Irurozki
SPAIN
Mariangela Sciandra
ITALY
Valeria Vitelli
University of Oslo
NORWAY
Statistical challanges in Functional Data Analysis
This session includes contributions related to the analysis of functional data showing complex characteristics. The presented papers comprises both new methodological proposals and a wide range of applications. In particular, clustering of spatially dependent functional data and prediction models for functional data with complex characteristics will be discussed.
Organizer: Tonio di Battista, Università “G. d’Annunzio” Chieti – Pescara, ITALY
Ana M. Aguilera
SPAIN
Francesca Fortuna
ITALY
Elvira Romano
ITALY
Mixture modelling for complex data
TBA
Organizer: Michael Fop, University College Dublin, IRELAND
Brendan Murphy
University College Dublin
IRELAND
Cristina Mollica
Università di Roma “La Sapienza”
ITALY
LUCA TARDELLA
Università di Roma “La Sapienza”
ITALY
Antonello Maruotti
ITALY
MONIA RANALLI
Università di Roma Tor Vergata
ITALY
ROBERTO ROCCI
Università di Roma Tor Vergata
ITALY
Università di Roma “La Sapienza”
ITALY
Methodologies for mixture modeling
TBA
Michael Fop
University College Dublin
IRELAND
RICCARDO RASTELLI
University College Dublin
IRELAND
Geoffrey McLachlan
University of Queensland
AUSTRALIA
Luca Scrucca
Università di Perugia
ITALY
Mathematical theory in clustering
The area of cluster analysis is very heterogeneous. It is not defined by a unified mathematical approach but rather by a class of problems that have been approached in very diverse ways. It has proven difficult to investigate the performance of clustering methods mathematically. The session is open to approaches such as model-based asymptotic theory, axiomatic theory and nonparametric theory (“performance guarantees”). The limits of mathematical analysis of clustering will also be discussed.
Discussant: Claudio Agostinelli, Università di Trento, ITALY
Alessandro Casa
Università di Padova
ITALY
Christine Keribin
Université Paris-Sud
FRANCE
Philipp Thomann
SWITZERLAND
INGO STEINWART
University of Stuttgart
GERMANY
NICO SCHMID
University of Stuttgart
GERMANY
Unsupervised Methods for Mixed-type Data
Most of the unsupervised learning methods aiming at synthesizing data, via clustering or dimension reduction, deal with continuous only or categorical only attributes. In real applications, however, observations are described by a combination of attributes of different nature. A common practice is to pre-process the attributes recoding them to a same type, and then apply a suitable method. A more enhanced solution is apply unsupervised methods natively designed to cope with mixed-type: extant methods aim to prevent the loss of information that may due to recoding, and to balance the importance of the information carried out by categorical and continuous attributes
Organizer: Alfonso Iodice d’Enza, Università di Napoli Federico II, ITALY
Angelos Markos
Democritus University of Thrace
GREECE
Monia Ranalli
ITALY
Michel van de Velden
Erasmus University of Rotterdam
HOLLAND
Large scale data analysis
TBA
Organizer and Chair: Michele La Rocca, Università di Salerno, ITALY
Discussant: Francesco Palumbo, Università di Napoli Federico II, ITALY
Pietro Liò
University of Cambridge
UK
Matteo Magnani
Uppsala University
SWEDEN
Maria Lucia Parrella
ITALY
FRANCESCO GIORDANO
Università di Salerno
ITALY
SOUMENDRA LAHIRI
North Carolina State University
US
Clustering time dependent data
Clustering is a widely used statistical tool to determine subsets in a given data set. Frequently used clustering methods are mostly based on distance measures and cannot easily be extended to cluster time series within a longitudinal or time series data set. The session aims at introducing cutting-edge approaches to model-based clustering of longitudinal data or time series based on finite mixture models and extensions. Approaches suitable both for continuous and for categorical observations are discussed. Bayesian and frequentist estimation methods are proposed.
Organizer: Antonello Maruotti, Università di Roma LUMSA, ITALY
Timo Adam
Bielefeld University
GERMANY
Gianluca Mastrantonio
ITALY
Francesco Dotto
ITALY
New trends in robust estimation anda data analysis
TBA
Organizers: Andrea Cerioli, Università di Parma, ITALY – Agustín Mayo Iscar, Universidad de Valladolid, SPAIN
Luca Bagnato
Università Cattolica del Sacro Cuore
ITALY
Domenico Perrotta
Joint Research Centre of the European Commission
ITALY
Peter Rousseeuw
KU Leuven
BELGIUM
Regularization for mixture estimation based on constraints
TBA
Organizer: Agustín Mayo Iscar, Universidad de Valladolid, SPAIN
Andrea Cappozzo
Università Milano Bicocca
ITALY
Roberto di Mari
Università di Catania
ITALY
Marco Riani
Università di Parma
ITALY
Robust procedures based on trimming and forward search
TBA
Organizers: Andrea Cerioli, Università di Parma, ITALY – Agustín Mayo Iscar, Universidad de Valladolid, SPAIN
Anthony C. Atkinson
London School of Economics
UK
Aldo Corbellini
Università di Parma
ITALY
Luca Greco
Università del Sannio
ITALY
Recent methodological advances in finite mixture modeling with applications
The field of finite mixture modeling has been developing intensively. The proposed session will cover some latest advances in related methodology and illustrate them on synthetic and reallife data sets. The first talk is devoted to the application of the goodness of fit statistics for choosing a consistent root for mixture models. The second talk discusses the role of finite mixture models in modeling spatio-temporal processes. Finally, the third talk is dedicated to modeling high-dimensional skewed data.
Organizer: Volodymyr Melnykov, University of Alabama, US
Michael Gallaugher
McMaster University
CANADA
Garritt Page
Brigham Young University
US
Weixin Yao
University of California, Riverside
US
Semiparametric methods for hierarchical data
Being the hierarchical structure (e.g. students are nested within classes, that are in turn nested within schools, patients nested within hospitals nested in healthcare districts, …) one of the main characteristic of administrative data, mixed-effects models, that are able to take into account the nested nature of the data, constitute one of the most useful statistical models to be investigated, in particular discarding the usual unrealistic parametric assumptions.
Organizer: Anna Maria Paganoni, Politecnico di Milano, ITALY
Chiara Masci
Politecnico di Milano
ITALY
Carla Rampichini
Università di Firenze
Italy
Linda Sharples
London School of Hygiene and Tropical Medicine
UK
Directional data analysis
Directional data arise when measurements are directions, and find application in many scientific fields such as bioinformatics and machine learning. They received a lot of attention over the past two decades, and are still gaining an increasing interest from scientists. The sample space is the (hyper)sphere so that the analysis of such data relies on specific statistical models and procedures which take into account their peculiar features. Hence, this session focuses on the latest advances in methodologies and applications in this research field.
Organizer: Giuseppe Pandolfo, Università di Napoli Federico II, ITALY
Marco Di Marzio
ITALY
Eduardo García-Portugués
University Carlos III, Madrid
SPAIN
Francesco Lagona
ITALY
New classification methods in economics and business
The aim of proposed session is presentation and discussion of the new approaches to data analysis and classification problems in economics, management and business. The novelty is referred both to the methodology of research and areas of its applications in empirical investigations.
Organizer: Józef Pociecha, Uniwersytet Ekonomiczny w Krakowie, POLAND
Lucie Kopecká
CZECH REPUBLIC
VIERA PACAKOVA
University of Pardubice
CZECH REPUBLIC
Viera Labudová
SLOVAKIA
LUBICA SIPKOVA
Bratislava University of Economics
SLOVAKIA
Paweł Lula
POLAND
S. BELOV
Plechanov Russian University of Economics, Moscow
RUSSIAN FEDERATION
SALVATORE INGRASSIA
Università di Catania
ITALY
ZORAN KALINIC
University of Kragujevac
SERBIA
Multi-block data analysis: methods and applications
Multiblock analysis is the extension of the multivariate analysis tailored to the study of complex data sets partitioned into blocks with at least one mode or dimension in common. The aim of the analysis is to find the underlying relationships between the various blocks, assuming that they are interrelated. The session will cover both the exploratory and the supervised multiblock approach. The first corresponds to the situation in which all the blocks are connected to all other blocks. The second assumes a specific pattern of directed relations between the various blocks. The session will present both current research and applications.
Rasmus Bro
University of Copenhagen
DENMARK
Pasquale Dolce
ITALY
CRISTINA DAVINO
Università di Napoli Federico II
ITALY
STEFANIA TARALLI
ISTAT
ITALY
DOMENICO VISTOCCO
Università di Napoli Federico II
ITALY
Mohamed Hanafi
Oniris, StatSC, Nantes
FRANCE
Compositional data analysis in practice
TBA
Organizer: Valentin Todorov, UNIDO Vienna, Austria
Michele Gallo
Università di Napoli “L’Orientale”
ITALY
Karel Hron
Palacký University Olomouc
CZECH REPUBLIC
Marco Antonio Cruz
SPAIN
MARIBEL ISABEL ORTEGO
Universitat Politècnica de Catalunya
SPAIN
ELISABET ROCA
Universitat Politècnica de Catalunya
SPAIN
Modern Likelihood Methods
In many modern applications, the likelihood function is difficult or even impossible to evaluate. Frequent computational difficulties are associated to the inversion of covariance matrices, approximation of multidimensional integrals, handling large number of nuisance components. This session will discuss some modifications of the likelihood function that are designed to analyse nonstandard situations that are increasingly appearing in applications.
Ruggero Bellio
ITALY
Nicola Lunardon
ITALY
Riccardo Rastelli
University College Dublin
IRELAND
Dimensionality reduction and model based synthesis of indicators
TBA
Organizer: Maurizio Vichi, Università di Roma “La Sapienza”, ITALY
Leonardo Salvatore Alaimo
Università di Roma “La Sapienza”
ITALY
FILOMENA MAGGINO
Università di Roma “La Sapienza”
ITALY
Paolo Giordani
ITALY
ROBERTO ROCCI
Università di Roma Tor Vergata
ITALY
Università di Roma “La Sapienza”
ITALY
GIUSEPPE BOVE
Università Roma Tre
ITALY
Carlo Cavicchia
Università di Roma “La Sapienza”
ITALY
MAURIZIO VICHI
Università di Roma “La Sapienza”
ITALY
GIORGIA ZACCARIA
Università di Roma “La Sapienza”
ITALY
Clustering methods for high dimensional data
The rapid technological change that characterizes our contemporary society has led to an exponentially increasing availability of information, which are very often high-dimensional. Traditional clustering methods might become unfeasible in such cases and new developments are required. The first talk will review and propose recent strategies for unsupervised classification of high-dimensional data in the context of model-based clustering. The second presentation will focus on random projections and on the proposal of a criterion to select the best random projections for clustering. The third talk will be devoted on robust data analysis methods for clustering very noisy microarray data and, in particular, a robust patients subtyping method that can scale well with the typical dimension of microarray data is presented.
Organizer: Cinzia Viroli, Università di Bologna, ITALY
Pietro Coretto
ITALY
Claire Gormley
University College Dublin
IRELAND
Francesca Fortunato
ITALY
Archetypal analysis and related methods
TBA
Irene Epifanio López
Universitat Jaume I
SPAIN
Francesco Palumbo
Università di Napoli Federico II
ITALY
Sohan Seth
University of Edinburgh
SCOTLAND
Data analysis for complex networks
The study of complex networks is an active area of research inspired by the study of real-world networks in different application fields. The focus of the session is to discuss relevant results and methodological developments in Network Analysis in presence of complex network data structure in presence of multiple relations measured on the same set of nodes, symbolic objects, and longitudinal information.
Organizer & Chair: Maria Prosperina Vitale, Università di Salerno, ITALY
Discussant: Matteo Magnani, Uppsala Universy, SWEDEN
Domenico De Stefano
Università di Trieste
ITALY
GIUSEPPE GIORDANO
Università di Salerno
ITALY
SUSANNA ZACCARIN
Università di Trieste
ITALY
Luka Kronegger
University of Lubiana
SLOVENIA
Leone Valerio Sciabolazza
ITALY
TILL KRENZ
Bureau of Economic and Business Research, Florida
US
RAFFAELE VACCA
University of Florida
US
Engendering statistics in social studies
TBA
Organizer: Mariangela Zenga, Università Milano Bicocca, ITALY
Franca Crippa
Università di Milano Bicocca
ITALY
FULVIA MECATTI
Università di Milano Bicocca
ITALY
GAIA BERTARELLI
Università di Pisa
ITALY
Adele H. Marshall
Queen’s university of Belfast
UK
AGLAIA KALAMATIANOU
Panteion University
GREECE
Ilaria Primerano
Università di Salerno
ITALY
MARIALUISA RESTAINO
Università di Salerno
ITALY
Applications and Concepts of Classification – Selected papers by the GfKl – Data Science Society
Research in classification is heavily driven by applications as well as by strengthening conceptual ideas. This section provides a cross-sectionary snapshot of recent research of members in the German Classification Society (GfKl). It sheds light on the intertwined relationships between application areas such as marketing and genomics and the mathematical concepts of symmetry and invariance.
Organizer: Adalbert Wilhelm, Jacobs University, GERMANY
Andreas Geyer-Schulz
Institute of Information Systems and Marketing
GERMANY
Hans A. Kestler
Universität Ulm
GERMANY
Friederike Paetz
GERMANY