Statistical data depths

Statistical Data Depths has proved successful in the analysis of multivariate data sets, e.g., for deriving an overall center and assigning ranks to the observed units. Two key features are the directions of the ordering, from the center towards the outside, and the recognition of a unique center of the distribution. More recently, Depths are used in the analysis of high-dimensional data and of functional data, in classification and clustering setting. This session is devoted to the latest advances in this research field.

Organizer: Claudio Agostinelli, Università di Trento, ITALY

5

Alicia Nieto Reyes

Universidad de Cantabria
SPAIN

 

5

Stanislav Nagy

Charles University in Prague
CZECH REPUBLIC

5

Germain Van Bever

Université Libre de Bruxelles
BELGIUM

Latent variable models for complex networks - I

Network analysis is an area of study which is gaining an increasing interest from scientists coming from many distinct fields. A recent focus of this broad research area is on the study of how subjects interact or are connected each other. The statistical analysis of network data represents a quite difficult context to deal with, as the relations to be studied are usually complex, and sophisticated models are needed to obtain a reliable conclusions, which require particular inferential tool due to the level of data complexity. Relations may refer to economic exchanges, as in input-output tables, to social proximity, as in human groups interactions, to scientific literature, as in research collaborations, to disease and mortality spread, as in the analysis of multiple death cases (co-)occurence. The proposed sessions focus on latent variable models for the analysis of multidimensional network data, either in the case of the same relation recorded over the same actors in different times, or with several relations recorded on the same set of units in a given time occasion. Latent variables may use to cluster units, describe their time dynamics, and represent them in a dimensional reduced space.

Organizers: Marco Alfò, Università di Roma “La Sapienza”, ITALY – Francesco Bartolucci, Università di Perugia, ITALY

5

Silvia D’Angelo

University College Dublin
IRELAND

 

5

Saverio Ranciati

Università di Bologna
ITALY

5

Daniel K Sewell

The Iowa State University
US

Latent variable models for complex networks - II

Network analysis is an area of study which is gaining an increasing interest from scientists coming from many distinct fields. A recent focus of this broad research area is on the study of how subjects interact or are connected each other. The statistical analysis of network data represents a quite difficult context to deal with, as the relations to be studied are usually complex, and sophisticated models are needed to obtain a reliable conclusions, which require particular inferential tool due to the level of data complexity. Relations may refer to economic exchanges, as in input-output tables, to social proximity, as in human groups interactions, to scientific literature, as in research collaborations, to disease and mortality spread, as in the analysis of multiple death cases (co-)occurence. The proposed sessions focus on latent variable models for the analysis of multidimensional network data, either in the case of the same relation recorded over the same actors in different times, or with several relations recorded on the same set of units in a given time occasion. Latent variables may use to cluster units, describe their time dynamics, and represent them in a dimensional reduced space.

Organizers: Marco Alfò, Università di Roma “La Sapienza”, ITALY – Francesco Bartolucci, Università di Perugia, ITALY
Chair: Daniel K. Sewell, The Iowa State University, US
5

Daniela D'Ambrosio

Università di Napoli Federico II
ITALY

MARCO SERINO
Invalsi
ITALY

GIANCARLO RAGOZINI
Università di Napoli Federico II
ITALY

5

Maria Francesca Marino

Università di Firenze
ITALY

SILVIA PANDOLFI
Università di Perugia
ITALY

 

5

Catherine Matias

French National Centre for Scientific Research
FRANCE

Statistical analysis of Data streams

This session aims to bring together research works on the statistical analysis of data streams. Many today’s real-world applications generate huge amounts of real time data which do not follow static and predictable data distributions but change over time with no specific patterns. Usually, due to the dimensionality of data and to the fast rate of recording, traditional data analysis methods cannot be applied to data streams but specific approaches are needed. This session includes contributions in the field of data stream summarization and analysis through fast algorithm which adapt the knowledge with the flowing of data.

Organizer: Antonio Balzanella, Università della Campania Luigi Vanvitelli, ITALY

5

Guénaël Cabanes

Université Paris 13 Nord
FRANCE

5

Fabrizio Maturo

Università della Camania Luigi Vanvitelli
ITALY

5

Rosanna Verde

Università della Campania Luigi Vanvitelli
ITALY

Generalized Linear, Latent, and Mixed Models

Generalized linear models and their extensions are now a common tool in many fields of application. However, issues remain about their use, such as finding simple ways to interpret effects when link functions are nonlinear, choosing appropriate models with subjective (e.g., ordinal) response variables, adjusting for poor behavior of some standard inference methods (e.g., Wald methods) when effects are strong, and choosing appropriate residuals. In addition, often many variables under investigation are not directly observable, and latent variable models can represent phenomena such as “true” variables measured with error, hypothetical constructs, unobserved heterogeneity, missing data, etc. A generalized latent variable modeling framework can integrate many methodologies in a global context.

Organizer: Matilde Bini, Università Europea di Roma, ITALY

5

Alan Agresti

University of Florida
US

5

Leonardo Grilli

Università di Firenze
ITALY

CARLA RAMPICHINI
Università di Firenze
ITALY

5

Lu Yan

London School of Economics
UK

Challenges in modern robust data analysis

Robust statistical methods have been much studied but perhaps little applied in the past. The field of “Robust statistics” has long been well recognized by the scientific community, although the related domain of “Robust data analysis” has grown up only recently. The session will address some of the issues that robust data analysis must face in order to be attractive for scientists working in different substantial domains. These issues include high dimensionality of data, the need for scalable and efficient state-of-the art algorithms, and the development of methods that can unveil the features of each unit in complex situations.

Organizer: Andrea Cerioli, Università di Parma, ITALY

5

Alessio Farcomeni

Università di Roma “La Sapienza”
ITALIA

5

Gianluca Morelli

Università di Parma
ITALY

FRANCESCA TORTI
Joint Research Centre of the European Commission
ITALY

5

Anne Ruiz-Gazen

Toulouse School of Economics
FRANCE

 

Advances in recursive partitioning (with applications)

TBA

Organizer and Discussant: Claudio Conversano, Università di Cagliari, ITALY
Chair: Antonio D’Ambrosio, Università di Napoli Federico II, ITALY

5

Federico Banchelli

Università di Bologna
ITALY

5

Moritz Berger

University of Bonn
GERMANY

5

Carmela Cappelli

Università di Napoli Federico II
ITALY

Cluster validation and the selection of a cluster configuration

In cluster analysis typically one obtain several groupings of the units for the same data set, and then a final clustering solution is selected. These solutions are often obtained by performing different algorithms, or the same algorithm with different input parameters including the number of clusters, feature weights, constraints on population parameters, etc. The session aims at exploring validation and estimation methods useful for formulating a final decision about the desired clustering solution. This session will focus on methodological contributions, applications, and software implementations.

Organizer: Pietro Coretto, Università di Salerno, ITALY

5

Luca Coraggio

Università di Napoli Federico II
ITALY
5

Christian Hennig

Università di Bologna
ITALY
5

Volodymyr Melnykov

University of Alabama
US

Preference rankings

Typically, rank data consist of a set of individuals, or judges, who have ordered a set of items —or objects— according to their overall preference or some pre-specified criterion. When each judge has expressed his or her preferences according to his own best judgment, such data are often characterized by systematic  individual differences.

            The first speaker, Ekhine Irurozki, will talk about the well-known Mallows Models and the Generalized Mallows Models under the Cayley distance. These kind of models are popular in analyzing rank data, but the used metric is most of the time either the Kendall distance or the Spearman distance. One reason is that these metrics are consistent with the geometrical space of preference rankings, which is the permutation polytope. The presenter shows an application in the field of biology, motivating the interest of this model with this particular distance.

            The second speaker, Mariangela Sciandra, will talk about a way to weight the items in determining a distance between two judges. The presenter defines the item-weighted Kemeny distance, showing its properties and tackling the problem of aggregating preferences when different weights are assigned to different items.

            One of the main problems in analyzing rank data is determine the so-called consensus ranking, which is known to be a NP hard problem. When the number of items is large, estimating the consensus ranking is somewhat unrealistic and not computationally feasible. The third speaker, Valeria Vitelli, will present an algorithmic approach, which is grounded on the Bayes Mallows model, to perform item selection when the number of items is really large. Applications on genomic data are presented.

Organizer and Discussant: Antonio D’Ambrosio, Università di Napoli Federico II, ITALY
Chair: Claudio Conversano, Università di Cagliari, ITALY

5

Ekhine Irurozki

Basque Center for Applied Mathematics
SPAIN
5

Mariangela Sciandra

Università di Palermo
ITALY
5

Valeria Vitelli

University of Oslo
NORWAY

Statistical challanges in Functional Data Analysis

This session includes contributions related to the analysis of functional data showing complex characteristics. The presented papers comprises both new methodological proposals and a wide range of applications. In particular, clustering of spatially dependent functional data and prediction models for functional data with complex characteristics will be discussed.

Organizer: Tonio di Battista, Università “G. d’Annunzio” Chieti – Pescara, ITALY

5

Ana M. Aguilera

University of Granada
SPAIN
5

Francesca Fortuna

Università “G. d’Annunzio” Chieti – Pescara
ITALY
5

Elvira Romano

Università della Campania Luigi Vanvitelli
ITALY

Mixture modelling for complex data

TBA

Organizer: Michael Fop, University College Dublin, IRELAND

5

Brendan Murphy

University College Dublin
IRELAND

5

Cristina Mollica

Università di Roma “La Sapienza”
ITALY

LUCA TARDELLA
Università di Roma “La Sapienza”
ITALY

5

Antonello Maruotti

Università di Roma LUMSA
ITALY

MONIA RANALLI
Università di Roma Tor Vergata
ITALY

ROBERTO ROCCI
Università di Roma Tor Vergata
ITALY
Università di Roma “La Sapienza”
ITALY

Methodologies for mixture modeling

TBA

Organizer: Francesca Greselin, Università Milano Bicocca, ITALY
5

Michael Fop

University College Dublin
IRELAND

RICCARDO RASTELLI
University College Dublin
IRELAND

5

Geoffrey McLachlan

University of Queensland
AUSTRALIA

5

Luca Scrucca

Università di Perugia
ITALY

Mathematical theory in clustering

The area of cluster analysis is very heterogeneous. It is not defined by a unified mathematical approach but rather by a class of problems that have been approached in very diverse ways. It has proven difficult to investigate the performance of clustering methods mathematically. The session is open to approaches such as model-based asymptotic theory, axiomatic theory and nonparametric theory (“performance guarantees”). The limits of mathematical analysis of clustering will also be discussed.

Organizer: Christian Hennig, Università di Bologna, ITALY
Discussant: Claudio Agostinelli, Università di Trento, ITALY
5

Alessandro Casa

Università di Padova
ITALY

5

Christine Keribin

Université Paris-Sud
FRANCE

5

Philipp Thomann

D One, Zuerich
SWITZERLAND

INGO STEINWART
University of Stuttgart
GERMANY

NICO SCHMID
University of Stuttgart
GERMANY

Unsupervised Methods for Mixed-type Data

Most of the unsupervised learning methods aiming at synthesizing data, via clustering or dimension reduction, deal with  continuous only or categorical only attributes. In real applications, however, observations are described by a combination of attributes of different nature. A common practice is  to pre-process the attributes recoding them to a same type, and then apply a suitable method. A more enhanced solution is apply unsupervised methods natively designed to cope with mixed-type: extant methods aim to prevent the loss of information that may due to recoding, and to balance the importance of the information carried out by categorical and continuous attributes

Organizer: Alfonso Iodice d’Enza, Università di Napoli Federico II, ITALY

5

Angelos Markos

Democritus University of Thrace
GREECE

5

Monia Ranalli

Università di Roma Tor Vergata
ITALY
5

Michel van de Velden

Erasmus University of Rotterdam
HOLLAND

Large scale data analysis

TBA

Organizer and Chair: Michele La Rocca, Università di Salerno, ITALY
Discussant: Francesco Palumbo, Università di Napoli Federico II, ITALY

5

Pietro Liò

University of Cambridge
UK

5

Matteo Magnani

Uppsala University
SWEDEN

5

Maria Lucia Parrella

Università di Salerno
ITALY

FRANCESCO GIORDANO
Università di Salerno
ITALY

SOUMENDRA LAHIRI
North Carolina State University
US

Clustering time dependent data

Clustering is a widely used statistical tool to determine subsets in a given data set. Frequently used clustering methods are mostly based on distance measures and cannot easily be extended to cluster time series within a longitudinal or time series data set. The session aims at introducing cutting-edge approaches to model-based clustering of longitudinal data or time series based on finite mixture models and extensions. Approaches suitable both for continuous and for categorical observations are discussed. Bayesian and frequentist estimation methods are proposed.

Organizer: Antonello Maruotti, Università di Roma LUMSA, ITALY

5

Timo Adam

Bielefeld University
GERMANY

5

Gianluca Mastrantonio

Politecnico di Torino
ITALY
5

Francesco Dotto

Università di Roma Tre
ITALY

New trends in robust estimation anda data analysis

TBA

Organizers: Andrea Cerioli, Università di Parma, ITALY – Agustín Mayo Iscar, Universidad de Valladolid, SPAIN

5

Luca Bagnato

Università Cattolica del Sacro Cuore
ITALY

5

Domenico Perrotta

Joint Research Centre of the European Commission
ITALY

5

Peter Rousseeuw

KU Leuven
BELGIUM

Regularization for mixture estimation based on constraints

TBA

Organizer: Agustín Mayo Iscar, Universidad de Valladolid, SPAIN

5

Andrea Cappozzo

Università Milano Bicocca
ITALY

5

Roberto di Mari

Università di Catania
ITALY

5

Marco Riani

Università di Parma
ITALY

Robust procedures based on trimming and forward search

TBA

Organizers: Andrea Cerioli, Università di Parma, ITALY – Agustín Mayo Iscar, Universidad de Valladolid, SPAIN

5

Anthony C. Atkinson

London School of Economics
UK

5

Aldo Corbellini

Università di Parma
ITALY

5

Luca Greco

Università del Sannio
ITALY

Recent methodological advances in finite mixture modeling with applications

The field of finite mixture modeling has been developing intensively. The proposed session will cover some latest advances in related methodology and illustrate them on synthetic and reallife data sets. The first talk is devoted to the application of the goodness of fit statistics for choosing a consistent root for mixture models. The second talk discusses the role of finite mixture models in modeling spatio-temporal processes. Finally, the third talk is dedicated to modeling high-dimensional skewed data.

Organizer: Volodymyr Melnykov, University of Alabama, US

5

Michael Gallaugher

McMaster University
CANADA

5

Garritt Page

Brigham Young University
US

5

Weixin Yao

University of California, Riverside
US

Semiparametric methods for hierarchical data

Being the hierarchical structure (e.g. students are nested within classes, that are in turn nested within schools, patients nested within hospitals nested in healthcare districts, …) one of the main characteristic of administrative data, mixed-effects models, that are able to take into account the nested nature of the data, constitute one of the most useful statistical models to be investigated, in particular discarding the usual unrealistic parametric assumptions.

Organizer: Anna Maria Paganoni, Politecnico di Milano, ITALY

5

Chiara Masci

Politecnico di Milano
ITALY

5

Carla Rampichini

Università di Firenze
Italy

5

Linda Sharples

London School of Hygiene and Tropical Medicine
UK

Directional data analysis

Directional data arise when measurements are directions, and find application in many scientific fields such as bioinformatics and machine learning. They received a lot of attention over the past two decades, and are still gaining an increasing interest from scientists. The sample space is the (hyper)sphere so that the analysis of such data relies on specific statistical models and procedures which take into account their peculiar features. Hence, this session focuses on the latest advances in methodologies and applications in this research field.

Organizer: Giuseppe Pandolfo, Università di Napoli Federico II, ITALY

5

Marco Di Marzio

Università “G. D’Annunzio” Chieti – Pescara
ITALY
5

Eduardo García-Portugués

University Carlos III, Madrid
SPAIN

5

Francesco Lagona

Università di Roma Tre
ITALY

New classification methods in economics and business

The aim of proposed session is presentation and discussion of the new approaches to data analysis and classification problems in economics, management and business. The novelty is referred both to the methodology of research and areas of its applications in empirical investigations.

Organizer: Józef Pociecha, Uniwersytet Ekonomiczny w Krakowie, POLAND

5

Lucie Kopecká

University of Pardubice
CZECH REPUBLIC

VIERA PACAKOVA
University of Pardubice
CZECH REPUBLIC

5

Viera Labudová

Bratislava University of Economics
SLOVAKIA

LUBICA SIPKOVA
Bratislava University of Economics
SLOVAKIA

5

Paweł Lula

Cracow University of Economics
POLAND

S. BELOV
Plechanov Russian University of Economics, Moscow
RUSSIAN FEDERATION

SALVATORE INGRASSIA
Università di Catania
ITALY

ZORAN KALINIC
University of Kragujevac
SERBIA

Multi-block data analysis: methods and applications

Multiblock analysis is the extension of the multivariate analysis tailored to the study of complex data sets partitioned into blocks with at least one mode or dimension in common. The aim of the analysis is to find the underlying relationships between the various blocks, assuming that they are interrelated. The session will cover both the exploratory and the supervised multiblock approach. The first corresponds to the situation in which all the blocks are connected to all other blocks. The second assumes a specific pattern of directed relations between the various blocks. The session will present both current research and applications.

Organizer: Rosaria Romano, Università di Napoli Federico II, ITALY
5

Rasmus Bro

University of Copenhagen
DENMARK

5

Pasquale Dolce

Università di Napoli Federico II
ITALY

CRISTINA DAVINO
Università di Napoli Federico II
ITALY

STEFANIA TARALLI
ISTAT
ITALY

DOMENICO VISTOCCO
Università di Napoli Federico II
ITALY

5

Mohamed Hanafi

Oniris, StatSC, Nantes
FRANCE

Compositional data analysis in practice

TBA

Organizer: Valentin Todorov, UNIDO Vienna, Austria

5

Michele Gallo

Università di Napoli “L’Orientale”
ITALY

5

Karel Hron

Palacký University Olomouc
CZECH REPUBLIC

5

Marco Antonio Cruz

Universitat Politècnica de Catalunya
SPAIN

MARIBEL ISABEL ORTEGO
Universitat Politècnica de Catalunya
SPAIN

ELISABET ROCA
Universitat Politècnica de Catalunya
SPAIN

Modern Likelihood Methods

In many modern applications, the likelihood function is difficult or even impossible to evaluate. Frequent computational difficulties are associated to the inversion of covariance matrices, approximation of multidimensional integrals, handling large number of nuisance components. This session will discuss some modifications of the likelihood function that are designed to analyse nonstandard situations that are increasingly appearing in applications.

Organizer: Cristiano Varin, Università Ca’ Foscari Venezia, ITALY
5

Ruggero Bellio

Università di Udine
ITALY
5

Nicola Lunardon

Università di Milano Bicocca
ITALY
5

Riccardo Rastelli

University College Dublin
IRELAND

Dimensionality reduction and model based synthesis of indicators

TBA

Organizer: Maurizio Vichi, Università di Roma “La Sapienza”, ITALY

5

Leonardo Salvatore Alaimo

Università di Roma “La Sapienza”
ITALY

FILOMENA MAGGINO
Università di Roma “La Sapienza”
ITALY

5

Paolo Giordani

Università di Roma “La Sapienza”
ITALY

ROBERTO ROCCI
Università di Roma Tor Vergata
ITALY
Università di Roma “La Sapienza”
ITALY

GIUSEPPE BOVE
Università Roma Tre
ITALY

5

Carlo Cavicchia

Università di Roma “La Sapienza”
ITALY

MAURIZIO VICHI
Università di Roma “La Sapienza”
ITALY

GIORGIA ZACCARIA
Università di Roma “La Sapienza”
ITALY

Clustering methods for high dimensional data

The rapid technological change that characterizes our contemporary society has led to an exponentially increasing availability of information, which are very often high-dimensional. Traditional clustering methods might become unfeasible in such cases and new developments are required. The first talk will review and propose recent strategies for unsupervised classification of high-dimensional data in the context of model-based clustering. The second presentation will focus on random projections and on the proposal of a criterion to select the best random projections for clustering. The third talk will be devoted on robust data analysis methods for clustering very noisy microarray data and, in particular, a robust patients subtyping method that can scale well with the typical dimension of microarray data is presented.

Organizer: Cinzia Viroli, Università di Bologna, ITALY

5

Pietro Coretto

Università di Salerno
ITALY
5

Claire Gormley

University College Dublin
IRELAND

5

Francesca Fortunato

Università di Bologna
ITALY

Archetypal analysis and related methods

TBA

Organizer: Domenico Vistocco, Università di Napoli Federico II, ITALY
5

Irene Epifanio López

Universitat Jaume I
SPAIN

5

Francesco Palumbo

Università di Napoli Federico II
ITALY

5

Sohan Seth

University of Edinburgh
SCOTLAND

Data analysis for complex networks

The study of complex networks is an active area of research inspired by the study of real-world networks in different application fields. The focus of the session is to discuss relevant results and methodological developments in Network Analysis in presence of complex network data structure in presence of multiple relations measured on the same set of nodes, symbolic objects, and longitudinal information.

Organizer & Chair: Maria Prosperina Vitale, Università di Salerno, ITALY
Discussant: Matteo Magnani, Uppsala Universy, SWEDEN

5

Domenico De Stefano

Università di Trieste
ITALY

GIUSEPPE GIORDANO
Università di Salerno
ITALY

SUSANNA ZACCARIN
Università di Trieste
ITALY

5

Luka Kronegger

University of Lubiana
SLOVENIA

5

Leone Valerio Sciabolazza

Università di Napoli “Parthenope”
ITALY

TILL KRENZ
Bureau of Economic and Business Research, Florida
US

RAFFAELE VACCA
University of Florida
US

Engendering statistics in social studies

TBA

Organizer: Mariangela Zenga, Università Milano Bicocca, ITALY

5

Franca Crippa

Università di Milano Bicocca
ITALY

FULVIA MECATTI
Università di Milano Bicocca
ITALY

GAIA BERTARELLI
Università di Pisa
ITALY

5

Adele H. Marshall

Queen’s university of Belfast
UK

AGLAIA KALAMATIANOU
Panteion University
GREECE

5

Ilaria Primerano

Università di Salerno
ITALY

MARIALUISA RESTAINO
Università di Salerno
ITALY

Applications and Concepts of Classification – Selected papers by the GfKl – Data Science Society

Research in classification is heavily driven by applications as well as by strengthening conceptual ideas. This section provides a cross-sectionary snapshot of recent research of members in the German Classification Society (GfKl). It sheds light on the intertwined relationships between application areas such as marketing and genomics and the mathematical concepts of symmetry and invariance.

Organizer: Adalbert Wilhelm, Jacobs University, GERMANY

5

Andreas Geyer-Schulz

Institute of Information Systems and Marketing
GERMANY

5

Hans A. Kestler

Universität Ulm
GERMANY

5

Friederike Paetz

Technical University of Clausthal
GERMANY