How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.
Networks, as efficient representations of complex systems, have appealed to scientists for a long time and now permeate many areas of science, including neuroimaging (Bullmore and Sporns 2009 Nat. Rev. Neurosci.10, 186–198. (doi:10.1038/nrn2618)). Traditionally, the structure of complex networks has been studied through their statistical properties and metrics concerned with node and link properties, e.g. degree-distribution, node centrality and modularity. Here, we study the characteristics of functional brain networks at the mesoscopic level from a novel perspective that highlights the role of inhomogeneities in the fabric of functional connections. This can be done by focusing on the features of a set of topological objects—homological cycles—associated with the weighted functional network. We leverage the detected topological information to define the homological scaffolds, a new set of objects designed to represent compactly the homological features of the correlation network and simultaneously make their homological properties amenable to networks theoretical methods. As a proof of principle, we apply these tools to compare resting-state functional brain activity in 15 healthy volunteers after intravenous infusion of placebo and psilocybin—the main psychoactive component of magic mushrooms. The results show that the homological structure of the brain's functional patterns undergoes a dramatic change post-psilocybin, characterized by the appearance of many transient structures of low stability and of a small number of persistent ones that are not observed in the case of placebo.
We define a new topological summary for data that we call the persistence landscape. Since this summary lies in a vector space, it is easy to combine with tools from statistics and machine learning, in contrast to the standard topological summaries. Viewed as a random variable with values in a Banach space, this summary obeys a strong law of large numbers and a central limit theorem. We show how a number of standard statistical tests can be used for statistical inference using this summary. We also prove that this summary is stable and that it can be used to provide lower bounds for the bottleneck and Wasserstein distances.
Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features (2000) as one varies a tuning parameter. Features with short lifetimes are informally considered to be “topological noise,” and those with a long lifetime are considered to be “topological signal.” In this paper, we bring some statistical ideas to persistent homology. In particular, we derive confidence sets that allow us to separate topological signal from topological noise.
Data-driven discovery in complex neurological disorders has potential to extract meaningful syndromic knowledge from large, heterogeneous data sets to enhance potential for precision medicine. Here we describe the application of topological data analysis (TDA) for data-driven discovery in preclinical traumatic brain injury (TBI) and spinal cord injury (SCI) data sets mined from the Visualized Syndromic Information and Outcomes for Neurotrauma-SCI (VISION-SCI) repository. Through direct visualization of inter-related histopathological, functional and health outcomes, TDA detected novel patterns across the syndromic network, uncovering interactions between SCI and co-occurring TBI, as well as detrimental drug effects in unpublished multicentre preclinical drug trial data in SCI. TDA also revealed that perioperative hypertension predicted long-term recovery better than any tested drug after thoracic SCI in rats. TDA-based data-driven discovery has great potential application for decision-support for basic research and clinical problems such as outcome assessment, neurocritical care, treatment planning and rapid, precision-diagnosis.