Causal networks are used to describe and to discover causal relationships among variables and data generating mechanisms. There have been many approaches for learning a global causal network of all observed variables. In many applications, we may be interested in finding what are the effects of a specified cause variable and what are the causal paths from the cause variable to its effects. Instead of learning a global causal network, we propose several local learning approaches for finding all effects (or descendants) of the specified cause variable and the causal paths from the cause variable to some effect variable of interest. We discuss the identifiability of the effects and the causal paths from observed data and prior knowledge. For the case that the causal paths are not identifiable, our approaches try to find a path set which contains the causal paths of interest.
Location-based social networks (LBSNs) such as Foursquare o er a platform for users to share and be aware of each others physical movements. As a result of such a sharing of check-in information with each other, users can be influenced to visit (or check-in) at the locations visited by their friends. Quantifying such influences in these LBSNs is useful in various settings such as location promotion, personalized recommendations, mobility pattern prediction etc. In this paper, we develop a model to quantify the influence specific to a location between a pair of users. Specifically, we develop a model called LoCaTe, that combines (a) a user mobility model based on kernel density estimates; (b) a model of the semantics of the location using topic models; and (c) a model of inter-check-in time using exponential distribution. We show the applicability of LoCaTe for location promotion and location recommendation tasks using LBSNs. Our model is validated using a long-term crawl of Foursquare data collected between Jan 2015 aAS' Feb 2016, as well as other publicly available LBSN datasets. Our experiments demonstrate the efficacy of LoCaTe in capturing location-specific influence between users. We also show that LoCaTe improves over state-of-the-art models for the coarse-grained task of location promotion.
Viewpoint estimation from 2D rendered images is helpful in understanding how users select viewpoints for volume visualization and guiding users to select better viewpoints based on previous visualizations. In this paper, we propose a viewpoint estimation method based on Convolutional Neural Networks (CNNs) for volume visualization. We first design an overfit-resistant image rendering pipeline to generate the training images with accurate viewpoint annotations, and then train a category-specific viewpoint classification network to estimate the viewpoint for the given rendered image. Our method can achieve good performance on images rendered with different transfer functions and rendering parameters in several categories. We apply our model to recover the viewpoints of the rendered images in publications, and show how experts look at volumes. We also introduce a CNN feature-based image similarity measure for similarity voting based viewpoint selection, which can suggest semantically meaningful optimal viewpoints for different volumes and transfer functions.
Effective network intrusion detection techniques are required to thwart evolving cybersecurity threats. Historically, traditional enterprise networks have been researched extensively in this regard. However, the cyber threat landscape has grown to include wireless networks. In this paper, the authors present a novel model that can be trained on completely different feature sets and applied to two distinct intrusion detection applications: traditional enterprise networks and 802.11 wireless networks. This is the first method that demonstrates superior performance in both aforementioned applications. The model is based on a one-versus-all (OVA) binary framework comprising multiple nested sub-ensembles. To provide good generalization ability, each sub-ensemble contains a collection of sub-learners, and only a portion of the sub-learners implement boosting. A class weight based on the sensitivity metric (true positive rate), learned from the training data only, is assigned to the sub-ensembles of each class. The use of pruning to remove sub-learners that do not contribute to or have an adverse effect on overall system performance is investigated as well. The results demonstrate that the proposed system can achieve exceptional performance in applications to both traditional enterprise intrusion detection and 802.11 wireless intrusion detection.
Recently, co-saliency detection which aims to automatically discover common and salient objects appeared in several relevant images has attracted increasing interest in computer vision community. In this paper, we present a novel graph-matching based model for co-saliency detection in image pairs. A solution of graph matching is proposed to integrate the visual appearance, saliency coherence and spatial structural continuity for detecting co-saliency collaboratively. Since the saliency and the visual similarity have been seamlessly integrated, such a joint inference schema is able to produce more accurate and reliable results. More concretely, the proposed model first computes the intra saliency for each image by aggregating multiple saliency cues. The common and salient regions across multiple images are thus discovered via a graph matching procedure. Then, a graph reconstruction scheme is proposed to refine the intra saliency iteratively. Compared to existing co-saliency detection methods that only utilize visual appearance cues, our proposed model can effectively exploit both visual appearance and structure information to better guide co-saliency detection. Extensive experiments on several challenging image pair databases demonstrate that our model outperforms state-of-the-art baselines significantly.
The proliferation of fake news on social media has opened up new directions of research for timely identifi- cation and containment of fake news, and mitigation of its widespread impact on public opinion. While much of the earlier research was focused on identification of fake news using content based solutions, that deter- mine the truthfulness of a piece of news based on its text contents only, or using feedback based solutions that exploit users? activities towards the news on social media, such as propagation patterns or comments, there has been a rising interest in active intervention strategies to counter the spread of misinformation and its impact on society. In this survey, we describe the problem of fake news and the technical challenges associated with identification and mitigation of fake news. We present an overview of existing methods and techniques applicable to fake news identification and mitigation, along with insights and details of the sig- nificant advances in various methods, and the practical advantages and limitations of each. Further, we enumerate a list of challenges and open problems that outline new directions of research, and provide a comprehensive list of available datasets with a summarization of their characteristic features, in order to facilitate future research and enable the development of solutions that are interdisciplinary and effective in practice.
In this paper, we study the problem of online heterogeneous transfer learning, where the objective is to make predictions for a target data sequence arriving in an online fashion, and some offline labeled instances from a heterogeneous source domain are provided as auxiliary data. The feature spaces of the source and target domains are completely different, thus the source data cannot be used directly to assist the learning task in the target domain. To address this issue, we take advantage of unlabeled co-occurrence instances as intermediate supplementary data to connect the source and target domains, and perform knowledge transition from the source domain into the target domain. We propose a novel online heterogeneous transfer learning algorithm called Online Heterogeneous Knowledge Transition (OHKT) for this purpose. In OHKT, we first seek to generate pseudo labels for the co-occurrence data based on the labeled source data, and then develop an online learning algorithm to classify the target sequence by leveraging the co-occurrence data with pseudo labels. Experimental results on real-world data sets demonstrate the effectiveness and efficiency of the proposed algorithm.
With the increasing demand in using 3D mesh data over networks, supporting effective compression and efficient transmission of meshes have caught lots of attention in recent years. This paper introduces a novel compression method for 3D mesh animation sequences, supporting user-defined and progressive transmissions over networks. Our motion-aware approach starts with clustering animation frames based on their motion similarities, dividing a mesh animation sequence into fragments of varying lengths. This is done by a novel temporal clustering algorithm, which measures motion similarity based on the curvature and torsion of a space curve formed by corresponding vertices along a series of animation frames. We further segment each cluster based on mesh vertex coherence, representing topological proximity within an object under certain motion. To produce a compact representation, we perform intra-cluster compression based on Graph Fourier Transform (GFT) and Set Partitioning In Hierarchical Trees (SPIHT) coding. Optimized compression results can be achieved by applying GFT due to the proximity in vertex position and motion. We adapt SPIHT to support progressive transmission and design a mechanism to transmit mesh animation sequences with user-defined quality. Experimental results show that our method can obtain a high compression ratio while maintaining a low reconstruction error.
Detecting abnormal behaviors of students in time and providing personalized intervention and guidance at the early stage is important in educational management. Academic performance prediction is an important building block to enabling this pre-intervention and guidance. Most of the previous studies are based on questionnaire surveys and self-reports, which suffer from a small sample size and social desirability bias. In this paper, we collect longitudinal behavioral data from 6,597 students' smart cards and propose three major types of discriminative behavioral factors, diligence, orderliness, and sleep patterns. Empirical analysis demonstrates these behavioral factors are strongly correlated with academic performance. Furthermore, motivated by social influence theory, we analyze the correlation of each student's academic performance with his/her behaviorally similar students'. Statistical tests indicate this correlation is significant. Based on these factors, we further build a multi-task predictive framework based on a learning-to-rank algorithm for academic performance prediction. This framework captures inter-semester correlation, inter-major correlation and integrates student similarity to predict students' academic performance. The experiments on a large-scale real-world dataset show the effectiveness of our methods for predicting academic performance and the effectiveness of proposed behavioral factors.
Hidden common causes make it difficult to infer causal relationships from observational data. Here, we consider a new method to account for a hidden common cause that infers its presence from the data. As with other approaches that can account for common causes, this approach is successful only in some cases. We describe such a case taken from the field of genomics, wherein one tries to identify which genomic markers causally influence a trait of interest.