Learning urban community structures refers to the efforts of quantifying, summarizing, and representing an urban community's (i) static structures, e.g., Point-Of-Interests (POIs) buildings and corresponding geographic allocations, and (ii) dynamic structures, e.g., human mobility patterns among POIs. By learning the community structures, we can better quantitatively represent urban communities and understand their evolutions in the development of cities. This can help us boost commercial activities, enhance public security, foster social interactions, and, ultimately, yield livable, sustainable and viable environments. However, due to the complex nature of urban systems, it is traditionally challenging to learn the structures of urban communities. To address this problem, in this paper, we propose a collective embedding framework to learn the community structure from multiple periodic spatial-temporal graphs of human mobility. Specifically, we first exploit a probabilistic propagation based approach to create a set of mobility graphs from periodic human mobility records. In these mobility graphs, the static POIs are regarded as vertexes, the dynamic mobility connectivity between POI pairs are regarded as edges, and the edge weights periodically evolve over time. A collective deep auto-encoder method is then developed to collaboratively learn the embeddings of POIs from multiple spatial-temporal mobility graphs. In addition, we develop a UGWA method (Unsupervised Graph based Weighted Aggregation), in order to align and aggregate the POI embeddings into the representation of the community structure. As an application, we apply the proposed embedding framework to rank high-rated residential communities to evaluate the performance of our proposed method. Extensive experimental results on real-world urban communities and human mobility data demonstrate the effectiveness of the proposed collective embedding framework.
The analysts at a cybersecurity operations center (CSOC) analyze the alerts that are generated by intrusion detection systems (IDSs). Under normal operating conditions, sufficient numbers of analysts are available to analyze the alert workload. For the purpose of this paper, this means that the cybersecurity analysts in each shift can fully investigate each and every alert that is generated by the IDSs in a reasonable amount of time, and perform their normal tasks in a shift. Normal tasks include analysis time, time to attend training programs, report writing time, personal break time, and time to update the signatures on new patterns in alerts as detected by the IDS. There are number of disruptive factors that occur randomly, and can adversely impact the normal operating condition of a CSOC such as 1) higher alert generation rates from a few IDSs, 2) new alert patterns that decreases the throughput of the alert analysis process, and 3) analyst absenteeism. The impact of all the above factors is that the alerts wait for a long duration before being analyzed, which impacts the Level of Operational Effectiveness (LOE) of the CSOC. In order to return the CSOC to normal operating conditions, the manager of a CSOC can take several actions such as increasing the alert analysis time spent by analysts in a shift by cancelling a training program, spending some of their own time to assist the analysts in alert investigation, and calling upon the on-call analyst workforce to boost the service rate of alerts. However, additional resources are limited in quantity over a 14-day work cycle, and the CSOC manager must determine when and how much action to take in the face of uncertainty, which arises from both the intensity and the random occurrences of the disruptive factors. The above decision by the CSOC manager is non-trivial and is often made in an ad-hoc manner using prior experiences. This paper develops a reinforcement learning (RL) model for optimizing the LOE throughout the entire 14-day work cycle of a CSOC in the face of uncertainties due to disruptive events. Results indicate that the RL model is able to assist the CSOC manager with a decision support tool to make better decisions than current practices in determining when and how much resource to allocate when the LOE of a CSOC deviates from the normal operating condition.
High utility sequential pattern (HUSP) mining is an emerging topic in pattern mining, and only a few algorithms have been proposed to address it. In practice, most sequence databases usually grow over time, and it is inefficient for existing algorithms to mine HUSPs from scratch when databases grow with a small portion of updates. In view of this, we propose the IncUSP-Miner + algorithm to mine HUSPs incrementally. Specifically, to avoid redundant re-computations, we propose a tighter upper bound of the utility of a sequence, called TSU (standing for Tight Sequence Utility), and then design a novel data structure, called the candidate pattern tree, to buffer the sequences whose TSU values are greater than or equal to the minimum utility threshold in the original database. Accordingly, to avoid keeping a huge amount of utility information for each sequence, a set of concise utility information is designed to be stored in each tree node. To improve the mining efficiency, several strategies are proposed to reduce the amount of computation for utility update and the scopes of database scans. Moreover, several strategies are also proposed to properly adjust the candidate pattern tree for the support of multiple database updates. Experimental results on some real and synthetic datasets show that IncUSP-Miner + is able to efficiently mine HUSPs incrementally.
From reducing stress and loneliness, to boosting productivity and overall well-being, pets are believed to play a significant role in people's daily lives. Many traditional studies have identified that frequent interactions with pets could make individuals become healthier and more optimistic, and ultimately enjoy a happier life. However, most of those studies are not only restricted in scale, but also may carry biases by using subjective self-reports, interviews, and questionnaires as the major approaches. In this paper, we leverage large-scale data collected from social media and the state-of-the-art deep learning technologies to study this phenomenon in depth and breadth. Our study includes four major steps: 1) collecting timeline posts from around 20,000 Instagram users; 2) using face detection and recognition on 2-million photos to infer users' demographics, relationship status, and whether having children, 3) analyzing a user's degree of happiness based on images and captions via smiling classification and textual sentiment analysis; 3) applying transfer learning techniques to retrain the final layer of the Inception v3 model for pet classification; and 4) analyzing the effects of pets on happiness in terms of multiple factors of user demographics. Our main results have demonstrated the efficacy of our proposed method with many new insights. We believe this method is also applicable to other domains as a scalable, efficient, and effective methodology for modeling and analyzing social behaviors and psychological well-being.
Machine learning and artificial intelligence techniques have been applied to construct online portfolio selection strategies recently. A popular and state-of-the-art family of strategies is to explore the reversion phenomenon through online learning algorithms and statistical prediction models. Despite gaining promising results on some benchmark datasets, these strategies often adopt a single model based on a selection criterion (e.g., breakdown point) for predicting future price. However, such model selection is often unstable and may cause unnecessarily high variability in the final estimation, leading to poor prediction performance in real datasets and thus non-optimal portfolios. To overcome the drawbacks, in this paper, we propose to exploit the reversion phenomenon by using combination forecasting estimators, and design a novel online portfolio selection strategy, named Combination Forecasting Reversion (CFR), which outputs optimal portfolios based on the improved reversion estimator. We further present an efficient CFR implementation based on online Newton step (ONS) and online gradient descent (OGD) algorithms, and theoretically analyze regret bounds of the proposed algorithms, which guarantee that the online CFR model performs as good as the best CFR model in hindsight. We evaluate the proposed algorithms on various real markets with extensive experiments. Empirical results show that CFR can effectively overcome the drawbacks of existing reversion strategies and achieve the state-of-the-art performance.
The increased accessibility of urban sensor data and the popularity of social network applications is enabling the discovery of crowd mobility and personal communication patterns. However, studying the egocentric relationships of an individual (i.e., the egocentric relations) can be very challenging because available data may refer to direct contacts, such as phone calls between individuals, or indirect contacts, such as paired location presence. In this paper, we develop methods to integrate three facets extracted from heterogeneous urban data (timelines, calls and locations) through a progressive visual reasoning and inspection scheme. Our approach uses a detect-and-filter scheme, such that, prior to visual refinement and analysis, a coarse detection is performed to extract the target individual and construct the timeline of the target. It then detects spatio-temporal co-occurrences or call-based contacts to develop the egocentric network of the individual. The filtering stage is enhanced with a line-based visual reasoning interface that facilitates flexible and comprehensive investigation of egocentric relationships and connections in terms of time, space and social networks. The integrated system, RelationLines, is demonstrated using a dataset that contains taxi GPS data, cell-base mobility data, mobile calling data, microblog data and POI data of a city with millions of citizens. We conduct three case studies to examine the effectiveness and efficiency of our system.
Deep convolutional neural networks (CNNs) have achieved remarkable success in various fields. However, training an excellent CNN is practically a trial-and-error process that consumes a tremendous amount of time and computer resources. To accelerate the training process and reduce the number of trials, experts need to understand what has occurred in the training process and why the resulting CNN behaves as such. However, current popular training platforms, such as TensorFlow, only provide very little and general information, such as training/validation errors, which is far from enough to serve this purpose. To bridge this gap and help domain experts with their training tasks in a practical environment, we propose a visual analytics system, DeepTracker, to facilitate the exploration of the rich dynamics of CNN training processes and to identify the unusual patterns that are hidden behind the huge amount of training log. Specifically, we combine a hierarchical index mechanism and a set of hierarchical small multiples to help experts explore the entire training log from different levels of detail. We also introduce a novel cube-style visualization to reveal the complex correlations among multiple types of heterogeneous training data including neuron weights, validation images, and training iterations. Three case studies are conducted to demonstrate how DeepTracker provides its users with valuable knowledge in an industry-level CNN training process, namely in our case, training ResNet-50 on the ImageNet dataset. We show that our method can be easily applied to other state-of-the-art "very deep" CNN models.
For real-world learning tasks (e.g., classification), graph-based models are commonly used to fuse the information distributed in diverse data sources, which can be heterogeneous, redundant, and incomplete. These models represent the relations in different datasets as pairwise links. However, these links cannot deal with high-order relations which connect multiple objects (e.g., more than two patient groups admitted by the same hospital in 2014). In this paper, we propose a visual analytics approach for the classification of heterogeneous datasets using the hypergraph model. The hypergraph is an extension to traditional graphs in which a hyperedge connects multiple vertices instead of just two. We model various high-order relations in heterogeneous datasets as hyperedges and fuse different datasets with a uni ed hypergraph structure. The hypergraph learning algorithm is used for predicting the missing labels in the datasets. To allow users to inject their domain knowledge into the model-learning process, we augment the traditional learning algorithm in a number of ways. We also propose a set of visualizations which enable the user to construct the hypergraph structure and the parameters of the learning model interactively during the analysis. We demonstrate the capability of our approach via two real-world cases.
For 2-dimensional (2D) data, current clustering algorithms usually need to convert them to vectors in a pre-processing step, which, unfortunately, severely damages 2D spatial information and omits the inherent structures and correlations in the original data. In this paper, we develop a novel clustering method, which addresses these issues to enhance the clustering capability. The proposed method mutually enhances three goals, including seeking projections, learning manifolds, and constructing data representations, in a seamlessly integrated model. In particular, we seek two projection matrices with optimal number of directions to project the data into optimally low-rank, noise reduced, most expressive subspaces, in which manifolds are constructed and data representations are sought. The manifolds are adaptively updated according to the projections, and the new data representations are sought with respect to the projected data. Consequently, the learned manifolds are clean and more expressive, and the new data representations are representative and robust. Extensive experimental results have verified the effectiveness of the proposed method in clustering.
Recognizing human activities using supervised learning methods has been widely studied in the literature. However, for some applications like elderly care, what activities to be identied for analysis are very often unknown. In this paper, we focus on automatic extraction of behavioral patterns as the representations of activities from the trajectory data of an individual. The underlying challenges lie on the need to model the long-range dependency and spatio-temporal variations within the trajectory data. We propose to rst represent the trajectory data using a behavior-aware ow graph which is a probabilistic nite state automaton with its nodes and edges attributed with local behavioral features. We then identify the underlying subows as the behavioral patterns using the kernel k-means algorithm. With the activities automatically identied, we propose a novel nominal matrix factorization method under a Bayesian framework with Lasso to extract highly interpretable daily activity routines. The performance of the proposed methodology has been compared with a number of existing methods using both synthetic and publicly available real smart home data sets with promising results obtained. We also discuss how the proposed unsupervised methodology can be used to support exploratory behavior analysis for elderly care.
Recommendation applications can guide users in making important life choices by referring to the activities of similar peers. For example, students making academic plans may learn from the data of similar students, while patients and their physicians may explore data from similar patients to select the best treatment. Selecting an appropriate peer group has a strong impact on the value of the guidance that can result from analyzing the peer group data. In this paper, we describe a visual interface that helps users review the similarity and differences between a seed record and a group of similar records, and refine the selection. We introduce the LikeMeDonuts, Ranking Glyph, and History Heatmap visualizations. The interface was refined through three rounds of formative usability evaluation with 12 target users and its usefulness was evaluated by a case study with a student review manager using real student data. We describe three analytic workflows observed during use and summarize how users' input shaped the final design.
Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition but still remains an important challenge. Data-driven supervised approaches, especially the ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks.
Hashing techniques have recently gained increasing research interests in multimedia studies. Most existing hashing methods only employ single feature for hash code learning. Multi-view data with each view corresponding to a type of feature generally provides more comprehensive information. How to efficiently integrate multiple views for learning compact hash codes still remains challenging. In this paper, we propose a novel unsupervised hashing method, dubbed multi-view discrete hashing (MvDH), by effectively exploring multi-view data. Specifically, MvDH performs matrix factorization to generate the hash codes as the latent representations shared by multiple views, during which spectral clustering is performed simultaneously. The joint learning of hash codes and cluster labels enables that MvDH can generate more discriminative hash codes, which are optimal for classification. An efficient alternating algorithm is developed to solve the proposed optimization problem with guaranteed convergence and low computational complexity. The binary codes are optimized via discrete cyclic coordinate descent (DCC) method to reduce the quantization errors. Extensive experimental results on three large-scale benchmark datasets demonstrate the superiorities of the proposed method over several state-of-the-art methods in terms of both accuracy and scalability.
Popular social media platforms could rapidly propagate vital information over social networks among a significant number of people. In this work we present D-Map+ (Diffusion Map), a novel visualization method to support exploration and analysis of social behaviors during such information diffusion and propagation on typical social media through a map metaphor. In D-Map+, users who participated in reposting (i.e., resending a message initially posted by others) one central user's posts (i.e., a series of original tweets) are collected and mapped to a hexagonal grid based on their behavior similarities and in chronological order of the repostings. With additional interaction and linking, D-Map+ is capable of providing visual profilings of the influential users, describing their social behaviors and analyzing the siginificant events evolution in social media. A comprehensive visual analysis system is developed to support interactive exploration with D-Map+. We evaluate our work with real world social media data and find interesting patterns among users. Key players, important information diffusion paths, and interactions among social communities can be identified.
Learning-to-Rank (LtR) solutions are commonly used in large-scale information retrieval systems such as Web search engines where high-quality documents need to be returned in response to a user query within a fraction of a second. The most effective LtR algorithms, e.g., »-MART, adopt a gradient boosting approach to build an additive ensemble of weighted regression trees. Since the required ranking effectiveness is achieved with very large ensembles, the impact on response time and query throughput of these solutions is not negligible. In this paper we propose X-CLEaVER, an iterative meta-algorithm able to build more efficient and effective ranking ensembles. X-CLEaVER interleaves the iterations of a given ensemble learning algorithm with pruning and re-weighting phases. First, redundant trees are removed from the ensemble generated, then the weights of the remaining trees are fine-tuned by optimizing the desired ranking loss function. We propose and analyse several pruning strategies and assess their bene ts showing that interleaving pruning and re-weighting phases during learning is more effective than applying a single post-learning optimization step. Experiments conducted using two publicly available LtR datasets show that X-CLEaVER is very effective in optimizing »-MART models both in terms of effectiveness and efficiency.
This paper presents a platform for interactive graph mining and relational learning called GraphVis. The platform combines interactive visual representations with state-of-the-art graph mining and relational machine learning techniques to aid in revealing important insights quickly as well as learning an appropriate and highly predictive model for a particular task (e.g., classification, link prediction, discovering the roles of nodes, finding influential nodes). Visual representations and interaction techniques and tools are developed for simple, fast, and intuitive real-time interactive exploration, mining, and modeling of graph data. In particular, we propose techniques for interactive relational learning (e.g., node/link classification), interactive link prediction and weighting, role discovery and community detection, higher-order network analysis (via graphlets, network motifs), among others. GraphVis also allows for the refinement and tuning of graph mining and relational learning methods for specific application domains and constraints via an end-to-end interactive visual analytic pipeline that learns, infers, and provides rapid interactive visualization with immediate feedback at each change/prediction in real-time. Other key aspects include interactive filtering, querying, ranking, manipulating, exporting, as well as tools for dynamic network analysis and visualization, interactive graph generators/models (including new block model approaches), and a variety of multi-level network analysis techniques.
With the rapid growth of social media, massive misinformation is also spreading widely on social media, such as Weibo and Twitter, and brings negative effects to human life. Nowadays, automatic misinformation identification has drawn attention from academic and industrial communities. For an event on social media usually consists of multiple microblogs, current methods are mainly constructed based on global statistical features. However, information on social media is full of noisy, which should be alleviated. Moreover, most of microblogs about an event have little contribution to the identification of misinformation, where useful information can be easily overwhelmed by useless information. Thus, it is important to mine significant microblogs for constructing a reliable misinformation identification method. In this paper, we propose an Attention-based approach for Identification of Misinformation (AIM). Based on the attention mechanism, AIM can select microblogs with largest attention values for misinformation identification. The attention mechanism in AIM contains two parts: content attention and dynamic attention. Content attention is calculated based textual features of each microblog. Dynamic attention is related to the time interval between the posting time of a microblog and the beginning of the event. To evaluate AIM, we conduct a series of experiments on the Weibo dataset and the Twitter dataset, and the experimental results show that the proposed AIM model outperforms the state-of-the-art methods.
In conventional supervised learning paradigm, each data instance is associated with one single class label. Multi-label learning differs in the way that data instances may belong to multiple concepts simultaneously, which naturally appear in a variety of high impact domains, ranging from bioinformatics, information retrieval to multimedia analysis. It targets to leverage the multiple label information of data instances to build a predictive learning model which can classify unlabeled instances into one or multiple predefined target classes. In multi-label learning, even though each instance is associated with a rich set of class labels, the label information could be noisy and incomplete as the labeling process is both time consuming and labor expensive, leading potential missing annotations or even erroneous annotations. The existence of noisy and missing labels could negatively affect the performance of underlying learning algorithms. More often than not, multi-labeled data often has noisy, irrelevant and redundant features of high dimensionality. The existence of these uninformative features may also deteriorate the predictive power of the learning model due to the curse of dimensionality. Feature selection, as an effective dimensionality reduction technique, has shown to be powerful in preparing high-dimensional data for numerous data mining and machine learning tasks. However, a vast majority of existing multi-label feature selection algorithms either boil down to solving multiple single-labeled feature selection problems or directly make use of the imperfect labels to guide the selection of representative features. As a result, they may not be able to obtain discriminative features shared across multiple labels. In this paper, to bridge the gap between rich source of multi-label information and its blemish in practical usage, we propose a novel noise resilient multi-label informed feature selection framework - MIFS by exploiting the correlations among different labels. In particular, to reduce the negative effects of imperfect label information in obtaining label correlations, we decompose the multi-label information of data instances into a low-dimensional space and then employ the reduced label representation to guide the feature selection phase via a joint sparse regression framework. Empirical studies on both synthetic and real-world datasets demonstrate the effectiveness and efficiency of the proposed MIFS framework.
Making machines understand human expressions enables various useful applications in human-machine interaction. In this paper, we present a novel facial expression recognition approach with 3D Mesh Convolutional Neural Network (3DMCNN) and a visual analytics guided 3DMCNN design and optimization scheme. From a RGBD camera, we first reconstruct a 3D face model of a subject with facial expressions and then compute the geometric properties of the surface. Instead of using regular Convolutional Neural Network (CNN) to learn intensities of the facial images, we convolve the geometric properties on the surface of the 3D model using 3DMCNN. We design a geodesic distance-based convolution method to overcome the difficulties raised from the irregular sampling of the face surface mesh. We further present an interactive visual analytics for the purpose of designing and modifying the networks to analyze the learned features and cluster similar nodes in 3DMCNN. By removing low activity nodes in the network, the performance of the network is greatly improved. We compare our method with the regular CNN-based method by interactively visualizing each layer of the networks and analyze the effectiveness of our method by studying representative cases. Testing on public datasets, our method achieves a higher recognition accuracy than traditional image-based CNN and other 3D CNNs. The proposed framework, including 3DMCNN and interactive visual analytics of the CNN, can be extended to other applications.
Smog causes low visibility on the road and it can impact the safety of traffic. Modeling traffic in smog will have a significant impact on realistic traffic simulation. Most of the existing traffic models assume that drivers have optimal vision in the simulations. These simulations are not suitable for modeling smog weather conditions. In this paper, we introduce the smog full velocity difference model (SMOG-FVDM) for a realistic simulation of traffic in smog weather conditions. In this model, we present a stadia model for drivers in smog weather. We then introduce it into the car-following traffic model through ``Psychological Force'' and ``Body Force'', and then introduce the SMOG-FVDM. Considering that there are lots of parameters in the SMOG-FVDM, we design a visual verification system based on the SMOG-FVDM to get an adequate solution, which can show visual simulation results in different road scenarios and different smog degrees by reconciling the parameters. Experiments results show that our model can give a realistic and efficient traffic simulation in smog weather conditions.
When looking at an image, humans shift their attention towards interesting regions, making sequences of eye fixations. When describing an image, they also come up with simple sentences that highlight the key elements in the scene. What is the correlation between where people look and what they describe in an image? To investigate this problem intuitively, we develop a visual analytics system CapVis to look into eye fixations and image captions, two types of subjective annotations that are relatively task-free and natural. From the annotations, we propose a word-weighting scheme to extract visual and verbal saliency ranks to compare against each other. In our approach, a number of low-level and semantic-level features relevant to the visual-verbal saliency consistency are proposed and visualized in multiple facts for better understanding of image content. Our method also shows the different ways human and computational model look and describe, which provides reliable information for the diagnosis of captioning model. Experiment also shows that the visualized feature can be integrated into a computational model, to effectively predict the consistency between the two modalities on image dataset with both types of annotations.
This paper deals with trajectory planning that is suitable for nonholonomic differentially driven wheeled mobile robots. The path is approximated with a spline which consist of multiple Bernstein-Bézier curves that are merged together in a way that continuous curvature of the spline is achieved. The paper presents the approach for optimization of velocity profile of Bernstein-Bézier spline subject to velocity and acceleration constraints. For the purpose of optimization velocity and turning points are introduced. Based on these singular points local segments are defined where local velocity profiles are optimized independently of each other. From the locally optimum velocity profiles the global optimum velocity profile is determined. The proposed optimization approach is experimentally evaluated and validated in simulation environment and on real mobile robots.