Understanding the Relationship between Human Behavior and Susceptibility to Cyber Attacks

Despite growing speculation about the role of human behavior in cyber-security of machines, concrete data-driven analysis and evidence have been... (more)

Optimal Scheduling of Cybersecurity Analysts for Minimizing Risk

Cybersecurity threats are on the rise with evermore digitization of the information that many day-to-day systems depend upon. The demand for... (more)

Automatic Construction of Statechart-Based Anomaly Detection Models for Multi-Threaded Industrial Control Systems

Traffic of Industrial Control System (ICS) between the Human Machine Interface (HMI) and the... (more)

Tracking Illicit Drug Dealing and Abuse on Instagram Using Multimodal Analysis

Illicit drug trade via social media sites, especially photo-oriented Instagram, has become a severe problem in recent years. As a result, tracking... (more)

Algorithms for Graph-Constrained Coalition Formation in the Real World

Coalition formation typically involves the coming together of multiple, heterogeneous, agents to achieve both their individual and collective goals.... (more)

Data-Driven Frequency-Based Airline Profit Maximization

Although numerous traditional models predict market share and demand along airline routes, the prediction of existing models is not precise enough,... (more)


Forthcoming Articles

Cyber Security and the Role of Intelligent Systems in Addressing its Challenges

Exploring Communication Behaviors of Users to Target PotentialUsers in Mobile Social Networks

In mobile social networks, users can communicate with each other over different telecom carriers. Thus, for telecom operators, how to acquire and retain users is a significant issue. The work of churn prediction is to determine whether a customer would leave soon. Differing from churn prediction, our work is to find those users who are likely to join target services from the competitors in the near future, where these users are called potential users. To target potential users, we propose a framework including feature extraction, feature selection, and classifier learning to solve the problem. First, we construct a heterogeneous information network from the call detail records of users. Then, we extract the explicit features from potential users interaction behavior in the heterogeneous information network. Moreover, because users are influenced by their community, we extract implicit features of potential users. After feature extraction, we explore the Information Gain to select the effective features. We use the effective explicit and implicit features to learn potential user classifiers, and use the classifiers to determine the potential users. Finally, we conduct experiments on real datasets. The results of our experiments show that the features extracted by our proposed method can be effective for targeting potential users.

TensorBeat: Tensor Decomposition for Monitoring Multi-Person Breathing Beats with Commodity WiFi

Breathing signal monitoring can provide important clues for human's physical health problems. Comparing to existing techniques that require wearable devices and special equipment, a more desirable approach is to provide contact-free and long-term breathing rate monitoring by exploiting wireless signals. In this paper, we propose TensorBeat, a system to employ channel state information (CSI) phase difference data to intelligently estimate breathing rates for multiple persons with commodity WiFi devices. The main idea is to leverage the tensor decomposition technique to handle the CSI phase difference data. The proposed TensorBeat scheme first obtains CSI phase difference data between pairs of antennas at the WiFi receiver to create CSI tensor data. Then Canonical Polyadic (CP) decomposition is applied to obtain the desired breathing signals. A stable signal matching algorithm is developed to find the decomposed signal pairs, and a peak detection method is applied to estimated the breathing rates for multiple persons. Our experimental study shows that TensorBeat can achieve high accuracy under different environments for multi-person breathing rate monitoring.

Mobile Social Multimedia Analytics in the Big Data Era: An Introduction to the Special Issue

TIST 8:4: Guest Editor Introduction

Adult Images and Videos Recognition by Deep Multi-Context Network and Fine-to-Coarse Strategy

Adult images and videos recognition is an important and challenging problem in the real world. Low-level feature cues do not produce good enough information especially when the dataset is very large and has various data distributions. This issue raises a serious problem for conventional approaches. In this paper, we tackle this problem by proposing a Deep Multi-Context Network (DMCNet) with Fine-to-Coarse strategy for adult images and videos recognition. We employ deep Convolution Networks to model fusion feature of sensitive object in images. Global contexts and local contexts are both taken into consideration, and are jointly modeled in a unified multi-context deep learning framework. To make the model more discriminative for diverse target objects, we investigate a novel hierarchical method, and a task-specific fine-to-coarse strategy is designed to make the multi-context modeling more suitable for adult object recognition. Furthermore, some recently proposed deep models are investigated. Our approach is extensively evaluated on four different datasets, including one for ablation experiments and the others for generalization experiments. Results show significant and consistent improvements over the state-of-the-art methods.

Advanced Economic Control of Electricity-based Space Heating Systems in Domestic Coalitions with Shared Intermittent Energy Resources

Over the past few years, domestic heating automation systems (DHASs) that optimize the domestic space heating control process with minimum user-input, utilizing appropriate occupancy prediction technology, have emerged as commercial products (e.g, the smart thermostats from Nest and Honeywell). At the same time, many houses are being equipped with, potentially grid-connected, intermittent energy resources (IERs), such as rooftop photovoltaic systems and/or small wind turbine generators. Now, in many regions of the world, such houses can sell energy to the grid but at a lower price than the price of buying it. In this context, and given the anticipated increase in electrification of heating, the next generation DHASs need to incorporate advanced economic control (AEC). Such AEC can exploit the energy buffer that heating loads provide, in order to shift the consumption of electricity-based heating systems to follow the intermittent energy generation of the house. By so doing, the energy imported from the grid can be minimized and considerable monetary gains for the household can be achieved, without affecting the occupants' schedule. These benefits can be amplified still further in domestic coalitions, where a number of houses come together and share their IER generation to minimize their cumulative grid energy import. Given the above, in this work we extend a state-of-the-art DHAS, to propose AdaHeat+, a practical DHAS, that, for the first time, incorporates AEC. Our work is applicable to both individual houses and domestic coalitions and comes complete with a cost allocation mechanism to share the gains of the coalition. Importantly, we propose an effective heuristic heating schedule planning approach for collective AEC which: (i) has a complexity that scales in a linear and parallelizable manner with the size of the coalition, and (ii) enables AdaHeat+ to handle different preferences, in balancing heating cost and thermal discomfort of the individual households. Our approach relies on stochastic IER power output predictions. To achieve this, we propose a new adaptive site-specific calibration technique to improve such predictions, utilizing Gaussian process modeling. Finally, we demonstrate the effectiveness of AdaHeat+ through real data evaluation, to show that collective AEC can improve heating cost-efficiency by up to 60%, compared to independent AEC (and even more when compared to no-AEC).

Exploring Indoor White Spaces in Metropolises

It is a promising vision to utilize white spaces, i.e., vacant VHF and UHF TV channels, to satisfy skyrocketing wireless data demand in both outdoor and indoor scenarios. While most prior works have focused on exploring outdoor white spaces, the indoor story is largely open for investigation. Motivated by this observation and that 70% of the spectrum demand comes from indoor environments, we carry out a comprehensive study of exploring indoor white spaces. We first present a large-scale measurement of outdoor and indoor TV spectrum occupancy in 30+ diverse locations in a typical metropolis Hong Kong. Our measurement results confirm abundant white spaces available for exploration in a wide range of areas in metropolises. In particular, more than 50% and 70% of the TV spectrum are white spaces in outdoor and indoor scenarios, respectively. While there are substantially more white spaces in indoor scenarios than in outdoor scenarios, there is no effective solution for identifying indoor white spaces. To fill in this gap, we propose the first system WISER (for White-space Indoor Spectrum EnhanceR), to identify and track indoor white spaces in a building, without requiring user devices to sense the spectrum. We discuss the design space of such system and justify our design choices using intensive real-world measurements. We design the architecture and algorithms to address the inherent challenges. We build a WISER prototype and carry out real-world experiments to evaluate its performance. Our results show that WISER can identify 30%-40% more indoor white spaces with negligible false alarms, as compared to alternative baseline approaches.

illiad: InteLLigent Invariant and Anomaly Detection in Cyber Physical Systems

Cyber physical systems (CPSs) are today ubiquitous in urban environments. Such systems now serve as thebackbone to numerous critical infrastructure applications, from smart grids to IoT installations. Scalableand seamless operation of such CPSs requires sophisticated tools for monitoring the time series progres-sion of the system, dynamically tracking relationships, and issuing alerts about anomalies to operators. Wepresent an online monitoring system (illiad) that models the state of the CPS as a function of its relation-ships between constituent components, using a combination of model-based and data-driven strategies. Inaddition to accurate inference for state estimation and anomaly tracking,illiadis able to exploit the un-derlying network structure of the CPS (wired or wireless) for state estimation purposes. We demonstratethe application ofilliadto two diverse settings: a wireless sensor motes application and an IEEE 33-busmicrogrid

DMAD: Data-Driven Measuring of Wi-Fi Access Point Deployment in Urban Spaces

Wireless networks offer many advantages over wired local area networks such as scalability and mobility. Strategically deployed wireless networks can achieve multiple objectives like traffic offloading, network coverage and indoor localization. To this end, various mathematical models and optimization algorithms have been proposed to find optimal deployments of access points (APs) for different objectives, like coverage ratio. However, wireless signals can be blocked by the human body, especially in crowded urban spaces. The impact of human beings on wireless coverage cannot be easily analyzed by existing methods. Site surveys are too time-consuming and labor-intensive to conduct. It is infeasible for simulation methods to predict the number of people. As a result, the real coverage of an on-site AP deployment may shrink to some degree and lead to unexpected dead spots (areas without wireless coverage). These dead spots are undesirable, since they degrade the user experience in network service continuity on one hand, and on the other hand paralyze some applications and services like tracking and monitoring when users are in these areas. In this paper, we propose DMAD, a Data-driven Measuring of Access point Deployment, which can not only find potential dead spots of an on-site AP deployment but also quantify their severity. DMAD utilizes simple Wi-Fi data collected from the on-site AP deployment and shop data from the Internet. We firstly classify static devices and mobile devices using a decision-tree classifier. Then locate these devices to shop-level locations based on shop popularities, wireless signals, and visit duration. Lastly, for each location, we estimate the probability of dead spots in different time slots and derive their severity combining the probability and human density. The analysis of Wi-Fi data from static devices indicates that the Pearson Correlation Coefficient of wire- less coverage status and the number of on-site people is over 0.7, which confirms that human beings may have a significant impact on wireless coverage. We also conduct extensive experiments in a large shopping mall in Shenzhen. The evaluation results demonstrate that DMAD can find around 70% of dead spots with a precision of over 70%.

A Comfort-Based Approach to Smart Heating and Air Conditioning

In this paper, we address the interrelated challenges of predicting user comfort and using this to reduce energy consumption in smart heating, ventilation and air conditioning (HVAC) systems. At present, such systems use simple models of user comfort when deciding on a set point temperature. Being built using broad population statistics, these models generally fail represent individual users preferences, resulting in poor estimates of the users preferred temperatures. To address this issue, we propose the Bayesian Comfort Model (BCM). This personalised thermal comfort model using a Bayesian network learns from a users feedback, allowing it to adapt to the users individual preferences over time. We further propose an alternative to the ASHRAE 7-point scale used to assess user comfort. Using this model, we create an optimal HVAC control algorithm that minimizes energy consumption while preserving user comfort. Through an empirical evaluation based on the ASHRAE RP-884 data set and data collected in a separate deployment by us, we show that our model is consistently 13.2 to 25.8% more accurate than current models and how using our alternative comfort scale can increase our models accuracy. Through simulations we show that using this model, our HVAC control algorithm can reduce energy consumption by 7.3% to 13.5% while decreasing user discomfort by 24.8% simultaneously.

GeoBurst+: Effective and Real-Time Local Event Detection in Geo-Tagged Tweet Streams

The real-time discovery of local events (e.g., protests, disasters) has been widely recognized as a fundamental socioeconomic task. Recent studies have demonstrated that the geo-tagged tweet stream serves as an unprecedentedly valuable source for local event detection. Nevertheless, how to effectively extract local events from massive geo-tagged tweet streams in real time remains challenging. To bridge the gap, we propose a method for effective and real-time local event detection from geo-tagged tweet streams. Our method, named GeoBurst+, first leverages a novel cross-modal authority measure to identify several pivots in the query window. Such pivots reveal different geo-topical activities and naturally attract similar tweets to form candidate events. GeoBurst+ further summarizes the continuous stream and compares the candidates against the historical summaries to pinpoint truly interesting local events. Better still, as the query window shifts, GeoBurst+ is capable of updating the event list with little time cost, thus achieving continuous monitoring of the stream. We used crowdsourcing to evaluate GeoBurst+ on two million-scale data sets, and found it significantly more effective than existing methods while being orders of magnitude faster.

Cost-Optimized Microblog Distribution over Geo-Distributed Data Centers: Insights from Cross-Media Analysis

The unprecedent growth of microblog services poses significant challenges on network traffic and service latency to the underlay infrastructure (i.e., geo-distributed data centers). Furthermore, the dynamic evolution in microblog status generates a huge workload on data consistence maintenance. In this paper, motivated by insights of cross media analysis based propagation patterns, we propose a novel cache strategy for microblog service systems to reduce the inter data center traffic and consistence maintenance cost, while achieve low service latency. Specifically, we first present a microblog classification method, which utilizes the external knowledge from correlated domains, to categorize microblogs. Then we conduct a large-scale measurement on a representative online social network system to study the category based propagation diversity on region and time scales. These insights illustrate social common habits on creating and consuming microblogs, and further motivate our architecture design. Finally, we formulate the content cache problem as a constrained optimization problem. By jointly using the Lyapunov optimization framework and simplex gradient method, we find the optimal online control strategy. Extensive trace driven experiments further demonstrate that our algorithm reduces the system cost by 24.5\% against traditional approaches with the same service latency.

An Unsupervised Approach to Inferring the Localness of People Using Incomplete Geo-Temporal Online Check-in Data

Inferring the localness of people is to identify whether a person is a local resident in a city or not by analyzing online check-in points that are contributed by users consisting of both local and non-local people (e.g., tourists). This information is critical for the targeted ads of local business, urban planning, and localized news recommendations. While there are prior work on geo-locating people in a city using supervised learning approaches, the accuracy of those techniques largely depends on the training datasets with complete geo-temporal information, which are difficult and expensive to obtain in practice. In this paper, we propose an unsupervised approach to infer the localness of people in a city by using the incomplete crowdsourcing data (i.e., online check-in points) that are publicly available. In particular, we develop an Incomplete-Geo-Temporal Expectation Maximization (IGT-EM) scheme, which incorporates a set of hidden variables to represent the localness of people and a set of estimation parameters to represent the likelihood of venues to attract local and non-local people respectively. Our solution can jointly estimate 1) the localness of a person and 2) the probability of a venue to attract local people without requiring any training data. We also implement a parallel IGT-EM algorithm by leveraging the computing power of a Graphic Processing Unit (GPU) that consists of 2496 cores. We evaluate our new approach on four real-world datasets collected from the city of New York, Chicago, Boston and Washington D.C. The results showed that our approach can accurately estimate the localness of people and significantly outperform other state-of-the-art baselines in terms of both estimation accuracy and execution time.

Detecting Communities of Authority and Analyzing their Influence in Dynamic Social Networks

Users in real-world social networks are organized into communities that differ from each other in terms of influence, authority, interest, size, etc. This paper addresses the problems of detecting communities of authority and of estimating the influence of such communities in dynamic social networks. These are new issues that have not yet been addressed in the literature and they are important in applications such as marketing and recommender systems. To facilitate the identification of communities of authority, our approach first detects communities sharing common interests, which we call "meta-communities", by incorporating topic modeling based on users' community memberships. Then, communities of authority are extracted with respect to each meta-community, using a new measure based on the betweenness centrality. To assess the influence between communities over time, we propose a new model based on the Granger causality method. Through extensive experiments on a variety of social network datasets, we empirically demonstrate the suitability of our approach for community-of-authority detection and assessment of the influence between communities over time.

Visual Classification of Furniture Styles

Furniture style describes the discriminative appearance characteristics of furniture. It plays an important role in real-world indoor decoration. In this paper, we explore the furniture style features and study the problem of furniture style classification. Different from traditional object classification, furniture style classification targets at classifying different furniture in terms of the ``style" which describes the appearance (e.g. American style, Gothic style, Rococo style, etc.), rather than the usual ``kind" which is more related to the functional structure (e.g. bed, desk, etc.). To pursue efficient furniture style features, we construct a novel dataset of furniture styles which contains 16 common style categories, and implement three strategies with respect to two categories of classification, i.e., handcrafted classification and learning-based classification. First we follow the typical image classification pipeline to extract the handcrafted features and train the classifier by support vector machine. Then we use the convolutional neural network to extract learning-based features from training images. To obtain comprehensive furniture style features, we finally combine the handcrafted image classification pipeline and the learning-based network. We experimentally evaluate the performances of handcrafted features and learning-based features of each strategy, and the results show the superiority of learning-based features and also the comprehensive of handcrafted features.

Transfer Learning for Behavior Ranking

Intelligent recommendation has been well recognized as one of the major approaches to address the information overload problem in the big data era. A typical intelligent recommendation engine usually consists of three major components, i.e., data as the main input, algorithms for preference learning, and system for user interaction and high-performance computation. We observe that the data (e.g., users' behavior) are usually in different forms such as examinations (e.g., browse and collection) and ratings, where the former are often much more abundant than the latter. Although the data are in different representations, they are both related to users' true preferences and are also deemed complementary to each other for preference learning. However, very few ranking or recommendation algorithms have been developed to exploit such two types of user behavior. In this paper, we focus on jointly modeling the examination behavior and rating behavior and develop a novel and efficient ranking-oriented recommendation algorithm accordingly. Firstly, we formally define a new recommendation problem termed {\em behavior ranking} (BR), which aims to build a ranking-oriented model by exploiting both the examination behavior and rating behavior. Secondly, we develop a simple and generic {\em transfer to rank} (ToR) algorithm for behavior ranking, which transfers knowledge of candidate items from a global preference learning task to a local preference learning task. Compared with the previous work on integrating heterogeneous user behavior, our ToR algorithm is the first ranking-oriented solution, which can effectively generate recommendations in a more direct manner than those regression-oriented methods. Extensive empirical studies show that our ToR algorithm performs significantly more accurate than the state-of-the-art methods in most cases. Furthermore, our ToR algorithm is very efficient in terms of the time complexity, which is similar to those for homogeneous user behavior alone.

ST-SAGE: A Spatial-Temporal Sparse Additive Generative Model for Spatial Item Recommendation

With the rapid development of location-based social networks (LBSNs), spatial item recommendation has become an important mobile application, especially when users travel away from home. However, this type of recommendation is very challenging compared to traditional recommender systems. A user may visit only a limited number of spatial items, leading to a very sparse user-item matrix. This matrix becomes even sparser when the user travels to a distant place as most of the items visited by a user are usually located within a short distance from the user's home. Moreover, user interests and behavior patterns may vary dramatically across different time and different geographical regions. In light of this, we propose ST-SAGE, a spatial-temporal sparse additive generative model for spatial item recommendation in this paper. ST-SAGE considers both personal interests of the users and the preferences of the crowd in the target region at the given time by exploiting both the co-occurrence patterns of spatial items and the content of spatial items. To further alleviate the data sparsity issue, ST-SAGE exploits the geographical correlation by smoothing the crowd's preferences over a well-designed spatial index structure called spatial pyramid. To speed up the training process of ST-SAGE, we implement a parallel version of the model inference algorithm on the GraphLab framework. We conduct extensive experiments and the experimental results clearly demonstrate that ST-SAGE outperforms the state-of-the-art recommender systems in terms of recommendation effectiveness, model training efficiency and online recommendation efficiency.

DUCT: An Upper Confidence Bound Approach to Distributed Constraint Optimization Problems

We propose a distributed upper confidence bound approach, DUCT, for solving distributed constraint optimization problems. We compare four variants of this approach with a baseline random sampling algorithm, as well as other complete and incomplete algorithms for DCOPs. Under general assumptions, we theoretically show that the solution found by DUCT after T steps is approximately T1-close to the optimal. Experimentally, we show that DUCT matches the optimal solution found by the well- known DPOP and O-DPOP algorithms on moderate-size problems, while always requiring less agent communication. For larger problems, where DPOP fails, we show that DUCT produces significantly better solutions than local, incomplete algorithms. Overall we believe that DUCT is a practical, scalable algorithm for complex DCOPs.


