Final year Projects | Real Time Projects | Ns2 Projects |
Binrank: Scaling Dynamic Authority-Based Search Using Materialized Subgraphs
– August 2010 – Knowledge And Data Engineering / Data Mining – J2EE
Dynamic authority-based keyword search algorithms, such as ObjectRank and personalized PageRank, leverage semantic link information to provide high quality, high recall search in databases, and the Web. Conceptually, these algorithms require a querytime PageRank-style iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of precomputed results for some or all keywords involves very expensive preprocessing. We introduce BinRank, a system that approximates ObjectRank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running ObjectRank on only one of the subgraphs. BinRank generates the subgraphs by partitioning all the terms in the corpus based on their co-occurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The intuition is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve subsecond query execution time on the English Wikipedia data set, while producing high-quality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 108 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of BinRank.
Closeness: A New Privacy Measure For Data Publishing
-Ã‚Â July 2010 – Knowledge And Data Engineering / Data Mining – J2EE
The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain Ã¢â‚¬Å“identifyingÃ¢â‚¬Â attributes) contains at least k records. Recently, several authors have recognized that k-anonymity cannot prevent attribute disclosure. The notion of `-diversity has been proposed to address this; `-diversity requires that each equivalence class has at least ` well-represented (in Section 2) values for each sensitive attribute. In this article, we show that `-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Motivated by these limitations, we propose a new notion of privacy called Ã¢â‚¬Å“closenessÃ¢â‚¬Â. We first present the base model t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold t). We then propose a more flexible privacy model called (n, t)-closeness that offers higher utility. We describe our desiderata for designing a distance measure between two probability distributions and present two distance measures. We discuss the rationale for using closeness as a privacy measure and illustrate its advantages through examples and experiments.
Data Leakage Detection Ã¢â‚¬â€œ June 2010 – Knowledge And Data Engineering / Data Mining – DOTNET
We study the following problem: A data distributor has given sensitive data to a set of supposedly trusted agents (third parties). Some of the data is leaked and found in an unauthorized place (e.g., on the web or somebodyÃ¢â‚¬â„¢s laptop). The distributor must assess the likelihood that the leaked data came from one or more agents, as opposed to having been independently gathered by other means. We propose data allocation strategies (across the agents) that improve the probability of identifying leakages. These methods do not rely on alterations of the released data (e.g., watermarks). In some cases we can also inject Ã¢â‚¬Å“realistic but fakeÃ¢â‚¬Â data records to further improve our chances of detecting leakage and identifying the guilty party.
PAM: An Efficient And Privacy-Aware Monitoring Framework For Continuously Moving Objects
— March 2010 – Knowledge And Data Engineering / Data Mining – J2EE
Efficiency and privacy are two fundamental issues in moving object monitoring. This paper proposes a privacy-aware monitoring (PAM) framework that addresses both issues. The framework distinguishes itself from the existing work by being the first to holistically address the issues of location updating in terms of monitoring accuracy, efficiency, and privacy, particularly, when and how mobile clients should send location updates to the server. Based on the notions of safe region and most probable result, PAM performs location updates only when they would likely alter the query results. Furthermore, by designing various client update strategies, the framework is flexible and able to optimize accuracy, privacy, or efficiency. We develop efficient query evaluation/reevaluation and safe region computation algorithms in the framework. The experimental results show that PAM substantially outperforms traditional schemes in terms of monitoring accuracy, CPU cost, and scalability while achieving close-to-optimal communication cost.
P2P Reputation Management Using Distributed Identities And Decentralized Recommendation Chains
Ã¢â‚¬â€œ July 2010- Knowledge And Data Engineering / Data Mining- Java
Peer-to-peer (P2P) networks are vulnerable to peers who cheat, propagate malicious code, leech on the network, or simply do not cooperate. The traditional security techniques developed for the centralized distributed systems like client-server networks are insufficient for P2P networks by the virtue of their centralized nature. The absence of a central authority in a P2P network poses unique challenges for reputation management in the network. These challenges include identity management of the peers, secure reputation data management, Sybil attacks, and above all, availability of reputation data. In this paper, we present a cryptographic protocol for ensuring secure and timely availability of the reputation data of a peer to other peers at extremely low costs. The past behavior of the peer is encapsulated in its digital reputation, and is subsequently used to predict its future actions. As a result, a peerÃ¢â‚¬â„¢s reputation motivates it to cooperate and desist from malicious activities. The cryptographic protocol is coupled with self-certification and cryptographic mechanisms for identity management and countering Sybil attack. We illustrate the security and the efficiency of the system analytically and by means of simulations in a completely decentralized Gnutella-like P2P network.
Managing Multidimensional Historical Aggregate Data In Unstructured P2p Networks
Ã¢â‚¬â€œ September 2010 – Knowledge And Data Engineering / Data Mining- J2EE
A P2P-based framework supporting the extraction of aggregates from historical multidimensional data is proposed, which provides efficient and robust query evaluation. When a data population is published, data are summarized in a synopsis, consisting of an index built on top of a set of subsynopses (storing compressed representations of distinct data portions). The index and the subsynopses are distributed across the network, and suitable replication mechanisms taking into account the query workload and network conditions are employed that provide the appropriate coverage for both the index and the subsynopses.
Bridging Domains Using World Wide Knowledge For Transfer Learning
– Knowledge And Data Engineering / Data Mining-DOTNET
A major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify target domain data by using some other source domain data, even when the data may have different distributions. However, domain adaptation may not work well when the differences between the source and target domains are large. In this paper, we design a novel transfer learning approach, called BIG (Bridging Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. BIG works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a Ã¢â‚¬Å“bridgeÃ¢â‚¬Â that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with BIG, a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain. We conduct experiments on several real-world cross-domain text classification tasks and demonstrate that our proposed approach can outperform several existing domain adaptation approaches significantly.
On Wireless Scheduling Algorithms For Minimizing The Queue-Overflow Probability
Ã¢â‚¬â€œ June 2010 Ã¢â‚¬â€œ Networking – JAVA
In this paper, we are interested in wireless scheduling algorithms for the downlink of a single cell that can minimize the queue-overflow probability. Specifically, in a large-deviation setting, we are interested in algorithms that maximize the asymptotic decay-rate of the queue-overflow probability, as the queue-overflow threshold approaches infinity. We first derive an upper bound on the decay-rate of the queue-overflow probability over all scheduling policies. We then focus on a class of scheduling algorithms collectively referred to as the ÃŽÂ±-algorithms. For a givenÃ‚Â ÃŽÂ± >= 1, the -algorithm picks the user for service at each time that has the largest product of the transmission rate multiplied by the backlog raised to the power. We show that when the overflow metric is appropriately modified, the minimum-cost-to-overflow under the -algorithm can be achieved by a simple linear path, and it can be written as the solution of a vector-optimization problem. Using this structural property, we then show that when a approaches infinity, theÃ‚Â ÃŽÂ±-algorithms asymptotically achieve the largest decay-rate of the queueover flow probability. Finally, this result enables us to design scheduling algorithms that are both close-to-optimal in terms of the asymptotic decay-rate of the overflow probability, and empirically shown to maintain small queue-overflow probabilities over queue-length ranges of practical interest.
A Distributed Csma Algorithm For Throughput And Utility Maximization In Wireless Networks
Ã¢â‚¬â€œ June 2010 Ã¢â‚¬â€œ Networking – JAVA
In multihop wireless networks, designing distributed scheduling algorithms to achieve the maximal throughput is a challenging problem because of the complex interference constraints among different links. Traditional maximal-weight scheduling (MWS), although throughput-optimal, is difficult to implement in distributed networks. On the other hand, a distributed greedy protocol similar to 802.11 does not guarantee the maximal throughput. In this paper, we introduce an adaptive carrier sense multiple access (CSMA) scheduling algorithm that can achieve the maximal throughput distributively. Some of the major advantages of the algorithm are that it applies to a very general interference model and that it is simple, distributed, and asynchronous. Furthermore, the algorithm is combined with congestion control to achieve the optimal utility and fairness of competing flows. Simulations verify the effectiveness of the algorithm. Also, the adaptive CSMA scheduling is a modular MAC-layer algorithm that can be combined with various protocols in the transport layer and network layer. Finally, the paper explores some implementation issues in the setting of 802.11 networks.
A Dynamic En-Route Filtering Scheme For Data Reporting In Wireless Sensor Networks
Ã¢â‚¬â€œ Networking – JAVA
In wireless sensor networks, adversaries can inject false data reports via compromised nodes and launch DoS attacks against legitimate reports. Recently, a number of filtering schemes against false reports have been proposed. However, they either lack strong filtering capacity or cannot support highly dynamic sensor networks very well. Moreover, few of them can deal with DoS attacks simultaneously. In this paper, we propose a dynamic en-route filtering scheme that addresses both false report injection and DoS attacks in wireless sensor networks. In our scheme, each node has a hash chain of authentication keys used to endorse reports; meanwhile, a legitimate report should be authenticated by a certain number of nodes. First, each node disseminates its key to forwarding nodes. Then, after sending reports, the sending nodes disclose their keys, allowing the forwarding nodes to verify their reports. We design the hill climbing key dissemination approach that ensures the nodes closer to data sources have stronger filtering capacity. Moreover, we exploit the broadcast property of wireless communication to defeat DoS attacks and adopt multipath routing to deal with the topology changes of sensor networks. Simulation results show that compared to existing solutions, our scheme can drop false reports earlier with a lower memory requirement, especially in highly dynamic sensor networks.
Efficient And Dynamic Routing Topology Inference From End-To-End Measurements
Ã¢â‚¬â€œ Networking – Java
Inferring the routing topology and link performance from a node to a set of other nodes is an important component in network monitoring and application design. In this paper we propose a general framework for designing topology inference algorithms based on additive metrics. The framework can flexibly fuse information from multiple measurements to achieve better estimation accuracy. We develop computationally efficient (polynomial-time) topology inference algorithms based on the framework. We prove that the probability of correct topology inference of our algorithms converges to one exponentially fast in the number of probing packets. In particular, for applications where nodes may join or leave frequently such as overlay network construction, application-layer multicast, peer-to-peer file sharing/streaming, we propose a novel sequential topology inference algorithm which significantly reduces the probing overhead and can efficiently handle node dynamics. We demonstrate the effectiveness of the proposed inference algorithms via Internet experiments.
Secure Data Collection In Wireless Sensor Networks Using Randomized Dispersive Routes
Ã¢â‚¬â€œ July 2010 – Mobile Computing – Java
Compromised-node and denial-of-service are two key attacks in wireless sensor networks (WSNs). In this paper, we study routing mechanisms that circumvent (bypass) black holes formed by these attacks. We argue that existing multi-path routing approaches are vulnerable to such attacks, mainly due to their deterministic nature. So once an adversary acquires the routing algorithm, it can compute the same routes known to the source, and hence endanger all information sent over these routes. In this paper, we develop mechanisms that generate randomized multipath routes. Under our design, the routes taken by the Ã¢â‚¬Å“sharesÃ¢â‚¬Â of different packets change over time. So even if the routing algorithm becomes known to the adversary, the adversary still cannot pinpoint the routes traversed by each packet. Besides randomness, the routes generated by our mechanisms are also highly dispersive and energy-efficient, making them quite capable of bypassing black holes at low energy cost. Extensive simulations are conducted to verify the validity of our mechanisms.
VEBEK: Virtual Energy-Based Encryption And Keying For Wireless Sensor Networks
Ã¢â‚¬â€œ July 2010 – Mobile Computing Ã¢â‚¬â€œ Dot Net
Designing cost-efficient, secure network protocols for Wireless Sensor Networks (WSNs) is a challenging problem because sensors are resource-limited wireless devices. Since the communication cost is the most dominant factor in a sensorÃ¢â‚¬â„¢s energy consumption, we introduce an energy-efficient Virtual Energy-Based Encryption and Keying (VEBEK) scheme for WSNs that significantly reduces the number of transmissions needed for rekeying to avoid stale keys. In addition to the goal of saving energy, minimal transmission is imperative for some military applications of WSNs where an adversary could be monitoring the wireless spectrum. VEBEK is a secure communication framework where sensed data is encoded using a scheme based on a permutation code generated via the RC4 encryption mechanism. The key to the RC4 encryption mechanism dynamically changes as a function of the residual virtual energy of the sensor. Thus, a one-time dynamic key is employed for one packet only and different keys are used for the successive packets of the stream. The intermediate nodes along the path to the sink are able to verify the authenticity and integrity of the incoming packets using a predicted value of the key generated by the senderÃ¢â‚¬â„¢s virtual energy, thus requiring no need for specific rekeying messages. VEBEK is able to efficiently detect and filter false data injected into the network by malicious outsiders. The VEBEK framework consists of two operational modes (VEBEK-I and VEBEK-II), each of which is optimal for different scenarios. In VEBEK-I, each node monitors its one-hop neighbors where VEBEK-II statistically monitors downstream nodes. We have evaluated VEBEKÃ¢â‚¬â„¢s feasibility and performance analytically and through simulations. Our results show that VEBEK, without incurring transmission overhead (increasing packet size or sending control messages for rekeying), is able to eliminate malicious data from the network in an energyefficient manner. We also show that our framework performs better than other comparable schemes in the literature with an overall 60-100 percent improvement in energy savings without the assumption of a reliable medium access control layer.
Localized Multicast: Efficient And Distributed Replica Detection In Large-Scale Sensor Networks
– Mobile Computing Ã¢â‚¬â€œ Dot Net
Due to the poor physical protection of sensor nodes, it is generally assumed that an adversary can capture and compromise a small number of sensors in the network. In a node replication attack, an adversary can take advantage of the credentials of a compromised node to surreptitiously introduce replicas of that node into the network. Without an effective and efficient detection mechanism, these replicas can be used to launch a variety of attacks that undermine many sensor applications and protocols. In this paper, we present a novel distributed approach called Localized Multicast for detecting node replication attacks. The efficiency and security of our approach are evaluated both theoretically and via simulation. Our results show that, compared to previous distributed approaches proposed by Parno et al., Localized Multicast is more efficient in terms of communication and memory costs in large-scale sensor networks, and at the same time achieves a higher probability of detecting node replicas.
Layered Approach Using Conditional Random Fields For Intrusion Detection
-Ã‚Â Dependable And Secure Computing- Java
Intrusion detection faces a number of challenges; an intrusion detection system must reliably detect malicious activities in a network and must perform efficiently to cope with the large amount of network traffic. In this paper, we address these two issues of Accuracy and Efficiency using Conditional Random Fields and Layered Approach. We demonstrate that high attack detection accuracy can be achieved by using Conditional Random Fields and high efficiency by implementing the Layered Approach. Experimental results on the benchmark KDD Ã¢â‚¬â„¢99 intrusion data set show that our proposed system based on Layered Conditional Random Fields outperforms other well-known methods such as the decision trees and the naive Bayes. The improvement in attack detection accuracy is very high, particularly, for the U2R attacks (34.8 percent improvement) and the R2L attacks (34.5 percent improvement). Statistical Tests also demonstrate higher confidence in detection accuracy for our method. Finally, we show that our system is robust and is able to handle noisy data without compromising performance
Privacy-Conscious Location-Based Queries In Mobile Environments
– Parallel And Distributed Systems – Java
In location-based services, users with location-aware mobile devices are able to make queries about their surroundings anywhere and at any time. While this ubiquitous computing paradigm brings great convenience for information access, it also raises concerns over potential intrusion into user location privacy. To protect location privacy, one typical approach is to cloak user locations into spatial regions based on user-specified privacy requirements, and to transform location-based queries into region-based queries. In this paper, we identify and address three new issues concerning this location cloaking approach. First, we study the representation of cloaking regions and show that a circular region generally leads to a small result size for region-based queries. Second, we develop a mobility-aware location cloaking technique to resist trace analysis attacks. Two cloaking algorithms, namely MaxAccu_Cloak and MinComm_Cloak, are designed based on different performance objectives. Finally, we develop an efficient polynomial algorithm for evaluating circular-region-based kNN queries. Two query processing modes, namely bulk and progressive, are presented to return query results either all at once or in an incremental manner. Experimental results show that our proposed mobility-aware cloaking algorithms significantly improve the quality of location cloaking in terms of an entropy measure without compromising much on query latency or communication cost. Moreover, the progressive query processing mode achieves a shorter response time than the bulk mode by parallelizing the query evaluation and result transmission.
An Improved Lossless Image Compression Algorithm Loco-R
— 201o International Conference On Computer Design And Applications (Iccda 2010)- Image Processing – Java
This paper presents a state-of-the-art implementation of lossless image compression algorithm LOCO-R, which is based on the LOCO-I (low complexity lossless compression for images) algorithm developed by weinberger, Seroussi and Sapiro, with modifications and betterment, the algorithm reduces obviously the implementation complexity. Experiments illustrate that this algorithm is better than Rice Compression typically by around 15 percent.
A DWT BASED APPROACH FOR STEGANOGRAPHY USING BIOMETRICS
-2010 International Conference On Data Storage And Data Engineering – IMAGE PROCESSING- Dotnet
Steganography is the art of hiding the existence of data in another transmission medium to achieve secret communication. It does not replace cryptography but rather boosts the security using its obscurity features. Steganography method used in this paper is based on biometrics. And the biometric feature used to implement steganography is skin tone region of images. Here secret data is embedded within skin region of image that will provide an excellent secure location for data hiding. For this skin tone detection is performed using HSV (Hue, Saturation and Value) color space. Additionally secret data embedding is performed using frequency domain approach – DWT (Discrete Wavelet Transform), DWT outperforms than DCT (Discrete Cosine Transform). Secret data is hidden in one of the high frequency sub-band of DWT by tracing skin pixels in that sub-band. Different steps of data hiding are applied by cropping an image interactively. Cropping results into an enhanced security than hiding data without cropping i.e. in whole image, so cropped region works as a key at decoding side. This study shows that by adopting an object oriented steganography mechanism, in the sense that, we track skin tone objects in image, we get a higher security. And also satisfactory PSNR (Peak- Signal-to-Noise Ratio) is obtained.
Sparse Bayesian Learning Of Filters For Efficient Image Expansion
– Image Processing- Dotnet
We propose a framework for expanding a given image using an interpolator that is trained in advance with training data, based on sparse Bayesian estimation for determining the optimal and compact support for efficient image expansion. Experiments on test data show that learnedÃ‚Â interpolators are compact yet superior to classical ones.
On Event-Based Middleware For Location-Aware Mobile Applications
– Software Engineering Ã¢â‚¬â€œJava
As mobile applications become more widespread, programming paradigms and middleware architectures designed to support their development are becoming increasingly important. The event-based programming paradigm is a strong candidate for the development of mobile applications due to its inherent support for the loose coupling between components required by mobile applications. However, existing middleware that supports the event-based programming paradigm is not well suited to supporting location-aware mobile applications in which highly mobile components come together dynamically to collaborate at some location. This paper presents a number of techniques including location-independent announcement and subscription coupled with location-dependent filtering and event delivery that can be used by event-based middleware to support such collaboration. We describe how these techniques have been implemented in STEAM, an event-based middleware with a fully decentralized architecture, which is particularly well suited to deployment in ad hoc network environments. The cost of such location-based event dissemination and the benefits of distributed event filtering are evaluated.
Mitigating Selective Forwarding Attacks With A Channel-Aware Approach In Wmns
Ã¢â‚¬â€œ May 2010 – Wireless Communications – Java
In this paper, we consider a special case of denial of service (DoS) attack in wireless mesh networks (WMNs) known as selective forwarding attack (a.k.a gray hole attacks). With such an attack, a misbehaving mesh router just forwards a subset of the packets it receives but drops the others. While most of the existing studies on selective forwarding attacks focus on attack detection under the assumption of an error-free wireless channel, we consider a more practical and challenging scenario that packet dropping may be due to an attack, or normal loss events such as medium access collision or bad channel quality. Specifically, we develop a channel aware detection (CAD) algorithm that can effectively identify the selective forwarding misbehavior from the normal channel losses. The CAD algorithm is based on two strategies, channel estimation and traffic monitoring. If the monitored loss rate at certain hops exceeds the estimated normal loss rate, those nodes involved will be identified as attackers. Moreover, we carry out analytical studies to determine the optimal detection thresholds that minimize the summation of false alarm and missed detection probabilities. We also compare our CAD approach with some existing solutions, through extensive computer simulations, to demonstrate the efficiency of discriminating selective forwarding attacks from normal channel losses.
Novel Defense Mechanism Against Data Flooding Attacks In Wireless Ad Hoc Networks
– Consumer Electronics Ã¢â‚¬â€œ Java
Mobile users like to use their own consumer electronic devices anywhere and at anytime to access multimedia data. Hence, we expect that wireless ad hoc networks will be widely used in the near future since these networks form the topology with low cost on the fly. However, consumer electronic devices generally operate on limited battery power and therefore are vulnerable to security threats like data flooding attacks. The data flooding attack causes Denial of Service (DoS) attacks by flooding many data packets. However, there are a few existing defense systems against data flooding attacks. Moreover, the existing schemes may not guarantee the Quality of Service (QoS) of burst traffic since multimedia data are usually burst. Therefore, we propose a novel defense mechanism against data flooding attacks with the aim of enhancing the throughput. The simulation results show that the proposed scheme enhances the throughput of burst traffic