Junction Systems continues to position itself as a credible research and development firm through its different research projects conducted by the core data science team made up of machine learning engineers and data scientists. Our team has researched and published papers on cognitive radio, computer networks, security, mobile cloud computing, embedded systems, e-commerce and social media.
Project #1: Sky VMS a Vehicular Tracking Internet of Things (IoT) System with Big Data analytics
A full turn-key fleet & asset management solution that fleet owners can manage on the go. The technology solution accesses valuable insights into the everyday performance of the fleet and/or assets instantly. The service provides the knowledge needed to keep informed, make smart decisions about your ventures, help reduce operating costs, identify efficiencies, and help to improve safety and user experiences. The solution is an embedded system with a hardware (low-cost Raspberry PI) GPS transceiver and the application software runs on a cloud infrastructure. Clients access the data analytics about their fleet through a thin web client that requires at least an ADSL internet connection. The analytics are also accessed through USSD and SMS messages. The fleet data is stored in Amazon Keyspaces a managed Apache Cassandra database on the Amazon Web Services platform.
Project #2: Mining voter sentiments from Twitter data for the 2016 Uganda Presidential elections
Mukonyezi, I., Babirye, C., & Mwebaze, E. (2017). Mining voter sentiments from Twitter data for the 2016 Uganda Presidential elections. International Journal of Technology and Management, 3(2), 12. Retrieved from https://utamu.ac.ug/ijotm/index.php/ijotm/article/view/42
In 2017, We presented a research paper on “Mining voter sentiments from Twitter data for the 2016 Uganda Presidential elections.” at the Neural Information Processing Systems conference in Montreal, Canada, the paper was published in the International Journal of Technology and Management in Kampala, Uganda.
We used natural language processing techniques, data mining and statistical techniques to mine sentiments from the social media data to predict an outcome from an electoral process and we discovered that we can be able to extract meaningful insights from the data. The uniqueness of social media data calls for novel natural language processing techniques that can effectively handle user-generated content with rich social relations in order to build descriptive and predictive models of social interactions. We collect and analyse tweets data for the Uganda Presidential elections 2016, during the period of January to February 2016. We derive inferences from the data and show that in some cases Twitter can be informative on actual events happening on ground. In the analysis we use a word-emotion lexicon to determine the nature of sentiments in the tweets and semantic orientation, to determine the ties conversations within the tweets have with positive and negative contexts, this is based on the pointwise mutual information technique. We find that twitter data analytics using both intensity-based measures and sentiment analysis can be useful to reflect the current offline political sentiment and we make a number of observations related to the task of monitoring public sentiments during an election campaign, including examining a variety of sample sizes, time parameters as well as methods for quantitatively and qualitatively exploring the underlying content.
Project #3: Prediction of spectrum holes in cognitive radio ad-hoc networks (Under consideration for publishing)
By Mukonyezi Isaac and Dr. Johnson Mwebaze
A Cognitive Radio Adhoc Network (CRHAN) is an infrastructure-less branch of Cognitive radio networks (CRNs) whose nodes are equipped with cognitive radios that can optimize performance by adapting to network conditions. The majority of existing routing protocols use instantaneous parameters to cater for adaptiveness. To achieve, a more robust cognitive routing protocol for these networks we proposed to use artificial intelligence techniques most especially machine learning in the process of routing. In this work, we implemented a classification machine learning algorithm to the Dual Diversity Cognitive Ad-hoc Routing Protocol (D2CARP) by modelling the spectrum occupancy of licensed/primary user activity to predict for spectrum holes for opportunistic access by the cognitive users. Evaluations of the new protocol were performed by experiments using the Network Simulator (NS-2) and we find that network performance parameters i.e. packet delivery ratio, end to end delay, normalized routing load and packet loss offer better results as shown in this work. In the improved work, it was found that incorporating channel history usage statistics of licensed users in the process of routing in CRAHNs improves the performance of the cognitive network.
Project #4: Signature-based Denial of Service and Probe Detection, a Machine Learning approach
Babirye, C., & Mwebaze, E. (1). Signature-based Denial of Service and Probe Detection, a Machine Learning approach. International Journal of Technology and Management, 3(2), 11. Retrieved from https://utamu.ac.ug/ijotm/index.php/ijotm/article/view/44
Presently, a key strategy in subduing computer networks attacks is by use of Intrusion Detection Systems (IDSs). They are used to detect attacks on a network. However, the uniqueness and frequency of these attacks call for novel approaches such as the use of machine learning techniques to model the network traffic as it changes and detect anomalous traffic. In this paper, we present some work on the detection of these Denial Of Service (DOS) and Probe attacks in network traffic using machine learning and data mining techniques. We build our models based on the common KDD dataset as well as live data from a wireless network at an institution of learning that has numerous and diverse users. We show the efficacy of machine learning algorithms for detecting these two attacks.
Project #5: Malware Classification using API System Calls.
Ninyesiga, A., & Ngubiri, J. (1). Malware Classification using API System Calls. International Journal of Technology and Management, 3(2), 9. Retrieved from https://utamu.ac.ug/ijotm/index.php/ijotm/article/view/41
Recent studies have shown data mining to be promising in identifying malware by analysing API calls. However, in this approach, a file is detected as malicious or not. It is not classified on to which malware class it belongs. This makes its elimination harder as elimination schemes are mostly class-based. Classification as a post-detection process is important if the malware is to be eliminated from the system. We make an experimental study in the use of the data mining approach to classifying malware using 4-gram API system calls. We use a dataset of 552 Windows Portable Executables (PE) with their corresponding API calls. The PE’s were executed in a windows 7 virtual environment using the Cuckoo sandbox. Relevant 4- gram API call features are extracted using Term Frequency-Inverse Document Frequency (TF-IDF). Gaussian Naive Bayes, SVM, Random Forest, and Decision Trees were used to train and test the data. We show that the technique is successful with accuracy between 92% and 96.4%. There are internal variations in accuracy with SVM and Decision Trees performing best and Gaussian Naive Bayes performing worst.
Projects In progress: To design, build and test a spectrum efficient cognitive Multiple Input Multiple Output rural broadband internet communication network using TV white spaces (TVWS) based on the IEEE 802.22b standard
We are currently conducting research in cognitive radio technology “To design, build and test a spectrum efficient cognitive Multiple Input Multiple Output rural broadband internet communication network using TV white spaces (TVWS) based on the IEEE 802.22b standard.” According to preliminary data from Uganda Electricity Transmission Company Limited, the energy sales team spends about 2,500 USD every month to manually read 55% of energy meters at handover locations (distributors and generating stations) in areas that are hard to reach without either an existing UETCL communication or private telecom network. Using TVWS, this cost can be reduced down to under 1,000 USD per month for OPEX costs with immediate ROI benefits and CAPEX costs recovered within just 3 years. Optimal network performance results can be achieved using low cost, low power, long-range TVWS.
Corporate Social Responsibilities
Our core data scientist have organised and facilitated at the IndabaX (a locally-organised deep learning conference that helps ensure that knowledge and capacity in machine learning are spread more widely across the African continent) in Kampala, Uganda 2018 and 2019, where they handled the practical sessions on “Introduction to deep learning with Theano, TensorFlow and Keras” and “Probabilistic Reasoning” respectively.
R&D Team:
Machine Learning Engineer: Mukonyezi Isaac, imu@junction.co.ug
Data Scientists: Claire Babirye, cba@junction.co.ug. Ninyesiga Allan, nin@junction.co.ug. Timothy Kivumbi, tim@junction.co.ug