SEDE 2019: Papers with Abstracts

Papers
Comparison of University Course Search Interfaces Andrew Munoz, Xiang Li, Bahadir Pehlivan, Frederick C. Harris and Sergiu Dascalu Abstract. Universities must implement interactive and useful search tools that allow students to search and find various courses that they wish to enroll. In this study, a comparison is made between three different University course search tools. These include the University of Nevada, Reno, Harvard University, and University of California, Berkeley. Participants in the study executed various tasks on each website and then compared each website based on ease of use and efficiency. While each web page has different strengths, the results showed that many preferred certain design elements over others such as a more simplistic design with a clear layout. This paper details the various comparisons made between the websites through experimental tasks as well as how data was gathered, analyzed, and processed.
A Peer-to-Peer Reputation Evaluation System Ming-Chang Huang Abstract. Lack of incentives makes most P2P users unwilling to cooperate and lead to free-riding behavior. One way to encourage cooperation is through service differentiation based on each peer’s contributions. This paper presents FuzRep, a reputation system for P2P networks. FuzRep uses fuzzy logic method which uses requester’s reputation and provider’s inbound bandwidth as input information to create incentives for sharing and to avoid overloading problems for primary file providers. Reputation sharing in FuzRep is implemented by interest-based selective polling, which can significantly decrease overheads for reputation communication.
Design and Development of the CTAR All-Star Terri Heglar, Andrew Penrose, Austin Yount, Kristine Galek, Yantao Shen, Sergiu Dascalu and Frederick C. Harris Abstract. The CTAR All-Star is a system consisting of a rubber ball, a pressure sensor, and a bluetooth transmitter paired with a cross-platform mobile application. The device is used as a rehabilitation tool for people with dysphagia in a similar fashion to the traditional chin tuck against resistance (CTAR) exercise by squeezing a ball between the chin and upper chest. The mobile device monitors and displays the pressure inside the ball on a real-time graph allowing the patient to follow exercise routines set by Speech-Language Pathologists. Additionally, the application stores exercise data that can be used to both monitor the patient's progress over time and provide objective data for future research purposes.
Motivational Game-Theme Based Instructional Module for Teaching Binary Tree and Linked List Sarika Rajeev and Sharad Sharma Abstract. Game theme based learning modules will revolutionize education because it increases motivation and engagement of students as they learn interactively. This study is aimed to assess the perceived motivation and engagement of undergraduate students for game theme based learning in introductory programming courses. This paper presents the design, implementation, and evaluation of a game theme-based instructional (GTI) module to teach linked list and binary tree data structure. We have used FDF (four- dimensional framework) with a minor extension for the design and development of GTI modules. The design of the GTI module is modeled on the constructive approach of learning. The purpose of this paper is to overcome the intellectual crisis by providing a new way of thinking and learning. We have evaluated the GTI modules based on the five components of Science Motivation Questionnaire II (SMQII). The results of the evaluation of GTI modules show that the students feel self-determined and motivated towards their learning and career.
Scalable Correlated Sampling for Join Query Estimations on Big Data David Wilson, Wen-Chi Hou and Feng Yu Abstract. Estimate query results within limited time constraints is a challenging problem in the research of big data management. Query estimation based on simple random samples per- forms well for simple selection queries; however, return results with extremely high relative errors for complex join queries. Existing methods only work well with foreign key joins, and the sample size can grow dramatically as the dataset gets larger. This research implements a scalable sampling scheme in a big data environment, namely correlated sampling in map-reduce, that can speed up search query length results, give precise join query estimations, and minimize storage costs when presented with big data. Extensive experiments with large TPC-H datasets in Apache Hive show that our sampling method produces fast and accurate query estimations on big data.
Human-Centric Situational Awareness and Big Data Visualization Sri Teja Bodempudi, Sharad Sharma, Atma Sahu and Rajeev Agrawal Abstract. Human-centric situational awareness and visualization are needed for analyzing the big data in an efficient way. One of the challenges is to create an algorithm to analyze the given data without any help of other data analyzing tools. This research effort aims to identify how graphical objects (such as data-shapes) developed in accordance with an analyst's mental model can enhance analyst's situation awareness. Our approach for improved big data visualization is two-fold, focusing on both visualization and interaction. This paper presents the developed data and graph technique based on force- directed model graph in 3D. It is developed using Unity 3D gaming engine. Pilot testing was done with different data sets for checking the efficiency of the system in immersive environment and non-immersive environment. The application is able to handle the data successfully for the given data sets in data visualization. The currently graph can render around 200 to 300 linked nodes in real-time.
Socio-Analyzer: A Sentiment Analysis Using Social Media Data Ajay Bandi and Aziz Fellah Abstract. The usage of social media is rapidly increasing day by day. The impact of societal changes is bending towards the peoples’ opinions shared on social media. Twitter has re- ceived much attention because of its real-time nature. We investigate recent social changes in MeToo movement by developing Socio-Analyzer. We used our four-phase approach to implement Socio-Analyzer. A total of 393,869 static and stream data is collected from the data world website and analyzed using a classifier. The classifier identify and categorize the data into three categories (positive, neutral, and negative). Our results showed that the maximum peoples’ opinion is neutral. The next higher number of peoples’ opinion is contrary and compared the results with TextBlob. We validate the 765 tweets of weather data and generalize the results to MeToo data. The precision values of Socio-Analyzer and TextBlob are 70.74% and 72.92%, respectively, when considered neutral tweets as positive.
Vertical Data Processing for Mining Big Data: A Predicate Tree Approach Mohammad Hossain, Maninder Singh and Sameer Abufardeh Abstract. Time is a critical factor in processing a very large volume of data a.k.a ‘Big Data’. Many existing data mining algorithms (supervised and unsupervised) become futile because of the ubiquitous use of horizontal processing i.e. row-by-row processing of stored data. Processing time for big data is further exacerbated by its high dimensionality (# of features) and high cardinality (# of records). To address this processing-time issue, we proposed a vertical approach with predicate trees (pTree). Our approach structures data into columns of bit slices, which range from few to hundreds and are processed vertically i.e. column by column. We tested and compared our vertical approach to traditional (horizontal) approach using three basic Boolean operations namely addition, subtraction and multiplication with 10 data sizes. The length of data size ranged from half a billion bits to 5 billion bits. The results are analyzed w.r.t processing speed time and speed gain for both the approaches. The result shows that our vertical approach outperformed the traditional approach for all Boolean operations (add, subtract and multiply) across all data sizes and results in speed-gain between 24% to 96%. We concluded from our results that our approach being in data-mining ready format is best suited to apply to operations involving complex computations in big data application to achieve significant speed gain.
ROI Estimation in a Scrum Project: A case study Rita Cortés and Fulvio Lizano Abstract. Financial metrics are necessary to inform decisions about the beginning or continuity of a software development project to justify investments. This research discuses initial ROI (Return on Investment) estimates in a software project using Scrum and how to analyze variations in the initial calculations to make return on investment decisions during partial deliveries of the product. The case study included a survey, a review of documentation, two focus group sessions, and an exercise involving application of the proposed technique. Twenty-four professionals participated, of which 4 were Scrum trainers (17%), 4 were officials of the company where the estimation technique was applied (17%), and 16 were project managers of domestic and foreign software development companies (66%), all of whom had experience in project management. This study provides elements to be considered in future research on ROI calculation in projects using Scrum, and can be used as a guide to estimate and review financial metrics during the execution of an actual project.
A lightweight environment for 2D visual applications Armando Arce-Orozco, Antonio González-Torres and Erick Mata-Montero Abstract. People and organizations constantly use a wide variety of devices and formats to produce large volumes of data. This involves a series of challenges related to processing and transforming data into valuable knowledge to carry out informed decisions. This research work departs from the fact that information visualization is a critical element in the pro- cess of data analysis that takes advantage of the visual and cognitive abilities of people to explore, discover, interpret, and understand patterns in data with the use of visual representations and human-computer interaction techniques. This research proposes Dio ̈ko ̈l, a programming environment developed with Lua and OpenVG to facilitate the learning process of programmers with little experience in the implementation of visualizations. The environment was designed after the careful study of similar tools and the most popular visualization libraries. Its design takes into account their weaknesses and strengths to propose a set of features that can make it an efficient alternative to learning about and program visualizations. In addition to typical functionality provided by several visualization tools and libraries, Dio ̈k ̈ol is a lightweight environment that provides a simple and effective event manager, is scalable, small, portable and contributes an environment with a simple and easy to understand graphical interface and functionality, similar to the ones provided by Processing.
On Architectural Decay Prediction and Detection in Real-Time Software Systems Aziz Fellah and Ajay Bandi Abstract. As the number of software applications including the widespread of real-time and em- bedded systems are constantly increasing and tend to grow in complexity, the architecture tends to decay over the years, leading to the occurrence of a spectrum of defects and bad smells (i.e., instances of architectural decay) that are manifested and sustained over time in a software system’s life cycle. Thus, the implemented system is not compliant to the specified architecture and such architectural decay becomes an increasing challenge for the developers. We propose a set of constructive architecture views at different levels of granularity, which monitor and ensure that the modifications made by developers at the implementation level are in compliance with those of the different architectural timed-event elements of real-time systems. Thus, we investigated a set of orthogonal architectural de- cay paradigms timed-event component decay, timed-event interface decay, timed-event connector decay and timed-event port decay. All of this has led to predicting, forecasting, and detecting architectural decay with a greater degree of structure, abstraction techniques, architecture reconstruction; and hence offered a series of potential effectiveness and enhancement in gaining a deeper understanding of implementation-level bad smells in real-time systems. Furthermore, to support this research towards an effective architectural decay prediction and detection geared towards real-time and embedded systems, we investigated and evaluated the effect of our approach through a real-time Internet of Things (IoT) case study.
Evaluation of Mobile Augmented Reality Application for Building Evacuation James Stigall and Sharad Sharma Abstract. Building occupants must know how to properly exit a building should the need ever arise. Being aware of appropriate evacuation procedures eliminates (or reduces) the risk of injury and death occurring during an existing catastrophe. Augmented reality (AR) is increasingly being sought after as a teaching and training tool because it offers a visualization and interaction capability that captures the learner’s attention and enhances the learner’s capacity to retain what was learned. Utilizing the visualization and interaction capability that AR offers and the need for emergency evacuation training, this paper explores mobile AR application (MARA) constructed to help users evacuate a building in the event of an emergency such as a building fire, active shooter, earthquake, and similar circumstances. The MARA was built for Android-based devices using Unity and Vuforia. Its features include the use of intelligent signs (i.e. visual cues to guide users to the exits) to help users evacuate a building. Inter alia, this paper discusses the MARA’s implementation and its evaluation through a user study utilizing the Technology Acceptance Model (TAM) and the System Usability Scale (SUS) frameworks. The results demonstrate the participants’ opinions that the MARA is both usable and effective in helping users evacuate a building.
Adaptation of Orthogonal Defect Classification for Mobile Applications Osama Barack and Liguo Huang Abstract. As mobile applications have become popular among end-users, developers have intro- duced a wide range of features that increase the complexity of application code. Orthogonal Defect Classification (ODC) is a model that enables developers to classify defects and track the process of inspection and testing. However, ODC was introduced to classify defects of traditional software. Mobile applications differ from traditional applications in many ways; they are susceptible to external factors, such as screen and network changes, notifi- cations, and phone interruptions, which affect the applications’ functioning. Therefore, in this paper, the ODC model will be adapted to accommodate defects of mobile applications. This allows us to address newly introduced application defects found in the mobile domain, such as energy, notification, and Graphical User Interface (GUI). In addition, based on the new model, we classify found defects of two well-known mobile applications. Moreover, we discuss one-way and two-way analyses. This work provides developers with a suitable defect analysis technique for mobile applications.
Homomorphic Encryption and Data Security in the Cloud Timothy Oladunni and Sharad Sharma Abstract. In the recent times, the use of cloud computing has gained popularity all over the world. There are lots of benefits associated with the use of this modern technology, however, there is a concern about the security of information during computation. A homomorphic encryption scheme provides a mechanism whereby arithmetic operation on the ciphertexts produces the same result as the arithmetic operation on plaintexts. Concept of homomorphic encryption (HME) is discussed with reviews, applications and future challenges to this promising field of research
Software Defect Density Analysis Cuauhtémoc López-Martín Abstract. Defect density (DD) is a measure to determine the effectiveness of software processes. DD is defined as the total number of defects divided by the size of the software. Software prediction is an activity of software planning. This study is related to the analysis of attributes of data sets commonly used for building DD prediction models. The data sets of software projects were selected from the International Software Benchmarking Standards Group (ISBSG) Release 2018. The selection criteria were based on attributes such as type of development, development platform, and programming language generation as suggested by the ISBSG. Since a lower size of data set is generated as mentioned criteria are observed, it avoids a good generalization for models. Therefore, in this study, a statistical analysis of data sets was performed with the objective of knowing if they could be pooled instead of using them as separated data sets. Results showed that there was no difference among the DD of new projects nor among the DD of enhancement projects, but there was a difference between the DD of new and enhancement projects. Results suggest that prediction models can separately be constructed for new projects and enhancement projects, but not by pooling new and enhancement ones.
Open Source Software Survivability Prediction Using Multi Layer Perceptron Vijaya Kumar Eluri, Shahram Sarkani and Thomas Mazzuchi Abstract. Many organizations develop software systems using Open Source Software (OSS) components. OSS components have a high risk of going out of support, making dependency on OSS components risky. So, it is imperative to perform risk assessment during the selection of OSS components. A model that can predict OSS survivability provides an objective criterion for such an assessment. Currently, there are no simple, quick and easy methods to predict survivability of OSS components. In this paper, we build a simple Multi Layer Perceptron (MLP) to predict OSS survivability. We performed experiments on 449 OSS components containing 215 active components and 234 inactive components collected from GitHub. The evaluation results show MLP achieves 81.44% validation accuracy for survivability prediction on GithHub dataset.
Cloud Based Framework to Integrate Map, GPS, and Android Apps Milad A. Khalil and Jiang Guo Abstract. Our cloud based Smart Order Online System is a cloud based and context-aware online retailing system with real time updates. It is designed to verify our framework and handle the operations of various types of e-commerce business. It is aimed at helping customers to obtain order information through cloud based platforms such as mobile devices, laptops or desktop computers. The system allows customers to place orders, to find location and direction to branches, and allows managers to track the delivering drivers’ locations and display them on the map. The clustering/grouping algorithm plays a very important role in our framework. It is also discussed in the paper.
Utilizing Optical Character Recognition and Boarder Detection Algorithms to Identify Trading Cards Brodie Boldt, Christopher Cooper, Ryan Fox, Jared Parks and Erin Keith Abstract. Magic: The Gathering is a popular physical trading card game played by millions of people around the world. To keep track of their cards, players typically store them in some sort of physical protective case, which can become cumbersome to sort through as the number of cards can reach up to the thousands. By utilizing and improving optical character recognition software, the TCG Digitizer allows users to efficiently store their entire inventory of Magic: The Gathering trading cards in a digital database. With an emphasis on quick and accurate scanning, the final product provides an intuitive digital solution for storing Magic: The Gathering cards for both collectors and card owners who want to easily store their collection of cards on a computer.
Cyber Threat Discovery from Dark Web Azene Zenebe, Mufaro Shumba, Andrei Carillo and Sofia Cuenca Abstract. In the darknet, hackers are constantly sharing information with each other and learning from each other. These conversations in online forums for example can contain data that may help assist in the discovery of cyber threat intelligence. Cyber Threat Intelligence (CTI) is information or knowledge about threats that can help prevent security breaches in cyberspace. In addition, monitoring and analysis of this data manually is challenging because forum posts and other data on the darknet are high in volume and unstructured. This paper uses descriptive analytics and predicative analytics using machine learning on forum posts dataset from darknet to discover valuable cyber threat intelligence. The IBM Watson Analytics and WEKA machine learning tool were used. Watson Analytics showed trends and relationships in the data. WEKA provided machine learning models to classify the type of exploits targeted by hackers from the form posts. The results showed that Crypter, Password cracker and RATs (Remote Administration Tools), buffer overflow exploit tools, and Keylogger system exploits tools were the most common in the darknet and that there are influential authors who are frequent in the forums. In addition, machine learning helps build classifiers for exploit types. The Random Forest classifier provided a higher accuracy than the Random Tree and Naïve Bayes classifiers. Therefore, analyzing darknet forum posts can provide actionable information as well as machine learning is effective in building classifiers for prediction of exploit types. Predicting exploit types as well as knowing patterns and trends on hackers’ plan helps defend the cyberspace proactively.
Environmental extreme events detection: A survey Rakesh Matta, Rui Wu and Shanyue Guan Abstract. The application of statistics in extreme events detection is quite diverse and leads to diverse formulations, which needs to be designed for the specific problem. Each formula needs to be tailored specifically to work with the available data in the given situation. This diversity is one of the driving forces of this survey towards identifying the most common mixture of components utilized in the analysis of environmental outlier detection. Indeed for some arbitrary applications, it may not always be possible to use off-the-shelf models due to the wide variations in problem formulations. In this paper, we summarize the statistical methods involved in the detection of environmental extremes such as wind ramps, high precipitation and extreme temperatures. Then we organize the discussion along different outlier detection types, present various outlier definitions, and briefly introduce the corresponding techniques. Environmental extreme events detection challenges and possible future work are also discussed.
Creational and Structural Patterns in a Flexible Machine Learning Framework for Medical Ultrasound Diagnostics Corey Thibeault Abstract. The impact of machine learning in medicine has arguably lagged behind its commercial counterparts. This may be attributable to the generally slower pace and higher costs associated with clinical applications, but also present are the conflicting constraints and requirements of learning from data in a highly regulated industry that introduce levels of complexity unique to the medical space. Because of this, the balance between innovation and controlled development is challenging. Adding to this are the multiple modalities found in most clinical applications where applying traditional machine learning preprocessing and cross-validation techniques can be precarious. This work presents the novel use of creational and structural design patterns in a generalized software framework intended to alleviate some of those difficulties. Designed to be a configurable pipeline to not only support the experimentation and development of diagnostic machine learning algorithms, but also to support the transition of those algorithms into production level systems in a composed manner. The resulting framework provides the foundation for developing unique tools by both novice and expert data scientists.
Documenting architectural rationale using source code annotations: An exploratory study Santiago Hyun Dorado and Julio Ariel Hurtado Abstract. The architectural rationale is the documentation of the reasons why certain software design decisions are made that satisfy quality needs in a system. On many occasions this rationale is implicitly found in precise sentences of the system's documentation or the same source code, making it difficult to understand and make decisions on the maintenance phase, leading to a deviation and erosion of the architecture and therefore aging of the software. In this paper, we discuss the utility of a tool based on code annotations as an alternative to document the architectural rationale with the source code. For this, a quasi-experiment with local industry software engineers is done, in order to answer the research question: Does the source code annotations, with information about the architectural rationale, improves the software maintenance? The quasi-experiment is done with software engineers who know Java language and notions of software architectures. It included 3 tasks that involve changes to the architecture and its documentation. After the results were analyzed using the t-student test, concluding that the participants who use the annotations with information of the Architectural Rationale achieve a better understanding of the architecture and its Rationale than those using a traditional way for documenting the rationale(documents). However, the efficiency and effectiveness of maintenance time do not depend on the Rationale specification. With the same problem, the variation was due to the ability of individuals to develop software, but the documentation of the architecture, in general, was very important to be able to make the changes within the limits.
ARIA 3.0: A Modern Approach to Web-based Music Festival Registration Systems Nikkolas J. Irwin, Anthony Bennett, Kevin Carlos, Jalal Kiswani, Cynthia R. Harris, Sergiu Dascalu and Frederick C. Harris Abstract. Administration, Registration, and Information Assistant (ARIA) 3.0 is a full-stack web- application developed to assist the Northern Nevada Music Teachers Association (NNMTA) with numerous music festivals and competitions occurring annually. ARIA’s primary func- tionalities are the following: event creation, event scheduling, event management, user management, document processing, and payment processing. Prior to the creation of ARIA, these functionalities were performed using-paper methods. This approach was tedious, inefficient, and prone to errors. The previous versions of ARIA (1.0/2.0) were developed to address these issues using the WordPress platform and WordPress plugins. While successful, new business requirements proposed by NNMTA have demonstrated that WordPress may not be a feasible software platform for all of NNMTA’s needs going for- ward. To ensure that ARIA can advance with NNMTA, ARIA 3.0 was developed using an “as-is” and “to-be” design approach. We built a microservice application which supports the implementation of features that NNMTA would like to maintain from previous versions while simultaneously providing a platform suitable for future software requirements. Additionally, new portals built for customers, students, teachers, and administrators aim to improve the user experience (UX) and user interface (UI) by increasing engagement with ARIA through new functionalities, greater access, and greater control over information.
Quack: A Youth-Targeted Game for English Comprehension Alexander Redei Abstract. Second grade is a critical year in the development of a child’s understanding of literature. Studies have shown that students who struggle with reading at the second grade will continue to struggle with reading for the rest of their lives. The purpose of this paper is to introduce a new method for increasing English comprehension for at-risk youth. In this paper we propose a fun and easy-to-use game called Quack. Our approach is to leverage new technologies with an existing school program, such as one-on-one mentoring, in addition to a targeted English comprehension game to enhance the educational experience of second graders struggling with reading comprehension. The Quack game implements a spelling challenge system to test the student’s vocabulary and spelling ability. Part of the game, an options system, allows instructors and students to customize the experience to each individual’s needs. Quack provides a novel approach to educational gaming through three new concepts: (i) Quack is free to use and open source (ii) Quack is customizable to the individual’s English comprehension needs (iii) Quack incorporates a rewarding English-comprehension system, effectively “gamifying” learning proper spelling.