International Journal of Web Information Systems

Publisher:
Emerald Group Publishing Limited
Publication date:
2021-02-01
ISBN:
1744-0084

Latest documents

  • Guest editorial: Special issue on “current topics of knowledge graphs and semantic web”
  • Applied personal profile ontology for personnel appraisals

    Purpose: The purpose of this paper is to apply the earlier enhanced personal profile ontology (e-PPO) developed by the authors as a case study for the appraisal of the lecturers in the department of computer science, University of Uyo, Uyo for the purposes of promotions. The developed e-PPO was a sample smart résumé for the selection of the best among three personnel using linguistic variables and formal rules representing the combination of the criteria and subcriteria was illustrated which was used to allocate competent personnel for software requirement engineering tasks. The need for the use of the smart resume for appraisal purposes was pointed out in the conference paper, calling for the applicant’s data to be inputted into the enhanced personal profile ontology (e-PPO) for personnel appraisa. Design/methodology/approach: Appraisal is a regular review of employees’ performances and their overall contribution to the organization they are working for. The availability of a web application for personnel appraisal requires PPO which includes both static and dynamic features. Personal profile is often modified for several purposes calling for augmentation and annotation when needs arise. Resume is one resulting extract from personal profile and often contain slightly different information based on needs. The urgent preparation of resume may introduce bias and incorrect information for the sole aim of projecting the personnel as being qualified for the available job. Religious and gender biases may sometimes be observed during appointments of new personnel, which may not be the case during appraisals for promotions or reassignment of tasks because such biases become insignificant given the fact that job targets and the skills needed are already set and the appraisals passes through several phases that are not determined by a single individual. This work therefore applied the earlier developed e-PPO for appraisal of the academic staff of the department of computer science, university of Uyo, Uyo, Nigeria. A mixed approach of existing ontologies like Methontology and Neon have been followed in the creation of the e-PPO, which is a constraint-based semantic data model tested using Protégé inbuilt reasoner with its updated plugins. Upon application of e-PPO on personnel appraisals, promotion and selection of employee for specific assignments in any organization is possible using the smart resume. Findings: The use of the smart resume reduces the numerous task that would have been taken up by the human resource team, thereby reducing the processing time for the appraisals. The appraisal task is done void of biases of any kind such as gender and religion. Originality/value: This work is an extension of the original work done by the authors.

  • Keyword-based faceted search interface for knowledge graph construction and exploration

    Purpose: Massive amounts of data, manifesting in various forms, are being produced on the Web every minute and becoming the new standard. Exploring these information sources distributed in different Web segments in a unified way is becoming a core task for a variety of users’ and companies’ scenarios. However, knowledge creation and exploration from distributed Web data sources is a challenging task. Several data integration conflicts need to be resolved and the knowledge needs to be visualized in an intuitive manner. The purpose of this paper is to extend the authors’ previous integration works to address semantic knowledge exploration of enterprise data combined with heterogeneous social and linked Web data sources. Design/methodology/approach: The authors synthesize information in the form of a knowledge graph to resolve interoperability conflicts at integration time. They begin by describing KGMap, a mapping model for leveraging knowledge graphs to bridge heterogeneous relational, social and linked web data sources. The mapping model relies on semantic similarity measures to connect the knowledge graph schema with the sources' metadata elements. Then, based on KGMap, this paper proposes KeyFSI, a keyword-based semantic search engine. KeyFSI provides a responsive faceted navigating Web user interface designed to facilitate the exploration and visualization of embedded data behind the knowledge graph. The authors implemented their approach for a business enterprise data exploration scenario where inputs are retrieved on the fly from a local customer relationship management database combined with the DBpedia endpoint and the Facebook Web application programming interface (API). Findings: The authors conducted an empirical study to test the effectiveness of their approach using different similarity measures. The observed results showed better efficiency when using a semantic similarity measure. In addition, a usability evaluation was conducted to compare KeyFSI features with recent knowledge exploration systems. The obtained results demonstrate the added value and usability of the contributed approach. Originality/value: Most state-of-the-art interfaces allow users to browse one Web segment at a time. The originality of this paper lies in proposing a cost-effective virtual on-demand knowledge creation approach, a method that enables organizations to explore valuable knowledge across multiple Web segments simultaneously. In addition, the responsive components implemented in KeyFSI allow the interface to adequately handle the uncertainty imposed by the nature of Web information, thereby providing a better user experience.

  • Semiautomated process for generating knowledge graphs for marginalized community doctoral-recipients

    Purpose: This paper aims to describe the “InDO: Institute Demographic Ontology” and demonstrates the InDO-based semiautomated process for both generating and extending a knowledge graph to provide a comprehensive resource for marginalized US graduate students. The knowledge graph currently consists of instances related to the semistructured National Science Foundation Survey of Earned Doctorates (NSF SED) 2019 analysis report data tables. These tables contain summary statistics of an institute’s doctoral recipients based on a variety of demographics. Incorporating institute Wikidata links ultimately produces a table of unique, clearly readable data. Design/methodology/approach: The authors use a customized semantic extract transform and loader (SETLr) script to ingest data from 2019 US doctoral-granting institute tables and preprocessed NSF SED Tables 1, 3, 4 and 9. The generated InDO knowledge graph is evaluated using two methods. First, the authors compare competency questions’ sparql results from both the semiautomatically and manually generated graphs. Second, the authors expand the questions to provide a better picture of an institute’s doctoral-recipient demographics within study fields. Findings: With some preprocessing and restructuring of the NSF SED highly interlinked tables into a more parsable format, one can build the required knowledge graph using a semiautomated process. Originality/value: The InDO knowledge graph allows the integration of US doctoral-granting institutes demographic data based on NSF SED data tables and presentation in machine-readable form using a new semiautomated methodology.

  • From ontology to knowledge graph with agile methods: the case of COVID-19 CODO knowledge graph

    Purpose: The purpose of this paper is to describe the CODO ontology (COviD-19 Ontology) that captures epidemiological data about the COVID-19 pandemic in a knowledge graph that follows the FAIR principles. This study took information from spreadsheets and integrated it into a knowledge graph that could be queried with SPARQL and visualized with the Gruff tool in AllegroGraph. Design/methodology/approach: The knowledge graph was designed with the Web Ontology Language. The methodology was a hybrid approach integrating the YAMO methodology for ontology design and Agile methods to define iterations and approach to requirements, testing and implementation. Findings: The hybrid approach demonstrated that Agile can bring the same benefits to knowledge graph projects as it has to other projects. The two-person team went from an ontology to a large knowledge graph with approximately 5 M triples in a few months. The authors gathered useful real-world experience on how to most effectively transform “from strings to things.” Originality/value: This study is the only FAIR model (to the best of the authors’ knowledge) to address epidemiology data for the COVID-19 pandemic. It also brought to light several practical issues that generalize to other studies wishing to go from an ontology to a large knowledge graph. This study is one of the first studies to document how the Agile approach can be used for knowledge graph development.

  • Open problems in medical federated learning

    Purpose: This study aims to summarize the critical issues in medical federated learning and applicable solutions. Also, detailed explanations of how federated learning techniques can be applied to the medical field are presented. About 80 reference studies described in the field were reviewed, and the federated learning framework currently being developed by the research team is provided. This paper will help researchers to build an actual medical federated learning environment. Design/methodology/approach: Since machine learning techniques emerged, more efficient analysis was possible with a large amount of data. However, data regulations have been tightened worldwide, and the usage of centralized machine learning methods has become almost infeasible. Federated learning techniques have been introduced as a solution. Even with its powerful structural advantages, there still exist unsolved challenges in federated learning in a real medical data environment. This paper aims to summarize those by category and presents possible solutions. Findings: This paper provides four critical categorized issues to be aware of when applying the federated learning technique to the actual medical data environment, then provides general guidelines for building a federated learning environment as a solution. Originality/value: Existing studies have dealt with issues such as heterogeneity problems in the federated learning environment itself, but those were lacking on how these issues incur problems in actual working tasks. Therefore, this paper helps researchers understand the federated learning issues through examples of actual medical machine learning environments.

  • Fake news detection on Twitter

    Purpose: Owing to the increased accessibility of internet and related technologies, more and more individuals across the globe now turn to social media for their daily dose of news rather than traditional news outlets. With the global nature of social media and hardly any checks in place on posting of content, exponential increase in spread of fake news is easy. Businesses propagate fake news to improve their economic standing and influencing consumers and demand, and individuals spread fake news for personal gains like popularity and life goals. The content of fake news is diverse in terms of topics, styles and media platforms, and fake news attempts to distort truth with diverse linguistic styles while simultaneously mocking true news. All these factors together make fake news detection an arduous task. This work tried to check the spread of disinformation on Twitter. Design/methodology/approach: This study carries out fake news detection using user characteristics and tweet textual content as features. For categorizing user characteristics, this study uses the XGBoost algorithm. To classify the tweet text, this study uses various natural language processing techniques to pre-process the tweets and then apply a hybrid convolutional neural network–recurrent neural network (CNN-RNN) and state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) transformer. Findings: This study uses a combination of machine learning and deep learning approaches for fake news detection, namely, XGBoost, hybrid CNN-RNN and BERT. The models have also been evaluated and compared with various baseline models to show that this approach effectively tackles this problem. Originality/value: This study proposes a novel framework that exploits news content and social contexts to learn useful representations for predicting fake news. This model is based on a transformer architecture, which facilitates representation learning from fake news data and helps detect fake news easily. This study also carries out an investigative study on the relative importance of content and social context features for the task of detecting false news and whether absence of one of these categories of features hampers the effectiveness of the resultant system. This investigation can go a long way in aiding further research on the subject and for fake news detection in the presence of extremely noisy or unusable data.

  • Hotel room personalization via ontology and rule-based reasoning

    Purpose: The increasingly competitive hotel industry and emerging customer trends where guests are more discerning and want a personalized experience has led to the need of innovative applications. Personalization is much more important for hotels, especially now in the post-COVID lockdown era, as it challenges their business model. However, personalization is difficult to design and realize due to the variety of factors and requirements to be considered. Differences are both in the offer (hotels and their rooms) and demand (customers’ profiles and needs) in the accommodation domain. As for the implementation, critical issues are in hardware-dependent and vendor-specific Internet of Things devices which are difficult to program. Additionally, there is complexity in realizing applications that consider varying customer needs and context via existing personalization options. This paper aims to propose an ontological framework to enhance the capabilities of hotels in offering their accommodation and personalization options based on a guest’s characteristics, activities and needs. Design/methodology/approach: A research approach combining both quantitative and qualitative methods was used to develop a hotel room personalization framework. The core of the framework is a hotel room ontology (HoROnt) that supports well-defined machine-readable descriptions of hotel rooms and guest profiles. Hotel guest profiles are modeled via logical rules into an inference engine exploiting reasoning functionalities used to recommend hotel room services and features. Findings: Both the ontology and the inference engine module have been validated with promising results which demonstrate high accuracy. The framework leverages user characteristics, and dynamic contextual data to satisfy guests’ needs for personalized service provision. The semantic rules provide recommendations to both new and returning guests, thereby also addressing the cold start issue. Originality/value: This paper extends HoROnt in two ways, to be able to add: instances of the concepts (room characteristics and services; guest profiles), i.e. to create a knowledge base, and logical rules into an inference engine, to model guests’ profiles and to be used to offer personalized hotel rooms. Thanks to the standards adopted to implement personalization, this framework can be integrated into existing reservation systems. It can also be adapted for any type of accommodation since it is broad-based and personalizes varying features and amenities in the rooms.

  • Inference attacks based on GAN in federated learning

    Purpose: In the digital age, organizations want to build a more powerful machine learning model that can serve the increasing needs of people. However, enhancing privacy and data security is one of the challenges for machine learning models, especially in federated learning. Parties want to collaborate with each other to build a better model, but they do not want to reveal their own data. This study aims to introduce threats and defenses to privacy leaks in the collaborative learning model. Design/methodology/approach: In the collaborative model, the attacker was the central server or a participant. In this study, the attacker is on the side of the participant, who is “honest but curious.” Attack experiments are on the participant’s side, who performs two tasks: one is to train the collaborative learning model; the second task is to build a generative adversarial networks (GANs) model, which will perform the attack to infer more information received from the central server. There are three typical types of attacks: white box, black box without auxiliary information and black box with auxiliary information. The experimental environment is set up by PyTorch on Google Colab platform running on graphics processing unit with labeled faces in the wild and Canadian Institute For Advanced Research-10 data sets. Findings: The paper assumes that the privacy leakage attack resides on the participant’s side, and the information in the parameter server contains too much knowledge to train a collaborative machine learning model. This study compares the success level of inference attack from model parameters based on GAN models. There are three GAN models, which are used in this method: condition GAN, control GAN and Wasserstein generative adversarial networks (WGAN). Of these three models, the WGAN model has proven to obtain the highest stability. Originality/value: The concern about privacy and security for machine learning models are more important, especially for collaborative learning. The paper has contributed experimentally to private attack on the participant side in the collaborative learning model.

  • CNN-BERT for measuring agreement between argument in online discussion

    Purpose: With the rise of online discussion and argument mining, methods that are able to analyze arguments become increasingly important. A recent study proposed the usage of agreement between arguments to represent both stance polarity and intensity, two important aspects in analyzing arguments. However, this study primarily focused on finetuning bidirectional encoder representations from transformer (BERT) model. The purpose of this paper is to propose convolutional neural network (CNN)-BERT architecture to improve the previous method. Design/methodology/approach: The used CNN-BERT architecture in this paper directly uses the generated hidden representation from BERT. This allows for better use of the pretrained BERT model and makes finetuning the pretrained BERT model optional. The authors then compared the CNN-BERT architecture with the method proposed in the previous study (BERT and Siamese-BERT). Findings: Experiment results demonstrate that the proposed CNN-BERT is able to achieve a 71.87% accuracy in measuring agreement between arguments. Compared to the previous study that achieve an accuracy of 68.58%, the CNN-BERT architecture was able to increase the accuracy by 3.29%. The CNN-BERT architecture is also able to achieve a similar result even without further pretraining the BERT model. Originality/value: The principal originality of this paper is the proposition of using CNN-BERT to better use the pretrained BERT model for measuring agreement between arguments. The proposed method is able to improve performance and also able to achieve a similar result without further training the BERT model. This allows separation of the BERT model from the CNN classifier, which significantly reduces the model size and allows the usage of the same pretrained BERT model for other problems that also did not need to finetune their BERT model.

Featured documents

  • A deep neural network-based approach for fake news detection in regional language

    Purpose: The current natural language processing algorithms are still lacking in judgment criteria, and these approaches often require deep knowledge of political or social contexts. Seeing the damage done by the spreading of fake news in various sectors have attracted the attention of several low-l...

  • A framework to aggregate multiple ontology matchers

    Purpose: Although ontology matchers are annually proposed to address different aspects of the semantic heterogeneity problem, finding the most suitable alignment approach is still an issue. This study aims to propose a computational solution for ontology meta-matching (OMM) and a framework designed ...

  • A machine learning-based methodology to predict learners’ dropout, success or failure in MOOCs

    Purpose: Even if MOOCs (massive open online courses) are becoming a trend in distance learning, they suffer from a very high rate of learners’ dropout, and as a result, on average, only 10 per cent of enrolled learners manage to obtain their certificates of achievement. This paper aims to give...

  • On proposing and evaluating a NoSQL document database logical approach

    Purpose: NoSQL databases do not require a default schema associated with the data. Even that, they are categorized by data models. A model associated with the data can promote better strategies for persistence and manipulation of data in the target database. Based on this motivation, the purpose of ...

  • Fake news detection on Twitter

    Purpose: Owing to the increased accessibility of internet and related technologies, more and more individuals across the globe now turn to social media for their daily dose of news rather than traditional news outlets. With the global nature of social media and hardly any checks in place on posting ...

  • A practical guide for understanding online business models

    Purpose: There is a lack of clarity about what online business models are. The top 20 Google search results on online business models are articles that explain online business models. However, each of them deals with just one or two elements of business strategies. The list of business models is...

  • An approach to support the construction of adaptive Web applications

    Purpose: This paper aims to presents Real-time Usage Mining (RUM), an approach that exploits the rich information provided by client logs to support the construction of adaptive Web applications. The main goal of RUM is to provide useful information about the behavior of users that are currently...

  • A systematic literature review for authorization and access control: definitions, strategies and models

    Purpose: Authorization and access control have been a topic of research for several decades. However, existing definitions are inconsistent and even contradicting each other. Furthermore, there are numerous access control models and even more have recently evolved to conform with the challenging...

  • The impact of headline features on the attraction of online financial articles

    Purpose: This study aims to investigate whether and to what extent the characteristics of headlines impact the attraction of online financial articles by using data collected from WeChat, a popular social app in China. Design/methodology/approach: By integrating the methods of econometric and text ...

  • Conversion of XML schema design styles with StyleVolution

    Purpose: Any XML schema definition can be organized according to one of the following design styles: “Russian Doll”, “Salami Slice”, “Venetian Blind” and “Garden of Eden” (with the additional “Bologna” style actually representing absence of style). Conversion from a design style to another can...

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT