Welcome to SCDD 2023

International Conference on Soft Computing, Data mining and Data Science (SCDD 2023)

May 13-14, 2023, Virtual Conference



Accepted Papers
Exploring Sentiment Analysis Research: a Social Media Data Perspective

Zahra Dahish1, 2 and Shah J Miah1, 1Newcastle Business School, College of Human and Social Futures, University of Newcastle, NSW, Australia, 2Management Information System, College of Business Administration, Jazan University, Saudi Arabia

ABSTRACT

Businesses use sentiment analysis in different ways for generating decision support insights. Existing sentiment analysis studies related to social media data have been significantly grown but, its insights and trends were not fully revealed for new researchers in the data mining field. Therefore, it is of paramount task to delineate the trend holistically for knowledge growth in the field of the data mining and text analytics research. The study addresses the research gap through a comprehensive bibliometric review of 523 research articles published in the Scopus database (between 2018 and 2022) to discern the content and thematic analysis. We adopt an automated bibliometric study approach using the R-tool- biblioshiny for generating and presenting outcomes. Finding points on the vital usages of sentiment analysis such as innovation, transparency, and improved efficiency. It also highlights the uniqueness of sentiment analysis for synthesizing social media content to examine various aspects such as the knowledge-domain map that detects author collaboration networks.

KEYWORDS

Social Media, Sentiment Analysis, Bibliometric Analysis, Systematic literature review .


Teaching and Learning With Ict Tools: Issues and Challenges

Er. Kamaljit Kaur Assistant Professor Department of Computer Science Khalsa College, Garhdiwala

ABSTRACT

The students nowadays are more friendly with tech devices. So, to make it less boring need to start innovative ways that involve technology. Teaching via ICT tools have the potential to make a change in the academic sector. It is a dynamic learning method. It provides more benefits as compared to the traditional blackboard and chalks learning. Due to the rapid progress in technology, there are many new ways of learning that have started. These methods attract the mind of most students. It is interesting and better than the traditional methods in many ways. These methods spark curiosity in the mind of the students. The whole process of education can sometimes feel tedious for students. In this digital era, ICT use in the classroom is important for giving students opportunities to learn and apply the required 21st century skills. Hence studying the issues and challenges related to ICT use in teaching and learning can assist teachers in overcoming the obstacles and become successful technology users. With the advent of Information and Communications Technologies (ICT) in education, teachers form their own beliefs about the role of ICT as a teaching tool, the value of ICT for student learning outcomes and their own personal confidence and competency. Barriers exist in integrating ICT in teaching and learning. The barriers are extrinsic to the teacher and include lack of resources, time, access and technical support.

KEYWORDS

ICT Tools, Teaching and Learning Technology , Issues and Challenges.


Text Generation With Gan Networks Using Feedback Score

Dmitrii Kuznetsov, Department of Software Engineering, South China University of Technology, Guangzhou, China

ABSTRACT

Text generation using GAN networks is becoming more effective but still requires new approaches to achieve high-quality output. The usage of a discriminator model in GAN solves this task partially but it can be extended using more natural ways such as feedback scoring. Feedback or response is a natural part of conversations and not only consists of words, but also can take other shapes such as emotions, or other reactions. In dialogue processes feedback is a factor influencing the next phrase or reaction. Depending on this feedback or response we correct our possible answers by trying to change the tone, context, or even structure of the sentences. Applying feedback as part of the GAN model structure will give us new ways to apply feedback and generate well-controlled outputs with defined scores which is very important in real-world applications and systems. With GAN networks and their instability in training and unique architecture, it becomes trickier and requires new ways of solving this problem. The matter of feedback usages GAN network we will review in this paper and experiment with 2 different approaches to applying feedback such as using its score in the GAN discriminator’s loss function or integrating score values into generator model layers.

KEYWORDS

Neural Networks, Text generation, GAN networks, Autoencoders, Controlled text generation .



An Illumination Invariant Convolution Module for Zero-shot Object Detection in the Night

Jian Wei, Qinzhao Wang, Zixu Zhao Army Academy of Armored Forces, Beijing 100071, China

ABSTRACT

Intelligent detection is an important part of the research of military intelligent technology. When only daytime training data is available, it is difficult to train a consistent model for object detection at night. How to eliminate the inconsistent performance resulting from different illumination conditions is a critical work under resolution. The illumination invariant convolution module (IIC) based on dynamic learning is proposed, which realizes a cross-domain detection model that can be trained only in the daytime scene and directly used for target detection in the low-light scene. It models the conversion relationship between the visible light image and the target constant illumination feature map. Experiments show that compared with the model based on data augmentation and style transformation, the proposed method has a more stable detection performance and higher average precision (AP) in the self-built dataset. Furthermore, it gains consistent performance in the small sub-dataset, as well.

KEYWORDS

Cross-Domain; Illumination Invariant; Object Detection; YOLO.


A Model-based Approach Machine Learning to Scalable Portfolio Selection

Ana Paula S. Gularte1, 2 and Vitor V. Curtis1, 2, 1, 2Department of Aerospace Science and Technology, AeronauticsInstitute of Technology (ITA), Marechal Eduardo Gomes Square, São José dos Campos, São Paulo, Brazil, 1, 2Department of Science and Technology, Federal University of São Paulo (UNIFESP), Cesare Mansueto Giulio Lattes Avenue, São José dos Campos, São Paulo, Brazil

ABSTRACT

In related literature, recent developments in machine learning have brought significant opportunities for integrating clustering methods for size reduction or pre-processing for portfolio optimization models. The goal of this research is a scalable quantitative proposal via machine learning for asset selection and allocation listed in two indices, the Ibovespa and Standard and Poor's 500 (S&P; 500) indexes from January 1, 2016, to December 31, 2021. The study consists of two stages: the first combines fundamental and market data to asset pre-selection by applying the Uniform Manifold Approximation and Projection (UMAP) method and hybrid clustering of K-means, Partition Around Medoids (PAM), and Agglomerative Hierarchical Clustering (AHC) methods. The second stage compares the performance of three allocation models and validates their results in-sample and out-of-sample data via Monte Carlo simulation. The results indicate that pre-processing with UMAP allowed us to find clusters with higher discriminatory power, evaluated through the internal cluster validation metrics, including the Silhouette Coefficient and the Davies-Bouldin Index. This prior asset selection helped reduce the problem's size during optimal portfolio allocation. The Hierarchical Risk Parity (HRP) model stood out as the best-performing model, with the highest Sharpe ratio (1.11) compared to Inverse-Variance Portfolio (IVP) and Mean-Variance (MV). In addition, the portfolios outperformed the S&P; 500 and Ibovespa in cumulative returns, with similar annual volatilities of 20%. Despite the impact of the pandemic, affecting the portfolios with drawdowns close to 30%, they recovered in 111 to 149 trading days.

KEYWORDS

Portfolio Selection, Cluster Analysis, Hierarchical Risk Parity, Inverse-Variance Portfolio, Mean-Variance.


Alice – Applying Bert to Italian Emails

Pasquale Restaino and Liliana Saracino, Sogei S.p.A., Rome, Italy

ABSTRACT

ALICE is an Artificial Intelligence solution that allows the automatic classification of email-type documents based on their information content, analyzed almost in real-time. The emails are written in Italian language. The classification classes are equal to 591, and in the use of the service, they will be able to grow further. This research study explores the implementation of the BERT model in the ALICE email classification system for multiclass classification in the Italian language. The main objective of the ALICE solution is the automation of classification processes performed manually by dedicated operators. To improve the performance of the BERT model and to allow the addition of further classification classes, a transfer learning process has been envisaged. The ALICE service, which uses the BERT model trained on the provided dataset, presently has a 76% accuracy.

KEYWORDS

BERT model, multiclass text classification, Italian emails.


Clustering an African Hairstyle Dataset Using Pca and K-means

TeffoPhomolo Nicrocia1,OwolawiPius Adewale 2 ,PholoMoanda Diana3, 1Department of Computer Engineering, Tshwane University of Technology, Pretoria (Soshanguve), 2, 3Department of Computer Systems Engineering, Tshwane University of Technology, Pretoria (Soshanguve )

ABSTRACT

The adoption of digital transformation was not expressed in building an African face shape classifier. In this paper, an approach is presented that uses k-means to classify African women images. African women rely on beauty standards recommendations, personal preference, or the newest trends in hairstyles to decide on the appropriate hairstyle for them. In this paper, an approach is presented that uses K-means clustering to classify African women's images. In order to identify potential facial clusters, Haarcascade is used for feature-based training, and K-means clustering is applied for image classification.

KEYWORDS

Face detection, k-means, African, hairstyle.