Abstract
A persistent, targeted cyber attack is called an advanced persistent threat (APT) attack. The attack is mainly launched to gain sensitive information, take over the system, and for financial gain, which creates nowadays more hurdles and challenges for the organization in preventing, detecting, and recovering from such attacks. Due to the nature of APT attacks, it is difficult to detect them quickly. Therefore machine learning techniques come into these research areas. This study uses deep and machine learning models such as random forest, decision tree, convolutional neural network, multilayer perceptron and so forth to categorize and effectively detect APT attacks by utilizing publicly accessible datasets. The datasets used in this study are CSE-CIC-IDS2018, CIC-IDS2017, NSL-KDD, and UNSW-NB15. This study proposes the hybrid ensemble machine learning model, a mixed approach of random forest and XGBoost classifiers. It has obtained the maximum prediction accuracy of 98.92%, 99.91%, 99.24%, and 97.11% for datasets CSE-CIC-IDS2018, CIC-IDS2017, NSL-KDD, and UNSW-NB15, with a false positive rate of 0.52%, 0.12%, 0.62%, and 5.29% respectively. These results are compared to other closely related recent studies in the literature. Our experiment's findings show that our model has performed significantly better for all datasets.