This is the test dataset, comprising the remaining 20% of the original data. It was used to evaluate the trained model’s performance, including accuracy, precision, recall, and F1-score. Comparing results between the training and test sets helps assess the model’s generalization and check for overfitting.