Abstract
This research presents an AI-powered deepfake detection system that leverages ensemble learning to identify manipulated video content with high accuracy. The increasing realism of GAN-generated deepfakes has created serious risks in areas such as cybersecurity, political misinformation, financial fraud, and digital media authentication. Traditional detection methods often rely on single-model analysis, making them ineffective as deepfake synthesis improves. The proposed system integrates Convolutional Neural Networks for texture analysis, Vision Transformers for global spatial structural evaluation, and audio-lip synchronization models for temporal consistency measurement. Each model independently extracts unique feature representations, and their outputs are combined through a Random Forest–based ensemble classifier to produce a more reliable final decision. Experiments conducted using benchmark datasets such as Face Forensics++, Celeb- DF, and Deep Fake TIMIT demonstrate that the ensemble approach significantly enhances performance, achieving improved precision, reduced false positives, and stronger generalization to unseen manipulations. The results indicate that multi-model fusion is a robust and practical solution for modern deepfake detection challenges and provides a foundation for future advancements in intelligent multimedia forensics.Keywords
- Deepfake Detection
- Ensemble Learning
- Artificial Intelligence
- Machine Learning
- CNN
- Forgery Analy