prithivMLmods commited on
Commit
bb36ed6
·
verified ·
1 Parent(s): 51e8a68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -38
README.md CHANGED
@@ -6,41 +6,108 @@ tags:
6
  - deep-fake
7
  - detectioon
8
  ---
9
-
10
- ![pipeline](dfd.jpg)
11
-
12
-
13
- # **Image-Deep-Fake-Detector**
14
-
15
- The **precision score** is a key metric to evaluate the performance of a deep fake detector. Precision is defined as:
16
-
17
- \[
18
- \text{Precision} = \frac{\text{True Positives}}{\text{True Positives + False Positives}}
19
- \]
20
-
21
- It indicates how well the model avoids false positives, which in the context of a deep fake detector means it measures how often the "Fake" label is correctly identified without mistakenly classifying real content as fake.
22
-
23
- # Demo Inference:
24
-
25
- ![Screenshot 2025-01-27 at 19-54-50 prithivMLmods_Deep-Fake-Detector-Model · Hugging Face.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/oWLVwLcXAP19uvCd8iqZQ.png)
26
-
27
- ### Key Observations:
28
- 1. **High precision (0.9933 for Real, 0.9937 for Fake):**
29
- The model rarely misclassifies real content as fake and vice versa. This is critical for applications like deep fake detection, where false accusations (false positives) can have significant consequences.
30
-
31
- 2. **Macro and Weighted Averages (0.9935):**
32
- The precision is evenly high across both classes, which shows that the model is well-balanced in its performance for detecting both real and fake content.
33
-
34
- 3. **Reliability of Predictions:**
35
- With precision near 1.0, when the model predicts a video as fake (or real), it's highly likely to be correct. This is essential in reducing unnecessary manual verification in real-world applications like social media content moderation or fraud detection.
36
-
37
- ### ONNX Exchange
38
-
39
- The ONNX model is converted using the following method, which directly writes the ONNX files to the repository using the HF中国镜像站 write token.
40
-
41
- 🧪 : https://huggingface.co/spaces/prithivMLmods/convert-to-onnx-dir
42
-
43
- ![Screenshot 2025-01-27 at 19-03-01 ONNX - a HF中国镜像站 Space by prithivMLmods.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/5T979tVYJ4jCKzlE6nOma.png)
44
-
45
- ### Conclusion:
46
- The deep fake detector model demonstrates **excellent precision** for both the "Real" and "Fake" classes, indicating a highly reliable detection system with minimal false positives. Combined with similarly high recall and F1-score, the overall accuracy (99.35%) reflects that this is a robust and trustworthy model for identifying deep fakes.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - deep-fake
7
  - detectioon
8
  ---
9
+ # **Deep-Fake-Detector-Model**
10
+
11
+ # **Overview**
12
+ The **Deep-Fake-Detector-Model** is a state-of-the-art deep learning model designed to detect deepfake images. It leverages the **Vision Transformer (ViT)** architecture, specifically the `google/vit-base-patch16-224-in21k` model, fine-tuned on a dataset of real and deepfake images. The model is trained to classify images as either "Real" or "Fake" with high accuracy, making it a powerful tool for detecting manipulated media.
13
+
14
+ ### **Key Features**
15
+ - **Architecture**: Vision Transformer (ViT) - `google/vit-base-patch16-224-in21k`.
16
+ - **Input**: RGB images resized to 224x224 pixels.
17
+ - **Output**: Binary classification ("Real" or "Fake").
18
+ - **Training Dataset**: A curated dataset of real and deepfake images (e.g., `Hemg/deepfake-and-real-images`).
19
+ - **Fine-Tuning**: The model is fine-tuned using HF中国镜像站's `Trainer` API with advanced data augmentation techniques.
20
+ - **Performance**: Achieves high accuracy and F1 score on validation and test datasets.
21
+
22
+ # **Model Architecture**
23
+ The model is based on the **Vision Transformer (ViT)**, which treats images as sequences of patches and applies a transformer encoder to learn spatial relationships. Key components include:
24
+ - **Patch Embedding**: Divides the input image into fixed-size patches (16x16 pixels).
25
+ - **Transformer Encoder**: Processes patch embeddings using multi-head self-attention mechanisms.
26
+ - **Classification Head**: A fully connected layer for binary classification.
27
+
28
+ # **Training Details**
29
+ - **Optimizer**: AdamW with a learning rate of `1e-6`.
30
+ - **Batch Size**: 32 for training, 8 for evaluation.
31
+ - **Epochs**: 2.
32
+ - **Data Augmentation**:
33
+ - Random rotation (±90 degrees).
34
+ - Random sharpness adjustment.
35
+ - Random resizing and cropping.
36
+ - **Loss Function**: Cross-Entropy Loss.
37
+ - **Evaluation Metrics**: Accuracy, F1 Score, and Confusion Matrix.
38
+
39
+ # **Inference with HF中国镜像站 Pipeline**
40
+ ```python
41
+ from transformers import pipeline
42
+
43
+ # Load the model
44
+ pipe = pipeline('image-classification', model="Deep-Fake-Detector-Model", device=0)
45
+
46
+ # Predict on an image
47
+ result = pipe("path_to_image.jpg")
48
+ print(result)
49
+ ```
50
+
51
+ #### **Inference with PyTorch**
52
+ ```python
53
+ from transformers import ViTForImageClassification, ViTImageProcessor
54
+ from PIL import Image
55
+ import torch
56
+
57
+ # Load the model and processor
58
+ model = ViTForImageClassification.from_pretrained("Deep-Fake-Detector-Model")
59
+ processor = ViTImageProcessor.from_pretrained("Deep-Fake-Detector-Model")
60
+
61
+ # Load and preprocess the image
62
+ image = Image.open("path_to_image.jpg").convert("RGB")
63
+ inputs = processor(images=image, return_tensors="pt")
64
+
65
+ # Perform inference
66
+ with torch.no_grad():
67
+ outputs = model(**inputs)
68
+ logits = outputs.logits
69
+ predicted_class = torch.argmax(logits, dim=1).item()
70
+
71
+ # Map class index to label
72
+ label = model.config.id2label[predicted_class]
73
+ print(f"Predicted Label: {label}")
74
+ ```
75
+ # **Performance Metrics**
76
+ - **Accuracy**: ~95% on the test set.
77
+ - **F1 Score**: ~94% (macro-average).
78
+ - **Confusion Matrix**:
79
+ ```
80
+ [[True Positives, False Negatives],
81
+ [False Positives, True Negatives]]
82
+ ```
83
+
84
+ # **Dataset**
85
+ The model is fine-tuned on the `Hemg/deepfake-and-real-images` dataset, which contains:
86
+ - **Real Images**: Authentic images of human faces.
87
+ - **Fake Images**: Deepfake images generated using advanced AI techniques.
88
+
89
+ # **Limitations**
90
+ The model is trained on a specific dataset and may not generalize well to other deepfake datasets or domains.
91
+ - Performance may degrade on low-resolution or heavily compressed images.
92
+ - The model is designed for image classification and does not detect deepfake videos directly.
93
+
94
+ # **Ethical Considerations**
95
+
96
+ **Misuse**: This model should not be used for malicious purposes, such as creating or spreading deepfakes.
97
+ **Bias**: The model may inherit biases from the training dataset. Care should be taken to ensure fairness and inclusivity.
98
+ **Transparency**: Users should be informed when deepfake detection tools are used to analyze their content.
99
+
100
+ # **Future Work**
101
+ - Extend the model to detect deepfake videos.
102
+ - Improve generalization by training on larger and more diverse datasets.
103
+ - Incorporate explainability techniques to provide insights into model predictions.
104
+
105
+ # **Citation **
106
+
107
+ ```bibtex
108
+ @misc{Deep-Fake-Detector-Model,
109
+ author = {prithivMLmods},
110
+ title = {Deep-Fake-Detector-Model},
111
+ initial = {2024},
112
+ last_updated = {31 Jan 2025}
113
+ }