Handwritten vs printed detection

Printed vs Handwritten Detection – Classify Text Type in Documents

:busts_in_silhouette: Team Members

Suraj Gupta
Prathmesh gaikwas
Yash Machhi

:star: Use Case Importance

Detecting whether text is printed or handwritten helps automate document processing in real-world applications like exam paper evaluation, form verification, and digitization of records. This system reduces manual effort and improves accuracy when handling mixed-type documents.

:camera_flash: Data Collection and Annotation

Data Collection:
Images were collected from multiple sources including handwritten notes, printed documents, and scanned papers. Data included variations in handwriting styles, fonts, lighting conditions, and backgrounds to ensure robustness.

Annotated Classes:
Two classes:

Printed
Handwritten

Annotation Tool:
Roboflow

Total Images:
Dataset prepared with training, validation, and testing splits for effective model learning.

:brain: Model Training and Validation

Model & Version:
YOLOv8 Classification model trained using Google Colab

Training Details:

Epochs: 100
Batch Size: 16
Image Size: 224×224
Optimizer: Default (SGD/Adam depending on setup)

Augmentations:

Rotation
Flipping
Brightness & Contrast Adjustment

Monitored Metrics:

Accuracy
Loss
Precision

Performance Improvement:
Initial dataset size was limited, leading to lower accuracy. After applying augmentation and increasing dataset diversity, the model performance improved significantly.

:iphone: Model Deployment and Demo Video

Performance:
The model performs efficiently in real-time classification with high accuracy on test data.

Deployment:
Model deployed using YOLOvX application for real-time testing.

Demo Video:

:white_check_mark: Conclusion

The YOLOv8 classification model successfully distinguishes between printed and handwritten text with strong accuracy. This project highlights the importance of dataset quality, augmentation, and lightweight models for real-time deployment. It can be further extended for document digitization and intelligent OCR systems.