Optical Character Recognition (OCR) - definition
Optical Character Recognition (OCR) is an image processing technology used for machine-based recognition of characters from images and video frames. From a standards-based perspective, OCR is part of the broader field of pattern recognition in computer science, defined, among others, in ISO/IEC 2382:2015 as information processing involving the identification of symbols through optical analysis. In practice, OCR typically consists of three stages: detecting text regions, normalizing image crops, and recognizing (decoding) character sequences into a digital form.
In the context of photo and video anonymization, OCR supports the automated detection of textual elements that may constitute personal data or enable identification, most notably license plate numbers. OCR is not used for face recognition - this relies on face detection and (optionally) face recognition methods. However, OCR can verify or reinforce anonymization rules related to text present in images.
The role of OCR in image and video anonymization
OCR plays a supporting role in the anonymization pipeline by increasing confidence that areas requiring blurring have been correctly identified. This is particularly relevant for license plates, text on workwear, or markings that could potentially identify an individual. While OCR is not required for face blurring, it can serve as a validation layer for license plate anonymization.
- License plate detection support - recognized character patterns can confirm that a detected region corresponds to a license plate (ANPR/LPR).
- Rule validation - matching against country-specific license plate formats reduces false positives during blurring.
- Assisted manual editing - highlighting text regions helps users quickly blur elements not detected automatically.
- Mismatches as risk signals - failure to read characters in an obvious license plate area can trigger additional review.
Legal considerations remain critical. The EDPB states that an image of a person and their identifiable attributes fall under GDPR when identification of a natural person is possible (Guidelines 3/2019, Version 2.0, 29.01.2020). National authorities, such as the UK ICO, indicate that a vehicle registration number may constitute personal data depending on context and the ability to link it to an individual (ICO, What is personal data). In Poland, the UODO emphasizes the need for data minimization and proportionality in video monitoring, including elements enabling vehicle identification. At the same time, case law shows differing views on whether license plates qualify as personal data, making contextual and purpose-based assessment essential.
OCR technologies for privacy protection
Modern OCR for natural images (scene text recognition) is based on deep learning. The pipeline typically separates text detection from text recognition. Technology choices directly affect the quality, speed, and stability of photo and video anonymization processes.
- Text detection - popular single-stage and two-stage models include EAST (CVPR 2017), CRAFT (CVPR 2019), and DBNet, enabling detection of text with varying orientations and distortions (Zhou et al., 2017; Baek et al., 2019).
- Character sequence recognition - CRNN approaches with CTC, as well as attention-based and transformer models such as TrOCR, convert normalized crops into character strings (Shi et al., 2017; Li et al., TrOCR 2021).
- Video processing - inter-frame stabilization, denoising, and exposure normalization improve recognition consistency under motion and low-light conditions, supported by classical image processing libraries (e.g., OpenCV).
- Domain validation - regular expressions and allowlists for license plate formats strengthen anonymization decisions.
Key OCR parameters and metrics
The effectiveness of OCR in anonymization should be evaluated using metrics that capture the risk of under-blurring and over-blurring. Below are key measures commonly used in ICDAR benchmarks and related competitions.
Metric | Definition | Use in anonymization
|
|---|---|---|
CER - Character Error Rate | CER = Levenshtein(pred, ref) / length(ref) | Measures character recognition accuracy on license plates. |
WER - Word Error Rate | WER = (S + D + I) / N, where S = substitutions, D = deletions, I = insertions, N = number of words | Useful for short text; lower values reduce the risk of incorrect decisions. |
Precision / Recall (text detection) | Precision = TP/(TP+FP), Recall = TP/(TP+FN) | Recall is critical when failing to blur poses greater risk than over-blurring. |
F1-score | F1 = 2·(Precision·Recall)/(Precision+Recall) | Balances false positives and missed detections when selecting thresholds. |
IoU for bounding boxes | IoU = area(intersection) / area(union) | Verifies coverage of the blurred area against the text region. |
Processing latency | Average end-to-end time per frame or image | Supports batch throughput planning without real-time requirements. |
Research results and competition outcomes are published in conference proceedings. For example, the ICDAR Robust Reading Competitions (2015-2019) provide survey reports defining and applying these metrics to evaluate scene text detection and recognition systems.
Challenges and limitations of OCR in anonymization
Real-world environments introduce numerous distortions. Mitigating them requires selecting models and processing policies aligned with anonymization goals and the principle of data minimization.
- Imaging conditions - motion blur, low contrast, reflections, and font variations reduce detection recall and increase CER.
- Angles and occlusions - perspective distortions and partial coverage require detectors robust to rotation and irregular shapes.
- Diverse license plate formats - national and regional formats vary in character sets and layouts, necessitating jurisdiction-specific validation rules.
- Risk of excessive processing - under GDPR Article 5(1)(c), the scope and duration of processing must be minimized, and unnecessary storage of OCR outputs should be avoided.
Use cases in the context of Gallio PRO
Gallio PRO uses object detection and blurring to anonymize faces and license plates in photos and videos. The software does not perform real-time anonymization and operates in an on-premise model. In this context, OCR serves a supporting function.
- Faces - OCR is not used; anonymization relies on face detection and blurring.
- License plates - OCR can verify whether a blurred area matches a license plate character pattern, reducing false positives.
- Unsupported elements - logos, tattoos, name badges, or screen content are not detected automatically but can be blurred manually using the built-in editor.
- Privacy and logs - the tool does not store logs containing face or license plate detection results. OCR outputs, when used, follow data minimization principles and are not retained as personal data.
Blurring license plates is a common practice in many Western European countries and may be recommended or expected depending on the publication context, in line with data protection authority guidance and market practice. In Poland, interpretations of license plates as personal data vary, but both the UODO and EDPB stress contextual and risk-based assessment. This supports using OCR as a control layer to reduce the risk of exposing identifiable text in images.
Standards and references
The following sources document definitions, metrics, and technical and regulatory best practices related to OCR and image processing in a data protection context.
- ISO/IEC 2382:2015 - Information technology - Vocabulary. Definitions related to pattern recognition and information processing.
- EDPB, Guidelines 3/2019 on processing of personal data through video devices, Version 2.0, 29.01.2020.
- GDPR - Articles 4(1), 5(1)(c), 25, 32 - definitions, data minimization, privacy by design, and processing security.
- ICO, What is personal data - guidance including examples such as vehicle registration numbers.
- Zhou et al., EAST: An Efficient and Accurate Scene Text Detector, CVPR 2017.
- Baek et al., Character Region Awareness for Text Detection (CRAFT), CVPR 2019.
- Shi et al., An End-to-End Trainable Neural Network for Image-based Sequence Recognition, TPAMI 2017.
- Li et al., TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models, 2021.
- ICDAR Robust Reading Competitions - technical reports from 2015-2019 on metrics and datasets for scene text detection and recognition.