Confidence threshold: definition
A confidence threshold, also called a detection confidence threshold, is a parameter used in computer vision models that defines the minimum level of model confidence required for a detected object to be treated as a valid detection. In the context of photo and video anonymization, this mainly refers to detecting faces and license plates before they are automatically blurred or redacted.
A detection model typically returns two types of information for each object: the object’s location in the image, usually represented by a bounding box, and a confidence score, meaning a numerical estimate of how likely it is that the given region actually contains a face or a license plate. The confidence threshold sets the decision boundary. If the model’s score is equal to or higher than the threshold, the object is accepted for further processing. If the score is lower, the detection is rejected.
In anonymization systems, this parameter directly affects the risk of two classes of errors: a false positive, where an object that is not a face or a license plate is blurred, and a false negative, where a real face or license plate is missed and therefore not anonymized. From a privacy protection perspective, false negatives are usually the more critical error because they result in personal data being exposed in visual material.
The role of the detection confidence threshold in photo and video anonymization
In the context of visual content anonymization, the confidence threshold is not just a purely technical setting. It is a parameter that matters for both compliance with data protection requirements and the operational quality of processing. The lower the threshold, the more objects will be labeled as faces or license plates. This usually increases detection sensitivity, but at the same time raises the number of incorrect detections.
The higher the threshold, the more restrictive the system becomes, accepting only high-confidence detections. This reduces excessive blurring, but it may also cause difficult objects to be missed, such as small faces, partially occluded faces, license plates in poor lighting, or plates viewed at an angle.
In practice, the threshold is set according to the purpose of the process:
- for anonymization aimed at minimizing the risk of personal data disclosure, a lower threshold combined with additional quality control is usually preferred,
- for high-quality visual material, a higher threshold may be used if validation confirms that appropriate recall is maintained,
- different thresholds may be applied to different object classes, such as faces and license plates.
How a confidence threshold works technically
Modern face detection and license plate detection systems typically rely on deep learning models, most often convolutional neural networks or other contemporary detection architectures. Deep learning is the standard approach at the AI model development stage because, during training, the model learns visual features that make it possible to recognize faces and license plates under varied conditions. The trained model is then used in the automatic blurring process.
A confidence score is not always a calibrated probability in the statistical sense. In many architectures, it is a value produced after a sigmoid or softmax function, but its interpretation depends on how the model was trained and validated. For this reason, the threshold should not be set based on intuition alone. It should be derived from tests on a dataset that is representative of real-world input data.
A typical decision rule looks like this:
detection accepted when score >= threshold
After this step, additional procedures are often applied, such as Non-Maximum Suppression, which removes overlapping boxes for the same object. This affects the final number of detections, so it is important to understand that the end result depends not only on the threshold itself, but also on the other parameters of the detection pipeline.
Key parameters and metrics related to the confidence threshold
Assessing whether the threshold is set correctly requires quality measurement. In object detection tasks, metrics commonly used in the literature and in benchmarks such as PASCAL VOC and COCO are applied. In anonymization, the most important metrics are those showing how often the system misses objects that need to be concealed.
Metric | Practical meaning | Effect of changing the threshold
|
|---|---|---|
Precision | The percentage of accepted detections that are correct | A higher threshold usually increases precision |
Recall | The percentage of actual objects that were detected | A lower threshold usually increases recall |
F1-score | The harmonic mean of precision and recall | Helps identify a trade-off |
False Discovery Rate | The rate of incorrect labels among accepted detections | Increases when the threshold is too low |
False Negative Rate | The rate at which faces or license plates are missed | Increases when the threshold is too high |
mAP at IoU 0.5 or 0.5:0.95 | Overall detector quality in benchmarks | Used to evaluate the model, but does not replace threshold selection |
In privacy protection tasks, recall is especially important. If recall for faces or license plates is too low, some personal data will remain visible. For that reason, the optimal confidence threshold is not always the one that maximizes precision or mAP.
How to choose the right threshold in anonymization systems
Threshold selection should be based on validation using data similar to the material actually being processed. Important factors include resolution, camera angle, compression, time of day, weather conditions, and the degree of object occlusion. A threshold chosen on a laboratory dataset may not perform correctly on CCTV footage, mobile camera recordings, or smartphone photos.
In practice, the following is recommended:
- test faces and license plates separately,
- plot precision-recall curves for several threshold levels,
- choose the threshold based on an acceptable false negative rate,
- revalidate periodically after changing the model, camera, codec, or use case,
- apply manual review in borderline cases.
In Gallio PRO, automatic detection applies to faces and license plates. The system does not automatically detect logos, tattoos, name badges, documents, or content displayed on monitor screens. Such elements can be blurred manually in the built-in editor. This means that even a properly configured detection confidence threshold does not eliminate the need to assess material for other visual identifiers.
Limitations and compliance context
The confidence threshold is not a guarantee of full anonymization. It is only one parameter within the detection system. The outcome is also influenced by model quality, training data, annotation methods, minimum object size, preprocessing, and tracking parameters between frames in video material.
From a GDPR compliance perspective, what matters is a risk-based approach and the selection of technical measures appropriate to the purpose of processing. Regulation (EU) 2016/679 does not specify a particular confidence threshold value, but it does require the implementation of appropriate technical and organizational measures in line with Articles 24, 25, and 32. In practice, this means documenting anonymization effectiveness tests and justifying the parameters that were selected.
For materials containing license plates, the legal and factual context must also be considered. A vehicle registration number is not always personal data in itself, but it may become personal data if, using means reasonably likely to be used, it allows a natural person to be identified. In privacy practice, license plate blurring is therefore often applied as a precautionary measure, especially when the material is to be shared more broadly.
Normative references and sources
The term confidence threshold does not have a single normative definition in legislation, but its technical meaning is consistent with established practice in machine learning and object detection. In implementation projects, it is good practice to rely on primary sources, model documentation, and recognized benchmarks.
- Regulation (EU) 2016/679, GDPR – Articles 24, 25, 32.
- NIST IR 8280, Factsheets for AI and Automated Decision Systems, 2021 – the importance of documenting AI system parameters and limitations.
- PASCAL VOC Challenge – Everingham et al., International Journal of Computer Vision, 2010 – precision, recall, and AP metrics for object detection.
- COCO Detection Evaluation, Microsoft COCO – commonly used AP and IoU definitions for detector evaluation.
- Guo et al., On Calibration of Modern Neural Networks, ICML 2017 – limitations of interpreting the score as a calibrated probability.