Skip to main content
Early access β€” new tools and guides added regularly
Practical

Object Detection

Last reviewed: April 2026

A computer vision task that identifies and locates specific objects within images or video frames, drawing bounding boxes around each detected object.

Object detection is a computer vision task that not only identifies what objects are in an image but also locates where each object is by drawing a bounding box around it. It goes beyond simple classification ("this image contains a car") to provide spatial information ("there is a car at this location, a pedestrian at that location, and a traffic sign over there").

How object detection works

Modern object detection models process an image and output a list of detected objects, each with:

  • A class label (what is it β€” car, person, dog)
  • A bounding box (where is it β€” x, y coordinates and dimensions)
  • A confidence score (how certain the model is)

Major architectures

  • YOLO (You Only Look Once) β€” processes the entire image in a single pass, making it extremely fast. The go-to choice for real-time applications.
  • SSD (Single Shot Detector) β€” similar to YOLO in processing speed, with multi-scale feature detection
  • Faster R-CNN β€” a two-stage approach (first propose regions, then classify) that is more accurate but slower
  • DETR β€” a transformer-based approach that treats detection as a set prediction problem

Business applications

  • Manufacturing β€” detecting defects on production lines at speeds no human inspector can match
  • Retail β€” shelf monitoring, customer behaviour analysis, checkout-free stores
  • Autonomous vehicles β€” detecting pedestrians, vehicles, traffic signs, and obstacles in real time
  • Security β€” detecting weapons, suspicious behaviour, or safety violations in surveillance footage
  • Agriculture β€” counting fruit on trees, detecting pests, assessing crop health from drone imagery
  • Healthcare β€” detecting nodules in medical imaging, counting cells in microscopy

Challenges

  • Small objects β€” detecting tiny objects in large images remains difficult
  • Occluded objects β€” partially hidden objects are harder to detect
  • Real-time constraints β€” balancing accuracy with speed for live video applications
  • Domain adaptation β€” a model trained on standard photos may struggle with thermal imaging, satellite imagery, or medical scans

From detection to tracking

Object detection processes individual frames. Object tracking extends this to video, following detected objects across frames to maintain identity over time.

Want to go deeper?
This topic is covered in our Practitioner level. Access all 60+ lessons free.

Why This Matters

Object detection is one of AI's most commercially mature capabilities, with proven deployments in manufacturing, retail, and logistics. Understanding its capabilities and limitations helps you identify which of your visual inspection, monitoring, or counting tasks could be automated β€” and what accuracy and speed to realistically expect.

Related Terms

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow