zamba.object_detection.yolox.megadetector_lite_yolox¶
FillModeEnum
¶
Bases: str
, Enum
Enum for frame filtering fill modes
Attributes:
Name | Type | Description |
---|---|---|
repeat |
Randomly resample qualifying frames to get to n_frames |
|
score_sorted |
Take up to n_frames in sort order (even if some have zero probability) |
|
weighted_euclidean |
Sample the remaining frames weighted by their euclidean distance in time to the frames over the threshold |
|
weighted_prob |
Sample the remaining frames weighted by their predicted probability |
Source code in zamba/object_detection/yolox/megadetector_lite_yolox.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
MegadetectorLiteYoloX
¶
Source code in zamba/object_detection/yolox/megadetector_lite_yolox.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 |
|
__init__(path=LOCAL_MD_LITE_MODEL, kwargs=LOCAL_MD_LITE_MODEL_KWARGS, config=None)
¶
MegadetectorLite based on YOLOX.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
pathlike
|
Path to trained YoloX model checkpoint (.pth extension) |
LOCAL_MD_LITE_MODEL
|
config |
MegadetectorLiteYoloXConfig
|
YoloX configuration |
None
|
Source code in zamba/object_detection/yolox/megadetector_lite_yolox.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
detect_image(img_arr)
¶
Runs object detection on an image.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
img_arr |
ndarray
|
An image array with dimensions (height, width, channels). |
required |
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: An array of bounding box detections with dimensions (object, 4) where object is the number of objects detected and the other 4 dimension are (x1, y1, x2, y1). |
ndarray
|
np.ndarray: An array of object detection confidence scores of length (object) where object is the number of objects detected. |
Source code in zamba/object_detection/yolox/megadetector_lite_yolox.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
detect_video(video_arr, pbar=False)
¶
Runs object detection on an video.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
video_arr |
ndarray
|
An video array with dimensions (frames, height, width, channels). |
required |
pbar |
int
|
Whether to show progress bar. Defaults to False. |
False
|
Returns:
Name | Type | Description |
---|---|---|
list |
A list containing detections and score for each frame. Each tuple contains two arrays: the first is an array of bounding box detections with dimensions (object, 4) where object is the number of objects detected and the other 4 dimension are (x1, y1, x2, y1). The second is an array of object detection confidence scores of length (object) where object is the number of objects detected. |
Source code in zamba/object_detection/yolox/megadetector_lite_yolox.py
145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
|
filter_frames(frames, detections)
¶
Filter video frames using megadetector lite.
Which frames are returned depends on the fill_mode and how many frames are above the confidence threshold. If more than n_frames are above the threshold, the top n_frames are returned. Otherwise add to those over threshold based on fill_mode. If none of these conditions are met, returns all frames above the threshold.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
frames |
ndarray
|
Array of video frames to filter with dimensions (frames, height, width, channels) |
required |
detections |
list of tuples
|
List of detection results for each frame. Each element is a tuple of the list of bounding boxes [array(x1, y1, x2, y2)] and the detection probabilities, both as float |
required |
Returns:
Type | Description |
---|---|
ndarray
|
np.ndarray: An array of video frames of length n_frames or shorter |
Source code in zamba/object_detection/yolox/megadetector_lite_yolox.py
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 |
|
MegadetectorLiteYoloXConfig
¶
Bases: BaseModel
Configuration for a MegadetectorLiteYoloX frame selection model
Attributes:
Name | Type | Description |
---|---|---|
confidence |
float
|
Only consider object detections with this confidence or greater |
nms_threshold |
float
|
Non-maximum suppression is a method for filtering many bounding boxes around the same object to a single bounding box. This is a constant that determines how much to suppress similar bounding boxes. |
image_width |
int
|
Scale image to this width before sending to object detection model. |
image_height |
int
|
Scale image to this height before sending to object detection model. |
device |
str
|
Where to run the object detection model, "cpu" or "cuda". |
frame_batch_size |
int
|
Number of frames to predict on at once. |
n_frames |
int
|
Max number of frames to return. If None returns all frames above the threshold. Defaults to None. |
fill_mode |
str
|
Mode for upsampling if the number of frames above the threshold is less than n_frames. Defaults to "repeat". |
sort_by_time |
bool
|
Whether to sort the selected frames by time (original order) before returning. If False, returns frames sorted by score (descending). Defaults to True. |
seed |
int
|
Random state for random number generator. Defaults to 55. |
Source code in zamba/object_detection/yolox/megadetector_lite_yolox.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|