These are my running notes for porting the TIDL demo app that comes with the BeagleBone AI to a Python interface.

Importing TIDL from Python

TIDL includes a Python3 interface which is included with the BeagleBone AI version of the TIDL API. However it's not obvious how to use it since naively trying to run the Python example apps fails:

$ cd /usr/share/ti/examples/tidl/pybind
$ ./one_eo_per_frame.py
Traceback (most recent call last):
  File "./one_eo_per_frame.py", line 35, in <module>
    from tidl import DeviceId, DeviceType, Configuration, Executor, TidlError
ModuleNotFoundError: No module named 'tidl'

Unlike most Python packages, the TIDL Python interface is a .so, not a directory full of .py files so I had a hard time figuring out where to look for it. It turns out the trick is to add the following to your PYTHONPATH:

$ PYTHONPATH=/usr/share/ti/tidl/tidl_api ./imagenet.py
Input: ../test/testvecs/input/objects/cat-pet-animal-domestic-104827.jpeg
TIOCL FATAL: Failed to open file /dev/mem

This time TIDL was found, but we need to run as root due to TIDL's need to directly manipulate /dev/mem. So,

$ sudo PYTHONPATH=/usr/share/ti/tidl/tidl_api ./imagenet.py
[sudo] password for debian:
Input: ../test/testvecs/input/objects/cat-pet-animal-domestic-104827.jpeg
1: Egyptian_cat,   prob = 34.12%
2: tabby,   prob = 34.12%
3: Angora,   prob =  9.41%
4: tiger_cat,   prob =  7.84%

Since TIDL Python apps have to run as root, you may run into unexpected errors even if you export PYTHONPATH=/usr/share/ti/tidl/tidl_api in your .bashrc.

TIDL's Python API

This TIDL package for Python does not expose the entire TIDL C++ API to Python. Instead, it only provides the following as of version 01.04.00:

Method/Class/etc Type
tidl.ArgInfo pybind11_builtins.pybind11_type
tidl.Configuration pybind11_builtins.pybind11_type
tidl.DeviceId pybind11_builtins.pybind11_type
tidl.DeviceType pybind11_builtins.pybind11_type
tidl.ExecutionObject pybind11_builtins.pybind11_type
tidl.ExecutionObjectPipeline pybind11_builtins.pybind11_type
tidl.Executor pybind11_builtins.pybind11_type
tidl.TidlError type (it's an exception)
tidl.allocate_memory builtin_function_or_method
tidl.enable_time_stamps builtin_function_or_method
tidl.free_memory builtin_function_or_method

The namespace of the tidl.so package is dumped out by doing something like this:

import sys
sys.path.append("/usr/share/ti/tidl/tidl_api")
import tidl

for part in dir(tidl):
    print(part, eval("str(type(tidl.{}))".format(part)))

Notably, the C++ tidl::imgutil is not included so we don't have access to tidl::imgutil::PreprocessImage. This function applies the OpenCV steps required to transform any old input image into the format expected TIDL's models. So, we have to preprocess images ourselves in Python with OpenCV; TI's imagenet.py example shows how this is done.

For completeness (and since it's not documented anywhere), I've included all the interfaces provided by the TIDL Python API below. The best documentation on what each member does can be found in the TIDL API Reference.

tidl.ArgInfo

Member Type
tidl.ArgInfo.size instancemethod

tidl.Configuration

Member Type
tidl.Configuration.channels property
tidl.Configuration.enable_api_trace property
tidl.Configuration.enable_layer_dump property
tidl.Configuration.height property
tidl.Configuration.in_data property
tidl.Configuration.layer_index_to_layer_group_id property
tidl.Configuration.network_binary property
tidl.Configuration.network_heap_size property
tidl.Configuration.num_frames property
tidl.Configuration.out_data property
tidl.Configuration.param_heap_size property
tidl.Configuration.parameter_binary property
tidl.Configuration.pre_proc_type property
tidl.Configuration.read_from_file instancemethod
tidl.Configuration.run_full_net property
tidl.Configuration.show_heap_stats property
tidl.Configuration.width property

tidl.DeviceID

Member Type
tidl.DeviceId.ID0 tidl.DeviceId
tidl.DeviceId.ID1 tidl.DeviceId
tidl.DeviceId.ID2 tidl.DeviceId
tidl.DeviceId.ID3 tidl.DeviceId

tidl.DeviceType

Member Type
tidl.DeviceType.DSP tidl.DeviceType
tidl.DeviceType.EVE tidl.DeviceType

tidl.ExecutionObject

Member Type
tidl.ExecutionObject.get_device_name instancemethod
tidl.ExecutionObject.get_frame_index instancemethod
tidl.ExecutionObject.get_input_buffer instancemethod
tidl.ExecutionObject.get_output_buffer instancemethod
tidl.ExecutionObject.get_process_time_in_ms instancemethod
tidl.ExecutionObject.process_frame_start_async instancemethod
tidl.ExecutionObject.process_frame_wait instancemethod
tidl.ExecutionObject.set_frame_index instancemethod
tidl.ExecutionObject.write_layer_outputs_to_file instancemethod

tidl.ExecutionObjectPipeline

Member Type
tidl.ExecutionObjectPipeline.get_device_name instancemethod
tidl.ExecutionObjectPipeline.get_frame_index instancemethod
tidl.ExecutionObjectPipeline.get_input_buffer instancemethod
tidl.ExecutionObjectPipeline.get_output_buffer instancemethod
tidl.ExecutionObjectPipeline.process_frame_start_async instancemethod
tidl.ExecutionObjectPipeline.process_frame_wait instancemethod
tidl.ExecutionObjectPipeline.set_frame_index instancemethod

tidl.Executor

Member Type
tidl.Executor.at instancemethod
tidl.Executor.get_api_version builtin_function_or_method
tidl.Executor.get_num_devices builtin_function_or_method
tidl.Executor.get_num_execution_objects instancemethod

Image interface into Python

The BeagleBone demo app and the TI Python example use OpenCV, so let's use that too. PIL is an alternative used in the Jetson Nano classification demo, and it may be a better choice in the future since it's a simpler library than OpenCV, but PIL is not included with the BeagleBone AI OS whereas OpenCV is.

To import an image from a USB webcam:

#!/usr/bin/env python3
import cv2

camera = cv2.VideoCapture("/dev/video1")
_, image = camera.read()
cv2.imwrite('cv2capture.jpg', image)

Streaming OpenCV over HTTP

Flask comes with BeagleBone's OS and provides everything you need to stream video over HTTP as the BeagleBone AI's classification demo app does. There's just a little glue code needed to read images using OpenCV and write them to an mjpg stream:

#!/usr/bin/env python3

import flask
import cv2

app = flask.Flask(__name__)
def stream_camera(camera):
    while True:
        ret, frame = camera.read()
        if ret:
            _, imstr = cv2.imencode(".jpg", frame)
            yield (b'--frame\r\nContent-Type: image/jpeg\r\n\r\n'
                + bytes(imstr) + b'\r\n')

@app.route('/')
def video_feed():
    return flask.Response(
        stream_camera(cv2.VideoCapture("/dev/video1")),
        mimetype='multipart/x-mixed-replace; boundary=frame')

if __name__ == '__main__':
    app.run(host='0.0.0.0')

The beauty here is that you can modify frame after it is read in stream_camera() to manipulate the image before sending it to the HTTP video stream. This is where you can insert functionality (such as using OpenCV to add a text overlay) on each video frame.

Classifying video frames

You can intercept frames in stream_camera() and apply an arbitrary filter_function() before passing it to the mjpg stream:

def stream_camera(camera):
    while True:
        ret, frame = camera.read()
        if ret:
            frame = filter_function(frame)
            _, imstr = cv2.imencode(".jpg", frame)
            yield (b'--frame\r\nContent-Type: image/jpeg\r\n\r\n'
                + bytes(imstr) + b'\r\n')

Naively, this filter_function() could do something like...

  1. run the frame through a neural net by way of TIDL or any other ML framework to classify the contents of image
  2. overlay a text message with the label we got from step 1

On BeagleBone AI this would be slow though; every frame would have to be processed and streamed before the next frame could begin, bottlenecking inference on the inference performance of a single frame. With more careful integration, you can round-robin image frames over all EVEs and DSPs to keep them all busy processing frames though.

I've written an example app that integrates TIDL, OpenCV, and Flask to do streaming classification. Its core loop looks at each ExecutionObjectPipeline in a round-robin fashion and does the following:

  1. Checks to see if an ExecutionObjectPipeline (EOP) is done processing its work
    1. If so, read the results from the EOP's EVE or DSP.
    2. Find out the label with the highest confidence from those results
    3. Use cv2.putText() to write that topmost label on the image
    4. Send the image as the next video frame via Flask
  2. Reads an image frame from the webcam
  3. Squeezes the frame down to 224 × 224 pixels, which is what the TIDL model being used expects
  4. Rearranges the layout of the image in memory to match the in-memory image representation that TIDL expects
  5. Asynchronously launches the ExecutionObjectPipeline to classify the image

By using TIDL asynchronously and looping over ExecutionObjectPipelines, we can load one frame on to each DSP and EVE to be classified in parallel. As EVEs and DSPs finish classifying their frame, we can load up another frame and launch it asynchronously while we look at the results we just got back.

This is like having your washer and dryer going at the same time--if you've got multiple loads of laundry to wash, you can let your dryer run while you load your second load into the washer, and you can fold your laundry when both washer and dryer are running. On BeagleBone AI, we have four EVEs and two DSPs which allows us to have five other video frames being processed while we use cv2.putText() to overlay text on the image we're about to stream out.