Configure scene metadata over RTSP
This guide shows how to configure and start a RTSP (Real-Time Streaming Protocol) stream with scene metadata of the ONVIF Scene Description kind.
This kind of scene metadata is delivered as an output from modules supporting RTSP as an output protocol, using the ONVIF tt:Frame
data format.
This includes the modules:
Selection of which output should be included in the RTSP stream is done with the Analytics Metadata Producer Configuration API The Video Streaming over RTSP API is used to start a RTSP stream that includes scene metadata.
Prerequisites
- An Axis device that supports AXIS Scene Metadata
- A module instance that supports output via RTSP using the ONVIF
tt:Frame
format. - cURL installed
Overview
The guide contains the following steps:
- List all available analytics metadata producers
- Configure which producers output that should be included in the RTSP stream
- Retrieve supported metadata for an analytics metadata producer
- Connect to a RTSP stream that includes scene metadata
Lets get started!
Step 1: List the available analytics metadata producers
To check for the currently available analytics metadata producers on a device we can use the following POST
method.
Don't forget to replace <device-ip>
, <user>
and <password>
where applicable.
curl --anyauth -u <user>:<password> -X 'POST' \
'http://<device-ip>/axis-cgi/analyticsmetadataconfig.cgi' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"apiVersion": "1.0",
"context": "my context",
"method": "listProducers",
"params": {}
}'
{
"apiVersion": "1.0",
"context": "my context",
"method": "listProducers",
"data": {
"producers": [
{
"name": "VideoMotionTracker",
"niceName": "Axis video motion tracker",
"videochannels": [
{
"channel": 1,
"enabled": false
}
]
},
{
"name": "AnalyticsSceneDescription",
"niceName": "Analytics Scene Description",
"videochannels": [
{
"channel": 1,
"enabled": true
}
]
}
]
}
}
The Analytics Scene Description producer corresponds to the output from the Fusion Tracker module.
The Axis video motion tracker producer corresponds to the output from the Video Motion Tracker module.
The data after channel
, i.e 1
, represents the origin of the data. In this case a module instance corresponding to the video source that the module instance uses as input.
If working with a multidirectional camera, 2
, 3
and 4
might also be available.
Step 2: Enable the Analytics Scene Description producer
Now we'll enable the Analytics Scene Description producer using a POST
method.
Don't forget to replace <device-ip>
, <user>
and <password>
where applicable.
curl --anyauth -u <user>:<password> -X 'POST' \
'http://<device-ip>/axis-cgi/analyticsmetadataconfig.cgi' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"apiVersion": "1.0",
"context": "my context",
"method": "setEnabledProducers",
"params": {
"producers": [
{
"name": "AnalyticsSceneDescription",
"videochannels": [
{
"channel": 1,
"enabled": true
}
]
}
]
}
}'
{
"apiVersion": "1.0",
"context": "my context",
"method": "setEnabledProducers",
"data": {}
}
Step 3: Retrieve supported metadata
Let's now retrieve information regarding the RTSP metadata analytics producers and what kind of metadata they can produce by the following POST
method.
One of reasons you would want to do this is to get an idea about what kind of metadata that can be included in the RTSP stream.
The supported metadata is communicated as a sample frame.
Don't forget to replace <device-ip>
, <user>
and <password>
where applicable.
curl --anyauth -u <user>:<password> -X 'POST' \
'http://<device-ip>/axis-cgi/analyticsmetadataconfig.cgi' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"apiVersion": "1.0",
"context": "my context",
"method": "getSupportedMetadata",
"params": {
"producers": [
"AnalyticsSceneDescription"
]
}
}'
{
"apiVersion": "1.0",
"context": "my context",
"method": "getSupportedMetadata",
"data": {
"producers": [
{
"name": "AnalyticsSceneDescription",
"sampleFrameXML": "<see formatted example below!>
}
]
}
}
Formatted Sample Frame
<tt:SampleFrame xmlns:tt= \"http://www.onvif.org/ver10/schema\" xmlns:bd= \"http://www.onvif.org/ver20/analytics/humanbody\" UtcTime= \"2025-03-11T08:11:54.008178Z\" Source= \"AnalyticsSceneDescription\">
<tt:Object ObjectId= \"101\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.75\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.75\">Bus</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"102\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.7\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.7\">Car</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"103\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.65\">Human</tt:Type>
</tt:Class>
<tt:HumanBody>
<bd:Clothing>
<bd:Tops>
<bd:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"245\" Y= \"245\" Z= \"220\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</bd:Color>
</bd:Tops>
<bd:Bottoms>
<bd:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"245\" Y= \"245\" Z= \"220\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</bd:Color>
</bd:Bottoms>
</bd:Clothing>
</tt:HumanBody>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"104\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.6\">HumanFace</tt:Type>
</tt:Class>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"105\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.55\">LicensePlate</tt:Type>
</tt:Class>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"106\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.5\">Bike</tt:Type>
</tt:Class>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"107\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.45\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.45\">Truck</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"108\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.4\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.4\">Vehicle</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:ObjectTree>
<tt:Rename>
<tt:to ObjectId= \"20\" />
<tt:from ObjectId= \"38\" />
</tt:Rename>
<tt:Delete ObjectId= \"1\" />
</tt:ObjectTree>
</tt:SampleFrame>"
Step 4: Connect to a RTSP stream with scene data
To connect to a RTSP stream that includes scene metadata from the enabled analytics metadata producers the following RTSP URL can be used,
rtsp://<device-ip>/axis-media/media.amp?camera=1&audio=0&video=0&analytics=polygon
For details on the different parameter in the URL see the Parameter specification RTSP URL reference for RTSP metadata.
Notably,
- The
camera
parameter used to select output from instances with output that corresponds to the specific video source. - The
video
parameter is used to configure if video should be includes in the RTSP stream, withvideo=0
video will not be included. - The
analytics
parameter used configure if scene metadata should be included, withanalytics=polygon
scene metadata will be included.
The metadata is received as an XML document.
The scene metadata uses the ONVIF tt:Frame
data format and is located in the tt:VideoAnalytics
element of the document.
<?xml version="1.0" ?>
<tt:MetadataStream xmlns:tt="http://www.onvif.org/ver10/schema">
<tt:VideoAnalytics>
<tt:Frame UtcTime="2025-03-27T12:58:26.799863Z" Source="AnalyticsSceneDescription">
...
</tt:Frame>
</tt:VideoAnalytics>
</tt:MetadataStream>
To view the data a tool such as the AXIS Metadata Monitor can be used.