Skip to main content

Configure scene metadata over RTSP

This guide shows how to configure and start a RTSP (Real-Time Streaming Protocol) stream with scene metadata of the ONVIF Scene Description kind.

This kind of scene metadata is delivered as an output from modules supporting RTSP as an output protocol, using the ONVIF tt:Frame data format.

This includes the modules:

Selection of which output should be included in the RTSP stream is done with the Analytics Metadata Producer Configuration API The Video Streaming over RTSP API is used to start a RTSP stream that includes scene metadata.

Prerequisites

  • An Axis device that supports AXIS Scene Metadata
  • A module instance that supports output via RTSP using the ONVIF tt:Frame format.
  • cURL installed

Overview

The guide contains the following steps:

  1. List all available analytics metadata producers
  2. Configure which producers output that should be included in the RTSP stream
  3. Retrieve supported metadata for an analytics metadata producer
  4. Connect to a RTSP stream that includes scene metadata

Lets get started!

Step 1: List the available analytics metadata producers

To check for the currently available analytics metadata producers on a device we can use the following POST method.

Don't forget to replace <device-ip>, <user> and <password> where applicable.

Get available metadata producers
curl --anyauth -u <user>:<password> -X 'POST' \
'http://<device-ip>/axis-cgi/analyticsmetadataconfig.cgi' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"apiVersion": "1.0",
"context": "my context",
"method": "listProducers",
"params": {}
}'
Response
{
"apiVersion": "1.0",
"context": "my context",
"method": "listProducers",
"data": {
"producers": [
{
"name": "VideoMotionTracker",
"niceName": "Axis video motion tracker",
"videochannels": [
{
"channel": 1,
"enabled": false
}
]
},
{
"name": "AnalyticsSceneDescription",
"niceName": "Analytics Scene Description",
"videochannels": [
{
"channel": 1,
"enabled": true
}
]
}
]
}
}
info

The Analytics Scene Description producer corresponds to the output from the Fusion Tracker module.

The Axis video motion tracker producer corresponds to the output from the Video Motion Tracker module.

The data after channel, i.e 1, represents the origin of the data. In this case a module instance corresponding to the video source that the module instance uses as input. If working with a multidirectional camera, 2, 3 and 4 might also be available.

Step 2: Enable the Analytics Scene Description producer

Now we'll enable the Analytics Scene Description producer using a POST method.

Don't forget to replace <device-ip>, <user> and <password> where applicable.

Enable producer
curl --anyauth -u <user>:<password> -X 'POST' \
'http://<device-ip>/axis-cgi/analyticsmetadataconfig.cgi' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"apiVersion": "1.0",
"context": "my context",
"method": "setEnabledProducers",
"params": {
"producers": [
{
"name": "AnalyticsSceneDescription",
"videochannels": [
{
"channel": 1,
"enabled": true
}
]
}
]
}
}'
Response
{
"apiVersion": "1.0",
"context": "my context",
"method": "setEnabledProducers",
"data": {}
}

Step 3: Retrieve supported metadata

Let's now retrieve information regarding the RTSP metadata analytics producers and what kind of metadata they can produce by the following POST method. One of reasons you would want to do this is to get an idea about what kind of metadata that can be included in the RTSP stream.

The supported metadata is communicated as a sample frame.

Don't forget to replace <device-ip>, <user> and <password> where applicable.

Retrieve supported metadata
curl --anyauth -u <user>:<password> -X 'POST' \
'http://<device-ip>/axis-cgi/analyticsmetadataconfig.cgi' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"apiVersion": "1.0",
"context": "my context",
"method": "getSupportedMetadata",
"params": {
"producers": [
"AnalyticsSceneDescription"
]
}
}'
Response
{
"apiVersion": "1.0",
"context": "my context",
"method": "getSupportedMetadata",
"data": {
"producers": [
{
"name": "AnalyticsSceneDescription",
"sampleFrameXML": "<see formatted example below!>
}
]
}
}
Formatted Sample Frame
<tt:SampleFrame xmlns:tt= \"http://www.onvif.org/ver10/schema\" xmlns:bd= \"http://www.onvif.org/ver20/analytics/humanbody\" UtcTime= \"2025-03-11T08:11:54.008178Z\" Source= \"AnalyticsSceneDescription\">
<tt:Object ObjectId= \"101\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.75\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.75\">Bus</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"102\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.7\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.7\">Car</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"103\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.65\">Human</tt:Type>
</tt:Class>
<tt:HumanBody>
<bd:Clothing>
<bd:Tops>
<bd:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"245\" Y= \"245\" Z= \"220\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</bd:Color>
</bd:Tops>
<bd:Bottoms>
<bd:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"245\" Y= \"245\" Z= \"220\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</bd:Color>
</bd:Bottoms>
</bd:Clothing>
</tt:HumanBody>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"104\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.6\">HumanFace</tt:Type>
</tt:Class>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"105\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.55\">LicensePlate</tt:Type>
</tt:Class>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"106\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Class>
<tt:Type Likelihood= \"0.5\">Bike</tt:Type>
</tt:Class>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"107\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.45\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.45\">Truck</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:Object ObjectId= \"108\">
<tt:Appearance>
<tt:Shape>
<tt:BoundingBox left= \"-0.6\" top= \"0.6\" right= \"-0.2\" bottom= \"0.2\" />
<tt:CenterOfGravity x= \"-0.4\" y= \"0.4\" />
<tt:Polygon>
<tt:Point x= \"-0.6\" y= \"0.6\" />
<tt:Point x= \"-0.6\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.2\" />
<tt:Point x= \"-0.2\" y= \"0.6\" />
</tt:Polygon>
</tt:Shape>
<tt:Color>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"128\" Y= \"128\" Z= \"128\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"0\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"0\" Z= \"255\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"0\" Y= \"128\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
<tt:ColorCluster>
<tt:Color X= \"255\" Y= \"255\" Z= \"0\" Likelihood= \"0.8\" Colorspace= \"RGB\" />
</tt:ColorCluster>
</tt:Color>
<tt:Class>
<tt:Type Likelihood= \"0.4\">Vehicle</tt:Type>
</tt:Class>
<tt:VehicleInfo>
<tt:Type Likelihood= \"0.4\">Vehicle</tt:Type>
</tt:VehicleInfo>
</tt:Appearance>
</tt:Object>
<tt:ObjectTree>
<tt:Rename>
<tt:to ObjectId= \"20\" />
<tt:from ObjectId= \"38\" />
</tt:Rename>
<tt:Delete ObjectId= \"1\" />
</tt:ObjectTree>
</tt:SampleFrame>"

Step 4: Connect to a RTSP stream with scene data

To connect to a RTSP stream that includes scene metadata from the enabled analytics metadata producers the following RTSP URL can be used,

rtsp://<device-ip>/axis-media/media.amp?camera=1&audio=0&video=0&analytics=polygon

For details on the different parameter in the URL see the Parameter specification RTSP URL reference for RTSP metadata.

Notably,

  • The camera parameter used to select output from instances with output that corresponds to the specific video source.
  • The video parameter is used to configure if video should be includes in the RTSP stream, with video=0 video will not be included.
  • The analytics parameter used configure if scene metadata should be included, with analytics=polygon scene metadata will be included.

The metadata is received as an XML document. The scene metadata uses the ONVIF tt:Frame data format and is located in the tt:VideoAnalytics element of the document.

<?xml version="1.0" ?>
<tt:MetadataStream xmlns:tt="http://www.onvif.org/ver10/schema">
<tt:VideoAnalytics>
<tt:Frame UtcTime="2025-03-27T12:58:26.799863Z" Source="AnalyticsSceneDescription">
...
</tt:Frame>
</tt:VideoAnalytics>
</tt:MetadataStream>

To view the data a tool such as the AXIS Metadata Monitor can be used.