Difference between revisions of "Protobuf notes"

From Wiki at Neela Nurseries
Jump to: navigation, search
m (^ Nanopb: - link to a nanopb API reference.)
m (^ Nanopb: add API refs to pb_encode() and pb_encode_submessage().)
Line 52: Line 52:
  
 
*  https://jpa.kapsi.fi/nanopb/docs/reference.html
 
*  https://jpa.kapsi.fi/nanopb/docs/reference.html
 +
 +
*  https://jpa.kapsi.fi/nanopb/docs/reference.html#pb_encode
 +
 +
*  https://jpa.kapsi.fi/nanopb/docs/reference.html#pb_encode_submessage
  
 
== [[#top|^]] Protobuf C Code Examples ==
 
== [[#top|^]] Protobuf C Code Examples ==

Revision as of 05:07, 12 October 2024

Overview

There's a lot going on with Protobuf, therefore this local Neela Nurseries page started to capture some links to online docs and tutorials.

Also a good more detailed introduction, the following documentation page is one of a full collection of documentations. It contains example C code to "get the size of the message without storing it anywhere":


Another reference with possible good detail:

Encoding details of protobuf "on the wire":

^ Terminology and Elements Of

Protobuf has several features and "moving parts", Two of these which we'll loop back and write a bit more about include filed names and field numbers. Both of these are important identifiers which are expressed in the .proto message defining files used at build time for senders and receivers of protobuf formatted messages. Because they're defined at build time, this means that changing these name and numeric field identifiers after software has been built and released can and will likely cause message interpretation errors. The newer message format is not understood by the older version of software sending and receiving the updated messages.

Protobuf has a means for handling cases where field names and numbers need to be "removed". This means involves changing the message definitions to mark those defunct identifiers as reserved. They cannot be re-used, different names and numbers must be chosen, but they can be taken "out of circulation". Read further at:

^ Factoring Protobug Message Definitions

The following online guide speaks to defining multiple related protobuf messages in a single .proto file:

^ Nanopb

2022-01-08 Saturday

Cmake script to locate Nanopb headers and sources:

A nanopb API reference:

^ Protobuf C Code Examples

When compiling nanopb Protobuf library as part of C language programs, nested Protobuf messages require use of nanopb defined function type `pb_callback_t` in order to encode and to decode those nested messages. Some examples of this on github:

In the first example an early on file instance of `pb_callback_t` occurs on line 56. Looking further this project has a few dozen protoc generated files . . . switching to a possible smaller project:

In kitsune project, looking at:

(1) file kitsune/kitsune/audio_features_upload_task.c function setup_protbuf( . . . )
(2) file audio_features_upload_task_helpers.c function encode_repeated_streaming_bytes_and_mark_done(pb_ostream_t *stream, const pb_field_t *field, void * const *arg)
(3) in same file reviewing function write_streams(pb_ostream_t *stream, const pb_field_t *field,hlo_stream_t * hlo_stream)

Here is an excerpt from proto_utils.c which appears to contain a pb_callback_t definition:

147 bool encode_device_id_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg) {
148     //char are twice the size, extra 1 for null terminator
149     char hex_device_id[2*DEVICE_ID_SZ+1] = {0};
150     if(!get_device_id(hex_device_id, sizeof(hex_device_id)))
151     {
152         return false;
153     }
154 
155     return pb_encode_tag_for_field(stream, field) && pb_encode_string(stream, (uint8_t*)hex_device_id, strlen(hex_device_id));
156 }

Same routine no line numbers, plus following routine which references first routine in function point assignment:

bool encode_device_id_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg) {
    //char are twice the size, extra 1 for null terminator
    char hex_device_id[2*DEVICE_ID_SZ+1] = {0};
    if(!get_device_id(hex_device_id, sizeof(hex_device_id)))
    {   
        return false;
    }   

    return pb_encode_tag_for_field(stream, field) && pb_encode_string(stream, (uint8_t*)hex_device_id, strlen(hex_device_id));
}

void pack_batched_periodic_data(batched_periodic_data* batched, periodic_data_to_encode* encode_wrapper)
{
    if(NULL == batched || NULL == encode_wrapper)
    {   
        LOGE("null param\n");
        return;
    }   

    batched->data.funcs.encode = encode_all_periodic_data;  // This is smart :D
    batched->data.arg = encode_wrapper;
    batched->firmware_version = KIT_VER;
    batched->device_id.funcs.encode = encode_device_id_string;
}

A search for calls to `pack_batched_periodic_data()`:

$ grep -nr pack_batched_periodic_data ./*
./commands.c:1069:			pack_batched_periodic_data(&data_batched, &periodicdata);
./proto_utils.c:158:void pack_batched_periodic_data(batched_periodic_data* batched, periodic_data_to_encode* encode_wrapper)
./proto_utils.h:29:void pack_batched_periodic_data(batched_periodic_data* batched, periodic_data_to_encode* encode_wrapper);

Tracing yet further back kitsune project commands.c has following routine which declares and uses a `periodic_data` type:

1038 void thread_tx(void* unused) {
1039         batched_periodic_data data_batched = {0};
1040 #ifdef UPLOAD_AP_INFO
1041         batched_periodic_data_wifi_access_point ap;
1042 #endif
1043         periodic_data forced_data;
1044         bool got_forced_data = false;
1045 
1046         LOGI(" Start polling  \n");
1047         while (1) {
1048                 if (uxQueueMessagesWaiting(data_queue) >= data_queue_batch_size
1049                  || got_forced_data ) {
1050                         LOGI(   "sending data\n" );
1051 
1052                         periodic_data_to_encode periodicdata;
1053                         periodicdata.num_data = 0;
1054                         periodicdata.data = (periodic_data*)pvPortMalloc(MAX_BATCH_SIZE*sizeof(periodic_data));
1055 
1056                         if( !periodicdata.data ) {
1057                                 LOGI( "failed to alloc periodicdata\n" );
1058                                 vTaskDelay(1000);
1059                                 continue;
1060                         }
1061                         if( got_forced_data ) {
1062                                 memcpy( &periodicdata.data[periodicdata.num_data], &forced_data, sizeof(forced_data) );
1063                                 ++periodicdata.num_data;
1064                         }
1065                         while( periodicdata.num_data < MAX_BATCH_SIZE && xQueueReceive(data_queue, &periodicdata.data[periodicdata.num_     data], 1 ) ) {
1066                                 ++periodicdata.num_data;
1067                         }
1068 
1069                         pack_batched_periodic_data(&data_batched, &periodicdata);
1070 
1071                         data_batched.has_uptime_in_second = true;
1072                         data_batched.uptime_in_second = xTaskGetTickCount() / configTICK_RATE_HZ;
1073 
1074                         if( !is_test_boot() && provisioning_mode ) {

 . . .

In this kitsune project see also `kitsune/kitsune/protobuf/provision.pb.h`.

^ Encoding Submessages

May be necessary in a pb_callback_t function to call `pb_encode_tag()` followed by `pb_encode_submessage()`.

^ Length Prefixing

One way to send large data sets via protobuf is to break them into smaller pieces, and apply protobuf definition to give these pieces a meaning both sender and receiver can understand. See one Mr. Eli's article on this strategy:

^ References To Sort

Protobuf references, somewhat arbitrary starting point yet introduces some key topics of Protobuf standard and use cases:

JSON supported data types:


First Protobuf .proto file, compiles using `protoc-c`, part of a package available with Ubuntu 20.04:

// syntax = "proto3";
syntax = "proto2";

// Notes:
// $ protoc-c --c_out=. ./first.proto

message sensorUpdates {
  required int32 message_id = 1;
  optional float vrms = 2;
}

. . . It appears that the integer values which message elements are assigned as tantamount to key names in JSON.