Difference between revisions of "Protobuf notes"
m (→^ Encoding Submessages: - add link to google dot groups dot com, nanopb post.) |
m (→^ Encoding Submessages: `pb_encode_tag()` followed by `pb_encode_submessage()`,) |
||
Line 167: | Line 167: | ||
* https://groups.google.com/g/nanopb/c/OT4Kw3Siuio | * https://groups.google.com/g/nanopb/c/OT4Kw3Siuio | ||
+ | |||
+ | May be necessary in a pb_callback_t function to call `pb_encode_tag()` followed by `pb_encode_submessage()`. | ||
+ | |||
+ | * https://github.com/nanopb/nanopb/issues/331 | ||
== [[#top|^]] Length Prefixing == | == [[#top|^]] Length Prefixing == |
Revision as of 04:16, 12 October 2024
Contents
Overview
There's a lot going on with Protobuf, therefore this local Neela Nurseries page started to capture some links to online docs and tutorials.
Also a good more detailed introduction, the following documentation page is one of a full collection of documentations. It contains example C code to "get the size of the message without storing it anywhere":
Another reference with possible good detail:
Encoding details of protobuf "on the wire":
^ Terminology and Elements Of
Protobuf has several features and "moving parts", Two of these which we'll loop back and write a bit more about include filed names and field numbers. Both of these are important identifiers which are expressed in the .proto message defining files used at build time for senders and receivers of protobuf formatted messages. Because they're defined at build time, this means that changing these name and numeric field identifiers after software has been built and released can and will likely cause message interpretation errors. The newer message format is not understood by the older version of software sending and receiving the updated messages.
Protobuf has a means for handling cases where field names and numbers need to be "removed". This means involves changing the message definitions to mark those defunct identifiers as reserved. They cannot be re-used, different names and numbers must be chosen, but they can be taken "out of circulation". Read further at:
^ Factoring Protobug Message Definitions
The following online guide speaks to defining multiple related protobuf messages in a single .proto file:
^ Nanopb
2022-01-08 Saturday
- https://github.com/nanopb/nanopb/blob/master/generator/proto/nanopb.proto
- https://jpa.kapsi.fi/nanopb/docs/whats_new.html
- https://jpa.kapsi.fi/nanopb/docs/
Cmake script to locate Nanopb headers and sources:
^ Protobuf C Code Examples
When compiling nanopb Protobuf library as part of C language programs, nested Protobuf messages require use of nanopb defined function type `pb_callback_t` in order to encode and to decode those nested messages. Some examples of this on github:
In the first example an early on file instance of `pb_callback_t` occurs on line 56. Looking further this project has a few dozen protoc generated files . . . switching to a possible smaller project:
In kitsune project, looking at:
(1) file kitsune/kitsune/audio_features_upload_task.c function setup_protbuf( . . . )
(2) file audio_features_upload_task_helpers.c function encode_repeated_streaming_bytes_and_mark_done(pb_ostream_t *stream, const pb_field_t *field, void * const *arg)
(3) in same file reviewing function write_streams(pb_ostream_t *stream, const pb_field_t *field,hlo_stream_t * hlo_stream)
Here is an excerpt from proto_utils.c which appears to contain a pb_callback_t definition:
147 bool encode_device_id_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg) { 148 //char are twice the size, extra 1 for null terminator 149 char hex_device_id[2*DEVICE_ID_SZ+1] = {0}; 150 if(!get_device_id(hex_device_id, sizeof(hex_device_id))) 151 { 152 return false; 153 } 154 155 return pb_encode_tag_for_field(stream, field) && pb_encode_string(stream, (uint8_t*)hex_device_id, strlen(hex_device_id)); 156 } Same routine no line numbers, plus following routine which references first routine in function point assignment: bool encode_device_id_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg) { //char are twice the size, extra 1 for null terminator char hex_device_id[2*DEVICE_ID_SZ+1] = {0}; if(!get_device_id(hex_device_id, sizeof(hex_device_id))) { return false; } return pb_encode_tag_for_field(stream, field) && pb_encode_string(stream, (uint8_t*)hex_device_id, strlen(hex_device_id)); } void pack_batched_periodic_data(batched_periodic_data* batched, periodic_data_to_encode* encode_wrapper) { if(NULL == batched || NULL == encode_wrapper) { LOGE("null param\n"); return; } batched->data.funcs.encode = encode_all_periodic_data; // This is smart :D batched->data.arg = encode_wrapper; batched->firmware_version = KIT_VER; batched->device_id.funcs.encode = encode_device_id_string; } A search for calls to `pack_batched_periodic_data()`: $ grep -nr pack_batched_periodic_data ./* ./commands.c:1069: pack_batched_periodic_data(&data_batched, &periodicdata); ./proto_utils.c:158:void pack_batched_periodic_data(batched_periodic_data* batched, periodic_data_to_encode* encode_wrapper) ./proto_utils.h:29:void pack_batched_periodic_data(batched_periodic_data* batched, periodic_data_to_encode* encode_wrapper);
Tracing yet further back kitsune project commands.c has following routine which declares and uses a `periodic_data` type:
1038 void thread_tx(void* unused) { 1039 batched_periodic_data data_batched = {0}; 1040 #ifdef UPLOAD_AP_INFO 1041 batched_periodic_data_wifi_access_point ap; 1042 #endif 1043 periodic_data forced_data; 1044 bool got_forced_data = false; 1045 1046 LOGI(" Start polling \n"); 1047 while (1) { 1048 if (uxQueueMessagesWaiting(data_queue) >= data_queue_batch_size 1049 || got_forced_data ) { 1050 LOGI( "sending data\n" ); 1051 1052 periodic_data_to_encode periodicdata; 1053 periodicdata.num_data = 0; 1054 periodicdata.data = (periodic_data*)pvPortMalloc(MAX_BATCH_SIZE*sizeof(periodic_data)); 1055 1056 if( !periodicdata.data ) { 1057 LOGI( "failed to alloc periodicdata\n" ); 1058 vTaskDelay(1000); 1059 continue; 1060 } 1061 if( got_forced_data ) { 1062 memcpy( &periodicdata.data[periodicdata.num_data], &forced_data, sizeof(forced_data) ); 1063 ++periodicdata.num_data; 1064 } 1065 while( periodicdata.num_data < MAX_BATCH_SIZE && xQueueReceive(data_queue, &periodicdata.data[periodicdata.num_ data], 1 ) ) { 1066 ++periodicdata.num_data; 1067 } 1068 1069 pack_batched_periodic_data(&data_batched, &periodicdata); 1070 1071 data_batched.has_uptime_in_second = true; 1072 data_batched.uptime_in_second = xTaskGetTickCount() / configTICK_RATE_HZ; 1073 1074 if( !is_test_boot() && provisioning_mode ) { . . .
In this kitsune project see also `kitsune/kitsune/protobuf/provision.pb.h`.
^ Encoding Submessages
May be necessary in a pb_callback_t function to call `pb_encode_tag()` followed by `pb_encode_submessage()`.
^ Length Prefixing
One way to send large data sets via protobuf is to break them into smaller pieces, and apply protobuf definition to give these pieces a meaning both sender and receiver can understand. See one Mr. Eli's article on this strategy:
^ References To Sort
Protobuf references, somewhat arbitrary starting point yet introduces some key topics of Protobuf standard and use cases:
- https://www.crankuptheamps.com/blog/posts/2017/10/12/protobuf-battle-of-the-syntaxes/
- https://www.educative.io/edpresso/what-is-the-difference-between-protocol-buffers-and-json
JSON supported data types:
First Protobuf .proto file, compiles using `protoc-c`, part of a package available with Ubuntu 20.04:
// syntax = "proto3"; syntax = "proto2"; // Notes: // $ protoc-c --c_out=. ./first.proto message sensorUpdates { required int32 message_id = 1; optional float vrms = 2; }
. . . It appears that the integer values which message elements are assigned as tantamount to key names in JSON.