Pseudo Tunnel

Intro

Let's assume that we have an embedded device (soil moisture IoT sensor) that routinely sends data to an internal server directly inside the private network. There is no strict requirement that the transmission should always be successful - the IoT sensor sends log 6 times a day and the server logic needs only 3 daily successful tranmissions to execute its business logic. Loss is tolerated as long it does not hinder the required 50% success rate.

Because the IoT sensor needs to be ultra low-power and the data is tranmistted via the relatively power hungry LTE network, the engineers come to the conclusion that sending 6 UDP single special short packets daily is more effient than opening even a single (once per day) bi-directional HTTP(S)session only to send some 8-bit diagnostics value.

On the (DMZ) public server side, the engineers want to avoid in-kernel packet minupulation like destination NAT (port-forwarding with iptables) and prefer to utilise a (containerised) cloud application to handle the packet processing in a way that the packet will eventually end up delived to an internal node.

Plain text tranmission is not an issue for now, but some form of obfuscation is diserable. There is also future requirement that the communication will eventually need to be switched towards AES-128/192 or 256 encryption in the fortcoming months when the European Cyber Resilence act finally kicks in.

In sounds like that the most effective solution will be a VPN-like technology and a trick to avoid port forawrding, but this combination is a bit different than anything existing.

You are probably familiar with the fact that many VPNs encapsulate the traffic over UDP. The connection between the server and client (or the two network nodes in case of point-to-point tunnel) is bidirectional and the internally encapsulated protocol defines whether it is connectionless (in case of UDP) or connection oriented (in case of TCP).

Below we want to demonstrate a simple use case of the PacketCord libary - namely a uni-directional UDP-encapsulated tunnel from the IoT device to the cloud, with on-demand (hardware-accelerated) encryption capability and without the need for any additional port-fordwaring configurations.

Setup and topology

For the embedded device client side, we are going to use the Toradex Verdin iMX8M Mini as if this is our IoT sensor. It is installed on the open-source Tangenta motherboard. For the public server side - we are going to use a cloud VPS (virtual private server) installed with Ubuntu. Because the final destination of the network packet, as per the above requirements (though imaginary), is an internal server, we are going emulate this via a Linux namespace on the VPS. Let's start with the topology diagram, describing our setup with all the necessary details:

Figure 1: Topology diagram.

Encapsulation

In order to send a packet from the Torizon container (Point B) to the internal server namespace (Point A), we can use the following packet encapulation. Upon reception on the VPS interface, we will decapsulate the outer header and pass the packet to the IP stack, so that it can apply the standard routing logic and deliver the UDP packet to the desired target destination.

Figure 2: Pseudo Tunnel Encupsulation.

We set the same source IP address inside both outer and inner IP headers (172.18.0.2)- in our case this is the IP of the Docker (Torizon) container itself. The outer header source will be changed by routers and other network appliances that apply NAT actions along the way to the cloud server.

We can encrypt any part of the payload or inner headers according to our needs from the code (by using the CORD-CRYPTO library).

Repository

We will use the example from the PacketCord.io official repo. We only need to modify the IP addresses according to our needs, for both the client and server sides.

Creating the Torizon Project

Let's create a new project inside the Torizon IDE.

Now, for the sake of brevity, delete the content inside CMakeLists.txt so that it will become an empty file. Also, delete main.cpp and create an empty file called l3_pseudo_tunnel_client_main.c.

Below in the tutorial we will provide the content with the necessary modification for both (you only need to adjust according to your actual IP addresses).

Building the example

To build the application, we will rely on CMake. We don't need any manual clone or download of the repository - this will be done by the CMake build system. For that purpose, populate the CMakeLists.txt file with the below content:

cmake_minimum_required(VERSION 3.16)

project(tunnel VERSION 1.0 LANGUAGES CXX C)

# Set C standard to C23 (required by PacketCord.io)
set(CMAKE_C_STANDARD 23)
set(CMAKE_C_STANDARD_REQUIRED ON)
set(CMAKE_C_EXTENSIONS OFF)

# Build type
if(NOT CMAKE_BUILD_TYPE)
    set(CMAKE_BUILD_TYPE Release)
endif()

# Compiler flags
set(CMAKE_C_FLAGS_DEBUG "-g -O0 -Wall -Wextra -DDEBUG")
set(CMAKE_C_FLAGS_RELEASE "-O3 -DNDEBUG")
set(CMAKE_CXX_FLAGS_DEBUG "-g -O0 -Wall -Wextra -DDEBUG")
set(CMAKE_CXX_FLAGS_RELEASE "-O3 -DNDEBUG")

# Fetch PacketCord.io from GitHub
include(FetchContent)

# Get PacketCord.io source but don't add subdirectories automatically
FetchContent_Declare(
    packetcord
    GIT_REPOSITORY https://github.com/packetcord/packetcord.io.git
    GIT_TAG        main
)

FetchContent_GetProperties(packetcord)
if(NOT packetcord_POPULATED)
    FetchContent_Populate(packetcord)
    # Add only cord-craft module (includes injector functionality)
    add_subdirectory(${packetcord_SOURCE_DIR}/modules/cord-craft ${packetcord_BINARY_DIR}/modules/cord-craft)
endif()

# Configure PacketCord.io with proper Linux definitions
if(TARGET cord_craft)
    target_compile_definitions(cord_craft PRIVATE
        _GNU_SOURCE
        __USE_MISC
        _DEFAULT_SOURCE
    )
endif()

# Set PacketCord.io variables
set(PACKETCORD_INCLUDE_DIRS 
    ${packetcord_SOURCE_DIR}/modules/cord-craft/include
)
set(PACKETCORD_LIBRARIES cord_craft)

# Create output directories
file(MAKE_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
file(MAKE_DIRECTORY ${CMAKE_BINARY_DIR}/lib)

# Create executable
add_executable(pseudotunnel src/l3_pseudo_tunnel_client_main.c)

# Include directories
target_include_directories(pseudotunnel PRIVATE
    ${CMAKE_CURRENT_SOURCE_DIR}/includes
)

# Include PacketCord.io headers
target_include_directories(pseudotunnel PRIVATE 
    ${PACKETCORD_INCLUDE_DIRS}
)

# Link libraries
target_link_libraries(pseudotunnel PRIVATE
    ${PACKETCORD_LIBRARIES}
)

# System-specific definitions for raw socket access
if(CMAKE_SYSTEM_NAME STREQUAL "Linux")
    target_compile_definitions(pseudotunnel PRIVATE
        _GNU_SOURCE
        __USE_MISC
    )
endif()

# Set target properties
set_target_properties(pseudotunnel PROPERTIES
    RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin"
    ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib"
    LIBRARY_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib"
)

# Install target
install(TARGETS pseudotunnel
    RUNTIME DESTINATION bin
)

# Override default clean to only remove pseudotunnel executable
set_target_properties(pseudotunnel PROPERTIES
    ADDITIONAL_CLEAN_FILES "${CMAKE_BINARY_DIR}/bin/pseudotunnel"
)

# Complete clean target
add_custom_target(distclean
    COMMAND ${CMAKE_COMMAND} -E remove_directory ${CMAKE_BINARY_DIR}/CMakeFiles
    COMMAND ${CMAKE_COMMAND} -E remove_directory ${CMAKE_BINARY_DIR}/bin  
    COMMAND ${CMAKE_COMMAND} -E remove_directory ${CMAKE_BINARY_DIR}/lib
    COMMAND ${CMAKE_COMMAND} -E remove_directory ${CMAKE_BINARY_DIR}/_deps
    COMMAND ${CMAKE_COMMAND} -E remove -f ${CMAKE_BINARY_DIR}/CMakeCache.txt
    COMMAND ${CMAKE_COMMAND} -E remove -f ${CMAKE_BINARY_DIR}/cmake_install.cmake
    COMMAND ${CMAKE_COMMAND} -E remove -f ${CMAKE_BINARY_DIR}/Makefile
    COMMENT "Complete clean - removes all build files and dependencies"
)

# Print configuration summary
message(STATUS "PacketCord.io include directories: ${PACKETCORD_INCLUDE_DIRS}")
message(STATUS "PacketCord.io libraries: ${PACKETCORD_LIBRARIES}")
message(STATUS "Build type: ${CMAKE_BUILD_TYPE}")
message(STATUS "C compiler: ${CMAKE_C_COMPILER}")
message(STATUS "CXX compiler: ${CMAKE_CXX_COMPILER}")

Code

Client side

#include <cord_craft/injector/cord_l3_stack_injector.h>
#include <cord_craft/protocols/cord_protocols.h>
#include <cord_craft/cord_retval.h>

#define MTU_SIZE            1420

#define OUTER_SOURCE_IP     "172.18.0.2"
#define OUTER_DEST_IP       "38.242.203.214"
#define OUTER_SOURCE_PORT   60000
#define OUTER_DEST_PORT     50000

#define INNER_SOURCE_IP     "172.18.0.2"
#define INNER_DEST_IP       "11.11.11.100"
#define INNER_SOURCE_PORT   1234
#define INNER_DEST_PORT     8765

#define PAYLOAD             "Hello from Toradex Verdin SoM!!!"
#define PAYLOAD_LEN         32

static struct
{
    CordInjector *l3_ci;
} cord_app_context;

static void cord_app_setup(void)
{
    CORD_LOG("[CordApp] No additional setup needed.\n");
}

static void cord_app_cleanup(void)
{
    CORD_LOG("[CordApp] Destroying all objects!\n");
    CORD_DESTROY_INJECTOR(cord_app_context.l3_ci);
}

int main()
{
    cord_retval_t cord_retval;
    uint8_t buffer[MTU_SIZE];
    size_t tx_bytes = 0;

    cord_app_context.l3_ci = CORD_CREATE_L3_STACK_INJECTOR('I');

    for (uint16_t n = 0; n < MTU_SIZE; n++) buffer[n] = 0x00;

    // Calculate payload and packet sizes
    const int inner_payload_len = PAYLOAD_LEN;
    const int inner_total_len = sizeof(cord_ipv4_hdr_t) + sizeof(cord_udp_hdr_t) + inner_payload_len;
    const int outer_payload_len = inner_total_len;
    const int outer_total_len = sizeof(cord_ipv4_hdr_t) + sizeof(cord_udp_hdr_t) + outer_payload_len;

    // Layer offsets using cord-craft structures
    cord_ipv4_hdr_t *outer_ip = (cord_ipv4_hdr_t *) buffer;
    cord_udp_hdr_t *outer_udp = (cord_udp_hdr_t *) (buffer + sizeof(cord_ipv4_hdr_t));
    uint8_t *encapsulated = buffer + sizeof(cord_ipv4_hdr_t) + sizeof(cord_udp_hdr_t);

    // Inner IP/UDP setup
    cord_ipv4_hdr_t *inner_ip = (cord_ipv4_hdr_t *) encapsulated;
    cord_udp_hdr_t *inner_udp = (cord_udp_hdr_t *) (encapsulated + sizeof(cord_ipv4_hdr_t));
    char *inner_payload = (char *) (encapsulated + sizeof(cord_ipv4_hdr_t) + sizeof(cord_udp_hdr_t));

    // Copy payload
    for (uint8_t n = 0; n < PAYLOAD_LEN; n++) inner_payload[n] = PAYLOAD[n];

    // Build inner IP header using cord-craft structure
    inner_ip->version = 4;
    inner_ip->ihl = 5; // Base header
    inner_ip->tos = 0;
    inner_ip->tot_len = cord_htons(inner_total_len);
    inner_ip->id = cord_htons(123);
    inner_ip->frag_off = 0;
    inner_ip->ttl = 32;
    inner_ip->protocol = CORD_IPPROTO_UDP;
    inner_ip->saddr.addr = inet_addr(INNER_SOURCE_IP);
    inner_ip->daddr.addr = inet_addr(INNER_DEST_IP);
    inner_ip->check = 0;
    inner_ip->check = cord_htons(cord_ipv4_checksum(inner_ip));

    // Build inner UDP header using cord-craft structure
    inner_udp->source = cord_htons(INNER_SOURCE_PORT);
    inner_udp->dest = cord_htons(INNER_DEST_PORT);
    inner_udp->len = cord_htons(sizeof(cord_udp_hdr_t) + inner_payload_len);
    inner_udp->check = 0;
    inner_udp->check = cord_htons(cord_udp_checksum_ipv4(inner_ip));

    // Build outer IP header using cord-craft structure
    outer_ip->version = 4;
    outer_ip->ihl = 5; // Base header as well
    outer_ip->tos = 0;
    outer_ip->tot_len = cord_htons(outer_total_len);
    outer_ip->id = cord_htons(321);
    outer_ip->frag_off = 0;
    outer_ip->ttl = 32;
    outer_ip->protocol = CORD_IPPROTO_UDP;
    outer_ip->saddr.addr = inet_addr(OUTER_SOURCE_IP);
    outer_ip->daddr.addr = inet_addr(OUTER_DEST_IP);
    outer_ip->check = 0;
    outer_ip->check = cord_htons(cord_ipv4_checksum(outer_ip));

    // Build outer UDP header using cord-craft structure
    outer_udp->source = cord_htons(OUTER_SOURCE_PORT);
    outer_udp->dest = cord_htons(OUTER_DEST_PORT);
    outer_udp->len = cord_htons(sizeof(cord_udp_hdr_t) + outer_payload_len);
    outer_udp->check = 0;
    outer_udp->check = cord_htons(cord_udp_checksum_ipv4(outer_ip));

    cord_ipv4_hdr_t *outer_ip_hdr = cord_get_ipv4_hdr_l3(buffer);
    CORD_L3_STACK_INJECTOR_SET_TARGET_IPV4(cord_app_context.l3_ci, cord_get_ipv4_dst_addr_l3(outer_ip_hdr));

    CORD_LOG("[CordApp] Transmitting the crafted pseudo-tunnel packet...\n");
    cord_retval = CORD_INJECTOR_TX(cord_app_context.l3_ci, buffer, cord_get_ipv4_total_length_ntohs(outer_ip_hdr), &tx_bytes);
    if (cord_retval != CORD_OK)
    {
        // Handle the error
    }

    cord_app_cleanup();

    return CORD_OK;
}

Server side

#include <cord_flow/event_handler/cord_linux_api_event_handler.h>
#include <cord_flow/flow_point/cord_l2_raw_socket_flow_point.h>
#include <cord_flow/flow_point/cord_l3_stack_inject_flow_point.h>
#include <cord_flow/flow_point/cord_l4_udp_flow_point.h>
#include <cord_flow/memory/cord_memory.h>
#include <cord_flow/match/cord_match.h>
#include <cord_error.h>

#define MTU_SIZE 1420
#define ETHERNET_HEADER_SIZE 14
#define DOT1Q_TAG_SIZE 4

#define BUFFER_SIZE (MTU_SIZE + ETHERNET_HEADER_SIZE)

static struct
{
    CordFlowPoint *l3_si;
    CordFlowPoint *l4_udp;
    CordEventHandler *evh;
} cord_app_context;

static void cord_app_setup(void)
{
    CORD_LOG("[CordApp] Expecting manual additional setup - blackhole routes, interface MTU.\n");
}

static void cord_app_cleanup(void)
{
    CORD_LOG("[CordApp] Destroying all objects!\n");
    CORD_DESTROY_FLOW_POINT(cord_app_context.l3_si);
    CORD_DESTROY_FLOW_POINT(cord_app_context.l4_udp);
    CORD_DESTROY_EVENT_HANDLER(cord_app_context.evh);

    CORD_LOG("[CordApp] Expecting manual additional cleanup.\n");
}

static void cord_app_sigint_callback(int sig)
{
    cord_app_cleanup();
    CORD_LOG("[CordApp] Terminating the PacketCord Tunnel App!\n");
    CORD_ASYNC_SAFE_EXIT(CORD_OK);
}

int main(void)
{
    cord_retval_t cord_retval;
    CORD_BUFFER(buffer, BUFFER_SIZE);
    size_t rx_bytes = 0;
    size_t tx_bytes = 0;

    cord_ipv4_hdr_t *ip = NULL;
    cord_udp_hdr_t *udp = NULL;

    CORD_LOG("[CordApp] Launching the PacketCord Tunnel App!\n");

    signal(SIGINT, cord_app_sigint_callback);

    cord_app_context.l3_si  = CORD_CREATE_L3_STACK_INJECT_FLOW_POINT('I');
    cord_app_context.l4_udp = CORD_CREATE_L4_UDP_FLOW_POINT('A', inet_addr("38.242.203.214"), inet_addr("78.83.207.86"), 50000, 60000);

    cord_app_context.evh = CORD_CREATE_LINUX_API_EVENT_HANDLER('E', -1);

    cord_retval = CORD_EVENT_HANDLER_REGISTER_FLOW_POINT(cord_app_context.evh, cord_app_context.l4_udp);

    while (1)
    {
        int nb_fds = CORD_EVENT_HANDLER_WAIT(cord_app_context.evh);

        if (nb_fds == -1)
        {
            if (errno == EINTR)
                continue;
            else
            {
                CORD_ERROR("[CordApp] Error: CORD_EVENT_HANDLER_WAIT()");
                CORD_EXIT(CORD_ERR);
            }
        }

        for (uint8_t n = 0; n < nb_fds; n++)
        {
            if (cord_app_context.evh->events[n].data.fd == cord_app_context.l4_udp->io_handle)
            {
                cord_retval = CORD_FLOW_POINT_RX(cord_app_context.l4_udp, buffer, BUFFER_SIZE, &rx_bytes);
                if (cord_retval != CORD_OK)
                    continue; // Raw socket receive error

                cord_ipv4_hdr_t *ip_inner = cord_get_ipv4_hdr_l3(buffer);

                if (rx_bytes != cord_get_ipv4_total_length_ntohs(ip_inner))
                    continue; // Packet partially received

                if (!cord_match_ipv4_version(ip_inner))
                    continue;

                int ip_inner_hdrlen = cord_get_ipv4_header_length(ip_inner);

                CORD_L3_STACK_INJECT_FLOW_POINT_SET_TARGET_IPV4(cord_app_context.l3_si, cord_get_ipv4_dst_addr_l3(ip_inner));

                cord_retval = CORD_FLOW_POINT_TX(cord_app_context.l3_si, buffer, cord_get_ipv4_total_length_ntohs(ip_inner), &tx_bytes);
                if (cord_retval != CORD_OK)
                {
                    // Handle the error
                }
            }
        }
    }

    cord_app_cleanup();

    return CORD_OK;
}

Configuration

Client side

Adjust the OUTER_SOURCE_IP and INNER_SOURCE_IP macros inside the client code - since still will be the only container attached to the default Docker network, it's IP address is 172.18.0.2/24 (becase 172.18.0.1/24 is the IP address of the host bridge).

Then, try to build the project via Terminal > Run Task > run-container-torizon-release-arm64 from the Torizon IDE.

Server side

# 1. Create the namespace
ip netns add ns1

# 2. Create the veth pair
ip link add veth0 type veth peer name veth1

# 3. Move veth1 to namespace ns1
ip link set veth1 netns ns1

# 4. Assign IP address to veth0 (host side)
ip addr add 11.11.11.1/24 dev veth0

# 5. Bring up veth0
ip link set veth0 up

# 6. Inside the namespace, assign IP to veth1
ip netns exec ns1 ip addr add 11.11.11.100/24 dev veth1

# 7. Bring up veth1 inside namespace
ip netns exec ns1 ip link set veth1 up

# 8. Bring up the loopback interface inside namespace
ip netns exec ns1 ip link set lo up

# 9. Set default route inside namespace via veth0's IP
ip netns exec ns1 ip route add default via 11.11.11.1

# 10. Disable the offload functionality on the vethX interfaces
ethtool --offload veth0 rx off tx off
ip netns exec ns1 ethtool --offload veth1 rx off tx off

# 11. Start netcat as UDP server
ip netns exec ns1 nc -n -u -l -p 8765 -s 11.11.11.100 -v

Compile the server side code and run it (in a separate SSH session):

mkdir build
cd build/
cmake ..
make
sudo ./apps/l3_pseudo_tunnel/l3_pseudo_tunnel_server

In case no errors have occured, the console output should be something like:

[CordApp] Launching the PacketCord Pseudo Tunnel App!
[CordL4UdpFlowPoint] Successfully bound to port 50000

Result

Re-run the project via Terminal > Run Task > run-container-torizon-release-arm64 from the Torizon IDE and observe the VPS server netcat terminal (stdout) output - it reports the received payload via the pseudo-tunnel application.

# ip netns exec NS1 nc -n -u -l -p 8765 -s 11.11.11.100 -v
Bound on 11.11.11.100 8765
Connection received on 172.18.0.2 1234
Hello from Toradex Verdin SoM!!!

Outro

We demonstate a different type of extremely simple tunneling method that is Docker-friendly on the server side and flexible for MPU or MCU implementation within several lines of code, without the need for tunneling interfaces and complicated configurations.

Note: Although the example is running on Linux, the approach could be applied on bare-metal microcontrollers, on buffer level (maybe Ethernet attached over SPI and manually populating the buffer to craft the pseudo tunnel packet), without any IP stack. It is also worth trying with the Cortex-M4 core available inside the iMX.8 SoCs (especially the model with dual Ethernet interface) - both with and without LwIP.