Stupid RFNoC Tricks: Loopback

One of the most useful applications for the RFNoC architecture is loopback: receiving a signal, processing it somehow, and retransmitting it. Unfortunately, RFNoC doesn’t support loopback out of the box. This post details how to modify RFNoC (and your applications) to make loopback work.

This post assumes considerable familiarity with RFNoC, CHDR, Verilog, C++, UHD, and GNU Radio. Here are two good presentations covering the basics of RFNoC:

Martin Braun, “It’s The RFNoC™ Life, For Us”, GNU Radio Conference 2016

RFNoC™ Deep Dive: FPGA Side, Wireless at Virginia Tech 2014

The rest of those subjects, well, that’s why we get paid.

One additional caveat: RFNoC is in flux and its implementation is subject to change. It’s probable that loopback support will be added to a future revision of UHD. This approach works as of 4/22/2017 but should be reevaluated as to the current state of UHD.

Update 2/23/18: This has been tested and works as of rfnoc-devel commit ec9138eb. Thanks to Adam Kurisko for the report.

Step 1: Timestamping

In ordinary use, the RFNoC Radio block’s RX component, which is responsible for collecting samples from the ADC and packetizing them, also adds a timestamp to the CHDR header at the start of every packet. The timestamp is used by the host to align samples from multiple streams. The problem is that in TX mode the same timestamp is used to queue transmissions to start at a predetermined time. This is problematic because the timestamp will be late by definition by the time it gets to the TX Radio block (the RX and TX Radio share the same clock), and so it will be marked late and rejected. So, let’s get rid of, or modify, the timestamp. Three approaches present themselves:

  1. Prevent the RX Radio from adding the timestamp,
  2. Strip the timestamp out somewhere before the TX Radio, or
  3. Add a constant offset to the timestamp somewhere before the TX Radio

Approach 1: Disable Timestamping

Approach #1 is the easiest to implement. There’s a register in the RX Radio block to disable timestamping. You can find the register definition in fpga-src/usrp3/lib/radio/radio_core_regs.vh:

localparam [7:0] SR_RX_CTRL_OUTPUT_FORMAT = 158;

The register is instantiated in fpga-src/usrp3/lib/radio/rx_control_gen3.v:

 setting_reg #(.my_addr(SR_RX_CTRL_OUTPUT_FORMAT), .width(1), .at_reset(1'b1)) sr_output_format (
 .clk(clk),.rst(reset),.strobe(set_stb),.addr(set_addr),
 .in(set_data),.out(use_timestamps),.changed());

…and the resulting use_timestamps register is used when constructing the RX Radio CHDR header:

 wire [127:0] rx_header = {2'b00, use_timestamps, eob, seqnum_cnt, 16'h0 /* len ignored */, sid, start_time};

At this point the solution is clear. Just write a zero to that register. Right now, if you just want to hardcode it and wash your hands of the problem, you can just change the at_reset parameter of the settings register to 1’b0 instead of 1’b1. Then the block will default to not including timestamps. You could also just force use_timestamps to zero and ignore the settings register entirely. But there’s no need to recompile the FPGA if we write to that register from the host machine at initialization, so let’s explore that avenue next.

If you look in uhd/host/lib/rfnoc/radio_ctrl_impl.hpp, you’ll see a bunch of useful-looking registers defined:

static const uint32_t RX_CTRL_HALT = 155;
static const uint32_t RX_CTRL_MAXLEN = 156;
static const uint32_t RX_CTRL_CLEAR_CMDS = 157;
static const uint32_t MISC_OUTS = 160;
static const uint32_t DACSYNC = 161;
static const uint32_t SPI = 168;

…but, notably, register 158 (the register defined in radio_core_regs.vh above) is absent (at least, as of this writing). So, let’s add it:

static const uint32_t RX_CTRL_OUTPUT_FORMAT= 158;

Now let’s go into uhd/host/lib/rfnoc/radio_ctrl_impl.cpp and add a command to write to that register when we set up the RX stream. Go into radio_ctrl_impl::issue_stream_cmd() and add the following line just before issuing the actual stream command:

sr_write(regs::RX_CTRL_OUTPUT_FORMAT, boost::uint32_t(0), chan);

Great. Now we’ve disabled timestamping. You can verify this by looking at the data on the wire using Wireshark.

If you want to get fancier at this point, you could wire in the option to control timestamping into UHD’s property tree or stream arguments so that it’s controllable by client software. I’m lazy, so I’ll leave that as an exercise for the reader.

Approach 2: Strip Out Timestamps

Let’s say you have a custom RFNoC signal-processing block, and you’re handling CHDR headers yourself anyway. Well, while you’re at it, you can always strip out the timestamps that the RX Radio block added. This has the advantage of not requiring any host code changes. Ettus added a handy block called “cvita_hdr_modify” which allows easy mangling of the deserialized CHDR headers that the axi_wrapper places in tuser for you. The block is combinatorial and doesn’t require any clocks or state. It lets you specify which headers you want to mangle, and which you want to leave alone and copy from the incoming header. If you’re handling headers yourself, you always need to mangle src_sid and dst_sid, so that the NOC crossbar knows where to send packets to and from your block. This is handled automatically for you if you use the axi_wrapper with SIMPLE_MODE(1), but you can’t always do that (for instance, if your block has multiple outputs, or modifies the length of incoming packets). So in addition to mangling src_sid and dst_sid, we’ll also mangle has_time:

 cvita_hdr_modify cvita_hdr_modify_data (
 .header_in(m_axis_data_tuser),
 .header_out(s_axis_data_tuser),
 .use_pkt_type(1'b0), .pkt_type(),
 .use_has_time(1'b1), .has_time(1'b0),
 .use_eob(1'b0), .eob(),
 .use_seqnum(1'b0), .seqnum(),
 .use_length(1'b0), .length(),
 .use_src_sid(1'b1), .src_sid(src_sid),
 .use_dst_sid(1'b1), .dst_sid(next_dst_sid),
 .use_vita_time(1'b0), .vita_time());

Now the axi_wrapper will clear the has_time bit from the outgoing CHDR header. As a bonus, it will also correctly set the packet length to compensate for no longer having a timestamp.

Approach 3: Adding a fixed delay

Approach #3 is useful if you want to maintain a constant RX->TX delay time for some reason. The goal here is to modify the incoming timestamps by adding a fixed delay, so that the TX Radio block hangs onto the packets until the current time matches the incoming timestamp. There’s a couple of caveats here:

  1. If the TX Radio receives a timestamp which is in the past when the packet arrives, it will drop the packet because it is late.
  2. If you add too much delay time, the packets will stack up in the buffers between the RX Radio and TX Radio until they overflow, which will stop the RX Radio from streaming.

How much buffering you have depends on lots of things. You can use the DMA FIFO block to add a considerable amount of SDRAM-backed buffering in the event that you need a lot of delay. You might also just consider running at a lower sample rate, if possible, to reduce your buffering requirement for a given delay.

So let’s go over the approach. It looks a whole lot like Approach #2, because we’re modifying the CHDR header similarly:

localparam [63:0] DELAYTIME = 1000;
wire [63:0] future_vita_time = m_axis_data_tuser[63:0] + DELAYTIME;
cvita_hdr_modify cvita_hdr_modify_data (
 .header_in(m_axis_data_tuser),
 .header_out(s_axis_data_tuser),
 .use_pkt_type(1'b0), .pkt_type(),
 .use_has_time(1'b0), .has_time(),
 .use_eob(1'b0), .eob(),
 .use_seqnum(1'b0), .seqnum(),
 .use_length(1'b0), .length(),
 .use_src_sid(1'b1), .src_sid(src_sid),
 .use_dst_sid(1'b1), .dst_sid(next_dst_sid),
 .use_vita_time(1'b1), .vita_time(future_vita_time));

Here you see that we declare a delay time, and then add it to the extracted time (the lower 64 bits) from the incoming header. If you want to get fancy and make it changeable at runtime, you could use a settings register:

 wire [31:0] delaytime;
 setting_reg #(
 .my_addr(SR_DELAYTIME), .awidth(8), .width(32))
 sr_delaytime (
 .clk(ce_clk), .rst(ce_rst),
 .strobe(set_stb[0]), .addr(set_addr[0]), .in(set_data[0]), .out(delaytime), .changed());

This limits you to 32 bits of delay, but since you definitely don’t have 32 bits of samples worth of buffering, this is fine. Some thinking about this approach reveals a caveat: you can never reduce the delay at runtime, because the packets already in the queue would take enough time to clear that by the time the first packet with the reduced delay arrives, it will be too late. Creative approaches to solving this problem might involve dropping incoming packets until you “catch up”.

Remember: the VITA time is in full-rate clock ticks (200Msps for X310, radio rate for E310). So if you’re running at full rate (no decimation), the VITA time is in samples — DELAYTIME=1000 means you will be storing 1000 samples in buffers between RX and TX. If you’re operating on a decimated stream, remember to adjust accordingly.

Now let’s move on to Step 2: enabling the streamer.

Step 2: Enable Streamer

This step is all on the host side, and exactly what it involves depends on whether you’re using GNU Radio with gr-ettus or not. Either way, the point here is that you need to explicitly tell UHD that the RX Radio is an active streamer. This step is usually taken care of automatically in the process of wiring up an rx_streamer, which controls the flow of data from the RFNoC device back into the host. Since there’s no data flowing back into the host in the loopback configuration, we have to explicitly tell UHD that we want to activate the RX streamer.

If you’re using bare UHD (no GNU Radio), this is as simple as:

usrp->set_rx_streamer(True, chan);

…where “chan” is the channel number of the RX Radio you’re using (usually 0).

If you’re using GNU Radio and gr-ettus, some changes are necessary. gr-ettus doesn’t expose the set_rx_streamer() method to the underlying UHD device object, so we need to modify gr-ettus’s API in order to get a hold of that method. In gr-ettus/include/ettus/rfnoc_radio.h, you’ll see a bunch of methods declared:

virtual void set_rx_gain(const double gain, const size_t chan) = 0;
virtual void set_tx_antenna(const std::string &ant, const size_t chan) = 0;
virtual void set_rx_antenna(const std::string &ant, const size_t chan) = 0;
virtual void set_tx_dc_offset(bool enable, const size_t chan) = 0;
virtual void set_tx_dc_offset(const std::complex< double > &offset, const size_t chan) = 0;

We’ll add to these:

virtual void set_tx_streamer(bool active, const size_t port) = 0;
virtual void set_rx_streamer(bool active, const size_t port) = 0;
virtual void issue_stream_cmd(const uhd::stream_cmd_t &cmd, const size_t chan=0) = 0;

Similarly, in gr-ettus/lib/rfnoc_radio_impl.h, we’ll add the same methods (without the pure virtual declaration). And finally, we’ll add their implementations to gr-ettus/lib/rfnoc_radio_impl.cc:

void rfnoc_radio_impl::set_tx_streamer(bool active, const size_t port)
{
  _radio_ctrl->set_tx_streamer(active, port);
}

void rfnoc_radio_impl::set_rx_streamer(bool active, const size_t port)
{
 _radio_ctrl->set_rx_streamer(active, port);
}

void rfnoc_radio_impl::issue_stream_cmd(const uhd::stream_cmd_t &cmd, const size_t chan)
{
  _radio_ctrl->issue_stream_cmd(cmd, chan);
}

In your GNU Radio flowgraph, after you instantiate your RX Radio block but before you start streaming, make sure to add the following:

self.uhd_rfnoc_streamer_radio_0_0.set_rx_streamer(True, chan)

…replacing “uhd_rfnoc_streamer_radio_0_0” with whatever the name of the RX Radio block is in your flowgraph.

You’ll notice we also added issue_stream_cmd() to the gr-ettus API. That brings us to Step 3.

Step 3: Issuing the Stream Command

This step is unnecessary if you’re just using UHD, since you’ll have to issue a stream command to the RX Radio anyway to get things started. Ordinarily gr-ettus takes care of issuing the stream command as part of setting up the RX streamer, but since there’s no RX streamer (remember the RX streamer handles data streams coming back from the FPGA into the host, and there is none here) we have to do it ourselves. In Step 2 we already added the necessary API call to gr-ettus to be able to handle this ourselves. So, after the call to set_rx_streamer, just put the following:

stream_cmd = uhd.stream_cmd_t(uhd.stream_cmd_t.STREAM_MODE_START_CONTINUOUS)
self.uhd_rfnoc_streamer_radio_0_0.issue_stream_cmd(stream_cmd)

…again replacing “uhd_rfnoc_streamer_radio_0_0” with the name of your RX Radio block. This will write to the relevant register of the RX Radio block and command it to begin sending samples. Because the TX Radio block is already set up to start streaming as soon as it gets a packet, this is all you need to do in order to get loopback working!

Addendum: E310 Complications

If you’re doing this on an E310, there’s a little complication due the way the driver handles setting up the RF paths on the device. You might find that loopback appears to be working, but no RF comes out of the TX ports. This is because in the case of the E3xx, you also need to manually set TX streamers in order to convince UHD to set up the TX path at initialization time. Because we’ve already punctured the API for gr-ettus, there’s nothing else to be done except:

usrp->set_tx_streamer(True, chan);

…for pure UHD, or:

self.uhd_rfnoc_streamer_radio_2.set_tx_streamer(True, chan)

…for a GNU Radio/gr-ettus flowgraph.