Microprocessors
Programs

A MicroZed UDP Server for Waveform Centroiding: 3.3

Table of Contents

3.3: The main.c and echo.c Files: Part 2

After all of the initialization and configuration described in the previous section, the code enters the main while(Error=0) loop that represents the heart of our program. The line

                   /* Receive packets */
   xemacif_input(echo_netif);

makes it so that our application is constantly waiting to receive packets. And as we already discussed, whenever a packet is bound for 192.168.1.10:7, our handy recv_callback function is called. This callback is the real meat and bones of our program. So what does it do?

3.3.1: Receiving Packets in recv_callback

Take a look at the last two arguments of our callback function: (struct ip_addr *addr, u16_t port). These give us the IP address and port that the inbound packet came from. The first thing we do in our function is save these to global variables:

                   RemotePort = port;
   RemoteAddr = *addr;

Why do we do this? Well, the programs we're going to using on our PC to transfer the index array and waveform data to the Zynq don't guarantee a fixed port when they send UDP packets. Each time we send a new waveform, the IP should remain the same, but the port won't (you could write a more sophisticated program to do this, but we're not going to). So what we're doing is recording the source port so we can send data back to it when we have a centroid result ready. Notice we also save the protocol control block so we can use that as well:

                   send_pcb = *upcb;

The workhorse of UDP reception is the pbuf, or packet buffer structure. You can take a look at its definition in GetCentroid_bsp/ps7_cortexa9_0/ libsrc/lwip141_v1_7/src/lwip-1.4.1/src/include/lwip/pbuf.h to see all of the elements it entails. The two that we'll be using most are the payload length and the payload itself. The way we'll typically use them is like this:

                   EthBytesReceived = p->len;
   memcpy(OurBuffer, (u32*)p->payload, EthBytesReceived);

The first line determines how many bytes are in the UDP payload. The second line copies that payload to a buffer of our choosing. Note that p->payload is a pointer to void, so we are free to choose what type of array we copy it to without fear of compiler warning. I simply made up a fictitious array called OurBuffer. It could be an array of u8's, u16's, or whatever else we want. We're just copying bytes.


One crucial thing to note is that the largest UDP packet that can be handled is something like 1536 bytes, so you have to be careful when you're handling the incoming data or you could end up receiving just a part of what you expect. For instance, if you expect to fill an array of 2048 bytes, you'll have to account for the fact that it will arrive in two chunks. The lwIP code has the option of working with jumbo UDP packets, but I never messed around with those.

3.3.1.1: The First UDP Packet Expected: Our Index Array

I'll admit at the outset that the way I wrote the code to receive the incoming data is a bit hokey. Instead of being smart and detecting what type of data is contained in the incoming packet, we assume they arrive in order. So the very first packet that our program expects is the 256-element array of fp_data_t indices. Since each element is 4 bytes, that's 1024 bytes total.


To see this, note that the variable IndArrDone is initialized to 0. Therefore, on the first packet that is received, we enter the conditional if (IndArrDone == 0){...} to handle receiving the index array and writing it to the PL. We first copy the payload to an array of u8's called IndArr. We then point our int* pointer to the start of the array and use that as an argument to the function WriteIndArr().


Let's have a look at the code in that function, because it may be a bit confusing at first. I know it was for me.

                   // Declare some pointers to populate the templates and matrices
   int *pbufptr1 = pbufptr;
   int *pbufptr2 = pbufptr + 1;

   // Fill in the cyclically partitioned index array
   for (int i=0; i<WAVE_SIZE_BYTES/4; i++){
      ww1 = XGetcentroid_Write_IndArr_0_V_Words(&GetCentroid, i, pbufptr1, 1);
      ww2 = XGetcentroid_Write_IndArr_1_V_Words(&GetCentroid, i, pbufptr2, 1);
      if ((ww1 != 1) || (ww2 !=1)) return -1;
      pbufptr1 += 2; pbufptr2 += 2;
   }

Remember how we partitioned the index array in HLS so that we could have two parallel data channels operating at the same time? Well, now we have to make sure that we split our index array properly into those two channels (or sub-arrays). The first channel needs to contain the elements i=0,2,4,...N-2 and the second needs to have the elements i=1,3,5,...N-1, where N=256.


If you take a look at xgetcentroid.h, you'll see that there are two separate functions we use to populate the two arrays! The first function is XGetcentroid_Write_IndArr_0_V_Words() and the second is XGetcentroid_Write_IndArr_1_V_Words(). The above is an example of one way you can use these functions and some pointer magic to properly fill up your cyclically partitioned array. We use pbufptr1 to point to the even elements and pbufptr2 to point to the odd ones.


At the low level, what we are doing when we call these functions is writing our index array to block RAM in the PL. Once we write those addresses, the values will persist when we run GetCentroid. This is particularly important since it means we don't have to waste time sending the index array every time we run the algorithm.


Encoding the Fixed Point Index Array
You may be asking, "Why are we sending arrays of ints when the index array is of type fp_data_t?" The short answer is: it doesn't matter. A byte is a byte. We just have to make sure that we're sending the right four bytes with each call to the XGetCentroid_Write_IndArr functions.


However, later on you may notice that we are actually encoding our fixed point numbers as integers when we send them from the PC. There is a good reason for this. If it's not clear to you, or you're confused by fixed point vs. floating point and how to convert between the two, I highly recommend reading this explanation by Mahdi Shabany. You'll see we use the methods he describes in our C program that sends the index array, as well as the readback function shown below.


Reading Back the Index Array
If you want to verify that the index array was properly saved to memory, make sure that you've got a #define READ_BACK_INDEX_ARRAY statement in includes.h. The code that will be executed if this is defined does the same thing as above, except it reads the arrays instead of writing them. Let's take a look at the code since it reveals something about how we're storing our numbers:

                #ifdef READ_BACK_INDEX_ARRAY
                   // Declare some temporary holder arrays
   int dummy1[WAVE_SIZE_BYTES/2];
   int dummy2[WAVE_SIZE_BYTES/2];

   // Read the index array back
   ww1 = XGetcentroid_Read_IndArr_0_V_Words(&GetCentroid, 0, &dummy1[0], WAVE_SIZE_BYTES/4);
   ww2 = XGetcentroid_Read_IndArr_1_V_Words(&GetCentroid, 0, &dummy2[0], WAVE_SIZE_BYTES/4);
   if ((ww1 != WAVE_SIZE_BYTES/4) || (ww2 != WAVE_SIZE_BYTES/4)) return -1;
   for (int i=0; i<WAVE_SIZE_BYTES/4; ++i){
     printf("Index Array %d = %16.8f\n", NUMCHANNELS*i, ((float)dummy1[i])/BITDIV);
     printf("Index Array %d = %16.8f\n", NUMCHANNELS*i+1, ((float)dummy2[i])/BITDIV);
   }
#endif

You can see that we're reading the data using the XGetcentroid_Read_IndArr_0_V_words() and XGetcentroid_Read_IndArr_1_V_words() functions into an index array. We then simply cast the 4 bytes as a float and divide by BITDIV=256 to convert our fixed point value to a floating point value. Again, if this doesn't make sense to you, read this document.

3.3.1.2: The Rest of the UDP Packets: DMA'ing the Waveform Data

At the end of the last conditional, we set IndArrDone=1 so that the remaining UDP packets will be handled with the code that comes after. Every subsequent packet is assumed to contain a 256-element waveform consisting of unsigned shorts. That's a total of 512 bytes.


What we want to do with those unsigned shorts is DMA them to GetCentroid as an HLS stream. But we have to make sure that the logic in our GetCentroid core is ready. We do that by checking the state of some global variables: the same ones that we talked about toggling in the interrupt handlers.

                   // Send data to the IP core slave
   TimeOutCntr = RESET_TIMEOUT_COUNTER;
   while (DMA_TX_Busy == 1){
     TimeOutCntr--;
     if (TimeOutCntr == 0){
       xil_printf("Error in waiting for DMA\n\r");
       Error = 1;
     break;
     }
}

You should recognize DMA_TX_Busy from the AxiDMA interrupt handler. Once a DMA transfer completes, our interrupt handler will toggle this variable so that we know the DMA engine is ready to handle another transfer. If the variable doesn't toggle back in a certain amount of time, we'll halt execution and declare an error.


You may be asking why we're not just polling the function XAxiDma_Busy(). The answer is that, by using interrupts, we can do other things—like handling incoming UDP packets—while the DMA transfer is ongoing. That's really the true intention of DMA: to free up the processor to handle other business while large data transfers occur. In our example, we're doing a small data transfer. But you could easily envision a scenario where we have a larger data volume. In that case, we'd be able to start a transfer and then go back and start receiving packets for the next transfer.


The next block of code is very similar to the last one. But instead of polling DMA_TX_Busy, we poll the variable GetCentroidReady. GetCentroidReady is toggled when we get an ap_ready interrupt signaling that the GetCentroid core is ready to receive another burst of data.


Finally, we get to the lines

                   // DMA Should be ready, so send transfer and reset wait flags
   GetCentroidReady = 0;
   DMA_TX_Busy = 1;
   Xil_DCacheFlushRange((u32)WaveformArr, WAVE_SIZE_BYTES);
   status = XAxiDma_SimpleTransfer(&axiDMA, (u32)&WaveformArr[0], WAVE_SIZE_BYTES,\
      XAXIDMA_DMA_TO_DEVICE);
   if (status != XST_SUCCESS){
     xil_printf("Error with DMA transfer to Device\n\r");
     Error = 1;
   }

where the DMA magic actually happens. The first thing we do is set the global variables back to indcate that the DMA transfer is going to be busy and the algorithm won't be ready for incoming data yet. The next thing we do is VERY IMPORTANT.


We must flush the cache or we'll never get new data into our algorithm. EVER. If we don't do this, the DMA will just use the same cached data over and over again because it's not smart enough to realize that we've populated the WaveformArr with new data. I have been frustrated to no end when working with DMA transfers in the past because I was too dumb to flush the cache. Don't be like me.


The last thing is our function call to XAxiDma_SimpleTransfer(). We need to point the DMA core to the starting location in memory, &WaveformArr[0], and tell it how many bytes we want to send, WAVE_SIZE_BYTES. We also need to indicate that it's a transfer from the processor to the IP core with the last argument. After we make this call, the processor will set up the DMA transfer and let it go. As soon as that is done, we can get back to receiving packets in the main loop and checking to see if we have a result to send back to the PC.

3.3.1.3: Sending the GetCentroid Result Back Over UDP

Now that we've jumped back to the main while() loop, we will check the SendResults boolean to see if there is a result to send. Remember that this variable is the one we set in our GetCentroid Interrupt Service Routine when we get an ap_done interrupt. The latter means there is a new, valid value happily sitting in its register. If SendResults is equal to 1, we do the following:

                   // Read the results from the FPGA
   Centroid = XGetcentroid_Get_Centroid_V(&GetCentroid);

   // Send out the centroid result over UDP
   psnd = pbuf_alloc(PBUF_TRANSPORT, sizeof(int), PBUF_REF);
   psnd->payload = &Centroid;
   udpsenderr = udp_sendto(&send_pcb, psnd, &RemoteAddr, RemotePort);
   if (udpsenderr != ERR_OK){
     xil_printf("UDP Send failed with Error %d\n\r", udpsenderr);
     goto ErrorOrDone;
   }
   pbuf_free(psnd);

The first function call is another one defined in xgetcentroid.h: XGetcentroid_Get_Centroid(). This simply reads the register that contains the calculated centroid value.


Once we've got that value, we create a new pbuf called psnd and stuff the payload with our newly calculated value. Then we use the udp_sendto() function to send the value back to the IP/port that sent us the waveform data in the first place.


So that's pretty much it in the way of explaining how the program works. If you're interested in some of the underlying files in the project, I encourage you to explore some more. It never hurts! Now we'll move on to how we load the program with SDK and then send our data from the PC to the MicroZed over Ethernet.



← Previous   ...    Next →

Table of Contents