Blog Post:

Reading line-by-line from a serial port (or other byte-oriented stream)

With many .NET developers moving from the traditional (and broken) System.IO.Ports.SerialPort DataReceived event handling to either the correct and more efficient BaseStream.BeginRead / BaseStream.EndRead pair I promoted in my last post or the newer BaseStream.ReadAsync method introduced in .NET Framework 4.5 along with the C# async and await keywords, a common complaint is that BaseStream doesn’t provide any ReadLine() method. They try assuming that each EndRead or ReadAsync will be exactly one line, and get wrong results.

For some developers, it is enough to point out that byte-oriented streams don’t preserve message boundaries, so it’s possible for a message to be split across multiple buffer transfers (from hardware FIFO to application). For others, including both users of serial ports, TCP sockets and pipes, they just don’t know what to do next, so they ask on StackOverflow. Because there are so many questions about fixing code that badly handles message fragmentation, I wanted to share an elegant solution to the problem.

class LineSplitter
{
    public event Action LineReceived;
    public byte Delimiter = (byte)'n';
    byte[] leftover;

    public void OnIncomingBinaryBlock(object sender, byte[] buffer)
    {
        int offset = 0;
        while (true)
        {
            int newlineIndex = Array.IndexOf(buffer, Delimiter, offset);
            if (newlineIndex  offset)
                    leftover = ConcatArray(leftover, buffer, offset, buffer.Length - offset);
                return;
            }
            ++newlineIndex;
            byte[] full_line = ConcatArray(leftover, buffer, offset, newlineIndex - offset);
            leftover = null;
            offset = newlineIndex;
            LineReceived?.Invoke(full_line); // raise an event for further processing
        }
    }

    static byte[] ConcatArray(byte[] head, byte[] tail, int tailOffset, int tailCount)
    {
        byte[] result;
        if (head == null)
        {
            result = new byte[tailCount];
            Array.Copy(tail, tailOffset, result, 0, tailCount);
        }
        else
        {
            result = new byte[head.Length + tailCount];
            head.CopyTo(result, 0);
            Array.Copy(tail, tailOffset, result, head.Length, tailCount);
        }

        return result;
    }
}

class LineSplitter

{

public event Action LineReceived;

public byte Delimiter = (byte)'n';

byte[] leftover;

public void OnIncomingBinaryBlock(object sender, byte[] buffer)

{

int offset = 0;

while (true)

{

int newlineIndex = Array.IndexOf(buffer, Delimiter, offset);

if (newlineIndex offset)

leftover = ConcatArray(leftover, buffer, offset, buffer.Length - offset);

return;

}

++newlineIndex;

byte[] full_line = ConcatArray(leftover, buffer, offset, newlineIndex - offset);

leftover = null;

offset = newlineIndex;

LineReceived?.Invoke(full_line); // raise an event for further processing

}

static byte[] ConcatArray(byte[] head, byte[] tail, int tailOffset, int tailCount)

{

byte[] result;

if (head == null)

{

result = new byte[tailCount];

Array.Copy(tail, tailOffset, result, 0, tailCount);

}

else

{

result = new byte[head.Length + tailCount];

head.CopyTo(result, 0);

Array.Copy(tail, tailOffset, result, head.Length, tailCount);

}

return result;

}

The logic is quite simple. Between data blocks, the variable leftover holds an incomplete message, if any. When a new block arrives, it is separated (ala String.Split) at each occurrence of the line delimiter (here, 'n', but any other single end-of-message byte can easily be substituted). The first substring of the new data is concatenated with the leftover partial message to form a complete line. Ranges between delimiters are extracted and forwarded likewise. And any data following the last delimiter becomes the leftover saved for the next incoming block (including the old leftovers, in the case that no delimiters were found at all).

Although there are many cases which must be dealt with: zero, one, or multiple delimiters found, whether the last byte is a delimiter or not, and whether there was leftover data from the last call, instead of handling all combinations explicitly, introducing a helper method that gets called twice greatly simplifies matters.

You may have noticed that there are no serial port or stream calls here at all. The separation of buffer processing from I/O calls here is intentional, and an approach I very strongly recommend that you adopt in your own applications. The benefits are the usual ones associated with adhering to the Single Responsibility Principle — easier testing, lower coupling, more flexibility. For example, an class containing just this code can be inserted between various types of data sources — serial port, TCP stream, logfile replay — and the application code to log, parse, and otherwise process incoming lines.

I’ve also intentionally NOT followed the EventHandler pattern. It had value when it was introduced, but now the C# language supports variable capture in anonymous delegates and lambda expressions, so the sender parameter is useless. As a benefit, the event is now compatible with the Add method of a System.Collections.Generic.List making unit testing very easy:

[TestMethod]
void DetectorTestTwoLinesArrivingTogether()
{
    var result_lines = List();
    var dut = new LineDetector();
    dut.LineReceived += result_lines.Add;
    dut.OnIncomingBinaryBlock(new[] { 0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x72, 0x65, 0x61, 0x64, 0x65, 0x72, 0x73, 0x21, 0x0D, 0x0A, 0x53, 0x70, 0x61, 0x72, 0x78, 0x20, 0x69, 0x73, 0x20, 0x74, 0x68, 0x65, 0x20, 0x62, 0x65, 0x73, 0x74, 0x2E, 0x0D, 0x0A });
    Assert.AreEqual(2, result_lines.Count);
}

[TestMethod]

void DetectorTestTwoLinesArrivingTogether()

{

var result_lines = List();

var dut = new LineDetector();

dut.LineReceived += result_lines.Add;

dut.OnIncomingBinaryBlock(new[] { 0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x72, 0x65, 0x61, 0x64, 0x65, 0x72, 0x73, 0x21, 0x0D, 0x0A, 0x53, 0x70, 0x61, 0x72, 0x78, 0x20, 0x69, 0x73, 0x20, 0x74, 0x68, 0x65, 0x20, 0x62, 0x65, 0x73, 0x74, 0x2E, 0x0D, 0x0A });

Assert.AreEqual(2, result_lines.Count);

}

With a little care to the parameter types of the trigger methods and events, your controller code may not need to do anything more than compose objects:

serialPortController.BinaryBlockReceived += lineSplitter.OnIncomingBinaryBlock;
lineSplitter.LineReceived += messageParser.OnIncomingLine;
messageParser.ValuesParsed += realtimePlot.PlotValues;
messageParser.ValuesParsed += UpdateLabels;
messageParser.ValuesParsed += dataLogger.WriteValues;

serialPortController.BinaryBlockReceived += lineSplitter.OnIncomingBinaryBlock;

lineSplitter.LineReceived += messageParser.OnIncomingLine;

messageParser.ValuesParsed += realtimePlot.PlotValues;

messageParser.ValuesParsed += UpdateLabels;

messageParser.ValuesParsed += dataLogger.WriteValues;

10 Responses

Diego says:

August 20, 2017 at 12:59 am

I know this is an old post, having said that, this is very nice, very handy.
Thanks for sharing.

Reply
gwideman says:

September 4, 2017 at 1:23 pm

Useful topic.
There seem to be a few problems in class LineSplitter:
Line 13: Missing comparison operator? Also missing open brace?
Lines 14, 15: Odd indent suggests something missing
Line 21: Missing one of the expressions?

Reply
Talahamut says:

May 8, 2018 at 9:23 pm

if (newlineIndex < offset)
{
leftover = ConcatArray(leftover, buffer, offset, buffer.Length – offset);
return;
}

Reply
Talahamut says:

May 8, 2018 at 9:25 pm

and…

public event Action LineReceived;

Reply
Talahamut says:

May 8, 2018 at 9:25 pm

It didn’t post it right the first time…

public event Action LineReceived;

Reply
1. Talahamut says:
  
  May 8, 2018 at 9:27 pm
  
  Still not posting it right…
  
  Action should have “byte[]” as its T.
  
  Reply
  1. Ubashaka says:
    
    August 10, 2020 at 1:37 pm
    
    Thanks for the suggestions made!
    
    Reply
PAL says:

June 16, 2020 at 1:43 pm

Is there memory leak here with reassigning result to a new byte[]?

Reply
1. Kalmor says:
  
  March 19, 2021 at 9:31 am
  
  This is .NET and C#. The garbage collector takes care of it so there is no memory leak.
  
  Reply
name says:

June 3, 2021 at 7:40 am

Is article this still valid in 2021?

Reply

Blog Post:

Reading line-by-line from a serial port (or other byte-oriented stream)

10 Responses

Leave a Reply Cancel reply

Categories

Get in Touch