Blog Post:

Reading line-by-line from a serial port (or other byte-oriented stream)

With many .NET developers moving from the traditional (and broken) System.IO.Ports.SerialPort DataReceived event handling to either the correct and more efficient BaseStream.BeginRead / BaseStream.EndRead pair I promoted in my last post or the newer BaseStream.ReadAsync method introduced in .NET Framework 4.5 along with the C# async and await keywords, a common complaint is that BaseStream doesn’t provide any ReadLine() method. They try assuming that each EndRead or ReadAsync will be exactly one line, and get wrong results.

For some developers, it is enough to point out that byte-oriented streams don’t preserve message boundaries, so it’s possible for a message to be split across multiple buffer transfers (from hardware FIFO to application). For others, including both users of serial ports, TCP sockets and pipes, they just don’t know what to do next, so they ask on StackOverflow. Because there are so many questions about fixing code that badly handles message fragmentation, I wanted to share an elegant solution to the problem.

The logic is quite simple. Between data blocks, the variable leftover holds an incomplete message, if any. When a new block arrives, it is separated (ala String.Split) at each occurrence of the line delimiter (here, 'n', but any other single end-of-message byte can easily be substituted). The first substring of the new data is concatenated with the leftover partial message to form a complete line. Ranges between delimiters are extracted and forwarded likewise. And any data following the last delimiter becomes the leftover saved for the next incoming block (including the old leftovers, in the case that no delimiters were found at all).

Although there are many cases which must be dealt with: zero, one, or multiple delimiters found, whether the last byte is a delimiter or not, and whether there was leftover data from the last call, instead of handling all combinations explicitly, introducing a helper method that gets called twice greatly simplifies matters.

You may have noticed that there are no serial port or stream calls here at all. The separation of buffer processing from I/O calls here is intentional, and an approach I very strongly recommend that you adopt in your own applications. The benefits are the usual ones associated with adhering to the Single Responsibility Principle — easier testing, lower coupling, more flexibility. For example, an class containing just this code can be inserted between various types of data sources — serial port, TCP stream, logfile replay — and the application code to log, parse, and otherwise process incoming lines.

I’ve also intentionally NOT followed the EventHandler pattern. It had value when it was introduced, but now the C# language supports variable capture in anonymous delegates and lambda expressions, so the sender parameter is useless. As a benefit, the event is now compatible with the Add method of a System.Collections.Generic.List making unit testing very easy:

With a little care to the parameter types of the trigger methods and events, your controller code may not need to do anything more than compose objects:

10 Responses

  1. Useful topic.
    There seem to be a few problems in class LineSplitter:
    Line 13: Missing comparison operator? Also missing open brace?
    Lines 14, 15: Odd indent suggests something missing
    Line 21: Missing one of the expressions?

  2. if (newlineIndex < offset)
    {
    leftover = ConcatArray(leftover, buffer, offset, buffer.Length – offset);
    return;
    }

Leave a Reply

Your email address will not be published. Required fields are marked *

Get in Touch

If you have a product design that you would like to discuss, a technical problem in need of a solution, or if you just wish you could add more capabilities to your existing engineering team, please contact us.