Recently we've had a couple of customers send us data files with NMEA0183 sentences that weren't being logged by our data logger, wondering what the problem was. Our logger checks the sentences are in a valid format, and only logs well formed sentences (it doesn't look at the message contents - everything that is valid gets logged).
So what makes a sentence valid? Here are the key points:
Without burrowing down to the level of the electronic signal, NMEA data is basically unidirectional. When setting up a serial port to send or receive NMEA data it should be configures as 1 start bit, 8 data bits, 1 stop bit and no parity. The official data rate of NMEA0183 is 4800bps. A faster data rate of 38400bps is referred to as NMEA0183-HS, and is primarily used by AIS. Some devices may use other speeds, e.g. many non-marine GPS receivers have a default speed of 9600bps. If these settings are wrong, you may get garbage, including characters not allowed by NMEA0183, or you may get nothing.
Of course, if sending NMEA data over a network cable or Wi-Fi, these serial settings don't apply, as the data is sent over the network using TCP or UDP.
Basically it uses the printable ASCII characters (i.e. decimal codes 20 to 127 inclusive). Within this range, the following characters are reserved for specific purposes: $*,!\^~ and <DEL>, plus <CR> and <LF>. This means upper and lower case unaccented characters from the Latin alphabet, digits 0 to 9, and standard punctuation marks.
Each message begins with a $ or ! and ends with a <CR><LF>, and can be up to 82 characters long (that is 79 characters between the first character and the <CR><LF>). After the initial $ or ! there are 5 characters comprising upper case letters or number (or 4 if the sentence begins $P), and then a comma. From then on there are a series of comma separated fields, e.g.: $ESDPT,11.3,,100<CR><LF>. In most instances fields are optional, e.g. in this example the field between 11.3 and 100 is omitted. Also, there is no comma between the last field and the <CR><LF>.
There is an optional checksum between the last field and the terminating <CR><LF> (though a few sentences are now required to have it, to meet backwards compatibility needs the checksum is still effectively optional). The checksum consists of a * and then two hex characters (i.e. 0 - 9 and A-F), for example: $ESDPT,11.3,,100*56<CR><LF>. Whilst this gives another way of detecting corrupted data, if you are generating long sentences it also shortens the available length by 3 characters.
The checksum is calculated by doing a bitwise exclusive or (XOR) of all characters between (but not including) the initial $ or ! and the final <CR><LF>. In hexadecimal representation this is a 2 digit number which is converted to the ASCII characters. The * and two character checksum is then added to the sentence. For example, continuing with the example here, XORing ESDPT,11.3,,100 gives 86 as a decimal or 56 as a hexadecimal, so we add in *56 as the checksum. If you need to code this up, Google will give you code snippets in most languages, and there is a useful checksum calculator here.
So, before looking at the information content of a sentence, quite a lot can be done to check that it is valid, and hasn't been corrupted in transit - which is relatively rare when sending NMEA data over wires, but is all too common sending data over Wi-Fi using UDP (which I may look at in a later post).