📜 ⬆️ ⬇️

Parse the protocol pager pocsag messages, p1

Hi, Habr!

Once upon a time, when a mobile phone cost $ 2000 and a minute of a call cost 50 cents, there was such a popular thing as a paging connection. Then the communication became cheaper, and the pager first turned from a prestigious attribute of a business person into a non-prestigious courier or secretary attribute, and then this technology almost completely disappeared.


For those who remember the joke "read the pager, thought a lot," and wants to figure out how it works, continued under the cut. For those who want to understand even more detail, the second part is available.

general information


For those who forgot or was born after the 2000s, briefly recall the basic ideas.

From the user's point of view, paging has two big advantages, which are still relevant in some cases:

- Communication is one-way, without any confirmations, so the paging network cannot be overloaded, its performance does not depend on the number of subscribers. Messages are simply sequentially broadcast as is, and the pager receives them if the recipient's number matches the pager number.

- The receiving device is very simple, so the pager can operate without recharging for up to a month from 2x ordinary AA batteries.

There are two main standards for messaging - POCSAG (Post Office Code Standardization Advisory Group) and FLEX . Standards are not new, POCSAG was approved in 1982, supported speeds of 512, 1200 and 2400 bps. Frequency shift keying (FSK - frequency shift keying) with a frequency separation of 4.5 KHz is used for transmission. A newer FLEX standard (proposed by Motorola in the 90s) supports speeds up to 6400 bps and can use not only FSK2, but also FSK4.

The protocols are inherently simple enough, and 20 years ago decoders were written for them, capable of decrypting the signal from the sound card input (no message encryption is provided, so anyone can read them with such a program).

Let's see how it works.

Reception of signals


For a start, we need a sample for decoding. We take a laptop, rtl-sdr receiver, time machine, and receive the signal we need.



Since frequency modulation, receive mode also set FM. Using HDSDR, we record the signal in WAV format.

Let's see what we did. Load the wav file as an array using Python:

from scipy.io import wavfile import matplotlib.pyplot as plt fs, data = wavfile.read("pocsag.wav") plt.plot(data) plt.show() 

Result (bits are manually signed):



As you can see, everything is simple, and even "by eye" in Paint you can draw bits, where "0" and where "1". But doing this for the entire file would be too long, the process should be automated.

If we increase the graph, then we can see that the width of each “bit” is 20 samples, which, at a sampling frequency of 24,000 samples / s wav-file, corresponds to 1200 bit / s. We find in the signal the place of transition through zero - this will be the beginning of the bit sequence. Display the markers to verify that the bits match.

 speed = 1200 fs = 24000 cnt = int(fs/speed) start = 0 for p in range(2*cnt): if data[p] < - 50 and data[p+1] > 50: start = p break # Bits frame bits = np.zeros(data.size) for p in range(0, data.size - cnt, cnt): bits[start + p] = 500 plt.plot(bits) 

As you can see, the coincidence is not perfect (the transmitter and receiver frequencies are still slightly different), but it is quite enough for decoding.



For long signals, one would have to introduce a frequency adjustment algorithm, but in this case it is not required.

And the last step is to translate the array from wav into a bit sequence. Everything is also simple, we already know the length of one bit, if the data for this period is positive, then we add “1”, otherwise “0” (edit - as it turned out, the signal had to be reversed, so 0 and 1 are reversed).

 bits_str = "" for p in range(0, data.size - cnt, cnt): s = 0 for p1 in range(p, p+cnt): s += data[p] bits_str += "1" if s < 0 else "0" print("Bits") print(bits_str) 

Perhaps the code can be optimized by abandoning the loop, although in this case it is not critical.

The result is a finished bit sequence (as a string) that saves our message.

1010101010101010101010101010101010101010101010101010101010101010101010101
010101010101010101010101010101010101010101010100111110011010010000101001101
100001111010100010011100000110010111011110101000100111000001100101110111101
010001001110000011001011101111010100010011100000110010111011110101000100111
000001100101110111101010001001110000011001011101111010100010011100000110010
011011110101000100111000001100101110111101010001001110000011001011101111010
100010011100000110010111011110101000100111000001100101110111101010001001110
...
111101111

Decoding


The sequence of bits is already much more convenient than just a wav file, it is already possible to extract any data from it. We divide the file into blocks of 4 bytes, and we get a more understandable sequence:

10101010101010101010101010101010
10101010101010101010101010101010
10101010101010101010101010101010
10101010101010101010101010101010
01111100110100100001010011011000
01111010100010011100000110010111
01111010100010011100000110010111
01111010100010011100000110010111
01111010100010011100000110010111
00001000011011110100010001101000
10000011010000010101010011010100
01111100110100100001010111011000
11110101010001000001000000111000
01111010100010011100000110010111
01111010100010011100000110010111
01111010100010011100000110010111
00100101101001011010010100101111

This is all that we can extract from the file, it remains to understand what these lines mean. Open the documentation for the format, which is available as a PDF .



Everything is more or less clear. The message header consists of a long block “10101010101” which is needed for the pager to go out of “sleep mode”. The message itself consists of Batch-1 ... Batch-N blocks, each of which begins with a unique FSC sequence (bold in the text). Further, as can be seen from the manual, if the line starts with "0", then this is the recipient's address. The address is sewn in the pager itself, and if it doesn’t match, the pager will simply ignore the message. If the line starts with "1", then this is actually the message. We have two such lines.

Now look at each block. We see codes Idle - empty blocks 01111 ... 0111, not carrying useful information. We delete them, information remains very little, all that remains is:

01111100110100100001010011011000 - Frame Sync
00001000011011110100010001101000 - Address
10000011010000010101010011010100 - Message

01111100110100100001010111011000 - Frame Sync
11110101010001000001000000111000 - Message
00100101101001011010010100101111 - Address

It remains to understand what's inside.

We search further in the manual, and find out that messages can be digital or text. Digital messages are stored as 4-bit BCD codes, so 5 characters can fit in 5 characters (there are still bits to control, we will not consider them). The message can also be text, in this case the 7-bit encoding is used, but for text our message is too small - the total number of message bits is not a multiple of 7.

From the lines 10000011010000010101010011010100 and 11110101010001000001000000111000 we get the following 4-bit sequences:
1 0000 0110 1000 0010 10101 0011010100 - 0h 6h 8h 2h Ah
1 1110 1010 1000 1000 00100 0000111000 - Eh Ah 8h 8h 2h

And finally, the last step - look in the documentation of the symbol matching table.



As you can see, a digital message can only contain the digits 0-9, the letter U ("ugrent"), a space and a pair of brackets. We write a simple output function, so as not to count them manually:

 def parse_msg(block): # 16 lines in a batch, each block has a length 32 bits for cw in range(16): cws = block[32 * cw:32 * (cw + 1)] if cws[0] == "0": addr = cws[1:19] print(" Addr:" + addr) else: msg = cws[1:21] print(" Msg: " + msg) size = 4 s = "" for ind in range(0, len(msg), size): bcd_s = msg[ind:ind + size] value = int(bcd_s, 2) symbols = "0123456789*U -)(" s += symbols[value] print(" ", s) print() 

As a result, we receive the transmitted message "0682 *) * 882". What it means is hard to say, but since the format supports digital messages, it probably means someone needs it.

findings


As you can see, the POCSAG format is very simple, and in fact, can be decoded even in a school notebook. And although now it is more of historical interest, the analysis of such protocols is very useful from a cognitive point of view.

The next part is about decoding ASCII messages.

Source: https://habr.com/ru/post/438602/