RRG-Proxmark3/doc/new_frame_format.md
2019-10-05 19:10:40 +02:00

467 lines
18 KiB
Markdown

# New frame format documentation
This document is primarily intended for developers only.
A major change is the support of variable length frames between host and Proxmark3.
This is a step especially important for usage over FPC/USART/BT.
## Old format
Previously, frames were, in both directions like this:
uint64_t cmd;
uint64_t arg[3];
union {
uint8_t asBytes[PM3_CMD_DATA_SIZE];
uint32_t asDwords[PM3_CMD_DATA_SIZE / 4];
} d;
with PM3_CMD_DATA_SIZE = 512 and there was no API abstraction, everybody was forging/parsing these frames.
So the frame size was fixed, 544 bytes, even for simple ACKs.
When snooping the USB transfers, we can observe the host is sending 544b Bulk USB frames while the Proxmark3 is limited by its internal buffers and is sending 128b, 128b, 128b, 128b, 32b, so in total 5 packets.
## New format
Even if we make the payload part variable in the old format, we've still a minimum of 32 bytes per frame with fields arbitrarily large.
So we designed a new format from scratch:
For commands being sent to the Proxmark3:
uint32_t magic;
uint16_t length : 15;
bool ng : 1;
uint16_t cmd;
uint8_t data[length];
uint16_t crc;
* `magic`: arbitrary magic (`PM3a`) to help re-sync if needed
* `length`: length of the variable payload, 0 if none, max 512 (PM3_CMD_DATA_SIZE) for now.
* `ng`: flag to tell if the data is following the new format (ng) or the old one, see transition notes below
* `cmd`: as previously, on 16b as it's enough
* `data`: variable length payload
* `crc`: either an actual CRC (crc14a) or a Magic placeholder (`a3`)
For responses from the Proxmark:
uint32_t magic;
uint16_t length : 15;
bool ng : 1;
int16_t status;
uint16_t cmd;
uint8_t data[length];
uint16_t crc;
* `magic`: arbitrary magic (`PM3b`) to help re-sync if needed
* `length`: length of the variable payload, 0 if none, max 512 (PM3_CMD_DATA_SIZE) for now.
* `ng`: flag to tell if the data is following the new format (ng) or the old one, see transition notes below
* `status`: a field to send back the status of the command execution
* `cmd`: as previously, on 16b as it's enough
* `data`: variable length payload
* `crc`: either an actual CRC (crc14a) or a Magic placeholder (`a3`)
We used to send an anonymous ACK, now we're replying with the corresponding command name and a status.
CRC is optional and on reception, the magic `a3` is accepted as placeholder. If it's different then it's checked as a CRC.
By default CRC is user over USART and is disabled over USB, on both directions.
Internal structures used to handle these packets are:
* PacketCommandNGPreamble
* PacketCommandNGPostamble
* PacketCommandNGRaw
* PacketResponseNGPreamble
* PacketResponseNGPostamble
* PacketResponseNGRaw
But they are abstracted from the developer view with a new API. See below.
## Transition
Because it's a long transition to clean all the code from the old format and because we don't want to break stuffs when flashing the bootloader, the old frames are still supported together with the new frames. The old structure is now called `PacketCommandOLD` and `PacketResponseOLD` and it's also abstracted from the developer view with the new API.
## New format API
So the new API is a merge of the old and the new frame formats, to ensure a smooth transition.
The boolean `ng` indicates if the structure is storing data from the old or the new format.
Old format can come from either old 544b frames or mixed frames (variable length but still with oldargs).
After the full transition, we might remove the fields `oldarg` and `ng`.
`PacketCommandNG` and `PacketResponseNG` are the structs used by the API, as seen in a previous section, there are other variable-sized packed structs specifically for the data transmission.
typedef struct {
uint16_t cmd;
uint16_t length;
uint32_t magic; // NG
uint16_t crc; // NG
uint64_t oldarg[3]; // OLD
union {
uint8_t asBytes[PM3_CMD_DATA_SIZE];
uint32_t asDwords[PM3_CMD_DATA_SIZE / 4];
} data;
bool ng; // does it store NG data or OLD data?
} PacketCommandNG;
typedef struct {
uint16_t cmd;
uint16_t length;
uint32_t magic; // NG
int16_t status; // NG
uint16_t crc; // NG
uint64_t oldarg[3]; // OLD
union {
uint8_t asBytes[PM3_CMD_DATA_SIZE];
uint32_t asDwords[PM3_CMD_DATA_SIZE / 4];
} data;
bool ng; // does it store NG data or OLD data?
} PacketResponseNG;
### On the client, for sending frames:
(`client/comms.c`)
void SendCommandNG(uint16_t cmd, uint8_t *data, size_t len);
void SendCommandBL(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len);
void SendCommandOLD(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len);
void SendCommandMIX(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len);
So cmds should make the transition from `SendCommandOLD` to `SendCommandNG` to benefit from smaller frames (and armsrc handlers adjusted accordingly of course).
`SendCommandBL` is for Bootloader-related activities, see Bootrom section.
`SendCommandMIX` is a transition fct: it uses the same API as `SendCommandOLD` but benefits somehow from variable length frames. It occupies at least 24b of data for the oldargs and real data is therefore limited to PM3_CMD_DATA_SIZE - 24 (defined as PM3_CMD_DATA_SIZE_MIX). Besides the size limitation, the receiver handler doesn't know if this was an OLD frame or a MIX frame, it gets its oldargs and data as usual.
Warning : it makes sense to move from `SendCommandOLD` to `SendCommandMIX` only for *commands with small payloads*.
* otherwise both have about the same size
* `SendCommandMIX` has a smaller payload (PM3_CMD_DATA_SIZE_MIX < PM3_CMD_DATA_SIZE) so it's risky to blindly move from OLD to MIX if there is a large payload.
Internally these functions prepare the new or old frames and call `uart_communication` which calls `uart_send`.
### On the Proxmark3, for receiving frames:
(`armsrc/appmain.c`)
PacketCommandNG
`AppMain` calls `receive_ng`(`common/cmd.c`) which calls `usb_read_ng`/`usart_read_ng` to get a `PacketCommandNG`, then passes it to `PacketReceived`.
(no matter if it's an old frame or a new frame, check `PacketCommandNG.ng` field to know if there are `oldargs`)
`PacketReceive` is the commands broker.
Old handlers will still find their stuff in `PacketCommandNG.oldarg` field.
### On the Proxmark3, for sending frames:
(`common/cmd.c`)
int16_t reply_ng(uint16_t cmd, int16_t status, uint8_t *data, size_t len)
int16_t reply_old(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len)
int16_t reply_mix(uint64_t cmd, uint64_t arg0, uint64_t arg1, uint64_t arg2, void *data, size_t len)
So replies should make the transition from `reply_old` to `reply_ng` to benefit from smaller frames (and client reception adjusted accordingly of course).
`reply_mix` is a transition fct: it uses the same API as reply_old but benefits somehow from variable length frames. It occupies at least 24b of data for the oldargs and real data is therefore limited to PM3_CMD_DATA_SIZE - 24. Besides the size limitation, the client command doesn't know if this was an OLD frame or a MIX frame, it gets its oldargs and data as usual.
Example of a handler that supports both OLD/MIX and NG command styles and replies with the new frame format when it receives new command format:
if (packet->ng) {
reply_ng(CMD_FOOBAR, PM3_SUCCESS, packet->data.asBytes, packet->length);
} else {
// reply_old(CMD_ACK, 0, 0, 0, packet->data.asBytes, packet->length);
reply_mix(CMD_ACK, 0, 0, 0, packet->data.asBytes, packet->length);
}
### On the client, for receiving frames:
(`client/comms.c`)
WaitForResponseTimeout ⇒ PacketResponseNG
`uart_communication` calls `uart_receive` and create a `PacketResponseNG`, then passes it to `PacketResponseReceived`.
`PacketResponseReceived` treats it immediately (prints) or stores it with `storeReply`.
Commands do `WaitForResponseTimeoutW` (or `dl_it`) which uses `getReply` to fetch responses.
## API transition
In short, to move from one format to the other, we need for each command:
* (client TX) `SendCommandOLD``SendCommandNG` (with all stuff in ad-hoc PACKED structs in `data` field)
* (pm3 RX) `PacketCommandNG` parsing, from `oldarg` to only the `data` field
* (pm3 TX) `reply_old``reply_ng` (with all stuff in ad-hoc PACKED structs in `data` field)
* (client RX) `PacketResponseNG` parsing, from `oldarg` to only the `data` field
Meanwhile, a fast transition to MIX frames can be done with:
* (client TX) `SendCommandOLD``SendCommandMIX` (but check the limited data size PM3_CMD_DATA_SIZE ⇒ PM3_CMD_DATA_SIZE_MIX)
* (pm3 TX) `reply_old``reply_mix` (but check the limited data size PM3_CMD_DATA_SIZE ⇒ PM3_CMD_DATA_SIZE_MIX)
## Bootrom
Bootrom code will still use the old frame format to remain compatible with other repos supporting the old format and because it would hardly gain anything from the new format:
* almost all frames convey 512b of payload, so difference in overhead is neglictible
* bringing flash over usart sounds risky and would be terribly slow anyway (115200 bauds vs. 7M bauds).
`SendCommandBL` is the same as `SendCommandOLD` with a different name to be sure not to migrate it.
### On the Proxmark3, for receiving frames:
(`bootrom/bootrom.c`)
usb_read (common/usb_cdc.c) ⇒ UsbPacketReceived (bootrom.c)
⇒ CMD_DEVICE_INFO / CMD_START_FLASH / CMD_FINISH_WRITE / CMD_HARDWARE_RESET
also `usb_enable`, `usb_disable` (`common/usb_cdc.c`)
### On the Proxmark3, for sending frames:
(`bootrom/bootrom.c`)
reply_old (bootrom.c) ⇒ usb_write (common/usb_cdc.c)
also `usb_enable`, `usb_disable` (`common/usb_cdc.c`)
### On the client, for sending frames:
Therefore, the flasher client (`client/flasher.c` + `client/flash.c`) must still use these old frames.
It uses a few commands in common with current client code:
OpenProxmark
CloseProxmark
SendCommandOLD
⇒ CMD_DEVICE_INFO / CMD_START_FLASH / CMD_FINISH_WRITE / CMD_HARDWARE_RESET
### On the client, for receiving frames:
As usual, old frames are still supported
WaitForResponseTimeout ⇒ PacketResponseNG
## New usart RX FIFO
USART code has been rewritten to cope with unknown size packets.
* using USART full duplex with double DMA buffer on RX & TX
* using internal FIFO for RX
`usart_init`:
* USART is activated all way long from usart_init(), no need to touch it in RX/TX routines: `pUS1->US_PTCR = AT91C_PDC_RXTEN | AT91C_PDC_TXTEN`
`usart_writebuffer_sync`:
* still using DMA but accepts arbitrary packet sizes
* removed unneeded memcpy
* wait for DMA buffer to be treated before returning, therefore "sync"
* we could make an async version but caller must be sure the DMA buffer remains available!
* as it's sync, no need for next DMA buffer
`usart_read_ng`:
* user tells expected packet length
* relies on usart_rxdata_available to know if there is data in our FIFO buffer
* fetches data from our FIFO
* dynamic number of tries (depending on FPC speed) to wait for asked data
`usart_rxdata_available`:
* polls usart_fill_rxfifo
* returns number of bytes available in our FIFO
`usart_fill_rxfifo`:
* if next DMA buffer got moved to current buffer (`US_RNCR == 0`), it means one DMA buffer is full
* transfer current DMA buffer data to our FIFO
* swap to the other DMA buffer
* provide the emptied DMA buffer as next DMA buffer
* if current DMA buffer is partially filled
* transfer available data to our FIFO
* remember how many bytes we already copied to our FIFO
## Timings
Reference (before new format):
linux usb: #db# USB Transfer Speed PM3 ⇒ Client = 545109 Bytes/s
On a Windows VM:
proxspace usb: #db# USB Transfer Speed PM3 ⇒ Client = 233998 Bytes/s
Over USART:
(`common/usart.h`)
USART_BAUD_RATE defined there
9600: #db# USB Transfer Speed PM3 ⇒ Client = 934 Bytes/s
115200: #db# USB Transfer Speed PM3 ⇒ Client = 11137 Bytes/s
460800: #db# USB Transfer Speed PM3 ⇒ Client = 43119 Bytes/s
linux usb: #db# USB Transfer Speed PM3 ⇒ Client = 666624 Bytes/s (equiv. to ~7Mbaud)
(`pm3_cmd.h`)
Receiving from USART need more than 30ms as we used on USB
else we get errors about partial packet reception
FTDI 9600 hw status ⇒ we need 20ms
FTDI 115200 hw status ⇒ we need 50ms
FTDI 460800 hw status ⇒ we need 30ms
BT 115200 hf mf fchk 1 dic ⇒ we need 140ms
# define UART_FPC_CLIENT_RX_TIMEOUT_MS 170
# define UART_USB_CLIENT_RX_TIMEOUT_MS 20
# define UART_TCP_CLIENT_RX_TIMEOUT_MS 300
This goes to `uart_posix.c` `timeval` struct
and `uart_win32.c` `serial_port_windows` struct
It starts at UART_FPC_CLIENT_RX_TIMEOUT_MS and once we detect we're working over USB
it's reduced to UART_USB_CLIENT_RX_TIMEOUT_MS.
Add automatically some communication delay in the `WaitForResponseTimeout` & `dl_it` timeouts.
Only when using FPC, timeout = 2* empirically measured delay (FTDI cable).
Empirically measured delay (FTDI cable) with "hw ping 512" :
usb ⇒ 6.. 32ms
460800 ⇒ 40.. 70ms
9600 ⇒ 1100..1150ms
(`client/comms.c`)
static size_t communication_delay(void) {
if (conn.send_via_fpc_usart) // needed also for Windows USB USART??
return 2 * (12000000 / uart_speed);
return 100;
}
Because some commands send a lot of frames before finishing (hw status, lf read,...),
`WaitForResponseTimeout` & `dl_it` timeouts are reset at each packet reception,
so timeout is actually counted after latest received packet,
it doesn't depend anymore on the number of received packets.
It was needed to tune pm3 RX usart `maxtry` :
(`common/usart.c`)
uint32_t usart_read_ng(uint8_t *data, size_t len) {
// Empirical max try observed: 3000000 / USART_BAUD_RATE
// Let's take 10x
uint32_t tryconstant = 0;
#ifdef USART_SLOW_LINK
// Experienced up to 13200 tries on BT link even at 460800
tryconstant = 50000;
#endif
uint32_t maxtry = 10 * (3000000 / USART_BAUD_RATE) + tryconstant;
`DbpStringEx` using `reply_old`:
time client/proxmark3 -p /dev/ttyACM0 -c "hw status"
2.52s
time client/proxmark3 -p /dev/ttyUSB0 -b 460800 -c "hw status"
3.03s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "hw status"
4.88s
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "hw status"
26.5s
`DbpStringEx` using `reply_mix`:
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "hw status"
7.08s
`DbpStringEx` using `reply_ng`:
time client/proxmark3 -p /dev/ttyACM0 -c "hw status"
2.10s
time client/proxmark3 -p /dev/ttyUSB0 -b 460800 -c "hw status"
2.22s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "hw status"
2.43s
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "hw status"
5.75s
time client/proxmark3 -p /dev/ttyUSB0 -b 9600 -c "lf read"
50.38s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "lf read"
6.28s
time client/proxmark3 -p /dev/ttyACM0 -c "mem dump f foo_usb"
1.48s
time client/proxmark3 -p /dev/ttyUSB0 -b 115200 -c "mem dump f foo_fpc"
25.34s
Sending multiple commands can still be slow because it waits regularly for incoming RX frames and the timings are quite conservative because of BT (see struct timeval timeout in uart_posix.c, now at 200ms). When one knows there is no response to wait before the next command, he can use the same trick as in the flasher:
// fast push mode
conn.block_after_ACK = true;
some loop {
if (sending_last_command)
// Disable fast mode
conn.block_after_ACK = false;
SendCommandOLD / SendCommandMix
if (!WaitForResponseTimeout(CMD_ACK, &resp, some_timeout)) {
....
conn.block_after_ACK = false;
return PM3_ETIMEOUT;
}
}
return PM3_SUCCESS;
Or if it's too complex to determine when we're sending the last command:
// fast push mode
conn.block_after_ACK = true;
some loop {
SendCommandOLD / SendCommandMIX
if (!WaitForResponseTimeout(CMD_ACK, &resp, some_timeout)) {
....
conn.block_after_ACK = false;
return PM3_ETIMEOUT;
}
}
// Disable fast mode and send a dummy command to make it effective
conn.block_after_ACK = false;
SendCommandNG(CMD_PING, NULL, 0);
WaitForResponseTimeout(CMD_ACK, NULL, 1000);
return PM3_SUCCESS;
## Reference frames
For helping debug...
* OLD command and reply packets are 544 bytes.
* NG & MIX command packets are between 10 and 522 bytes.
* NG & MIX reply packets are between 12 and 524 bytes.
On linux USB
* sent packets can be 544
* received packets are max 128, so 544 = 128+128+128+128+32
On linux UART (FTDI)
* sent packets are max 256, so 544 = 256+256+32
* received packets are max 512, so 544 = 512+32
`hw ping` (old version, mix reply)
TestProxmark: SendCommandOLD(CMD_PING, 0, 0, 0, NULL, 0);
->544=0901000000000000000000000000000000000000000000000000000000000000 -> OLD
CMD_PING: reply_mix(CMD_ACK, reply_via_fpc, 0, 0, 0, 0);
<-36=504d336218000000ff0000000000000000000000000000000000000000000000 <- MIX
`hw ping` (intermediate version using MIX)
CmdPing SendCommandMIX(CMD_PING, 0, 0, 0, NULL, 0);
->34=504d336118000901000000000000000000000000000000000000000000000000 -> MIX
CMD_PING reply_mix(CMD_ACK, reply_via_fpc, 0, 0, 0, 0);
<-36=504d336218000000ff0000000000000000000000000000000000000000000000 <- MIX
`hw ping` (current NG version)
CmdPing SendCommandNG(CMD_PING, data, len);
->10=504d3361008009016133 -> NG
CMD_PING reply_ng(CMD_PING, PM3_SUCCESS, packet->data.asBytes, packet->length);
<-12=504d33620080000009016233 <- NG
`hw ping 512` (NG)
CmdPing SendCommandNG(CMD_PING, data, len);
->522=504d336100820901000102030405060708090a0b0c0d0e0f1011121314151617 -> NG
CMD_PING reply_ng(CMD_PING, PM3_SUCCESS, packet->data.asBytes, packet->length);
<-128=504d3362008200000901000102030405060708090a0b0c0d0e0f101112131415 <- NG
<-128=767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495
<-128=f6f7f8f9fafbfcfdfeff000102030405060708090a0b0c0d0e0f101112131415
<-128=767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495
<-12=f6f7f8f9fafbfcfdfeff6233