Fix PN532 UART support + macOS support
The MacOS changes are unrelated, so I guess they would better fit in another PR?
Actually, I do not understand what this PR is supposed to fix. Could you explain what is the underlying issue?
@smortex @neomilium Hey all, I'm finding that I get significant failure rates on nfc_open() with the PN532 via UART (FTDI) by default. It failed every time with the default apps like examples/nfc-poll so I assumed it was fully broken. I've been retesting and I find I get failures about 80-90% of the time. I tested across two different PN532s, two different FTDI chips, two different USB cables, and two different USB ports. Note, I had to make the macOS changes to even get to the point to try this as whichever termios.h that's being included doesn't have B115200 set despite macOS's actual sys/termios.h does support 115200, and 115200 is the default baud rate on the PN532.
I tried a logic analyzer and found the chip wasn't responding at all to initial commands. Oddly, I also noticed not all of the consecutive 0x00's were being sent. Perhaps this is an issue with the libusb version used.
At quick glance of the data coming out, I assumed stop bits were missing and that the commands were being sent in one shot so I split out the UART TX to write on byte at a time (for pn53x only), and it started working intermittently. I added an additional delay equal to one bit and now I have 100% success rate with examples/nfc-poll.
If pn532 was already working successfully for other people then I'm guessing varying libusb's may be the culprit, though if this works successfully on other user systems, I'd suggest we keep it to increase compatibility with systems.

And here's the debug output using the original version (still patched to support baud 115200 on macOS)
tigerblood:/Users/samy/code/nfc/libnfc/build$ examples/nfc-poll
examples/nfc-poll uses libnfc 1.8.0
debug libnfc.general log_level is set to 3
debug libnfc.general allow_autoscan is set to true
debug libnfc.general allow_intrusive_scan is set to true
debug libnfc.general 1 device(s) defined by user
debug libnfc.general #0 name: "pn532ftdi", connstring: "pn532_uart:/dev/tty.usbserial-FTGNNC0P"
debug libnfc.driver.pn532_uart Attempt to open: /dev/tty.usbserial-FTGNNC0P at 115200 baud.
debug libnfc.bus.uart Serial port speed requested to be set to 115200 baud.
debug libnfc.chip.pn53x Diagnose
debug libnfc.chip.pn53x Timeout value: 500
debug libnfc.bus.uart TX: 55 55 00 00 00 00 00 00 00 00 00 00 00 00 00 00
debug libnfc.chip.pn53x SAMConfiguration
debug libnfc.chip.pn53x Timeout value: 1000
debug libnfc.bus.uart TX: 00 00 ff 03 fd d4 14 01 17 00
debug libnfc.bus.uart Timeout!
debug libnfc.driver.pn532_uart Unable to read ACK
error libnfc.driver.pn532_uart pn53x_check_communication error
debug libnfc.chip.pn53x InRelease
debug libnfc.bus.uart TX: 00 00 ff 03 fd d4 52 00 da 00
debug libnfc.bus.uart RX: 00 00 ff 00 ff 00
debug libnfc.chip.pn53x PN53x ACKed
debug libnfc.bus.uart RX: 00 00 ff 03 fd
debug libnfc.bus.uart RX: d5 53
debug libnfc.bus.uart RX: 00
debug libnfc.bus.uart RX: d8 00
debug libnfc.general set_property_bool NP_ACTIVATE_FIELD False
debug libnfc.chip.pn53x RFConfiguration
debug libnfc.bus.uart TX: 00 00 ff 04 fc d4 32 01 00 f9 00
debug libnfc.bus.uart RX: 00 00 ff 00 ff 00
debug libnfc.chip.pn53x PN53x ACKed
debug libnfc.bus.uart RX: 00 00 ff 02 fe
debug libnfc.bus.uart RX: d5 33
debug libnfc.bus.uart RX: f8 00
debug libnfc.chip.pn53x PowerDown
debug libnfc.bus.uart TX: 00 00 ff 03 fd d4 16 f0 26 00
debug libnfc.bus.uart RX: 00 00 ff 00 ff 00
debug libnfc.chip.pn53x PN53x ACKed
debug libnfc.bus.uart RX: 00 00 ff 03 fd
debug libnfc.bus.uart RX: d5 17
debug libnfc.bus.uart RX: 00
debug libnfc.bus.uart RX: 14 00
debug libnfc.general Unable to open "pn532_uart:/dev/tty.usbserial-FTGNNC0P".
nfc-poll: ERROR: Unable to open NFC device.
tigerblood:/Users/samy/code/nfc/libnfc/build$
Perhaps this is an issue with the libusb version used.
Using UART devices, we pass through /dev/tty.serialXXX so there is not libusb involved here.
I added an additional delay equal to one bit and now I have 100% success rate with
examples/nfc-poll.
IMHO, we should rely on kernel to do these things. We probably need to tweak the way we send data to serial port (ie. maybe improve configuration we made on port opening).
Your detailed comment is really appreciated, even the implementation could help to target the problem but IMHO the final implementation should rely on kernel good-usage, not on a userland workaround.
Your detailed comment is really appreciated, even the implementation could help to target the problem but IMHO the final implementation should rely on kernel good-usage, not on a userland workaround.
Makes sense. I'm investigating why termios isn't sending all the bytes. For example you can see debug libnfc.bus.uart TX: 55 55 00 00 00 00 00 00 00 00 00 00 00 00 00 00 above but the logic output only sends two of the 0x00's. Looks like adding CSTOPB to termios' c_cflag will add an additional stop bit, equivalent to a delay, but the missing 0x00's is still an issue.
I'm not sure what's required for Windows; I can investigate but I may not be able to test properly and would be concerned with making driver-specific changes without ability to test.
Think I found a culprit. uart_send() which write()s to the FD returns immediately on my system (macOS 10.14.6), yet the bytes haven't finished TX'ing! So while it returns successfully as if it wrote it out, it hasn't transmitted all the way, at least down the full stack -- maybe it's buffered at the FTDI side and did properly send out my system.
What ends up happening is when pn532_SAMConfiguration() is called and it writes out the next command, that overwrites the buffer, hence the first screenshot I displayed showing only 2 bytes of the first command (ff ff) followed by the second command which doesn't provide enough time for the PN532 to wake up.
I tried a tcdrain() but that didn't seem to affect anything.
int
uart_send(serial_port sp, const uint8_t *pbtTx, const size_t szTx, int timeout)
{
(void) timeout;
LOG_HEX(LOG_GROUP, "TX", pbtTx, szTx);
if ((int) szTx == write(UART_DATA(sp)->fd, pbtTx, szTx))
{
tcdrain(UART_DATA(sp)->fd);
return NFC_SUCCESS;
}
else
return NFC_EIO;
}
uart_send()whichwrite()s to the FD returns immediately on my system (macOS 10.14.6), yet the bytes haven't finished TX'ing!
:thinking: have you tried adding O_FSYNC to the call to open(2) that returns this file descriptor? I would expect write to return imediatly until the system's buffer is full but it's just a guess so it might help.
🤔 have you tried adding
O_FSYNCto the call toopen(2)that returns this file descriptor? I would expect write to return imediatly until the system's buffer is full but it's just a guess so it might help.
Just tried, looks like macOS doesn't have O_FSYNC, and I don't see a similar attribute in open(2). I tried removing O_NONBLOCK but something blocks indefinitely when I do that. I tried disabling O_NONBLOCK only when writing as well which failed in the same way as if it were non-blocking.
// set blocking mode
int flags = fcntl(UART_DATA(sp)->fd, F_GETFL);
fcntl(UART_DATA(sp)->fd, F_SETFL, flags & ~O_NONBLOCK);
// write out
int bytes = write(UART_DATA(sp)->fd, pbtTx, szTx);
// return old mode (may or may not be nonblocking)
// i tested with the next line both enabled commented out
fcntl(UART_DATA(sp)->fd, F_SETFL, flags);
O_SYNC? My man page say:
O_SYNC is a synonym for O_FSYNC required by POSIX.
AFAICR, MacOS is supposed to be POSIX.
If it still says that O_SYNC does not exist, I guess some #define is required… greping in the include directory might help :shrug:
Ah, thanks! O_SYNC compiles (odd that it's not in man open(2) while O_NONBLOCK is documented there) but same issue, the TX data is overwritten. I tested with and without O_NONBLOCK as well.
Warning: The following could drive you crazy :-)
The serial port handling, in C, compliant with more-or-less POSIX platforms (e.g. Linux, FreeBSD, macOS), compliant with Windows, working with bad hardware, working through RS232, USB, Bluetooth, etc. was, is and will source of headache...
Maybe, we could try to use a library that already handle these corner-cases like: libserialport from sigrok project:
https://sigrok.org/wiki/Libserialport
I know it could be a big turn, but could be at least the right moment to test it, and see if the macOS support is better than ours.
Adding libserialport is likely a much larger undertaking than I'm interested in at the moment; I just wanted to try out libnfc and have it work natively on my mac :) and send a reasonable patch back so that others could do the same. I think for now I'll adjust my UART changes to ifdefs for Apple so the additional timing only will occur on macOS and will behave normally elsewhere. This way it won't impact any other users while still allowing macOS people to use it out of the box, assuming the new PR is considered acceptable.
I'm happy to contribute more if I have other projects in the area but at the moment I just wanted to try out the example apps to see how they worked, and it always feels nicer to get things working natively!