I’ve been working on a project with some very unfamiliar tech. The project involves communication between a new Android app (Kotlin) and an FTDI 232R connected to an Arduino. I encountered a problem that baffled everyone on the team for weeks and was about to be labeled a “general incompatibility between FTDI devices and Android apps” until I stumbled upon the solution. Before I describe the solution, I’m going to document some basic details. While working on this, I was unable to find any examples of people having this problem, so there was an intense feeling of isolation as I struggled on and off for weeks to resolve it. My hope is that this might help someone identify the same problem in their system.
If you’re experiencing mystery “0 bytes available” errors, you might need to change your latency timer setting. The problem is described here. I also strongly recommend you read the longer document from which that excerpt is drawn, AN23B-04 Data Throughput, Latency and Handshaking. We immediately resolved our issue with a call to
setLatencyTimer((byte) 1); and very small reads (64 bytes at a time, no more) but ultimately settled on an event character and larger reads. Full details below.
Our Arduino’s firmware is capable of sending a few different messages across the wire. Each message is small, anywhere from 16 bytes up to around 256. Most of these are on-demand: send a command from the application, the Arduino decodes it, then it sends one message in response that is either an ACK or the data that you’ve requested. There is one exception: one particular message from the app will trigger the start of an infinite stream of 44-byte messages at a frequency rate specified in the request. In this case, the Arduino is reading sensors, performing some basic analysis, and spitting it out across the wire for the app to do with as it pleases. The app reads this constant stream of bytes, does its own analysis, puts it on the screen, etc,…
Our minimum acceptable streaming rate is 300hz but we hope for closer to 500hz or greater, so our baud is currently 460800.
We encountered an issue whereby the app was constantly being told 0 bytes were available for read. The problem was extremely inconsistent and weird. The following were true:
- We could ALWAYS open the port from our app.
- We could ALWAYS transmit successfully from the app to the Arduino. We knew this because logs on the firmware indicated that the right bytes arrived in the right order.
- We would SOMETIMES receive the correct response. It was all or nothing: sometimes we would query for available bytes and be told 0, other times we would see the expected number.
- We could RARELY start the data stream. Once we sent the message to start streaming, the app would always believe there were 0 bytes available for read. Once that state was encountered, no other messages would be sent across the wire until we rebooted the firmware. It seemed to be more likely to fail as our streaming rate exceeded 100hz. Our target was 300hz or greater, so this was a serious problem!
Adding to the mystery, this seemed specific to the FTDI chip. Our first draft of this used the Arduino’s programming port for serial data transfer at 115200 baud. We were losing a lot of packets from the lack of flow control but it never failed to respond to messages.
More troubling was the fact that a C++ test application seemed to communicate correctly. This pointed towards a code problem with the Android app.
We tried three different libraries in attempts to resolve this. Those were:
- usb-serial-for-android - An open-source library that is pretty well maintained and offers a lot of features. Unfortunately, it doesn’t support automatic flow control, so we worried we wouldn’t be able to use it long term.
- UsbSerial - Another open-source library. This one is not nearly as well maintained and it has quite a few open issues that describe some pretty heinous bugs. I opened an issue after I found that calling the wrong method during initialization would result in all your sent messages being replaced by two 0 bytes for every one byte in your message! Brutal. It supports flow control but it has so many problems that I unfortunately couldn’t recommend it, even if it supported what we needed.
- FTDI’s official d2xx - The official closed-source library for FTDI devices. It hasn’t been updated in two years but by virtue of being official, we expected it to be more reliable or at least more full-featured. The closed-source part is a bummer and I think it would be a much better library if not for that, but that’s another story. This was the library we wound up using and we will continue to do so.
All three of these libraries exhibited the same behavior! This started looking like a major issue with FTDI devices. I ordered a few Prolific PL2303-based serial cables to test as an alternative but kept researching in the meantime.
I began looking at FTDI’s official test apps and their example Android app code. The example code is… not… great… but in taking notes, I came across a mysterious call to
setLatencyTimer(). This led me to this, which appeared to describe our problem exactly. It specifically remarks, “While the host controller is waiting for one of the above conditions to occur, NO data is received by our driver and hence the user’s application. The data, if there is any, is only finally transferred after one of the above conditions has occurred.” I did some more reading and found the longer AN23B-04 Data Throughput, Latency and Handshaking which explained this and many other concepts. This document was particularly enlightening. I feel like the embedded software development world is full of extremely dense, unapproachable technical specs that assume a ton of highly specific knowledge; by comparison, this document was a breath of fresh air and explained things from a high enough level that I came away feeling more capable of anticipating behavior as I continued troubleshooting.
It appeared that we were never hitting any of the three rules fast enough to trigger a read. It still doesn’t totally make sense to me; I feel like we should have eventually hit 4Kb to trigger the send, but maybe I never let it sit long enough to get there? Or maybe there was another timeout value that was clearing the buffer before then. What I do know is that if I set the latency timer down to 1ms and ensured we never requested more than 64 bytes at a time, our data read problems went away. We could stream at 500hz and messages would usually start showing up as soon as we hit the button. This change was as simple as
setLatencyTimer((byte) 1); and making sure that we never requested more than 64 bytes during a call to
read. The immediate problem was solved and it was clear that we did not have some incompatibility between FTDI and our Android app.
I say that it would “usually” start showing up because it still exhibited strange behavior. Very often, I would start the stream through the app’s interface and nothing would happen. Then I’d send another message (“get hardware version”) and not only would it get my hardware version, it would also recognize that data was streaming in. Other times, I would request our largest payload, a system configuration, and it would return 31 bytes of the 200+ we expect. Just like with the stream, I’d send any other message (“get firmware version”) and it the remaining 200+ bytes would show up.
I wound up making a few other changes to resolve this problem and improve the behavior overall.
First, using more information gleaned from the Data Throughput, Latency and Handshaking document, I thought that we be better off using the FTDI’s support for Event Characters than the latency timer. Our encoding rules use a 0 byte as a delimiter, so it was an obvious choice. This allowed me to increase our maximum read size up to 256 bytes, which helped in the event that our read loop was delayed and we had to quickly get through a backlog of data. (I could probably go higher but I’m being pretty careful right now, I want to keep things moving.) Finally, I modified the read loop to also be responsible for writes, added a FIFO queue for outgoing messages, and (crucially) a 50ms timeout of the loop after every single message sent. The 50ms timeout was the most significant piece – it was the final change that ensured that we stopped seeing partial messages or messages that only arrived after a subsequent send. I don’t have a good answer for why that was necessary but given the complexities of the d2xx library, reading from USB in general, the FTDI and its buffers, and the Arduino, it’s not too surprising that things can get out of sync if you’re moving fast.
With the implementation of the event character, the buffered writes added to the loop, and the timeout after writing, we appear to be running smoothly. So smoothly, in fact, that I was able to remove the
setLatencyTimer call entirely and just leave it at its default. As configured, data is sent as soon as a 0 is hit or 256 bytes are available, whichever comes first. (Typing this out, I realize that I should probably just set it to the exact size of our largest message, there’s no way it could ever be smaller and having an incomplete message does us no good!)
To summarize, we went through two rounds of improvements that changed our situation from bleak to beautiful.
- Set the FTDI’s latency timer to 1ms
- Limit our max read size to something small to prevent a “jerky” feel
- Revert latency timer to default
- Enable an event character keyed to our delimiter, a 0 byte – this is the key
- Set a max read that’s a bit bigger than our typical messages to help us catch up if we ever have a huge backlog and want to get the queue down (again, I don’t know what this situation is)
As it happens, the d2xx library is the only one of the three that supports configuration of latency timer, event character, and flow control. One of the two open source libraries supports the latency timer, the other claims to support flow control, and neither supports the event character. Only the closed-source official FTDI library d2xx supports all three, so we’ll be sticking with that.
It appears that our use case of extremely high streaming rate combined with tiny messages at a very high baud is somewhat unique. If we had been sending larger messages at a slower rate, I don’t think we would have encountered this. Our 44 byte messages at 300hz were the problem.
I spent many lonely weeks fighting with this. Failure to resolve it would have been a major problem for the project. In the end, the solutions I found were new to the whole team, which included many people with much more experience than me when it came to FTDI chips, which should go to show you how esoteric some of these configuration parameters very well may be. This is my first project writing Kotlin, working on Android, or using FTDI devices at all, so I while I’m disappointed that it was such an unpleasant struggle to get it done, I am pleased to have it behind me. I sincerely hope this helps somebody avoid going through the same experience.