aravis
aravis copied to clipboard
Image corruption with PtGrey BlackFly GigE camera
Describe the bug I see occasional corruption artefacts in images taken with Aravis from a PtGrey BlackFly GigE camera. These are visible as one or a few horizontal stripes of mismatching color in the image. Such artefacts are present in 1% to 50% of images, depending on network load.
It seems that the artefacts are triggered by packet_resend commands. I have discovered the following by analyzing network packets that were received during frame corruption:
- While streaming packets from camera to PC, some packets are lost due to network overload.
- For example, the PC receives packet_ids ..., 263, 264, 268, 269, 272, 273, ...
- At this point, ArvGvStream detects missing packets and sends a
packet_resendcommand to the camera to retransmit packet_ids 265, 266, 267 and packet_ids 270, 271. - The camera then interrupts its packet stream to resend the requested packets, for example: ..., 1367, 1368, 265, 266, 267, 1369, 1370, ...
- So far everything is fine. But problems begin when the camera handles the second resend command a little bit later. It sends packet_ids: 1406, 1407, 1369, 270, 271, 1409, 1410. However, the packet with packet_id 1369 actually contains the data that belongs to packet_id 1408. For some reason, the camera puts an incorrect packet_id value in the header of that packet.
- ArvGvStream detects packet_id 1369 as a duplicate packet. In such cases, ArvGvStream uses the data of the duplicate packet, thus replacing the data of the first (correct) packet_id 1369. This causes a corruption artefact in the image, because the duplicate did not contain correct data for packet_id 1369.
- Also, ArvGvStream detects packet_id 1408 as missing, and sends a resend command to have it retransmitted.
It seems like this is a bug in the PtGrey firmware, not in Aravis. However, I can't be sure because I don't have access to GigE protocol documentation. I would really appreciate some advice. I can report this issue at PtGrey, but don't think they will take it very seriously since it does not occur with their own software.
With respect to handling of duplicate packets in ArvGvStream, it would make more sense to me if the code would completely ignore duplicate packets. In my case the first packet was good and the duplicate packet is corrupt, so ignoring the duplicate packet would solve my problem. Any thoughts on this?
To Reproduce
- Connect the PtGrey BlackFly GigE camera to a PC via a Gbit switch, and connect a second PC to generate background network traffic.
- Run
arv-viewer. Start a video stream from the camera. Maximize the window such that single-scanline artefacts will be visible. - Start transmitting background network traffic from the extra PC to the Aravis PC. For example I simply open a TCP connection between the PCs and stream /dev/urandom from the second PC to /dev/null on the Aravis PC.
- Notice that artefacts start to show up in the video stream as soon as the background network traffic starts.
Camera description:
- Manufacturer: PtGrey
- Model: BlackFly BFLY-PGE-31S4M, firmware version 1.61.3.00
- Interface: GigE
Platform description:
- Aravis version: confirmed with Aravis 0.6.0 and with current Github (commit 22c48eb8582312105689149f356278093da5fb91)
- OS: Debian 10.1
- Hardware: x86_64
Hi Joris,
Thanks a lot for the detailed analysis of the issue.
It seems like this is a bug in the PtGrey firmware, not in Aravis. However, I can't be sure because I don't have access to GigE protocol documentation. I would really appreciate some advice. I can report this issue at PtGrey, but don't think they will take it very seriously since it does not occur with their own software.
Don't hesitate to contact PointGrey and point them to this report. I'm pretty sure they will take it seriously. They already have shown their interest in Aravis by donating devices.
With respect to handling of duplicate packets in ArvGvStream, it would make more sense to me if the code would completely ignore duplicate packets. In my case the first packet was good and the duplicate packet is corrupt, so ignoring the duplicate packet would solve my problem. Any thoughts on this?
That definitely makes sense. Could you propose a patch that implements the rejection of the duplicated packets ?
Cheers.
Hello Emmanuel ,
Thanks for your comments. I reported the issue at PtGrey support. If something useful comes out of it, I will report it in this issue.
I modified ArvGvStream to fully ignore duplicate packets. This works for my scenario: duplicate packets still occur (as reported by the duplicate packet counter), but the images are now free of artefacts.
My changes are in pull request #313
Hi Joris.
I have pushed your fix to master. As I've said in the commit entry, it seems not only the resent packets may have wrong packet id, but also the wrong frame id, when packets are late. In this case, the workaround doesn't work, and it is even the reason these wrong packets cause visible artifacts. But it happens less often, so your fix is still an improvement.
For the record, here are the commands I'm using in order to generate traffic on the ethernet link:
Aravis host:
iperf -s
Second host:
iperf -t 3600 -c aravis-host -d
Here is an example of a test pipeline that can help to see corruptions, when the camera is set to emit test images:
./src/arv-tool-0.8 -n PointGrey-13125101 control TestImageSelector=TestImage1
./gst/gst-aravis-launch aravissrc camera-name="PointGrey-13125101" ! videoconvert ! videodiff ! videoconvert ! fpsdisplaysink sync="false"
Thanks for your help with the workaround.
Considering the second type of problem, I can now reproduce that type of image corruption a well. We initially did not see this problem because we always use the camera at reduced frame rate.
If these new artefacts are caused by resent packets with incorrect frame_id (I have not yet verified that this is the cause in my case), then perhaps this can be mitigated by rejecting packets with type GVSP_PACKET_TYPE_RESEND unless we have actually requested a resend of that specific packet.
It looks like https://github.com/AravisProject/aravis/commit/a31df1206d34a30ad4ef5e5a1bfc027ee4e23728 doesn't fix the issue, so my guess about a bogus frame id is probably wrong, unless the fix itself is wrong.
With regard to the original issue (false duplicate packets): PtGrey say they failed to reproduce the issue. They created an internal ticket for my bug report, but I think it will have low priority unless they can reproduce it with their software.