Qcodes icon indicating copy to clipboard operation
Qcodes copied to clipboard

Add read until terminator support for ip instrument

Open jenshnielsen opened this issue 7 years ago • 26 comments

Hopefully fix stability issues that we have been seeing with the AMI 430 magnet

The idea is to ensure that we always recv until we get read termination char. The current implementation assumes that one recv is always enough to get all the data. That is probably true in almost all cases but might be wrong under heavily congested network traffic.

Still need to check the Terminator for the Mercury. Judging from the manual there may not be one. Currently it is set to None

jenshnielsen avatar Feb 19 '18 09:02 jenshnielsen

Should we not make the read_terminator default to terminator? I think we should always call recv until the read terminator has been seen.

sohailc avatar Feb 19 '18 11:02 sohailc

I agree. I wanted to test with other instruments before doing that but just doing a single recv without any checks is fundamentally wrong

jenshnielsen avatar Feb 19 '18 12:02 jenshnielsen

Codecov Report

Merging #971 (a6219b7) into master (18e9d4e) will increase coverage by 1.60%. The diff coverage is 16.00%.

:exclamation: Current head a6219b7 differs from pull request most recent head 5df574b. Consider uploading reports for the commit 5df574b to get more accurate results

@@            Coverage Diff             @@
##           master     #971      +/-   ##
==========================================
+ Coverage   65.56%   67.16%   +1.60%     
==========================================
  Files         227      145      -82     
  Lines       30626    17972   -12654     
==========================================
- Hits        20080    12071    -8009     
+ Misses      10546     5901    -4645     

codecov[bot] avatar Feb 19 '18 12:02 codecov[bot]

Other things I am concerned about:

  1. How do we prevent getting stuck in an infinite loop? Maybe we should have a timeout?
  2. How do we know the terminator string is at the end of the string we get from recv?

Maybe the following link can help us: http://code.activestate.com/recipes/408859-socketrecv-three-ways-to-turn-it-into-recvall/

sohailc avatar Feb 19 '18 13:02 sohailc

  1. We already set a timeout so the first call that times out will raise a TimeoutError. We should probably catch and handle that better.
  2. As long as we send commands one by one I think we should always get a buffer that ends with a terminator but if we send more than one command before reading then you are right it can be anywhere in the stream.

jenshnielsen avatar Feb 19 '18 13:02 jenshnielsen

I think we should be careful and not assume we are reading before another command is send. As is done in the example code I linked to:

  1. We should see if the terminator marker is in the string
  2. If the termination marker is more then one character, it can be broken up in two read actions.
  3. If the termination marker is half way down the string, what do we do with the part of the string after the marker? Its part of a response to another querry so it should be stored in a buffer somewhere and appended to the string of the next recv command.

All of this will complicate our code dramaticaly so I am wondering if we really should do all of this. I am leaning towards a "yes" though as it will make our code more robust.

sohailc avatar Feb 19 '18 13:02 sohailc

This kind of issues is also addressed in pyvisa-py (tcpip-socket backend). You may be interested in giving it a look.

MatthieuDartiailh avatar Feb 19 '18 13:02 MatthieuDartiailh

In general we should strongly consider if we really need this IP class. I would expect we can and I think we should replace both the magnet drivers with visa drivers that connects to visa over sockets.

jenshnielsen avatar Feb 19 '18 13:02 jenshnielsen

I agree! We do not want to duplicate efforts

sohailc avatar Feb 19 '18 13:02 sohailc

  • This now adds a comment suggesting to use sockets over visa.
  • Takes into account that the EOL may be split over more than one recv
  • Clips the EOL and any residual text from the recv string
  • Warns if any text follows the EOL. If that is the case we are doing something wrong

jenshnielsen avatar Feb 20 '18 08:02 jenshnielsen

@WilliamHPNielsen would be useful to look at this in the context of the mercury

jenshnielsen avatar Jun 20 '18 08:06 jenshnielsen

I'll test it with the old (current) IPInstrument MercuryiPS driver.

WilliamHPNielsen avatar Jun 20 '18 09:06 WilliamHPNielsen

@sohailc, do you see any reason to not merge this guy?

WilliamHPNielsen avatar Jun 25 '18 13:06 WilliamHPNielsen

Well I still don't like the way partial messages are treated. I think there is a more robust way to do this and one which does not throw away anything. Here is an example of how I would do this: https://gist.github.com/sohailc/a0fcbfff5359f5fc4cc1ad9f2e0707ba

sohailc avatar Jun 25 '18 13:06 sohailc

@sohailc imho this is of little consequence. If we ever read more than one reply it is likely due to a non empty output buffer even before whatever write command we did. Most likely the output of the last recv is the answer that you actually want. In principle we need to prevent this from ever happening as the reply is likely to be wrong no matter what.

Honestly perhaps we should just raise and close the connection in that case.

jenshnielsen avatar Jun 25 '18 14:06 jenshnielsen

Jens wrote: "If we ever read more than one reply it is likely due to a non empty output buffer even before whatever write command we did."

I'm sorry Jens, but I just don't agree. I think we need to write our code and flexible and general as possible and not include any assumptions like the one you are making. Keep in mind that this modification will not only affect the MercuryiPS driver but all IPInstruments. If we receive jibberish from the Mercury power controller the proper location to raise and close the connection is in the instrument driver, not in the base IPInstrument class which I think we should program as general as possible.

I you guys disagree feel free to merge this PR, but it is just not the way I would do things.

sohailc avatar Jun 25 '18 14:06 sohailc

Ok Sohail I will be more explicit. I think your solution will result in wrong results in all relevant cases for all our IP Instruments. I have identified a real use case where it will result in incorrect results. This applies to all IPInstruments and not specifically the mercury. I cannot see any use cases where it does anything useful. If you can identify such a use case I am happy to reconsider

Try looking at the _recv function and how it is used. It is only used after a send command. Each send command must result in a single reply or no reply. If it does not the whole class is fundamentally broken.

jenshnielsen avatar Jun 25 '18 15:06 jenshnielsen

I have not tested my particular implementation with any instruments. I was trying to explain that there should be more robust ways of dealing with instrument replies. I am sure you are right that my solution will result in wrong results, but can you share your reasoning? Which use cases have you identified?

As for your question of when my solution would be useful, what about if a reply to a _recv command is so long that an instrument sends it in two or more chunks? We have never seen that but we cannot altogether rule it out, especially for older instruments.

sohailc avatar Jun 25 '18 15:06 sohailc

You know what, I think you are right. The case that a reply is send in two chucks is already taken care of in this PR. I am approving this PR

sohailc avatar Jun 25 '18 15:06 sohailc

Here is a real world example for an instrument with two parameters volt and curr

The output buffer on the instrument contains a reply to a command that has never been read i.e. "1.0e1V\n". This is not read due to a crash at an earlier run. They you ask the instrument CURR? The _recv will then get"1.0e1V\n" or perhaps "1.0e1V\n5.0A\nfollowing the loop construct. In the first case we will get the wrong result in any case with both solutions. In the second case you will return1.0e1Vand keep5.0A` in the internal buffer and return that on the next command.

It's not about 2 or more chunks that's already handled correctly we are in any case reading until a termination char in this PR. It is only about two or more termination characters within one message. And as far as I can see your solution will still not handle that hypothetical situation correctly. If a command to the instrument returns two partial results separated by a newline char the ask command will still only return the first part and the next one will be left in the internal buffer of the Instrument for the next ask command giving an incorrect result here.

jenshnielsen avatar Jun 25 '18 15:06 jenshnielsen

@sohailc Just saw your message glad that we understand each other. This still need real world testing before merging can be considdered. And I think we should look into better handling of Interrupts within the _recv command. I suspect these are a source on the kind of inconsistency that I talk about

jenshnielsen avatar Jun 25 '18 15:06 jenshnielsen

Actually my solution does not handle it any better. I stand by what I said earlier. I think we should raise. Quoting the zen of Python In the face of ambiguity, refuse the temptation to guess.

jenshnielsen avatar Jun 25 '18 16:06 jenshnielsen

Also, I am not raising issues dealing with multiprocessing/threading, but technically I think our solution might be thread unsafe (right?). I am not objecting because for now QCoDeS is single process anyway. But what if two processes call _recv and the internal instrument buffer contains two responses?

I suppose we can deal with these kind of issues when we implement multiprocessing in the weeks/months to come?

sohailc avatar Jun 25 '18 16:06 sohailc

I suspect that no instrument is thread safe see https://github.com/QCoDeS/Qcodes/issues/621 for an example of what can happen in a Visa instrument. We should have a good system, locks or otherwise as part of a multiprocessing architecture to ensure that an instrument can only be called from one thread

jenshnielsen avatar Jun 25 '18 16:06 jenshnielsen

@jenshnielsen said:

This still needs real world testing before merging can be considered.

As you saw, I did some preliminary testing (basically: it doesn't crash upon init, and commands are sent and read). I'll be happy to try and reproduce an actual fail case and see how the code lives up. What would you consider sufficient testing with an actual instrument?

WilliamHPNielsen avatar Jun 26 '18 07:06 WilliamHPNielsen

The one thing that i suspect still needs improving is the behavior when doing interrupts in the middle of get and set commands.

I think it's hard to reproduce the issue that this aims to fix in a real world scenario. It is so far pure speculation that a recv may only return a partial command. (But according to the specs it is possible) But I suspect this may happen with high volumes of network traffic perhaps coupled with high cpu utilization

jenshnielsen avatar Jun 26 '18 07:06 jenshnielsen

We are generally moving away from the ip instrument so lets drop this

jenshnielsen avatar Nov 15 '22 13:11 jenshnielsen