amazon-kinesis-client-python icon indicating copy to clipboard operation
amazon-kinesis-client-python copied to clipboard

Recommended way of debugging a custom RecordProcessor

Open spg opened this issue 8 years ago • 6 comments

Looking at RecordProcessor::process_records

def process_records(self, process_records_input):
    ...
    try:
        for record in process_records_input.records:
            ...

    except Exception as e:
        sys.stderr.write("Encountered an exception while processing records. Exception was {e}\n".format(e=e))

I see that any exception raised in the try block will be written to stderr. However, the stderr buffer is not flushed immediately. In consequence, error messages are not displayed in the MultiLangDaemon's stdout, making debugging harder.

In your opinion, what would be the best solution to this problem? Here are some ideas:

  • disable output buffering (executableName = python -u myscript.py)
  • modify RecordProcessor so that it uses amazon_kclpy.kcl._IOHandler, e.g.:
def process_records(self, process_records_input):
    ...
    try:
        for record in process_records_input.records:
            ...

    except Exception as e:
        # self.iohandler would have to be injected somewhere
        self.iohandler.write_error("Encountered an exception while processing records. Exception was {e}\n".format(e=e))
  • handle exceptions myself from MyCustomRecordProcessor at the process_record level:
logger = logging.getLogger()
class MyCustomRecordProcessor(RecordProcessor):
    def process_record(self, data, partition_key, sequence_number, sub_sequence_number):
        try:
            # some exception is raised here
        except as e:
            logger.error("Something wrong happened: {e}".format(e=e))

spg avatar Nov 21 '16 19:11 spg

There is a request for a way to log messages in the parent process: Issue #60, that we're still looking at.

For my own use I currently use my own logging to record data from the child processes.

pfifer avatar Jan 03 '17 21:01 pfifer

Hi @pfifer I am using kcl python for developing my consumer, i need to change the log levels. thats being dumped on the console when i run the program using command amazon_kclpy_helper.py --print_command --java <path-to-java> --properties samples/sample.properties can you please help?

ajoevarghese avatar Jan 19 '18 14:01 ajoevarghese

hey did you found a work around for that? how can I log the KCL client to a file?

akivaElkayamTrex avatar Oct 02 '18 13:10 akivaElkayamTrex

We've had really good success with logging with the KCL by using a Python script that wraps the whole thing. It essentially does this:

  • Invokes the KCL using subprocess, piping both its stderr and stdout to the system stdout
  • Using a rotating file handler for our python script logging
  • Invoking a subprocess for tail -f on that log file, and piping its output to stderr

Running it this way, we essentially join the log streams of the KCL and our python logger. It's working really well for us.

For debugging, we use rpdb with a randomly selected (and logged) port.

ghost avatar May 10 '19 14:05 ghost

so @jtfalkenstein you're executing the java kcl app from a python script. that runs you record processor python script?

eylonsa avatar Aug 04 '19 13:08 eylonsa

Yeah. I use a python script to run a couple subprocesses, one of them is the KCL Java Pp and the other is a tail -f that watches the log file where I know it will be. In so doing, I can pipe the log file to stderr and the KCL output to stdout (or plain squelch the KCL output, since it is VERY chatty). This means the python process logs everything as you'd expect and if you didn't look at the code, you'd never know it was streaming logs with a rotating file handler. It has proved to be very stable for us.

ghost avatar Aug 04 '19 13:08 ghost