python-progressbar icon indicating copy to clipboard operation
python-progressbar copied to clipboard

Multi thread functions show only one bar

Open AndrewLauu opened this issue 6 years ago • 23 comments

Description

I am now trying to build a multi-thread downloader with progressbar. When I try to get following result:

thread-1: |###### | ETA: 5s thread-2: |#### | ETA: 7s thread-3: |######## | ETA: 3s

While I only got one bar with changing prefixs(thread-1,thread-3...) and changing bar in console.

Having searched a lot but didn't find solution, so hope to get your help.

Simplified codes are as followed:

Code

def download():
    respone=requests.get(url,stream=True)
    for file in progressbar.progressbar(rsponse.iter_content(...)):
        with  open(dir,'w') as f:
            f.write(file)

for i in range(3):
    t=threading.Thread(target=download,...)
    t.start
for i in range(3):
    t.join()

Versions

  • Python version: 3.7
  • Python distribution/environment: PyCharm
  • Operating System: Windows 8.1

AndrewLauu avatar Nov 07 '18 16:11 AndrewLauu

Well... honestly, it's the first time it's come up so far. The module has currently no support for threads whatsoever.

I suppose it would be possible with a bit of work to let this module take care of the printing: https://github.com/Yinzo/reprint The big issue is that all of the progressbars would need to print at the same time, so they need to write to some intermediate output which will write to the screen regularly.

I'll put it on the todo but it's going to take some time to implement.

wolph avatar Nov 08 '18 02:11 wolph

Thanks for replying.

I'm just guessing whether print 3 progressbars all together and use thread_name to control the update separately will help.

def down(url):
    thread_name = t.current_Thread()
    for file in re.iter_content() :
        # do something
        update_bar(thread_name)

def handle():
    t = threading.Thread(target = down)
    # print 3 bar and wait for update

def update_bar(thread_name):
    bar.update(thread_name)

I'll try this : )

AndrewLauu avatar Nov 09 '18 05:11 AndrewLauu

may I chime in on this. I wanted to show you something @WoLpH

just run the attached script. it will show a demo. I set this up for a multi threaded distutils build. I know it works on Windows I did write the code to work on linux as well I have not tested that portion of it.

this runs 10 threads and each progress bar updates independently without needing to redraw the whole thing.

pyozw_progressbar.zip .

kdschlosser avatar Nov 19 '18 22:11 kdschlosser

Oh and one other thing. running any kind of a python script that manipulates the terminal/console at all will most likely not function properly if run inside of an IDE. specifically with PyCharm the "console" in PyCharm is not a windows console. it is a terminal. and because of this it does not respond to any windows specific commands to manipulate the console. it does however respond to ANSI escape codes which the Windows console does not. But it is always a safe bet to give it a go from a Console (cmd) window and see if things work properly. I do not know in what way @WoLpH does his cursor movements. In the script I posted above I use ctypes to access the Windows API to move the cursor. and in linux it can only be done with ANSI escape codes. But using either mechanism is how I am able to update one bar at a time and not all of them. I have a class that acts like a controller and that controller creates an instance of a progress bar for each thread. and when that bar is started it stores the row of where the cursor is located then it draws the bar. leaving the cursor in place for the next bar that has to draw. so when each bar goes to update independently it simply moves the cursor to column 0 of the row it stored. and then draws over the existing bar. In windows getting and setting the cursor position is done by column and row. in linux you can get the column and row but you have to move relative to the current position. so some math is needed

kdschlosser avatar Nov 19 '18 22:11 kdschlosser

The current method effectively comes down to this simple bit of code: https://github.com/WoLpH/python-progressbar/blob/02164e1d6db4c047c625fe368d4aece3ff4403a0/progressbar/bar.py#L143-L146

So nothing too fancy, simply overwriting the line again. The method you're using is a lot more advanced, you're actually moving around within the terminal so you can rewrite multiple lines. That could make things a lot easier than what I was thinking about initially (always having to rewrite everything) but I would need to test to make sure that works. And I'm completely unsure where this method would and would not work.

Within PyCharm it will definitely not work, but PyCharm has several output bugs regardless of that.

Thanks a lot for the script @kdschlosser, that will be very helpful in implementing this feature :)

wolph avatar Nov 20 '18 10:11 wolph

It is really amazing how things turn out. I used your package initially when i was creating a compiling routine for another project. then i discovered i didn't have to wait 5 minutes for the thing to compile each time i wanted to test. by using threads to compile different chunks of it at the same time. But i still wanted to keep the progress bar. I could have kept a single one but I thought it would be nice to be able to see what each one was doing. I went through all the thoughts you had about rewriting the whole screen each time.

The biggest problem with that is because of 10 threads in my case all writing it would cause flicker because of almost the whole window being redrawn. and if i did it on a timer it would not have up to date information on a single bar possibly 2 bars I do not know if the flicker would be as apparent. but having to redraw almost the whole console window.. it was quite nasty actually.

while i was working on that project.. I also happened to be writing a pure python windows API and I found the windows sdk files for handling console interaction. and one of the things i discovered was I could move the cursor.

I just finished writing that script i gave you 2 days ago.. and yesterday i wanted to show it to ya so if you wanted to do something similar to extend the functionality of your script you would have a starting point. and you would also be able to use the windows specific bits which are always a pain to write/research/get working properly.

and it happened to be that someone else had come across the same issue I had.. which is really bonkers. but there is the solution.

One thing I did really like about how I wrote that script and it makes a huge amount of sense and I think you should use the same kind of a model for your is the Cursor class. There is only one terminal/console when a python script/program gets run. so having a single instance of an object to handle all of the moving around getting the size. getting the position and changing colors would be a great idea. create the base class with methods/properties that would be the "public" bits and sublass the base for all the OS dependent stuff. easier to maintain that way I think. and keeps all the OS related parts in a single place. this also makes it easy if someone wanted to create 5 different bars that are completely different styles. You now would be able to do this. each of the bars would take reference to it's starting position only and always go back to that position before doing a redraw. you just have to make sure that if there are any parts in the bar that can dynamically change the length of the output text you will need to keep the last known length and if the refreshed test is shorter you have to fill the difference with spaces to clear the old data on the screen .

This script also lends to the wonderful thing of dynamically changing how the bar gets displayed based on the width of the terminal. so instead of the terminal wrapping the bar which would end up not displaying the thing properly you can inject newlines if the terminal width is to narrow to keep the bar being displayed properly.

add a note to the docs on this about running this inside of ANY IDE's console window. state that it is not compatible and has to be run from the native OS console/terminal. you can explain it has to deal with how the IDE changes the stdout and stderr because of debugging routines and other IDE relates things. And that the IDE console does not implement all of the features of the native console/terminal for the OS they are running.

It is an unfortunate that the IDE designers did not spend the time to write the code to make the IDE's terminal/console 100% compatible with the OS. but they didn't and that is just how it is. PyCharm specifically good luck with getting them to do anything. they have had 1000's of complaints on performance issues related to larger projects. and they still have not fixed it or even stated that it is a problem in the IDE or in the Java VM and it is not something they can repair. I even went to the extent of upgrading my RAM because at the time the bottleneck was memory. Now it's not and the problem still exists... There are bug reports on their "tracker" that are a decade old and the bug has not been fixed. They are a company that is all about tossing out new features that have issues but instead of making those features run at 100% they will start a new project to add another feature that doesn't work 100% correctly. and they do not care what gets reported or even what people would like to see added. they have a voting system for feature suggestions and there are ideas for a feature to be added that have 1000's of votes and they do not get added. I hate that mentality. PyCharms whole code inspection process is broken in a very bad way. the objects that gets created during that process never get release and the memory they use never gets GC'd that is why PyCharm's heap memory climbs but never comes back down after it is done. because the objects never get destroyed or GC'd. I even explained this to them (which I am sure they already knew about) they asked for me to send them memory snapshots. like they do with every single bug report. they do not need to see a memory snapshot from my machine to test for the existence of a bug. they want that snapshot to be able to point blame at something else. which is what they do with a lot of bug reports.

kdschlosser avatar Nov 20 '18 17:11 kdschlosser

It took me a while to thoroughly read your very detailed message, I fully agree though! :)

What I'm currently planning on doing is using your code to implement a little progressbar wrapper script which handles the printing. That should be fairly uninvasive for the current progressbar (meaning, probably no new bugs in existing code) and still allow for multiple parallel progressbars.

It will take a bit of time though.

wolph avatar Nov 21 '18 10:11 wolph

if you add a simple

if not hasattr(stdout, 'isatty') or not stdout.isatty():
    raise RuntimeError("This package can only be used in a native OS console/terminal/shell")

well test whatever the user is passing to your package to be written to. this will also test to see if your packaged is being from from inside of PyCharm. also the use of \r i do not believe will work if a file like object is passed to your script. so the isatty will test for that as well. I believe you do plan on adding the Windows specific bits as well. so those will not work unless it is a windows console. In windows 10 the console does now support ANSI, only colors tho.

Here is a newer version of that script you ran before. it has some things fixed but I also added color support. and used your detection of the terminal size with a bit of modification so it will only run through all of the different checks once and when it finds the right one it will use that from then on.

the color support is for linux as well as windows

pyozw_progressbar.zip

kdschlosser avatar Nov 21 '18 19:11 kdschlosser

You should be able to plop the top half of that script into you package. and inside of the bar classes grab the row number for the cursor instance in the bar __init__. if you do a comparison of the output objects that are passed to the bar class against sys.__stdout__ and sys.__stderr__ you will know if they are the original std's and if they are use the cursor objects that are already created. and if they are not then create a new cursor object. You will need to pass the fileno() to the cursor object.

I changed how the locks are handled the cursor object not has __enter__ and __exit__

with self.fd as fd:
    fd.write('some text', x=0, y=self.row, color=FOREGROUND_DARK_MAGENTA |  UNDERSCORE)

as an example.

kdschlosser avatar Nov 21 '18 19:11 kdschlosser

from blessings import Terminal
class Writer(object):
    """Create an object with a write method that writes to a
    specific place on the screen, defined at instantiation.

    This is the glue between blessings and progressbar.
    """

    def __init__(self, location: tuple, bottom=True, background=15):
        """
        Input: location - tuple of ints (x, y), the position
                        of the bar in the terminal
        """
        self.location = location
        self.term = Terminal()
        self.bottom = bottom
        self.background = background

    def write(self, string):
        # ProcessLock.lock()
        if self.bottom:
            # with self.term.location():
            print(string)
        else:
            with self.term.location(*self.location):
                color = self.location[1] * 4 % 12
                if color == self.background:
                    color = 15 - color
                    color = max(0, color)
                # print(color, self.background)
                # logger.critical(self.term.on_color(self.background))
                # logger.critical(self.term.color(color))
                print(self.term.on_color(self.background)(self.term.color(color)(string)))
            # logger.critical(string)
        # ProcessLock.unlock()

    def flush(self):
        pass

#USEAGE
pb.ProgressBar(
            widgets=self.widgets,
            max_value=self.total,
            fd=Writer((1, self.bar_pos), False, self.background),
            custom_len=custom_len
        )

This one can help you to write the bar to the different location of the terminal.

not perfect though.

rockkoca avatar Dec 29 '18 06:12 rockkoca

problem with that is it is for NIX OS's only. no Windows support. Plus is also uses a non std lib package as well.

I would have to double check , but I think that all of the non std lib modules that python-progressbar uses are modules that @WoLpH wrote and maintains. Nothing is more annoying then having to patch issues in code maintained by someone else.

I am dealing with that now with the client-websocket package. months ago a problem was reported and the author stated the fix was going to be in the next version. several versions later and the problem still exists.

kdschlosser avatar Dec 29 '18 10:12 kdschlosser

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 22 '19 09:08 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 22 '19 10:10 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Dec 21 '19 11:12 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Feb 19 '20 16:02 stale[bot]

Are there any updates on multithread support? Thanks!

matteogarzon avatar Jun 14 '21 10:06 matteogarzon

I would have to look at the current code and see how the drawing of the bar is handled. In order to properly write multiple bars to the screen the cursor would need to be moved to update only the information that has changed.

It is quite a large undertaking to do this if it is not written to update changed information only.

kdschlosser avatar Jun 14 '21 21:06 kdschlosser

How about this? :)

import random
import sys
import threading
import time

import progressbar

output_lock = threading.Lock()


class LineOffsetStreamWrapper:
    UP = '\033[F'
    DOWN = '\033[B'

    def __init__(self, lines=0, stream=sys.stderr):
        self.stream = stream
        self.lines = lines

    def write(self, data):
        with output_lock:
            self.stream.write(self.UP * self.lines)
            self.stream.write(data)
            self.stream.write(self.DOWN * self.lines)
            self.stream.flush()

    def __getattr__(self, name):
        return getattr(self.stream, name)


bars = []
for i in range(5):
    bars.append(
        progressbar.ProgressBar(
            fd=LineOffsetStreamWrapper(i),
            max_value=1000,
        )
    )

    if i:
        print('Reserve a line for the progressbar')


class Worker(threading.Thread):
    def __init__(self, bar):
        super().__init__()
        self.bar = bar

    def run(self):
        for i in range(1000):
            time.sleep(random.random() / 100)
            self.bar.update(i)


for bar in bars:
    Worker(bar).start()

wolph avatar Sep 14 '22 01:09 wolph

as a suggestion instead of using locks which is actually going to stall the progress of whatever is running start a dedicated thread and use a Queue to handle the writing. You can set up the thread so if the thread has not done anything for a user specified amount of time the thread will exit and once some new information comes in the thread that is leaving the data can start up the thread again. Make sure to set the thread as a daemon thread.

Going about it this way is ideal for speed and also ideal for resources. If it's not doing anything no reason to have it just hang around.

kdschlosser avatar Sep 15 '22 05:09 kdschlosser

Oh and another good thing about using a single thread to handle the writing to the screen is bars can be added and removed as wanted. and you can set that thread so it optimizes the update so you don't have to keep on returning the cursor to a home position after a bar is updated. so if the thread gets say bar 2 in the queue to update and while it is off updating bar 1, bar 8 and bar 4 leave updates in the queue in that order. The thread will be able to review those updates and determine that bar 1, bar 4 and bar 8 get updates in that order and it writes then all in a single update.

This eliminates artifacts from appearing or flickering. that kind of stuff.

The thread thing you may want to be able to have the user choose if that is what they want to do. I can see this being used a whole lot during compilation in setup programs where multiple things are being compiled at the same time. so the faster the calling thread can get back the faster it will be able to get back to compiling and that will shorten up the time it takes to compile.

kdschlosser avatar Sep 15 '22 06:09 kdschlosser

It's indeed a far better idea to have the background threads communicate the status back to the main thread instead of locking all the time, but that's why it's just a quick proof of concept :)

I'm not entirely sure how much overhead the queues have in Python actually. They use the same locking mechanisms internally so it might not make much difference. As long as we can rely on the GIL I could simply use a list. The list.append() and list.pop() methods are thread-safe when using CPython.

wolph avatar Sep 15 '22 13:09 wolph

I just got ansi codes to work in the Windows Console on Windows 10+. This is going to make it a whole lot easier to code in a multi bar system. I don't think we need to over complicate it at. At the time a progress bar is created the row he bar is on should be collected and any changes to that bar will use ansi codes to position to that row and make any needed changes. The row numbers should get stored in a mutable class attribute and new bars will go up from the largest continuous sequenced number.

what I mean by that is say the first bar is on row 5 the attribute is going to be [4, 5] I will explain the 4 in a second. Another bar is created and that one is going to look and it will look at the numbers and see there is a bar running on 5 so it will be on row 6 and it will add a 6 to that attribute. so now the attribute reads [4, 5, 6]. but say the bar on row 5 ends. it will remove the 5 from the attribute so now you have [4, 6]. say another bar is started, it will look and see that row 5 is available and that is where it will create the bar and add the number. when the bar removed it's row number if there is only 1 number remaining it will empty the attribute. So the only time the the cursor position needs to be checked for position is when the attribute is empty. after every update of the bar the cursor will be set to the home position so the next bar will know where it is. the last bar to finish up will set the cursor position to the row that is stored at column 1 so any text added by the program will be placed correctly.

That is what I am thinking. Use locks like what you have in the example above. only a single update at a time can occur and that is going to happen pretty fast. setting it up like that will make every bar that can possibly be made with the library able to support multiple bars in any combination. I have sorted out how to add colors and text attributes like blinking and underline.

I got the reading of the return value for the cursor position working under Windows. see if you can get it working for nix systems. You can probably just read sys.stdin. doing that does not work under Windows. There is only going to be a small amount of Windows specific code which will be nice.

kdschlosser avatar Sep 20 '22 09:09 kdschlosser

This should also work in IDEs as well.

kdschlosser avatar Sep 20 '22 09:09 kdschlosser

OK scratch it working in IDE's. PyCharm only supports ANSI codes for color and not cursor movements.

I have gotten the multi bar working . There are some bugs that need to be corrected but it is a good proof of concept.

https://github.com/kdschlosser/python-progressbar/tree/multi_thread

This works on Windows, I have not tested it on a nix system yet. Maybe someone else will volunteer to do that. To test it run the test.py file in the root folder.

There are changes I made to python-utils.terminal. It will not collect the terminal width under Windows 100% of the time. It does not matter if the terminal/console is a Microsoft application or a 3rd party application it should still collect the correct terminal width.

The changes to the terminal I have monkey patched in the test.py script

kdschlosser avatar Sep 24 '22 08:09 kdschlosser

Yeah... PyCharm has horrible ANSI support, and what it does is still buggy. At some point they've "fixed" bugs with progress bars by increasing the update interval so the bugs were less noticible, but the order in which stdout/stderr is written to the screen is still non-deterministic and will vary with every execution.

With regards to testing the script, on my mac it prints ^[[172;1R and nothing beyond that. It doesn't respond to ^c either so I'm not sure what it's doing ;) I'll have to do a little debugging first

wolph avatar Sep 29 '22 01:09 wolph

Again I have not tested my code on anything but Windows.. and the 172;1R is actually a response from the console as to the cursor location. We need to figure out how to collect that. It gets written to STDIN. On windows you cannot collect it by reading sys.stdin you have to use a Windows API to collect it. I know there is getch and apparently that is not working. You might want to try reading stdin and see what happens.

kdschlosser avatar Oct 02 '22 18:10 kdschlosser

Pycharm only supports ansi colors and nothing more.

I have a commit I need to do on that branch to fix how the bars are being written. In that commit they get a little goofed up because I am only updating things that have changed and not writing over the entire bar again. If you write the entire bar and say you have 10 bars on the screen it will flicker really bad. By updating only the characters that have changed stops the flickering from occuring.

I had originally used difflib but found that it makes it far too complex. In my commit I made a function that returns the character position and the character for the characters that have to change.

The challenging part I am still working out is handling ansi color codes being in the text..

I started writing a wrapper class that will make it easier to handle the ansi codes and to do equality testing for both ansi codes and also the actual data.

kdschlosser avatar Oct 02 '22 18:10 kdschlosser

The reason why the script gets stuck is because it's looping to collect the response but is unable to find it. So it is looping endlessly. Again this is a proof of concept. There is a lot of stuff in it that will never get used and can be removed. It need some code cleanup done and optimization.

The amount of the original library that has been altered is actually really small which is really important. It should not break anything with the current library because of how I added it.

kdschlosser avatar Oct 02 '22 18:10 kdschlosser

Testing it again it somehow worked now. Albeit with a few alignment issues (i.e. printing above my current line, etc..) but nothing major :)

Now I just have to find the time to merge all of this. The time I have available for this project got a bit of a setback this week which will trickle on the coming months... I'm still planning to work on it the coming weeks but it will take longer :(

wolph avatar Oct 05 '22 20:10 wolph

It is going to be a little bit longer until I finish it up. I am still working on dealing with ansi codes for color being mixed in with the bar and updating the bar.

The issue lies with screen flicker if you redraw the whole line every single time. So with multiple bars we have to update only the characters that change. If there are ansi color codes involved it gets a bit more complex because of where the character actually is in the line.

I have devised a wrapper class for str and I will do the same for bytes that hammers out all of this kind of stuff. It is also going to make it easier for a user to access the different character attributes.

We need to make sure that the getch is working 100% all the time. I did fix the random not working for getting the Windows terminal size. I will do a PR for that against python-utils.

kdschlosser avatar Oct 06 '22 07:10 kdschlosser