git-cc
git-cc copied to clipboard
UnicodeDecodeError on Windows
I tried to follow the instructions in the README and did this
$ git init
$ gitcc init /cygdrive/x/[my actual path]
Then I edited my .git/gitcc to this
[core]
type = UCM
[master]
clearcase = /cygdrive/x/[my actual path]
I'm not sure which config options are required so I only added type = UCM. Then I did the rebase and got this
$ gitcc rebase
> git ls-files --modified
> git log -n 1 --pretty=format:%ai master_cc
> cleartool ls -recurse -short .
Traceback (most recent call last):
File "../gitcc/gitcc", line 48, in <module>
main()
File "../gitcc/gitcc", line 14, in main
return invoke(cmd, args)
File "../gitcc/gitcc", line 38, in invoke
cmd.main(*args)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 35, in main
cache.start()
File "/cygdrive/c/aoln/gitcc/cache.py", line 22, in start
self.initial()
File "/cygdrive/c/aoln/gitcc/cache.py", line 32, in initial
self.read(cc_exec(ls))
File "/cygdrive/c/aoln/gitcc/common.py", line 50, in cc_exec
return popen('cleartool', cmd, CC_DIR, **args)
File "/cygdrive/c/aoln/gitcc/common.py", line 61, in popen
return stdout if not decode else stdout.decode(ENCODING)
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 164948-164951: invalid data
This is on Cygwin in Windows XP with python 2.6.5. I'm sure there are many files in my CC repository which are not UTF-8 but is this unsupported or is there some other thing going on?
You can try modifying the 'ENCODING' value in common.py to something more appropriate.
Alternatively, or in any case, try disabling the cache which isn't crucial to gitcc, and you might see some more interesting results.
[core]
cache = False
Setting ENCODING = "ISO8859-1" got me past that stage. I have also set cache = False but now I get:
$ gitcc rebase
> git ls-files --modified
> git log -n 1 --pretty=format:%ai master_cc
> cleartool rebase -rec -f
Traceback (most recent call last):
File "../gitcc/gitcc", line 48, in <module>
main()
File "../gitcc/gitcc", line 14, in main
return invoke(cmd, args)
File "../gitcc/gitcc", line 38, in invoke
cmd.main(*args)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 39, in main
cc.rebase()
File "/cygdrive/c/aoln/gitcc/clearcase.py", line 21, in rebase
out = cc_exec(['rebase', '-rec', '-f'])
File "/cygdrive/c/aoln/gitcc/common.py", line 52, in cc_exec
return popen('cleartool', cmd, CC_DIR, **args)
File "/cygdrive/c/aoln/gitcc/common.py", line 62, in popen
raise Exception((stderr + stdout).decode(ENCODING))
Exception: cleartool: Error: Cannot rebase -recommended the integration stream.
cleartool: Error: Unable to rebase stream "[my actual stream]".
Hi Anders,
I replied direct from email. Not sure if you got it.
Looks like you're view isn't UCM. Try removing the type config.
I tried removing type = UCM but that did not work. Then I changed the CC view from my dynamic integration view to my snapshot development view, which is the one I probably should have been using all along. Then judging by the output everything works fine except after the rebase there's no evidence of that anything has happened to my git repo. This is some excerpts from the console:
...
Done loading "\abc\def" (2464 objects, copied 0 KB).
...
Updating stream's configuration...
Cleaning up...
Rebase completed.
> cleartool lsh -fmt %o%m|%Nd|%u|%En|%Vn|%[activity]p\n -recurse .
Theres lots of "Processing dir...", "End dir..." but everywhere there is "Done loading" theres always 0 KB which seems fishy.
My apologies about the bad UCM tip. I should have looked more closely at your error.
All that noisy output is just the cleartool output from doing a rebase, it's not gitcc at all.
Did the 'lsh' command ever return? It's running a recursive history on the entire folder, which can take hours for a decent amount of history. If you want to see git-cc working and get a feel for how it works, try specifying a (very) small sub-directory for the 'clearcase' variable and run it again. Otherwise, or in any case, you just have to wait. There isn't, as far as I know, a faster way to retrieve history from Clearcase. Once it's finished retrieving the history up until 'now' the next time you run gitcc will be much quicker as it passes the '-since' option to 'lsh'.
Just as an aside, if you haven't specified 'branch' in your config you're only going to get 'main'. I suggest, especially if you're using UCM, to add the branch name of your integration stream. This may also explain why you didn't see anything in your git repository.
Yes the command returned, after a longish time. Not sure what a branch is in UCM, I'm very much a CC novice, but I used the same name as my integration stream which worked when working against a subdirectory of my stream. Then I tried on a larger directory and got the following:
...
> git branch -f master_cc
Traceback (most recent call last):
File "/sc/aoln/gitcc/gitcc", line 48, in <module>
main()
File "/sc/aoln/gitcc/gitcc", line 14, in main
return invoke(cmd, args)
File "/sc/aoln/gitcc/gitcc", line 38, in invoke
cmd.main(*args)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 52, in main
doStash(lambda: doCommit(cs), stash)
File "/cygdrive/c/aoln/gitcc/common.py", line 40, in doStash
f()
File "/cygdrive/c/aoln/gitcc/rebase.py", line 52, in <lambda>
doStash(lambda: doCommit(cs), stash)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 69, in doCommit
git_exec(['branch', '-f', CC_TAG])
File "/cygdrive/c/aoln/gitcc/common.py", line 49, in git_exec
return popen('git', cmd, GIT_DIR, **args)
File "/cygdrive/c/aoln/gitcc/common.py", line 62, in popen
raise Exception((stderr + stdout).decode(ENCODING))
Exception: fatal: Not a valid object name: 'master'.
I'm not including the full output to avoid disclosing my customer's code but I can give more detail if needed. Thanks for all the help by the way!
Ooops, I used Comment & close by mistake.
Anders, that error means that there is no 'master' branch, which means the import was unsuccessful.
Was there any interesting output after lshistory finally finished? I'm assuming there isn't. Hopefully it's to do with a mis-configured branches. To work out what the branch is in UCM open up the version tree on a file that's been modified recently and look at the branch names there. Use the one that is the branch name representing your integration stream, which will end with "_int" most likely. Apologies, I haven't used Clearcase in about 2 years.
Whenever you run 'gitcc rebase' again should also use 'gitcc rebase -load=.git/lshistory.bak' to save yourself the huge wait for Clearcase to do its job.
Okay, now it seems to stumble on a pdf that is checked in to clearcase:
$ gitcc rebase --load=.git/lshistory.bak
...
> git add -f "[path to pdf]
> git branch -f master_cc
> git tag -f master_ci master_cc
Traceback (most recent call last):
File "/sc/aoln/gitcc/gitcc", line 48, in <module>
main()
File "/sc/aoln/gitcc/gitcc", line 14, in main
return invoke(cmd, args)
File "/sc/aoln/gitcc/gitcc", line 38, in invoke
cmd.main(*args)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 52, in main
doStash(lambda: doCommit(cs), stash)
File "/cygdrive/c/aoln/gitcc/common.py", line 40, in doStash
f()
File "/cygdrive/c/aoln/gitcc/rebase.py", line 52, in <lambda>
doStash(lambda: doCommit(cs), stash)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 63, in doCommit
commit(cs)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 146, in commit
cs.commit()
File "/cygdrive/c/aoln/gitcc/rebase.py", line 183, in commit
file.add(files)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 209, in add
self._add(self.file, self.version)
File "/cygdrive/c/aoln/gitcc/rebase.py", line 213, in _add
toFile = path(join(GIT_DIR, file))
File "/cygdrive/c/aoln/gitcc/common.py", line 145, in path
return os.popen('cygpath %s "%s"' %(args, path)).readlines()[0].strip()
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 68: ordinal not in range(128)
Note that I've set ENCODING = "ISO8859-1" in common.py.
Hmm. I'm not sure 100% sure, but does something like this work:
'cygpath %s "%s"' %(args.encode(ENCODING), path.encode(ENCODING))
Or alternatively
readlines()[0].decode(ENCODING)
My python was never very good. Apologies for the sloppy answer. Please let me know which it is so I can put a fix in for it.
Cheers.