Strange File Missing Error & Reset Commands Not Working
I'm having two issues and I think they could be related.
First, when I'm trying to load the data I'm getting this error from the load_denormalized part.
$ sh load_tweets_parallel.sh
================================================================================
load pg_denormalized
================================================================================
psql: error: connection to server at "localhost" (127.0.0.1), port 4720 failed: FATAL: "base/13468" is not a valid data directory
Claude says the possible causes of this error are that
- The data directory is corrupted or incomplete
- You're pointing to the wrong location for your data directory
- Permissions issues preventing access to the data files
- Incomplete database initialization
And all of these things seem like they're not things I could have caused. I haven't run any non-Kosher commands, like rm $HOME/bigdata, and I'm using files that all worked on the last assignment. And all I've changed in the docker-compose file was the port numbers.
And second, when I try to run the commands provided to remove some of the data, I get this error when the containers are up.
$ docker compose exec pg_normalized_batch bash -c 'rm -rf $PGDATA'
rm: cannot remove '/var/lib/postgresql/data': Device or resource busy
I tried pulling them down before running the commands, then I get this error.
$ docker compose exec pg_normalized_batch bash -c 'rm -rf $PGDATA'
service "pg_normalized_batch" is not running
My only guess is that I've somehow accidentally deleted the symlink, but honestly I'm not confident in that hypothesis. Any ideas for what's gone wrong here?
Your docker volume has gotten into a corrupted state that is preventing you from even running the reset command. (This has happened to a few other students as well.)
The solution is to run the following:
$ docker-compose run pg_normalized_batch bash -c 'rm -rf $PGDATA/*'
(And the equivalent for the denormalized db.)
The command above has two differences from the command that was in the README:
- I have changed
$PGDATAtoPGDATA/*. This means that instead of deleting the directory, the command will only delete the contents of the directory. Several students have had problems where the OS didn't allow the deletion of the directory and this led to a corrupted db. (I'm guessing that the first time you ran the deletion command, you received an error about not being able to delete because of an OS-level lock on a file.) - I have replaced
execwithrun. (The shorthand for this iss/exec/runsince this is the vim find/replace syntax, and all the cool kids normally just write the shortened version without explanation.) Recall thatexecruns the command in a running container, andrunwill work even if the container is not running (by bringing up the container if needed). This change allows thermcommand to run even if the container cannot be brought up normally because the db has been corrupted.
I have updated the README to contain this new command.
If the new command does not work for some reason, an alternative solution is to simply change the path in of the bind mount in docker-compose.yml from $HOME/bigdata/pg_normalized_batch to $HOME/bigdata/pg_normalized_batch2.
Since this is a different folder, it will be empty when postgres first starts up, and it will create a new db from scratch even if the old one is corrupted.