goleveldb
goleveldb copied to clipboard
Can not start filer in dir /storage/filer : leveldb/storage: corrupted or incomplete meta file.
Just as a brief history of the issue,
- https://github.com/chrislusf/seaweedfs/issues/576
- https://github.com/chrislusf/seaweedfs/issues/578
we are using SeaweedFS, for image storage, and this tool is using your repo for handling all leveldb queries and methodologies
- https://github.com/chrislusf/seaweedfs/blob/master/weed/filer/embedded_filer/files_in_leveldb.go#L8
Issues:
Our Disk File system was XFS, and it got corrupt, we used xfs_repair, to repair it and get it back to life, But while all these operations, our ldb files got corrupt, and we couldn't start the leveldb, from db directory with an error as
I1101 12:42:19 31159 volume.go:110] loading index file /storage/1082.idx readonly false
F1101 12:42:19 31159 filer_server.go:53] Can not start filer in dir /storage/filer : leveldb/storage: corrupted or incomplete meta file
goroutine 21 [running]:
I used that tool https://github.com/rchunping/leveldb-tools to repair leveldb files, It solved the above error but when I start leveldb at port 8888, it gave me such error as,
017/11/02 01:50:41 http: panic serving 39.36.53.157:46124: leveldb: internal key "\x00\x01d,", len=4: invalid length
goroutine 2498 [running]:
net/http.(*conn).serve.func1(0xcd3aaf8080)
/usr/lib/go/src/net/http/server.go:1389 +0xc1
panic(0xaf99a0, 0xcfefca8210)
I found another solution for this as in a python script,
#!/usr/local/bin/python
import leveldb
leveldb.RepairDB('/data/leveldb-db1')
It solved the above issue as well NOTE: All operations was done on the basic corrupt files, not repaired and then after repaired files.
But Now, we can read the data in ldb files, but It don't let us save something new in it,
and gave such error as
I1102 12:41:51 9063 needle.go:80] Reading Content [ERROR] multipart: Part Read: unexpected EOF
I1102 12:41:51 9063 filer_server_handlers_write.go:106] failing to connect to volume server /everc-fhlcr/snapshots/recordings/2017/11/02/11/40_21_000.jpg Post http://master:8080/278,01661b8efc299f5656: read tcp master:8888->110.36.213.6:51741: i/o timeout
I1102 12:41:51 9063 filer_server_handlers_write.go:106] failing to connect to volume server /everc-fhlcr/snapshots/recordings/2017/11/02/11/40_11_000.jpg Post http://master:8080/250,01661b8ee8c040eccf: unexpected EOF
Question: Is there any repair method in your go implementation of Leveldb? Which we should use to repair the leveldb? or anything you can refer to, what is the issue here?
Try leveldb.RecoverFile
:
package main
import (
"log"
"github.com/syndtr/goleveldb/leveldb"
)
func main() {
db, err := leveldb.RecoverFile("/data/leveldb-db1", nil)
if err != nil {
log.Fatal(err)
}
db.Close()
}
Thanks for you quick answer,
I have tried it and I got this error after few minutes.
root@Ubuntu-1404-trusty-64-minimal ~ # ./repair
2017/11/19 13:35:50 leveldb/table: corruption on table-footer (pos=2118053): bad magic number [file=3380087.ldb]
Can you assist what to do next now?
I deleted that ldb file and again went on with repair command, And I got this error as
2017/11/19 15:17:48 leveldb/table: corruption on meta-block (pos=2098585): checksum mismatch, want=0xb5d00c82 got=0xcacf6ef [file=3380089.ldb]
(Not reponsible for the repo, just answering as an onlooker.)
Clearly your database files are corrupt, likely beyond repair. This is the point at which you restore from backup or rebuild based on original data from somewhere else. I don't think further magic repair options are going to help - and even if they did make the system happy, how would you ever know the data is consistent?
Actually, I have repaired them using this,
#!/usr/local/bin/python
import leveldb
leveldb.RepairDB('/data/leveldb-db1')
The only issue is: I cannot re-write into them or anyhow, I am unable to understand the issue here, I am asking about possible solutions:
1: Is that any possibility that I can duplicate all the ldb files into a new folder, with no corruption? 2: Can I fix those errors with the goleveldb repo? 3: Should I use the actuall, Leveldb Google repo's repair to do so?
@ijunaid8989 I fix few things, you may try again, sync your repo first, i.e. go get -u github.com/syndtr/goleveldb/leveldb
.
TBH, @calmh probably correct. This probably restore leveldb into working state, but it will not recover all your data, it only recover what can be recovered, missing keys is to be expected and we don't know how filer will cope with that.
@syndtr , Thanks for the work you have done.
Can you please do a little bit more change, as Recover table first, stop working when it finds corruption. You changed it, so don't report just continue, But can you make it in that way, It reports as well in the log and continues? In this way, it will be better to see which were the files (corrupt),
so we can identify at the end that which files were corrupt, with such logs?
2017/11/19 15:17:48 leveldb/table: corruption on meta-block (pos=2098585): checksum mismatch, want=0xb5d00c82 got=0xcacf6ef [file=3380089.ldb]
It is already reported in the LOG
file. Search line starting with table@recovery
.
Okay thanks, I can see that now.
One question: We have recovered it, and its working fine, but after recovery , the LDB files are very slow in reading, as before corruption, they were all very fast in showing the data, (We mostly have the directory structure in that created by seaweedfs Filer).
Is there any option to optimize the speed of reading?
Also, right now we are just repairing the files using no options like paranoid checks and compaction.
Is there any possibility to add those option to this
package main
import (
"log"
"github.com/syndtr/goleveldb/leveldb"
)
func main() {
db, err := leveldb.RecoverFile("/storage/filer", nil)
if err != nil {
log.Fatal(err)
}
db.Close()
}
One question: We have recovered it, and its working fine, but after recovery , the LDB files are very slow in reading, as before corruption, they were all very fast in showing the data, (We mostly have the directory structure in that created by seaweedfs Filer).
The levels is still rebuilding, the performance will be restored once the level is rebuild. CompactRange might speed things up:
package main
import (
"log"
"github.com/syndtr/goleveldb/leveldb"
"github.com/syndtr/goleveldb/leveldb/util"
)
func main() {
db, err := leveldb.OpenFile("/data/leveldb-db1", nil)
if err != nil {
log.Fatal(err)
}
defer db.Close()
if err := db.CompactRange(util.Range{}); err != nil {
log.Fatal(err)
}
}
Also, right now we are just repairing the files using no options like paranoid checks and compaction.
StrictAll
I believe is somewhat similar paranoid checks, however the default settings already checks for journal and blocks integrity which I believe is already sufficient. See https://godoc.org/github.com/syndtr/goleveldb/leveldb/opt#Strict.
You need to increase open files limit, e.g. ulimit -n 10000
.
Thanks for your help @syndtr.
Compaction is still going on and its been 3 days already, before repair and compaction we had, 65262
files in database directory, which are now, 65948
and the compaction log is still going on with such logs
07:12:25.933800 table@build created L1@3380919 N·95300 S·2MiB "\x00\x00w..jpg,v66946521":"\x00\x00w..jpg,v67042291"
07:18:29.731085 table@build created L1@3380920 N·95300 S·2MiB "\x00\x00w..jpg,v67042340":"\x00\x00w..jpg,v67137253"
07:24:33.121964 table@build created L1@3380921 N·95300 S·2MiB "\x00\x00w..jpg,v67137458":"\x00\x00w..jpg,v67232195"
07:30:37.162468 table@build created L1@3380922 N·95200 S·2MiB "\x00\x00w..jpg,v67232386":"\x00\x00w..jpg,v67327983"
07:36:41.937491 table@build created L1@3380923 N·95300 S·2MiB "\x00\x00w..jpg,v67327622":"\x00\x00x..jpg,v67422824"
07:42:46.359768 table@build created L1@3380924 N·95200 S·2MiB "\x00\x00x..jpg,v67423492":"\x00\x00x..jpg,v67518006"
07:48:51.961748 table@build created L1@3380925 N·95200 S·2MiB "\x00\x00x..jpg,v67518389":"\x00\x00x..jpg,v67613408"
07:54:55.551695 table@build created L1@3380926 N·95300 S·2MiB "\x00\x00x..jpg,v67613451":"\x00\x00x..jpg,v67708891"
08:00:59.880944 table@build created L1@3380927 N·95300 S·2MiB "\x00\x00x..jpg,v67709347":"\x00\x00x..jpg,v67804275"
08:07:03.667411 table@build created L1@3380928 N·95300 S·2MiB "\x00\x00x..jpg,v67803689":"\x00\x00x..jpg,v67899399"
while creating new ldb files, is there any estimate or way to check, when will it be completed?