Kemono-scraper icon indicating copy to clipboard operation
Kemono-scraper copied to clipboard

start faile: unmarshal creator list error:

Open Shikitsuna opened this issue 8 months ago • 19 comments

Downloading Kemono
fetching creator list...
2025/08/17 08:39:08 kemono start failed: unmarshal creator list error: invalid character 'I' looking for beginning of value

is this kemono API's problem or something? any creator i tested and return to this

Shikitsuna avatar Aug 17 '25 02:08 Shikitsuna

I was also trying to take a quick look to see if I could fix that issue so I tried logging the data after the unmarshal that throws the error while it attempts to fetch the creators: `err = json.Unmarshal(data, &creators)

if err != nil {
	k.log.Print(string(data))
	return nil, fmt.Errorf("unmarshal creator list error: %s", err)
}`

And got this in the console: If you want to scrape, use "Accept: text/css" header in your requests for now. For whatever reason DDG does not like SPA and JSON, so we have to be funny. And you are no exception to caching.

So I tried adding text/css to the Accept header in the constants of the downloader and now I get a different error: fetch post list error: 403 Forbidden

Based on that the API changed so now we need to update how the scrapper does the requests to be accepted. Since I don't know what it was before it requires actually comparing the code was expecting to what the API wants and returns to figure out what needs to change, it could be simple or basically all requests the scraper does to the API.

randomNami avatar Aug 17 '25 17:08 randomNami

I think I got it working, I'm testing the scraper with only 1 link in the config.yaml and it is currently running (I probably should have used a link with less/smaller files and posts first, oh well). Here's what I had to change to get the current version working.

  1. I changed the every mention of the domain .su to .cr just like someone else already mentioned
  2. In downloader.go I added "text/css" to the Accept value in the constant, it should be line 26
  3. In the fetch.go file there are multiple changes needed in the FetchPosts function you need to update the url being set to end in "/posts" and also update the logic for the variable purl made form this one to not do it's modification if this is the first page to do this I added the variable max_num_posts that we set after this function definition. To get this value we need another request to the API is needed, so to do it +- properly I copied the same thing the FetchPosts was already doing to handle the requests for the posts and changed the end part to read the value we want from the request. here we get to the same url but instead of it ending in "/post" it ends in "/profile", the return of this request includes the value we want and this part of the code very likely should have been done in a much more clean manner but I was just testing to see if it worked and don't have the patience to improve the code quality here, basically it takes the output data read from the API's response, splits it for each "," present, gets the 9th value (arrays start at 0), splits it again for ":" and gets the second value, then I convert the text value to a number, this is because the output is in a Json format so something like:

{"id": "12345", "value1":"acbde", "value2":"dkfkds".......}

So we are getting the turning the text into an array split by the commas, we get the 9th element which in this case should be "post_count" and we get it's value (the second element after splitting this result)

The end result for this function should be something like what I have here:

func (k *Kemono) FetchPosts(service, id string) (posts []Post, err error) {

	url := fmt.Sprintf("https://%s.cr/api/v1/%s/user/%s/posts", k.Site, service, id)
	perUnit := 50
	fetch := func(page int, max_num_posts int) (err error, finish bool) {
		k.log.Printf("fetching post list page %d...", page)
		purl := url
		if page != 0 {
			purl = fmt.Sprintf("%s?o=%d", url, page*perUnit)
		}

		k.log.Printf("purl: %s", purl)

		retryCount := 0
		for retryCount < k.retry {
			resp, err := k.Downloader.Get(purl)
			if err != nil {
				k.log.Printf("fetch post list error1: %v", err)
				time.Sleep(k.retryInterval)
				retryCount++
				continue
			}

			if resp.StatusCode != http.StatusOK {
				k.log.Printf("fetch post list error2: %s", resp.Status)
				k.log.Printf("resp: %s", resp)
				time.Sleep(k.retryInterval)
				retryCount++
				continue
			}

			reader, err := handleCompressedHTTPResponse(resp)
			if err != nil {
				return err, false
			}

			data, err := ioutil.ReadAll(reader)
			if err != nil {
				return fmt.Errorf("fetch post list error3: %s", err), false
			}
			reader.Close()

			var pr []PostRaw
			err = json.Unmarshal(data, &pr)
			if err != nil {
				return fmt.Errorf("unmarshal post list error4: %s", err), false
			}
			if len(pr) == 0 || max_num_posts <= (page+1)*perUnit {
				// final page
				return nil, true
			}
			for _, p := range pr {
				posts = append(posts, p.ParasTime())
			}
			return nil, false
		}

		return fmt.Errorf("fetch post list error: maximum retry count exceeded"), false
	}

	for i := 0; ; i++ {
		url := fmt.Sprintf("https://%s.cr/api/v1/%s/user/%s/profile", k.Site, service, id)
		resp, err := k.Downloader.Get(url)
		if err != nil {
			return nil, fmt.Errorf("fetch post list error getting number of posts 1: %s", err)
		}

		if resp.StatusCode != http.StatusOK {
			return nil, fmt.Errorf("fetch post list error getting number of posts 2: %s", err)
		}

		reader, err := handleCompressedHTTPResponse(resp)
		if err != nil {
			return nil, fmt.Errorf("fetch post list error getting number of posts 3: %s", err)
		}

		data, err := ioutil.ReadAll(reader)
		if err != nil {
			return nil, fmt.Errorf("fetch post list error getting number of posts 4: %s", err)
		}
		reader.Close()

		var max_num_posts int
		temp := strings.Split(strings.Split(string(data), ",")[8], ":")[1]
		_, err = fmt.Sscan(temp, &max_num_posts)

		if err != nil {
			return nil, fmt.Errorf("fetch post list error getting number of posts 5: %s", err)
		}

		err, finish := fetch(i, max_num_posts)
		if err != nil {
			return nil, err
		}
		if finish {
			break
		}
	}
	return
}

randomNami avatar Aug 17 '25 20:08 randomNami

I've made a patch combining both the above changes and @bill88t's ones (see https://github.com/bill88t/Kemono-scraper, as well as his PR here: #63).

I also attempted to

  • Continue bill88t's method for setting the host in fetch.go
  • Use the json parser to get the post count instead of just relying on the post count always being the 9th entry in the json data.

I have never written any go before so there could be mistakes.

I still get:

[~/temp/Kemono-scraper/main] $ go build -ldflags="-s -w" -o kemono-scraper && ./kemono-scraper --link https://kemono.cr/patreon/user/XXXXXXXX -date-after 20250818
Downloading Kemono
fetching creator list...
Start download 1 creators
Fetching posts
fetching post list page 0...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXX/posts
fetching post list page 1...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXX/posts?o=50
fetching post list page 2...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=100
fetching post list page 3...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=150
fetching post list page 4...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=200
fetching post list page 5...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=250
fetching post list page 6...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=300
fetching post list page 7...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=350
fetching post list page 8...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=400
fetching post list page 9...
purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=450
fetch post list error2: 403 Forbidden
resp: &{403 Forbidden %!s(int=403) HTTP/1.1 %!s(int=1) %!s(int=1) map[Age:[0] Connection:[keep-alive] Content-Length:[191] Content-Security-Policy:[upgrade-insecure-requests;] Content-Type:[application/octet-stream] Date:[Mon, 18 Aug 2025 07:44:10 GMT] Ddg-Cache-Status:[MISS] Keep-Alive:[timeout=60] Server:[ddos-guard] Set-Cookie:[__ddg8_=tB9c1I27PvVBfnZB; Domain=.kemono.cr; Path=/; Expires=Mon, 18-Aug-2025 08:04:10 GMT __ddg10_=1755503050; Domain=.kemono.cr; Path=/; Expires=Mon, 18-Aug-2025 08:04:10 GMT __ddg9_=95.98.44.224; Domain=.kemono.cr; Path=/; Expires=Mon, 18-Aug-2025 08:04:10 GMT __ddg1_=u2pBHPpWX7PtZY35AzDl; Domain=.kemono.cr; HttpOnly; Path=/; Expires=Tue, 18-Aug-2026 07:44:10 GMT]] %!s(*http.bodyEOFSignal=&{0xc001821700 {{} {0 0}} false <nil> 0x6e5e80 0x6e5f60}) %!s(int64=191) [] %!s(bool=false) %!s(bool=false) map[] %!s(*http.Request=&{GET 0xc0015d83f0 HTTP/1.1 1 1 map[Accept:[text/css,text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7] Accept-Encoding:[gzip, deflate, br] Accept-Language:[en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7] Connection:[keep-alive] Cookie:[__ddg2=db78406c1b170553098fe44ae29484a2] Host:[kemono.cr] Referer:[https://kemono.cr] Sec-Ch-Ua:["Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111"] Sec-Ch-Ua-Mobile:[?0] Sec-Fetch-Dest:[document] Sec-Fetch-Mode:[navigate] Sec-Fetch-Site:[none] Sec-Fetch-User:[?1] Upgrade-Insecure-Requests:[1] User-Agent:[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36]] <nil> <nil> 0 [] false kemono.cr map[] map[] <nil> map[]   <nil> <nil> <nil>  {{}} <nil> [] map[]}) %!s(*tls.ConnectionState=&{772 true false 4865  true kemono.cr [0xc00024e008 0xc00024e608] [] [] [] [] false 0x673220 false 29})}

So this needs more work. Maybe I did not fully apply @randomNami's changes?

The patch:

diff --git a/downloader/downloader.go b/downloader/downloader.go
index 019e44e..e04e0ce 100644
--- a/downloader/downloader.go
+++ b/downloader/downloader.go
@@ -22,7 +22,7 @@ const (
 	maxConnection           = 100
 	rateLimit               = 2
 	UserAgent               = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
-	Accept                  = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"
+	Accept                  = "text/css,text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"
 	AcceptEncoding          = "gzip, deflate, br"
 	AcceptLanguage          = "en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7"
 	SecChUA                 = "\"Google Chrome\";v=\"111\", \"Not(A:Brand\";v=\"8\", \"Chromium\";v=\"111\""
diff --git a/example/example.go b/example/example.go
index 80af8aa..704c7f6 100644
--- a/example/example.go
+++ b/example/example.go
@@ -16,7 +16,7 @@ func main() {
 	t := term.NewTerminal(os.Stdout, os.Stderr, false)
 
 	d := downloader.NewDownloader(
-		downloader.BaseURL("https://kemono.su"),
+		downloader.BaseURL("https://kemono.cr"),
 		// the amount of download at the same time
 		downloader.MaxConcurrent(3),
 		downloader.Timeout(300*time.Second),
@@ -36,7 +36,7 @@ func main() {
 		}),
 		downloader.WithHeader(downloader.Header{
 			"User-Agent":      downloader.UserAgent,
-			"Referer":         "https://kemono.su",
+			"Referer":         "https://kemono.cr",
 			"accept":          downloader.Accept,
 			"accept-encoding": "gzip, deflate, br",
 			"accept-language": "ja-JP;q=0.8,ja;q=0.7,en-US;q=0.6,en;q=0.5",
diff --git a/kemono/fetch.go b/kemono/fetch.go
index f928e09..1e52597 100644
--- a/kemono/fetch.go
+++ b/kemono/fetch.go
@@ -14,9 +14,9 @@ import (
 )
 
 // FetchCreators fetch Creator list
-func (k *Kemono) FetchCreators() (creators []Creator, err error) {
+func (k *Kemono) FetchCreators(host string) (creators []Creator, err error) {
 	k.log.Print("fetching creator list...")
-	url := fmt.Sprintf("https://%s.su/api/v1/creators", k.Site)
+	url := fmt.Sprintf("https://%s/api/v1/creators", host)
 	resp, err := k.Downloader.Get(url)
 	if err != nil {
 		return nil, fmt.Errorf("fetch creator list error: %s", err)
@@ -39,69 +39,104 @@ func (k *Kemono) FetchCreators() (creators []Creator, err error) {
 }
 
 // FetchPosts fetch post list
-func (k *Kemono) FetchPosts(service, id string) (posts []Post, err error) {
-	url := fmt.Sprintf("https://%s.su/api/v1/%s/user/%s", k.Site, service, id)
-	perUnit := 50
-	fetch := func(page int) (err error, finish bool) {
-		k.log.Printf("fetching post list page %d...", page)
-		purl := fmt.Sprintf("%s?o=%d", url, page*perUnit)
-
-		retryCount := 0
-		for retryCount < k.retry {
-			resp, err := k.Downloader.Get(purl)
-			if err != nil {
-				k.log.Printf("fetch post list error: %v", err)
-				time.Sleep(k.retryInterval)
-				retryCount++
-				continue
-			}
-
-			if resp.StatusCode != http.StatusOK {
-				k.log.Printf("fetch post list error: %s", resp.Status)
-				time.Sleep(k.retryInterval)
-				retryCount++
-				continue
-			}
-
-			reader, err := handleCompressedHTTPResponse(resp)
-			if err != nil {
-				return err, false
-			}
-
-			data, err := ioutil.ReadAll(reader)
-			if err != nil {
-				return fmt.Errorf("fetch post list error: %s", err), false
-			}
-			reader.Close()
-
-			var pr []PostRaw
-			err = json.Unmarshal(data, &pr)
-			if err != nil {
-				return fmt.Errorf("unmarshal post list error: %s", err), false
-			}
-			if len(pr) == 0 {
-				// final page
-				return nil, true
-			}
-			for _, p := range pr {
-				posts = append(posts, p.ParasTime())
-			}
-			return nil, false
-		}
-
-		return fmt.Errorf("fetch post list error: maximum retry count exceeded"), false
-	}
-
-	for i := 0; ; i++ {
-		err, finish := fetch(i)
-		if err != nil {
-			return nil, err
-		}
-		if finish {
-			break
-		}
-	}
-	return
+func (k *Kemono) FetchPosts(service, id string, host string) (posts []Post, err error) {
+        url := fmt.Sprintf("https://%s/api/v1/%s/user/%s/posts", host, service, id)
+        perUnit := 50
+        fetch := func(page int, max_num_posts int) (err error, finish bool) {
+	        k.log.Printf("fetching post list page %d...", page)
+                purl := url
+
+	        if page != 0 {
+		   purl = fmt.Sprintf("%s?o=%d", url, page*perUnit)
+                }
+                k.log.Printf("purl: %s", purl)
+
+                retryCount := 0
+                for retryCount < k.retry {
+		        resp, err := k.Downloader.Get(purl)
+                        if err != nil {
+			        k.log.Printf("fetch post list error1: %v", err)
+			        time.Sleep(k.retryInterval)
+			        retryCount++
+			        continue
+		        }
+
+		        if resp.StatusCode != http.StatusOK {
+			        k.log.Printf("fetch post list error2: %s", resp.Status)
+			        k.log.Printf("resp: %s", resp)
+			        time.Sleep(k.retryInterval)
+			        retryCount++
+			        continue
+		        }
+
+		        reader, err := handleCompressedHTTPResponse(resp)
+		        if err != nil {
+			        return err, false
+		        }
+
+		        data, err := ioutil.ReadAll(reader)
+		        if err != nil {
+			        return fmt.Errorf("fetch post list error3: %s", err), false
+		        }
+		        reader.Close()
+
+		        var pr []PostRaw
+		        err = json.Unmarshal(data, &pr)
+		        if err != nil {
+			        return fmt.Errorf("unmarshal post list error4: %s", err), false
+		        }
+		        if len(pr) == 0 || max_num_posts <= (page+1)*perUnit {
+			        // final page
+			        return nil, true
+		        }
+		        for _, p := range pr {
+			        posts = append(posts, p.ParasTime())
+		        }
+		        return nil, false
+	        }
+
+	        return fmt.Errorf("fetch post list error: maximum retry count exceeded"), false
+        }
+
+        for i := 0; ; i++ {
+	        url := fmt.Sprintf("https://%s/api/v1/%s/user/%s/profile", host, service, id)
+	        resp, err := k.Downloader.Get(url)
+	        if err != nil {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 1: %s", err)
+	        }
+
+	        if resp.StatusCode != http.StatusOK {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 2: %s", err)
+	        }
+
+	        reader, err := handleCompressedHTTPResponse(resp)
+	        if err != nil {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 3: %s", err)
+	        }
+
+	        data, err := ioutil.ReadAll(reader)
+	        if err != nil {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 4: %s", err)
+	        }
+                reader.Close()
+
+                var jsonresult map[string]any
+                json.Unmarshal(data, &jsonresult)
+                post_count_float, ok := jsonresult["post_count"].(float64)
+                if !ok {
+                        return nil, fmt.Errorf("fetch post list error getting post_count")
+                }
+                max_num_posts := int(post_count_float)
+
+	        err, finish := fetch(i, max_num_posts)
+	        if err != nil {
+		        return nil, err
+	        }
+	        if finish {
+		        break
+	        }
+        }
+        return
 }
 
 // DownloadPosts download posts
diff --git a/kemono/kemono.go b/kemono/kemono.go
index fda74f4..df8f733 100644
--- a/kemono/kemono.go
+++ b/kemono/kemono.go
@@ -217,11 +217,11 @@ func WithUserAttachmentFilter(creator Creator, filter ...AttachmentFilter) Optio
 }
 
 // Start fetch and download
-func (k *Kemono) Start() error {
+func (k *Kemono) Start(host string) error {
 	// initialize the creators
 	if len(k.creators) == 0 {
 		// fetch creators from kemono
-		cs, err := k.FetchCreators()
+                cs, err := k.FetchCreators(host)
 		if err != nil {
 			return err
 		}
@@ -251,11 +251,13 @@ func (k *Kemono) Start() error {
 	k.log.Printf("Start download %d creators", len(k.users))
 	for _, creator := range k.users {
 		// fetch posts
-		posts, err := k.FetchPosts(creator.Service, creator.Id)
+                k.log.Printf("Fetching posts")
+		posts, err := k.FetchPosts(creator.Service, creator.Id, host)
 		if err != nil {
 			return err
 		}
 		// filter posts
+                k.log.Printf("Filter posts")
 		posts = k.FilterPosts(posts)
 
 		// filter attachments
diff --git a/main/main.go b/main/main.go
index 7855d9d..3d885f6 100644
--- a/main/main.go
+++ b/main/main.go
@@ -404,7 +404,9 @@ func main() {
 		k = true
 		options[Kemono] = append(options[Kemono], sharedOptions...)
 		options[Kemono] = append(options[Kemono], kemono.WithDomain("kemono"))
-		downloaderOptions = append(downloaderOptions, downloader.BaseURL("https://kemono.su"))
+
+		kemonoHost := siteHost(Kemono)
+		downloaderOptions = append(downloaderOptions, downloader.BaseURL("https://"+kemonoHost))
 		token, err := utils.GenerateToken(16)
 		if err != nil {
 			log.Fatalf("generate token failed: %s", err)
@@ -414,14 +416,14 @@ func main() {
 				Name:   "__ddg2",
 				Value:  token,
 				Path:   "/",
-				Domain: ".kemono.su",
+				Domain: "." + kemonoHost,
 				Secure: false,
 			},
 		}))
 		downloaderOptions = append(downloaderOptions, downloader.WithHeader(downloader.Header{
-			"Host":                      "kemono.su",
+			"Host":                      kemonoHost,
 			"User-Agent":                downloader.UserAgent,
-			"Referer":                   "https://kemono.su",
+			"Referer":                   "https://" + kemonoHost,
 			"Accept":                    downloader.Accept,
 			"Accept-Language":           downloader.AcceptLanguage,
 			"Accept-Encoding":           downloader.AcceptEncoding,
@@ -442,7 +444,9 @@ func main() {
 		c = true
 		options[Coomer] = append(options[Coomer], sharedOptions...)
 		options[Coomer] = append(options[Coomer], kemono.WithDomain("coomer"))
-		downloaderOptions = append(downloaderOptions, downloader.BaseURL("https://coomer.su"))
+
+		coomerHost := siteHost(Coomer)
+		downloaderOptions = append(downloaderOptions, downloader.BaseURL("https://"+coomerHost))
 		token, err := utils.GenerateToken(16)
 		if err != nil {
 			log.Fatalf("generate token failed: %s", err)
@@ -452,13 +456,13 @@ func main() {
 				Name:   "__ddg2",
 				Value:  token,
 				Path:   "/",
-				Domain: ".coomer.su",
+				Domain: "." + coomerHost,
 			},
 		}))
 		downloaderOptions = append(downloaderOptions, downloader.WithHeader(downloader.Header{
-			"Host":                      "coomer.su",
+			"Host":                      coomerHost,
 			"User-Agent":                downloader.UserAgent,
-			"Referer":                   "https://coomer.su/",
+			"Referer":                   "https://" + coomerHost + "/",
 			"Accept":                    downloader.Accept,
 			"Accept-Language":           downloader.AcceptLanguage,
 			"Accept-Encoding":           downloader.AcceptEncoding,
@@ -479,14 +483,16 @@ func main() {
 
 	if k {
 		terminal.Print("Downloading Kemono")
-		err := KKemono.Start()
+		kemonoHost := siteHost(Kemono)
+		err := KKemono.Start(kemonoHost)
 		if err != nil {
 			log.Printf("kemono start failed: %s", err)
 		}
 	}
 	if c {
 		terminal.Print("Downloading Coomer")
-		err := KCoomer.Start()
+		coomerHost := siteHost(Coomer)
+		err := KCoomer.Start(coomerHost)
 		if err != nil {
 			log.Printf("coomer start failed: %s", err)
 		}
@@ -499,7 +505,8 @@ func parasLink(link string) (s, service, userId, postId string) {
 		log.Fatal("invalid url")
 	}
 
-	pattern := `(?i)^(?:.*\.)?(kemono|coomer)\.su$`
+	// Accept kemono/coomer with su/cr/st TLDs (and optional subdomains)
+	pattern := `(?i)^(?:.*\.)?(kemono|coomer)\.(su|cr|st)$`
 	re := regexp.MustCompile(pattern)
 
 	matchedSubstrings := re.FindStringSubmatch(u.Host)
@@ -672,7 +679,8 @@ func DirectoryName(p kemono.Post) string {
 }
 
 func fetchFavoriteCreators(s string, cookie []*http.Cookie) []kemono.FavoriteCreator {
-	log.Printf("fetching favorite creators from %s.su", s)
+	host := siteHost(s)
+	log.Printf("fetching favorite creators from %s", host)
 	var client *http.Client
 	client = http.DefaultClient
 	if proxy != "" {
@@ -685,11 +693,11 @@ func fetchFavoriteCreators(s string, cookie []*http.Cookie) []kemono.FavoriteCre
 		downloader.AddProxy(proxy, client.Transport.(*http.Transport))
 	}
 
-	req, err := http.NewRequest("GET", fmt.Sprintf("https://%s.su/api/v1/account/favorites?type=user", s), nil)
+	req, err := http.NewRequest("GET", fmt.Sprintf("https://%s/api/v1/account/favorites?type=user", host), nil)
 	if err != nil {
 		log.Fatalf("Error creating request: %s", err)
 	}
-	req.Header.Set("Host", fmt.Sprintf("%s.su", s))
+	req.Header.Set("Host", host)
 	for _, v := range cookie {
 		req.AddCookie(v)
 	}
@@ -710,7 +718,8 @@ func fetchFavoriteCreators(s string, cookie []*http.Cookie) []kemono.FavoriteCre
 }
 
 func fetchFavoritePosts(s string, cookie []*http.Cookie) []kemono.PostRaw {
-	log.Printf("fetching favorite posts from %s.su", s)
+	host := siteHost(s)
+	log.Printf("fetching favorite posts from %s", host)
 	var client *http.Client
 	client = http.DefaultClient
 	if proxy != "" {
@@ -722,11 +731,11 @@ func fetchFavoritePosts(s string, cookie []*http.Cookie) []kemono.PostRaw {
 		}
 		downloader.AddProxy(proxy, client.Transport.(*http.Transport))
 	}
-	req, err := http.NewRequest("GET", fmt.Sprintf("https://%s.su/api/v1/account/favorites?type=post", s), nil)
+	req, err := http.NewRequest("GET", fmt.Sprintf("https://%s/api/v1/account/favorites?type=post", host), nil)
 	if err != nil {
 		log.Fatalf("Error creating request: %s", err)
 	}
-	req.Header.Set("Host", fmt.Sprintf("%s.su", s))
+	req.Header.Set("Host", host)
 	for _, v := range cookie {
 		req.AddCookie(v)
 	}
@@ -796,3 +805,17 @@ func parasCookieFile(cookieFile string) []*http.Cookie {
 	}
 	return cookies
 }
+
+// siteHost resolves the canonical host for a given site identifier ("kemono" or "coomer").
+// Updated to support kemono.cr and coomer.st.
+func siteHost(s string) string {
+	switch strings.ToLower(s) {
+	case Kemono:
+		return "kemono.cr"
+	case Coomer:
+		return "coomer.st"
+	default:
+		// Fallback: if a full host was passed in, return as-is.
+		return s
+	}
+}

Sorry about the weird spacing on the new code, my editor (and my sanity) are not ok with the use of spaces and tabs in the source files :)

swaylortift avatar Aug 18 '25 07:08 swaylortift

[~/temp/Kemono-scraper/main] $ go build -ldflags="-s -w" -o kemono-scraper && ./kemono-scraper --link https://kemono.cr/patreon/user/XXXXXXXX -date-after 20250818 Downloading Kemono fetching creator list... Start download 1 creators Fetching posts fetching post list page 0... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXX/posts fetching post list page 1... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXX/posts?o=50 fetching post list page 2... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=100 fetching post list page 3... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=150 fetching post list page 4... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=200 fetching post list page 5... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=250 fetching post list page 6... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=300 fetching post list page 7... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=350 fetching post list page 8... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=400 fetching post list page 9... purl: https://kemono.cr/api/v1/patreon/user/XXXXXXXXposts?o=450 fetch post list error2: 403 Forbidden resp: &{403 Forbidden %!s(int=403) HTTP/1.1 %!s(int=1) %!s(int=1) map[Age:[0] Connection:[keep-alive] Content-Length:[191] Content-Security-Policy:[upgrade-insecure-requests;] Content-Type:[application/octet-stream] Date:[Mon, 18 Aug 2025 07:44:10 GMT] Ddg-Cache-Status:[MISS] Keep-Alive:[timeout=60] Server:[ddos-guard] Set-Cookie:[_ddg8=tB9c1I27PvVBfnZB; Domain=.kemono.cr; Path=/; Expires=Mon, 18-Aug-2025 08:04:10 GMT _ddg10=1755503050; Domain=.kemono.cr; Path=/; Expires=Mon, 18-Aug-2025 08:04:10 GMT _ddg9=95.98.44.224; Domain=.kemono.cr; Path=/; Expires=Mon, 18-Aug-2025 08:04:10 GMT _ddg1=u2pBHPpWX7PtZY35AzDl; Domain=.kemono.cr; HttpOnly; Path=/; Expires=Tue, 18-Aug-2026 07:44:10 GMT]] %!s(http.bodyEOFSignal=&{0xc001821700 {{} {0 0}} false 0x6e5e80 0x6e5f60}) %!s(int64=191) [] %!s(bool=false) %!s(bool=false) map[] %!s(http.Request=&{GET 0xc0015d83f0 HTTP/1.1 1 1 map[Accept:[text/css,text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7] Accept-Encoding:[gzip, deflate, br] Accept-Language:[en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7] Connection:[keep-alive] Cookie:[__ddg2=db78406c1b170553098fe44ae29484a2] Host:[kemono.cr] Referer:[https://kemono.cr] Sec-Ch-Ua:["Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111"] Sec-Ch-Ua-Mobile:[?0] Sec-Fetch-Dest:[document] Sec-Fetch-Mode:[navigate] Sec-Fetch-Site:[none] Sec-Fetch-User:[?1] Upgrade-Insecure-Requests:[1] User-Agent:[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36]] 0 [] false kemono.cr map[] map[] map[] {{}} [] map[]}) %!s(*tls.ConnectionState=&{772 true false 4865 true kemono.cr [0xc00024e008 0xc00024e608] [] [] [] [] false 0x673220 false 29})}

@swaylortift based on what I see in the error it might be an issue with the fetch's stop condition or the method you're obtaining the max_num_posts since the error happens while fetching the posts and before the download starts. The reason I added the max_num_posts condition to the fetch function declared inside FetchPosts is that if you try to get a page of values from the API that starts in a post number that doesn't exist, it gives you the 403 error.

For your case can you debug that condition or add logs to see what is happening there? Because I tried to use the code you used to set max_num_posts and didn't get that error, I'm also not seeing any differences in your code vs what I changed in mine. So more debugging of your specific issue is needed

What I tried doing in the stop condition of the fetch was check if the post count represented by max_num_posts would be smaller or equal to the next page's starting post number, I left the original stopping condition here but with the new behaviour I think it will always use our new condition to stop. However now that I tested it on a different creator with much smaller files in their posts I see that my version still has some issues as well, it seems it wouldn't download the last page of posts of a creator. What happened is again related to the stopping conditions of the fetch function To fix this I will remove the original stopping condition that signals if everything is OK and instead add the new condition as a separate if condition after the portion that appends the posts:

var pr []PostRaw
err = json.Unmarshal(data, &pr)
if err != nil {
	return fmt.Errorf("unmarshal post list error: %s", err), false
}
for _, p := range pr {
	posts = append(posts, p.ParasTime())
}
if max_num_posts <= (page+1)*perUnit {
	// final page
	return nil, true
}
return nil, false

This solved my issue with it not downloading posts from the last page. I will do some more tests on my side to see if it is ok now

randomNami avatar Aug 18 '25 09:08 randomNami

Also now that I'm looking at it, this issue I was trying to fix with the max number of posts variable might be related to an open pull request #56, this pull request added a new field to the config.yaml file to set a maximum number of pages and then uses that to make the loop in FetchPostst no longer infinite, it also adds a delay between requests to try to avoid rate limit issues.

For this case we could also do the same thing of moving the stopping condition to the for loop in FetchPosts instead of the fetch function inside it, move the request to obtain the total post count to outside the loop and set the variable we will use to stop the loop, by doing this we won't need that stopping condition inside fetch. These are 2 different ways of implementing this fix.

randomNami avatar Aug 18 '25 10:08 randomNami

For your case can you debug that condition or add logs to see what is happening there? Because I tried to use the code you used to set max_num_posts and didn't get that error, I'm also not seeing any differences in your code vs what I changed in mine. So more debugging of your specific issue is needed

Thanks for taking a look. I added a few more printf statements. It doesn't make it any more clear for me, maybe you have a suggestion of where I should log which data? Maybe the API simply does not allow fetching more than 500 posts, could that be it?

Also, note in your for-loop where you fetch the profile page to get the max_num_posts, a lot of stuff can be taken out of the loop. The max_num_posts does not change between iterations and only has to be determined once. Basically, you can move the line for i := 0; ; i++ { to just above err, finish := fetch(i, max_num_posts) (and you then need to rename the url variable because it already exists in the outside scope. But it's not a functional change, it's just for efficiency.

[~/temp/Kemono-scraper/main] $ go build -ldflags="-s -w" -o kemono-scraper && ./kemono-scraper --link https://kemono.cr/patreon/user/59577203 -date-after 20250818
Downloading Kemono
fetching creator list...
Start download 1 creators
Fetching posts
Data returned from API: {"id":"59577203","name":"pizzacakecomic","service":"patreon","indexed":"2023-02-03T17:29:10.704099","updated":"2025-08-18T10:20:33.237190","public_id":"pizzacakecomic","relation_id":null,"has_chats":true,"post_count":586,"dm_count":1,"share_count":0,"chat_count":1}
Calling fetch with i = 0 and max_num_posts = 586
fetching post list page 0...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts
Calling fetch with i = 1 and max_num_posts = 586
fetching post list page 1...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=50
Calling fetch with i = 2 and max_num_posts = 586
fetching post list page 2...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=100
Calling fetch with i = 3 and max_num_posts = 586
fetching post list page 3...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=150
Calling fetch with i = 4 and max_num_posts = 586
fetching post list page 4...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=200
Calling fetch with i = 5 and max_num_posts = 586
fetching post list page 5...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=250
Calling fetch with i = 6 and max_num_posts = 586
fetching post list page 6...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=300
Calling fetch with i = 7 and max_num_posts = 586
fetching post list page 7...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=350
Calling fetch with i = 8 and max_num_posts = 586
fetching post list page 8...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=400
Calling fetch with i = 9 and max_num_posts = 586
fetching post list page 9...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=450
Calling fetch with i = 10 and max_num_posts = 586
fetching post list page 10...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=500
fetch post list error2: 403 Forbidden
resp: &{403 Forbidden %!s(int=403) HTTP/1.1 %!s(int=1) %!s(int=1) map[Age:[0] Connection:[keep-alive] Content-Length:[191] Content-Security-Policy:[upgrade-insecure-requests;] Content-Type:[application/octet-stream] Date:[Tue, 19 Aug 2025 07:26:44 GMT] Ddg-Cache-Status:[MISS] Keep-Alive:[timeout=60] Server:[ddos-guard] Set-Cookie:[__ddg8_=vUjMMOrQUlkVuYBp; Domain=.kemono.cr; Path=/; Expires=Tue, 19-Aug-2025 07:46:44 GMT __ddg10_=1755588404; Domain=.kemono.cr; Path=/; Expires=Tue, 19-Aug-2025 07:46:44 GMT __ddg9_=95.98.44.224; Domain=.kemono.cr; Path=/; Expires=Tue, 19-Aug-2025 07:46:44 GMT __ddg1_=y7QLUud0u1RlhU4TdeKp; Domain=.kemono.cr; HttpOnly; Path=/; Expires=Wed, 19-Aug-2026 07:26:44 GMT]] %!s(*http.bodyEOFSignal=&{0xc0018e2400 {{} {0 0}} false <nil> 0x6e5e80 0x6e5f60}) %!s(int64=191) [] %!s(bool=false) %!s(bool=false) map[] %!s(*http.Request=&{GET 0xc0015183f0 HTTP/1.1 1 1 map[Accept:[text/css,text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7] Accept-Encoding:[gzip, deflate, br] Accept-Language:[en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7] Connection:[keep-alive] Cookie:[__ddg2=3a14253b975ffdbe79f0c2d5f634b571] Host:[kemono.cr] Referer:[https://kemono.cr] Sec-Ch-Ua:["Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111"] Sec-Ch-Ua-Mobile:[?0] Sec-Fetch-Dest:[document] Sec-Fetch-Mode:[navigate] Sec-Fetch-Site:[none] Sec-Fetch-User:[?1] Upgrade-Insecure-Requests:[1] User-Agent:[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36]] <nil> <nil> 0 [] false kemono.cr map[] map[] <nil> map[]   <nil> <nil> <nil>  {{}} <nil> [] map[]}) %!s(*tls.ConnectionState=&{772 true false 4865  true kemono.cr [0xc00015a008 0xc00015a608] [] [] [] [] false 0x673220 false 29})}
^C

Thanks!

swaylortift avatar Aug 19 '25 07:08 swaylortift

@swaylortift I think it might be some kind of rate limit issue. I tried with my code downloading the creator you have in your logs and I got the error 403 even earlier than you which I think is because like you said in your comment I accidentally left the request to obtain the mas number of posts inside the loop, so I was doing more requests than your code and ended up getting the error earlier.

You can add the sleep command to the for loop that calls the fetch function and it should work, it's just a matter of testing of big the sleep needs to be to not have any issues I added an 500 millisecond sleep and it worked,I will probably try reducing it further as well it's just a matter of testing what we need to not trigger the rate limit: (figured out how to properly highlight the code in these comments so I updated my comments, I needed to post the code first, select it and then click the code buttont for it to add the proper tags to it)

for i := 0; ; i++ {
	err, finish := fetch(i, max_num_posts)
	if err != nil {
		return nil, err
	}
	if finish {
		break
	}
	time.Sleep(800 * time.Microsecond)
}
return
Downloading Kemono
fetching creator list...
Start download 1 creators
fetching post list page 0...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts
fetching post list page 1...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=50
fetching post list page 2...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=100
fetching post list page 3...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=150
fetching post list page 4...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=200
fetching post list page 5...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=250
fetching post list page 6...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=300
fetching post list page 7...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=350
fetching post list page 8...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=400
fetching post list page 9...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=450
fetching post list page 10...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=500
fetching post list page 11...
purl: https://kemono.cr/api/v1/patreon/user/59577203/posts?o=550
download post: Kitten Logic
 645.60ms  Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0%    2.59 MB/s   1.67 MB 0-Comic_i bite.png
 681.66ms  Success ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0%    2.45 MB/s   1.67 MB 1-Comic_i bite.png
download post: Cabin Vibes
 482.91ms Download ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0.9%  363.39 KB/s  18.93 MB 1-Xtra1.png
 482.30ms Download ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   1.0%  397.03 KB/s  18.93 MB 0-Xtra1.png
 482.30ms Download ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0.9%  397.03 KB/s  19.70 MB 2-Xtra2.png
 482.30ms Download ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0.2%   65.29 KB/s  15.04 MB 3-Xtra4.png
 482.30ms Download ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0.2%   65.29 KB/s  14.06 MB 4-Xtra5.png^C

randomNami avatar Aug 26 '25 09:08 randomNami

So here's my updated fetch function based on this:

// FetchPosts fetch post list
func (k *Kemono) FetchPosts(service, id string) (posts []Post, err error) {
	url := fmt.Sprintf("https://%s.cr/api/v1/%s/user/%s/posts", k.Site, service, id)
	perUnit := 50
	fetch := func(page int, max_num_posts int) (err error, finish bool) {
		k.log.Printf("fetching post list page %d...", page)
		purl := url
		if page != 0 {
			purl = fmt.Sprintf("%s?o=%d", url, page*perUnit)
		}

		k.log.Printf("purl: %s", purl)

		retryCount := 0
		for retryCount < k.retry {
			resp, err := k.Downloader.Get(purl)
			if err != nil {
				k.log.Printf("fetch post list error1: %v", err)
				time.Sleep(k.retryInterval)
				retryCount++
				continue
			}

			if resp.StatusCode != http.StatusOK {
				k.log.Printf("fetch post list error2: %s", resp.Status)
				k.log.Printf("resp: %s", resp)
				time.Sleep(k.retryInterval)
				retryCount++
				continue
			}

			reader, err := handleCompressedHTTPResponse(resp)
			if err != nil {
				return err, false
			}

			data, err := ioutil.ReadAll(reader)
			if err != nil {
				return fmt.Errorf("fetch post list error3: %s", err), false
			}
			reader.Close()

			var pr []PostRaw
			err = json.Unmarshal(data, &pr)
			if err != nil {
				return fmt.Errorf("unmarshal post list error: %s", err), false
			}
			for _, p := range pr {
				posts = append(posts, p.ParasTime())
			}
			if max_num_posts <= (page+1)*perUnit {
				// final page
				return nil, true
			}
			return nil, false
		}

		return fmt.Errorf("fetch post list error: maximum retry count exceeded"), false
	}

	num_posts_url := fmt.Sprintf("https://%s.cr/api/v1/%s/user/%s/profile", k.Site, service, id)
	resp, err := k.Downloader.Get(num_posts_url)
	if err != nil {
		return nil, fmt.Errorf("fetch post list error getting number of posts 1: %s", err)
	}

	if resp.StatusCode != http.StatusOK {
		return nil, fmt.Errorf("fetch post list error getting number of posts 2: %s", err)
	}

	reader, err := handleCompressedHTTPResponse(resp)
	if err != nil {
		return nil, fmt.Errorf("fetch post list error getting number of posts 3: %s", err)
	}

	data, err := ioutil.ReadAll(reader)
	if err != nil {
		return nil, fmt.Errorf("fetch post list error getting number of posts 4: %s", err)
	}
	reader.Close()

	/*var max_num_posts int
	temp := strings.Split(strings.Split(string(data), ",")[8], ":")[1]
	_, err = fmt.Sscan(temp, &max_num_posts)*/

	var jsonresult map[string]any
	json.Unmarshal(data, &jsonresult)
	post_count_float, ok := jsonresult["post_count"].(float64)
	if !ok {
		return nil, fmt.Errorf("fetch post list error getting post_count")
	}
	max_num_posts := int(post_count_float)

	if err != nil {
		return nil, fmt.Errorf("fetch post list error getting number of posts 5: %s", err)
	}

	for i := 0; ; i++ {
		err, finish := fetch(i, max_num_posts)
		if err != nil {
			return nil, err
		}
		if finish {
			break
		}
		time.Sleep(500 * time.Microsecond)
	}
	return
}

Like I mentioned the sleep time can likely be reduced just not too much otherwise the 403 could happen again

randomNami avatar Aug 26 '25 09:08 randomNami

@swaylortift The problem is the wrong order : if len(pr) == 0 || max_num_posts <= (page+1)*perUnit { // final page return nil, true } for _, p := range pr { posts = append(posts, p.ParasTime()) } return nil, false The fuction return at last page before adding the posts, just reverse these two parts of the code.

Jaystarpoi avatar Aug 27 '25 00:08 Jaystarpoi

@Jaystarpoi

If you don’t mind, could you please post an updated version? I’m not very experienced with Go Language, and although I tried the solutions, I couldn’t get it to work correctly.

mengod-gh avatar Aug 27 '25 05:08 mengod-gh

@Jaystarpoi

If you don’t mind, could you please post an updated version? I’m not very experienced with Go Language, and although I tried the solutions, I couldn’t get it to work correctly.

new function

// FetchPosts fetch post list func (k *Kemono) FetchPosts(service, id string, host string) (posts []Post, err error) { url := fmt.Sprintf("https://%s/api/v1/%s/user/%s/posts", host, service, id) perUnit := 50 fetch := func(page int, max_num_posts int) (err error, finish bool) { k.log.Printf("fetching post list page %d...", page) purl := url

        if page != 0 {
	   purl = fmt.Sprintf("%s?o=%d", url, page*perUnit)
            }
            k.log.Printf("purl: %s", purl)

            retryCount := 0
            for retryCount < k.retry {
	        resp, err := k.Downloader.Get(purl)
                    if err != nil {
		        k.log.Printf("fetch post list error1: %v", err)
		        time.Sleep(k.retryInterval)
		        retryCount++
		        continue
	        }

	        if resp.StatusCode != http.StatusOK {
		        k.log.Printf("fetch post list error2: %s", resp.Status)
		        k.log.Printf("resp: %s", resp)
		        time.Sleep(k.retryInterval)
		        retryCount++
		        continue
	        }

	        reader, err := handleCompressedHTTPResponse(resp)
	        if err != nil {
				k.log.Printf("aaa")
		        return err, false
	        }

	        data, err := ioutil.ReadAll(reader)
	        if err != nil {
		        return fmt.Errorf("fetch post list error3: %s", err), false
	        }
	        reader.Close()

	        var pr []PostRaw
	        err = json.Unmarshal(data, &pr)
	        if err != nil {
		        return fmt.Errorf("unmarshal post list error4: %s", err), false
	        }
	        
	        for _, p := range pr {
		        posts = append(posts, p.ParasTime())
	        }
	        
	        if len(pr) == 0 || max_num_posts <= (page+1)*perUnit {
		        // final page
		        return nil, true
	        }

	        

	        return nil, false
        }

        return fmt.Errorf("fetch post list error: maximum retry count exceeded"), false
    }

    for i := 0; ; i++ {
        url := fmt.Sprintf("https://%s/api/v1/%s/user/%s/profile", host, service, id)
        resp, err := k.Downloader.Get(url)
        if err != nil {
	        return nil, fmt.Errorf("fetch post list error getting number of posts 1: %s", err)
        }

        if resp.StatusCode != http.StatusOK {
	        return nil, fmt.Errorf("fetch post list error getting number of posts 2: %s", err)
        }

        reader, err := handleCompressedHTTPResponse(resp)
        if err != nil {
	        return nil, fmt.Errorf("fetch post list error getting number of posts 3: %s", err)
        }

        data, err := ioutil.ReadAll(reader)
        if err != nil {
	        return nil, fmt.Errorf("fetch post list error getting number of posts 4: %s", err)
        }
            reader.Close()

            var jsonresult map[string]any
            json.Unmarshal(data, &jsonresult)
            post_count_float, ok := jsonresult["post_count"].(float64)
            if !ok {
                    return nil, fmt.Errorf("fetch post list error getting post_count")
            }
            max_num_posts := int(post_count_float)

        err, finish := fetch(i, max_num_posts)
        if err != nil {
	        return nil, err
        }
        if finish {
	        break
        }
    }
    return

}

Jaystarpoi avatar Aug 27 '25 06:08 Jaystarpoi

Finally solved it everyone thank you!!!

mengod-gh avatar Aug 27 '25 10:08 mengod-gh

For anyone wondering what the solution is, just apply this patch using git apply -3 fix.diff

Here is fix.diff:

diff --git a/downloader/downloader.go b/downloader/downloader.go
index 019e44e..e04e0ce 100644
--- a/downloader/downloader.go
+++ b/downloader/downloader.go
@@ -22,7 +22,7 @@ const (
 	maxConnection           = 100
 	rateLimit               = 2
 	UserAgent               = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
-	Accept                  = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"
+	Accept                  = "text/css,text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"
 	AcceptEncoding          = "gzip, deflate, br"
 	AcceptLanguage          = "en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7"
 	SecChUA                 = "\"Google Chrome\";v=\"111\", \"Not(A:Brand\";v=\"8\", \"Chromium\";v=\"111\""
diff --git a/kemono/fetch.go b/kemono/fetch.go
index f928e09..4fe3d51 100644
--- a/kemono/fetch.go
+++ b/kemono/fetch.go
@@ -14,9 +14,9 @@ import (
 )
 
 // FetchCreators fetch Creator list
-func (k *Kemono) FetchCreators() (creators []Creator, err error) {
+func (k *Kemono) FetchCreators(host string) (creators []Creator, err error) {
 	k.log.Print("fetching creator list...")
-	url := fmt.Sprintf("https://%s.su/api/v1/creators", k.Site)
+	url := fmt.Sprintf("https://%s/api/v1/creators", host)
 	resp, err := k.Downloader.Get(url)
 	if err != nil {
 		return nil, fmt.Errorf("fetch creator list error: %s", err)
@@ -39,69 +39,104 @@ func (k *Kemono) FetchCreators() (creators []Creator, err error) {
 }
 
 // FetchPosts fetch post list
-func (k *Kemono) FetchPosts(service, id string) (posts []Post, err error) {
-	url := fmt.Sprintf("https://%s.su/api/v1/%s/user/%s", k.Site, service, id)
-	perUnit := 50
-	fetch := func(page int) (err error, finish bool) {
-		k.log.Printf("fetching post list page %d...", page)
-		purl := fmt.Sprintf("%s?o=%d", url, page*perUnit)
-
-		retryCount := 0
-		for retryCount < k.retry {
-			resp, err := k.Downloader.Get(purl)
-			if err != nil {
-				k.log.Printf("fetch post list error: %v", err)
-				time.Sleep(k.retryInterval)
-				retryCount++
-				continue
-			}
-
-			if resp.StatusCode != http.StatusOK {
-				k.log.Printf("fetch post list error: %s", resp.Status)
-				time.Sleep(k.retryInterval)
-				retryCount++
-				continue
-			}
-
-			reader, err := handleCompressedHTTPResponse(resp)
-			if err != nil {
-				return err, false
-			}
-
-			data, err := ioutil.ReadAll(reader)
-			if err != nil {
-				return fmt.Errorf("fetch post list error: %s", err), false
-			}
-			reader.Close()
-
-			var pr []PostRaw
-			err = json.Unmarshal(data, &pr)
-			if err != nil {
-				return fmt.Errorf("unmarshal post list error: %s", err), false
-			}
-			if len(pr) == 0 {
-				// final page
-				return nil, true
-			}
-			for _, p := range pr {
-				posts = append(posts, p.ParasTime())
-			}
-			return nil, false
-		}
-
-		return fmt.Errorf("fetch post list error: maximum retry count exceeded"), false
-	}
-
-	for i := 0; ; i++ {
-		err, finish := fetch(i)
-		if err != nil {
-			return nil, err
-		}
-		if finish {
-			break
-		}
-	}
-	return
+func (k *Kemono) FetchPosts(service, id string, host string) (posts []Post, err error) {
+        url := fmt.Sprintf("https://%s/api/v1/%s/user/%s/posts", host, service, id)
+        perUnit := 50
+        fetch := func(page int, max_num_posts int) (err error, finish bool) {
+	        k.log.Printf("fetching post list page %d...", page)
+                purl := url
+
+	        if page != 0 {
+		   purl = fmt.Sprintf("%s?o=%d", url, page*perUnit)
+                }
+                k.log.Printf("purl: %s", purl)
+
+                retryCount := 0
+                for retryCount < k.retry {
+		        resp, err := k.Downloader.Get(purl)
+                        if err != nil {
+			        k.log.Printf("fetch post list error1: %v", err)
+			        time.Sleep(k.retryInterval)
+			        retryCount++
+			        continue
+		        }
+
+		        if resp.StatusCode != http.StatusOK {
+			        k.log.Printf("fetch post list error2: %s", resp.Status)
+			        k.log.Printf("resp: %s", resp)
+			        time.Sleep(k.retryInterval)
+			        retryCount++
+			        continue
+		        }
+
+		        reader, err := handleCompressedHTTPResponse(resp)
+		        if err != nil {
+			        return err, false
+		        }
+
+		        data, err := ioutil.ReadAll(reader)
+		        if err != nil {
+			        return fmt.Errorf("fetch post list error3: %s", err), false
+		        }
+		        reader.Close()
+
+		        var pr []PostRaw
+		        err = json.Unmarshal(data, &pr)
+		        if err != nil {
+			        return fmt.Errorf("unmarshal post list error4: %s", err), false
+		        }
+		        for _, p := range pr {
+			        posts = append(posts, p.ParasTime())
+		        }
+		        if len(pr) == 0 || max_num_posts <= (page+1)*perUnit {
+			        // final page
+			        return nil, true
+		        }
+		        return nil, false
+	        }
+
+	        return fmt.Errorf("fetch post list error: maximum retry count exceeded"), false
+        }
+
+        for i := 0; ; i++ {
+	        url := fmt.Sprintf("https://%s/api/v1/%s/user/%s/profile", host, service, id)
+	        resp, err := k.Downloader.Get(url)
+	        if err != nil {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 1: %s", err)
+	        }
+
+	        if resp.StatusCode != http.StatusOK {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 2: %s", err)
+	        }
+
+	        reader, err := handleCompressedHTTPResponse(resp)
+	        if err != nil {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 3: %s", err)
+	        }
+
+	        data, err := ioutil.ReadAll(reader)
+	        if err != nil {
+		        return nil, fmt.Errorf("fetch post list error getting number of posts 4: %s", err)
+	        }
+                reader.Close()
+
+                var jsonresult map[string]any
+                json.Unmarshal(data, &jsonresult)
+                post_count_float, ok := jsonresult["post_count"].(float64)
+                if !ok {
+                        return nil, fmt.Errorf("fetch post list error getting post_count")
+                }
+                max_num_posts := int(post_count_float)
+
+	        err, finish := fetch(i, max_num_posts)
+	        if err != nil {
+		        return nil, err
+	        }
+	        if finish {
+		        break
+	        }
+        }
+        return
 }
 
 // DownloadPosts download posts
diff --git a/kemono/kemono.go b/kemono/kemono.go
index fda74f4..df8f733 100644
--- a/kemono/kemono.go
+++ b/kemono/kemono.go
@@ -217,11 +217,11 @@ func WithUserAttachmentFilter(creator Creator, filter ...AttachmentFilter) Optio
 }
 
 // Start fetch and download
-func (k *Kemono) Start() error {
+func (k *Kemono) Start(host string) error {
 	// initialize the creators
 	if len(k.creators) == 0 {
 		// fetch creators from kemono
-		cs, err := k.FetchCreators()
+                cs, err := k.FetchCreators(host)
 		if err != nil {
 			return err
 		}
@@ -251,11 +251,13 @@ func (k *Kemono) Start() error {
 	k.log.Printf("Start download %d creators", len(k.users))
 	for _, creator := range k.users {
 		// fetch posts
-		posts, err := k.FetchPosts(creator.Service, creator.Id)
+                k.log.Printf("Fetching posts")
+		posts, err := k.FetchPosts(creator.Service, creator.Id, host)
 		if err != nil {
 			return err
 		}
 		// filter posts
+                k.log.Printf("Filter posts")
 		posts = k.FilterPosts(posts)
 
 		// filter attachments
diff --git a/main/main.go b/main/main.go
index 9ec093f..3d885f6 100644
--- a/main/main.go
+++ b/main/main.go
@@ -483,14 +483,16 @@ func main() {
 
 	if k {
 		terminal.Print("Downloading Kemono")
-		err := KKemono.Start()
+		kemonoHost := siteHost(Kemono)
+		err := KKemono.Start(kemonoHost)
 		if err != nil {
 			log.Printf("kemono start failed: %s", err)
 		}
 	}
 	if c {
 		terminal.Print("Downloading Coomer")
-		err := KCoomer.Start()
+		coomerHost := siteHost(Coomer)
+		err := KCoomer.Start(coomerHost)
 		if err != nil {
 			log.Printf("coomer start failed: %s", err)
 		}

Shaman12983 avatar Sep 03 '25 22:09 Shaman12983

any working master zip that ic dl or .exe? kinda dumb because patch failed

Shikitsuna avatar Sep 04 '25 01:09 Shikitsuna

any working master zip that ic dl or .exe? kinda dumb because patch failed

@Shikitsuna try git apply --reject --whitespace=fix fix.diff

FawazTakahji avatar Sep 05 '25 14:09 FawazTakahji

@FawazTakahji

I tried

  1. git clone https://github.com/elvis972602/Kemono-scraper
  2. In Kemono-scraper Folder, I made fix.diff file and input your code
  3. And I tried "go mod tidy"
  4. In main Folder, I tried "go build -o ../kemono-scraper.exe" But there's an error exist.

C:\Users\MenGod\Downloads\Kemono-scraper\main>go build -o ../kemono-scraper.exe .\main.go:482:17: undefined: siteHost .\main.go:490:17: undefined: siteHost

C:\Users\MenGod\Downloads\Kemono-scraper\main>

mengod-gh avatar Sep 05 '25 18:09 mengod-gh

The patch is for this fork https://github.com/bill88t/Kemono-scraper

FawazTakahji avatar Sep 05 '25 18:09 FawazTakahji

for people getting the 403 error when fetching pages, changing the header to only accept text/css seems to fix it for me

FawazTakahji avatar Oct 15 '25 11:10 FawazTakahji