exotic-amazon issues

FileBackendStorage should be used to run demo tasks

1

FileBackendStorage should be used to run demo tasks.

Better deployment experience

We need a better deployment experience. 1. Create a local directory that contains all necessary files to deploy. 2. We should test the program in the local deploy directory. 3....

galaxyeye

关于extract-config

1

你好，关于extract-config中各爬取任务父子级的关系，不知道是不是可以大概讲下。我这边调整“列表页”-“商品详情页”以及“商品评论”的父子孙级关系后，发现无论是否有父子级关系，AmazonJdbcSinkSQLExtractor.isRelevant都会重复创建多次对目标url进行判断，但是在有父子级关系的时候，反而会漏掉部分url。不会使用孙级的判断来对url进行匹配。

ws4435700

Failed to create chrome devtools driver

3

你好，今天运行代码，发现之前可运行的代码现在都报了Failed to create chrome devtools driver 这个错误，程序无法启动chrome进行拉取，以下为日志记录 `21:49:40.923 [r-worker-2] WARN a.p.p.p.b.e.context.WebDriverContext - 3. Retry task 1 in crawl scope | caused by: [Unexpected] Failed to create chrome devtools driver 21:49:41.057...

ws4435700

单一资源模式采集 amazon.com 出现 503 错误

2

14:29:19.835 [r-worker-9] INFO a.p.p.c.component.LoadComponent.Task - 29745. 💔 ⚡ U for N got 1462 0

swlcyx

good first issue

wontfix

Failed to load periodical seed resources when execute the jar following readme's instruction.

10:43:27.719 [r-worker-1] WARN a.p.e.a.c.b.c.AmazonGenerator - Unexpected exception java.nio.file.FileSystemNotFoundException: null at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:169) at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.getPath(ZipFileSystemProvider.java:155) at java.base/java.nio.file.Path.of(Path.java:208) at java.base/java.nio.file.Paths.get(Paths.java:97) at ai.platon.exotic.amazon.crawl.boot.component.AmazonGenerator.getPeriodicalSeedDirectories(AmazonGenerator.kt:61) at ai.platon.exotic.amazon.crawl.boot.component.AmazonGenerator.generateLoadingTasks(AmazonGenerator.kt:111) at ai.platon.exotic.amazon.crawl.boot.component.AmazonGenerator.generateStartupTasks(AmazonGenerator.kt:85) at ai.platon.exotic.amazon.crawl.boot.component.AmazonCrawler.generate(AmazonCrawler.kt:53) at ai.platon.scent.crawl.AbstractRunnableCrawler.run0(AbstractRunnableCrawler.kt:49) at ai.platon.scent.crawl.AbstractRunnableCrawler.run$suspendImpl(AbstractRunnableCrawler.kt:29) at...

platonai

can not get data after timeout multi time

2

发现几种数据爬取失败时的日志，前2次失败了都会在几分钟后重试，第三次失败后，后面就不会重试了，也不会进入：isRelevant (true) -> onBeforeFilter -> onBeforeExtract -> extract -> onAfterExtract -> onAfterFilter 这个流程请问： 1、失败三次就直接失败是框架的机制吗？还是说可以通过某些设置解决 2、有没有办法设置，或者代码操作的时候，让失败了还是可以进入：isRelevant (true) -> onBeforeFilter -> onBeforeExtract -> extract -> onAfterExtract -> onAfterFilter 这个流程，因为这样可以做一些后置（清除操作）处理第一次失败： `Timeout...

sskmtm

good first issue

wontfix

一起学习

2

https://www.yuque.com/g/kuloudadi/acseen/bl28so6x51ntz4lm/collaborator/join?token=NMF1sHp4XPlcpnB3# 邀请你共同编辑文档《柏拉图ai学习》

yangxiongj

good first issue

wontfix

The program is stuck on the dubug message

3

The web page is stuck. And The terminal occasionally prints debug messages. **_DEBUG a.p.s.r.a.schedule.ScentRestMonitor - Try executing top N tasks ..._** ![f50a9223a68608e876a32ebc4f0e67e](https://user-images.githubusercontent.com/39584730/221103552-74cc88eb-70ff-44ea-bbfc-0f5212068e9a.png)

wfh1300

good first issue

wontfix

How do I change the default store mode

3

Hi, I've set up jdbcommitter as required, and commented out the code in the configuration to load mongodb by default, but the default mode remains once the service is started

ws4435700

good first issue

wontfix

exotic-amazon
exotic-amazon copied to clipboard

Metadata

FileBackendStorage should be used to run demo tasks

Better deployment experience

关于extract-config

Failed to create chrome devtools driver

单一资源模式采集 amazon.com 出现 503 错误

Failed to load periodical seed resources when execute the jar following readme's instruction.

can not get data after timeout multi time

一起学习

The program is stuck on the dubug message

How do I change the default store mode

← Metadata

Owner

Metadata

exotic-amazon exotic-amazon copied to clipboard

Metadata

← Metadata

Owner

Metadata

exotic-amazon
exotic-amazon copied to clipboard