alluxio
alluxio copied to clipboard
More than 100% data in Alluxio
Alluxio Version: What version of Alluxio are you using?
Describe the bug
The error logs I observed in spark were:
Protocol message tag had invalid wire type.
The error logs I observed in trino were:
Error opening Hive split alluxio:/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 (offset=67108864, length=67108864): Incorrect file size (270589145) for file (end of stream not reached): alluxio:/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000
I check the information of this file in alluxio as follows, the size of the file is more than 100%, specifically, it is made up of two 256M blocks, and the size of the file in HDFS is 258.1M,it should be noted that the data in HDFS is written by Alluxio,I used alluxio.user.file.metadata.sync.interval=216000000 and alluxio.user.file.writetype.default=CACHE_THROUGH When I switched the metadata for this table back to HDFS, the job worked, meaning the HDFS data was working, but Alluxio's data was causing problems
Here's how this file looks in alluxio:
bin/alluxio fs stat /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000
/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 is a file path.
URIStatus{info=FileInfo{fileId=164846448410623, name=part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, ufsPath=hdfs://intsig-bigdata-nameservice/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, length=270589145, blockSizeBytes=268435456, creationTimeMs=1713296722044, completed=true, folder=false, pinned=false, pinnedlocation=[], cacheable=true, persisted=true, blockIds=[164846431633408, 164846431633409], inMemoryPercentage=198, lastModificationTimesMs=1713296987573, ttl=-1, lastAccessTimesMs=1713296987573, ttlAction=FREE, owner=core_adm, group=hive, mode=440, persistenceState=PERSISTED, mountPoint=false, replicationMax=-1, replicationMin=0, fileBlockInfos=[FileBlockInfo{blockInfo=BlockInfo{id=164846431633408, length=268435456, locations=[BlockLocation{workerId=4789052418356342161, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-106.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-106.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=7686509082663303020, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-136.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-136.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=8389319337060975339, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-139.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-139.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=252882622198499095, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-112.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-112.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=318529424824982519, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-115.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-115.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=751232389065403260, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-99.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-99.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=884009842569883069, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-130.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-130.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1322556044732826109, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-108.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-108.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1659740148412970156, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-103.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-103.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1750661044104421740, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-129.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-129.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2212130573833630043, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-96.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-96.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2323607761968839547, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-117.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-117.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2537207964641668843, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-105.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-105.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2915095790252411072, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-97.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-97.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3097974485333619764, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-101.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-101.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3571195479121079143, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-95.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-95.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=5001574657684482277, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-94.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-94.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=6859183490568364892, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-107.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-107.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7009847752086549463, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-111.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-111.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7273818547467537089, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-110.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-110.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7438420668927043862, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-116.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-116.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7504197730422955713, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-104.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-104.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8039274055332598606, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-113.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-113.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8379003440035830949, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-100.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-100.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8964743355741175974, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-119.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-119.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=9006306898128303596, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-114.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-114.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}]}, offset=0, ufsLocations=[]}, FileBlockInfo{blockInfo=BlockInfo{id=164846431633409, length=268435456, locations=[BlockLocation{workerId=4789052418356342161, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-106.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-106.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=8389319337060975339, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-139.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-139.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=252882622198499095, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-112.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-112.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=318529424824982519, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-115.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-115.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=751232389065403260, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-99.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-99.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=884009842569883069, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-130.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-130.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1322556044732826109, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-108.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-108.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1659740148412970156, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-103.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-103.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2212130573833630043, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-96.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-96.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2323607761968839547, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-117.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-117.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2537207964641668843, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-105.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-105.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2915095790252411072, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-97.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-97.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3097974485333619764, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-101.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-101.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3571195479121079143, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-95.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-95.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=5001574657684482277, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-94.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-94.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=6859183490568364892, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-107.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-107.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7009847752086549463, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-111.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-111.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7273818547467537089, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-110.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-110.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7438420668927043862, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-116.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-116.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7504197730422955713, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-104.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-104.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7686509082663303020, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-136.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-136.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8039274055332598606, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-113.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-113.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8379003440035830949, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-100.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-100.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8964743355741175974, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-119.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-119.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=9006306898128303596, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-114.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-114.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}]}, offset=268435456, ufsLocations=[]}], mountId=1, inAlluxioPercentage=198, ufsFingerprint=TYPE|FILE UFS|hdfs OWNER|prod_tong_liu GROUP|prod_tong_liu MODE|432 CONTENT_HASH|(len:270589145,_modtime:1713296987554) , acl=user::rw-,group::rwx,other::---,group:intsig:r-x,group:prod:rwx,mask::rw-, defaultAcl=}, cacheContext=null}
Containing the following blocks:
BlockInfo{id=164846431633408, length=268435456, locations=[BlockLocation{workerId=4789052418356342161, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-106.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-106.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=7686509082663303020, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-136.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-136.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=8389319337060975339, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-139.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-139.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=252882622198499095, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-112.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-112.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=318529424824982519, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-115.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-115.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=751232389065403260, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-99.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-99.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=884009842569883069, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-130.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-130.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1322556044732826109, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-108.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-108.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1659740148412970156, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-103.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-103.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1750661044104421740, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-129.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-129.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2212130573833630043, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-96.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-96.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2323607761968839547, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-117.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-117.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2537207964641668843, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-105.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-105.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2915095790252411072, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-97.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-97.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3097974485333619764, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-101.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-101.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3571195479121079143, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-95.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-95.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=5001574657684482277, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-94.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-94.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=6859183490568364892, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-107.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-107.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7009847752086549463, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-111.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-111.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7273818547467537089, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-110.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-110.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7438420668927043862, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-116.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-116.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7504197730422955713, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-104.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-104.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8039274055332598606, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-113.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-113.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8379003440035830949, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-100.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-100.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8964743355741175974, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-119.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-119.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=9006306898128303596, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-114.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-114.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}]}
BlockInfo{id=164846431633409, length=268435456, locations=[BlockLocation{workerId=4789052418356342161, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-106.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-106.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=8389319337060975339, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-139.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-139.intsig.internal, rack=null)}, tierAlias=MEM, mediumType=MEM}, BlockLocation{workerId=252882622198499095, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-112.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-112.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=318529424824982519, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-115.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-115.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=751232389065403260, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-99.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-99.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=884009842569883069, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-130.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-130.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1322556044732826109, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-108.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-108.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=1659740148412970156, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-103.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-103.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2212130573833630043, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-96.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-96.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2323607761968839547, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-117.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-117.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2537207964641668843, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-105.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-105.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=2915095790252411072, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-97.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-97.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3097974485333619764, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-101.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-101.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=3571195479121079143, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-95.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-95.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=5001574657684482277, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-94.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-94.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=6859183490568364892, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-107.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-107.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7009847752086549463, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-111.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-111.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7273818547467537089, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-110.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-110.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7438420668927043862, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-116.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-116.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7504197730422955713, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-104.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-104.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=7686509082663303020, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-136.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-136.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8039274055332598606, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-113.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-113.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8379003440035830949, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-100.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-100.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=8964743355741175974, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-119.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-119.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}, BlockLocation{workerId=9006306898128303596, address=WorkerNetAddress{host=centos-bigdata-datanode-10-24-2-114.intsig.internal, containerHost=, rpcPort=29999, dataPort=29999, webPort=30010, domainSocketPath=, tieredIdentity=TieredIdentity(node=centos-bigdata-datanode-10-24-2-114.intsig.internal, rack=null)}, tierAlias=SSD, mediumType=SSD}]}
In addition, neither checksum nor copyToLocal attempts were successful for this file, and I didn't see any errors from the master or worker when the Spark and Trino tasks failed
Which alluxio version do you use?
you need to refresh metadata.
set alluxio.user.file.metadata.sync.interval=0
to sync metadata.
Thank you for your reply ! @YichuanSun I used 2.9.3 @jasondrogba I tried to check the metadata using checkConsistency, but it was no different from UFS, and my manual loadMetaData data was not fixed.
/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 is consistent with the under storage system.
I'm wondering what's causing this, and I'm not going to bypass Alluxio and write UFS directly, as there should be no inconsistency in metadata
In addition, the write to the file is a spark2.4.8 connection to hive write (the hive table metadata points to alluxio), and the alluxio log before and after the file generation is as follows:
grep part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 /opt/alluxio/alluxio-2.9.3/logs/*log*|grep "2024-04-17 03:"
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.81:2024-04-17 03:58:42,943 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.200:52778 cmd=getFileInfo src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=302 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.82:2024-04-17 03:58:09,365 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=false allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.113:56684 cmd=getFileInfo src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=null executionTimeUs=1366 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.82:2024-04-17 03:58:09,373 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.113:56684 cmd=rename src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/task_20240417030152_0021_m_000957/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 perm=core_adm:hive:rw-rwx--- executionTimeUs=6462 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.82:2024-04-17 03:58:36,183 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=false allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.113:56684 cmd=delete src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=null executionTimeUs=1067 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.82:2024-04-17 03:58:36,189 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.113:56684 cmd=rename src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 perm=core_adm:hive:rw-rwx--- executionTimeUs=5159 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.82:2024-04-17 03:58:36,190 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.113:56684 cmd=getFileInfo src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=293 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.85:2024-04-17 03:49:45,387 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.135:58148 cmd=getNewBlockIdForFile src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=38 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.85:2024-04-17 03:49:47,574 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.135:58148 cmd=completeFile src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=1265 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.85:2024-04-17 03:49:47,581 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.135:58148 cmd=getFileInfo src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=4358 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.86:2024-04-17 03:45:22,046 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.135:58148 cmd=createFile src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=22677 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.86:2024-04-17 03:45:22,057 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.135:58148 cmd=getNewBlockIdForFile src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=48 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master_audit.log.86:2024-04-17 03:45:52,078 INFO [AsyncUserAccessAuditLogger](AsyncUserAccessAuditLogWriter.java:126) - succeeded=true allowed=true ugi=prod_tong_liu,prod_tong_liu (AUTH=SIMPLE) ip=/10.24.2.135:58148 cmd=getNewBlockIdForFile src=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000 dst=null perm=core_adm:hive:rw-rwx--- executionTimeUs=35 proto=rpc
/opt/alluxio/alluxio-2.9.3/logs/master.log:2024-04-17 03:58:36,183 WARN [master-rpc-executor-TPE-thread-319](InodeSyncStream.java:503) - Failed to sync metadata on root path InodeSyncStream{rootPath=LockingScheme{path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, desiredLockPattern=READ, shouldSync={Should sync: true, Last sync time: 1712631045797}}, descendantType=ALL, commonOptions=syncIntervalMs: 216000000
/opt/alluxio/alluxio-2.9.3/logs/master.log:2024-04-17 03:58:36,183 WARN [master-rpc-executor-TPE-thread-319](RpcUtils.java:197) - Exit (Error): Remove: request=path: "/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000"
/opt/alluxio/alluxio-2.9.3/logs/master.log:2024-04-17 03:58:36,185 WARN [master-rpc-executor-TPE-thread-216](InodeSyncStream.java:503) - Failed to sync metadata on root path InodeSyncStream{rootPath=LockingScheme{path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, desiredLockPattern=READ, shouldSync={Should sync: true, Last sync time: 1712631045797}}, descendantType=ONE, commonOptions=syncIntervalMs: 216000000
/opt/alluxio/alluxio-2.9.3/logs/master.log.1:2024-04-17 03:45:22,024 WARN [master-rpc-executor-TPE-thread-131](InodeSyncStream.java:503) - Failed to sync metadata on root path InodeSyncStream{rootPath=LockingScheme{path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, desiredLockPattern=READ, shouldSync={Should sync: true, Last sync time: 1712631045797}}, descendantType=ONE, commonOptions=syncIntervalMs: 216000000
/opt/alluxio/alluxio-2.9.3/logs/master.log.1:2024-04-17 03:58:09,365 WARN [master-rpc-executor-TPE-thread-180](InodeSyncStream.java:503) - Failed to sync metadata on root path InodeSyncStream{rootPath=LockingScheme{path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, desiredLockPattern=READ, shouldSync={Should sync: true, Last sync time: 1712631045797}}, descendantType=NONE, commonOptions=syncIntervalMs: 216000000
/opt/alluxio/alluxio-2.9.3/logs/master.log.1:2024-04-17 03:58:09,367 WARN [master-rpc-executor-TPE-thread-407](InodeSyncStream.java:503) - Failed to sync metadata on root path InodeSyncStream{rootPath=LockingScheme{path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, desiredLockPattern=READ, shouldSync={Should sync: true, Last sync time: 1712631045797}}, descendantType=ONE, commonOptions=syncIntervalMs: 216000000
Have you tried refreshing hive's metadata? I think the problem is caused by the difference between the metadata in hive and the metadata in alluxio.
In addition, According to the final WARN, it may be that alluxio does not have permission to access /user/hive/warehouse
, alluxio failed to sync metadata.
Metadata Synchronization in Alluxio: Design, Implementation and Optimization this blog has more information about metadata sync
You can share more information from master.log
WARN [master-rpc-executor-TPE-thread-407](InodeSyncStream.java:503) - Failed to sync metadata on root path InodeSyncStream{rootPath=LockingScheme{path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/........
https://github.com/Alluxio/alluxio/blame/26919b8894d251b803c82513cb1eeee562bace0a/core/server/master/src/main/java/alluxio/master/file/InodeSyncStream.java#L503 and you can check the code here, the file does not exist on the UFS or in Alluxio
@jasondrogba Of course, I'll share more logs, the following two logs show up repeatedly in master.log.Note that all files in /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416 will get the same two alerts
2024-04-17 03:58:41,516 WARN [master-rpc-executor-TPE-thread-273](RpcUtils.java:197) - Exit (Error): Remove: request=path: "/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-01388-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000"
options {
recursive: true
alluxioOnly: false
unchecked: false
commonOptions {
syncIntervalMs: 216000000
ttl: -1
ttlAction: FREE
operationId {
mostSignificantBits: -7539285028618024718
leastSignificantBits: -6183483581502312092
}
}
}
, Error=alluxio.exception.FileDoesNotExistException: Path "/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-01388-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000" does not exist.
2024-04-17 03:58:41,517 WARN [master-rpc-executor-TPE-thread-53](InodeSyncStream.java:503) - Failed to sync metadata on root path InodeSyncStream{rootPath=LockingScheme{path=/user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/batch=20240416/part-01388-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000, desiredLockPattern=READ, shouldSync={Should sync: true, Last sync time: 1712631045797}}, descendantType=ONE, commonOptions=syncIntervalMs: 216000000
ttl: -1
ttlAction: FREE
operationId {
mostSignificantBits: 3717584004956243185
leastSignificantBits: -6223852750410610288
}
, forceSync=false} because it does not exist on the UFS or in Alluxio
One thing that bothered me was that the logs were telling me that the sync failed because the files didn't exist, but when I checked manually, the files were always there,additionally, the Alluxio process is started as the HDFS superuser with full access to /user/hive/warehouse.
After that, I ran the following command to make sure I didn't miss critical logs:
cat /opt/alluxio/alluxio-2.9.3/logs/master.log.1 |grep 'adm_cs_device_tag_df' |wc -l
cat /opt/alluxio/alluxio-2.9.3/logs/master.log.1 |grep 'adm_cs_device_tag_df' | grep 'WARN' |wc -l
cat /opt/alluxio/alluxio-2.9.3/logs/master.log.1 |grep 'adm_cs_device_tag_df' | grep 'Error' |wc -l
cat /opt/alluxio/alluxio-2.9.3/logs/master.log.1 |grep 'adm_cs_device_tag_df' | grep 'WARN' |grep "Failed to sync metadata" | wc -l
cat /opt/alluxio/alluxio-2.9.3/logs/master.log.1 |grep 'adm_cs_device_tag_df' | grep 'WARN' |grep -v "Failed to sync metadata" | head -100
Could this be due to the fact that I set ACLs manually? Although I didn't see any permissions errors, all the tables written by alluxio, user and group are different from those written without Alluxio (note that Alluxio's manual acl is the same as HDFS's acl).
@jasondrogba hi, did you forget this issue, and the problem still exists. If possible, please help to confirm whether this is a bug and how to solve it. If you need any logs or information, please feel free to tell me
oh! @ziyangRen ,can you share the worker log? and why do you have so many block replicas?
If convenient, could you also please share the alluxio-site.properties
file?
hi @jasondrogba I tried my best to gather the worker logs and didn't find any errors, but I've tried to summarize a few recurring events from that time that might help:
The first is a large number of block trasfer, and this time the two blockids with the wrong data have gone through this process:
2024-04-17 03:33:49,402 WARN [block-management-task-47](TieredBlockStore.java:569) - Target tier: BlockStoreLocation{TierAlias=SSD, DirIndex=0, MediumType=SSD} has no available space to store 134255858 bytes for session: -5163578156016566513
2024-04-17 03:33:49,402 WARN [block-management-task-47](BlockTransferExecutor.java:146) - Transfer-order: BlockTransferInfo{TransferType=SWAP, SrcBlockId=164604722282496, DstBlockId=164753401970688, SrcLocation=BlockStoreLocation{TierAlias=MEM, DirIndex=0, MediumType=MEM}, DstLocation=BlockStoreLocation{TierAlias=SSD, DirIndex=0, MediumType=SSD}} failed. alluxio.exception.runtime.ResourceExhaustedRuntimeException: Failed to find space in BlockStoreLocation{TierAlias=SSD, DirIndex=0, MediumType=SSD} to move blockId 164604722282496
2024-04-17 03:33:49,402 WARN [block-management-task-47](AlignTask.java:100) - Insufficient space for worker swap space, swap restore task called.
The following logs appear to be normal for reading and writing to HDFS, and I have listed them below:
2024-04-17 04:07:31,271 WARN [worker-rpc-executor-TPE-thread-55447](LogUtils.java:135) - Exception occurred while processing read request onError sessionId: null, null: io.grpc.StatusRuntimeException: CANCELLED: client cancelled
2024-04-17 03:47:43,513 INFO [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](SaslDataTransferClient.java:239) - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2024-04-17 03:49:47,027 INFO [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_000957_34607/batch=20240416/part-00957-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](SaslDataTransferClient.java:239) - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2024-04-17 03:55:23,777 INFO [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_001238_34888/batch=20240416/part-01238-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](DataStreamer.java:1790) - Exception in createBlockOutputStream blk_6388178153_5380773515
java.net.SocketTimeoutException: 75000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.24.2.119:51092 remote=/10.24.2.119:1004]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:548)
at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1762)
at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2024-04-17 03:55:23,778 WARN [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_001238_34888/batch=20240416/part-01238-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](DataStreamer.java:1683) - Abandoning BP-1902924606-10.2.5.100-1516956632926:blk_6388178153_5380773515
2024-04-17 03:55:23,788 WARN [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_001238_34888/batch=20240416/part-01238-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](DataStreamer.java:1688) - Excluding datanode DatanodeInfoWithStorage[10.24.2.119:1004,DS-68690278-d955-4978-9cdf-4c9ec80a3d6e,DISK]
2024-04-17 03:55:23,778 WARN [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_001238_34888/batch=20240416/part-01238-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](DataStreamer.java:1683) - Abandoning BP-1902924606-10.2.5.100-1516956632926:blk_6388178153_5380773515
2024-04-17 03:55:23,788 WARN [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_001238_34888/batch=20240416/part-01238-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](DataStreamer.java:1688) - Excluding datanode DatanodeInfoWithStorage[10.24.2.119:1004,DS-68690278-d955-4978-9cdf-4c9ec80a3d6e,DISK]
2024-04-17 03:55:23,796 INFO [DataStreamer for file /user/hive/warehouse/edw_user.db/adm_cs_device_tag_df/.hive-staging_hive_2024-04-17_03-01-46_626_2294969227079732485-1/-ext-10000/_temporary/0/_temporary/attempt_20240417030152_0021_m_001238_34888/batch=20240416/part-01238-8a201bff-21ad-4a8a-9cff-cdec51ed1657.c000](SaslDataTransferClient.java:239) - SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
As for the problem of too many replicas you mentioned, I am also confused. Although there will be a large number of concurrent reads and writes by clients in the actual scenario, it should be possible to avoid a large number of replicas by configuring it as follows, using the relevant alluxio configuration:
alluxio.master.ufs.block.location.cache.capacity=0
# User properties
alluxio.user.file.metadata.sync.interval=216000000
alluxio.user.file.writetype.default=CACHE_THROUGH
alluxio.user.ufs.block.read.location.policy=alluxio.client.block.policy.DeterministicHashPolicy
alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards=3
alluxio.user.block.write.location.policy.class=alluxio.client.block.policy.MostAvailableFirstPolicy
alluxio.user.file.replication.max=3
If there is any need, please let me know at any time and I will provide you with relevant information as soon as possible
I see you're using different medium types, MEM and SSD, and based on your URIStatus, there are many blocks. I suspect this might be due to your configuration of multiple-tier storage.
Based on the information you've shown in the worker log, I noticed an error,
ResourceExhaustedRuntimeException: Failed to find space in SSD.
My guess is that the HDFS data is 258.1MB, which can be divided into two blocks. The first block, 256MB, is stored in MEM, while the second block is stored in SSD. However, due to insufficient space in SSD, it gets cleared, leading to the Trino error: "Incorrect file size (270589145) for file (end of stream not reached)."
@jasondrogba Thanks for your quick reply. But if you say this is the case, I have three questions:
- I can understand the data miss due to lack of SSD storage space, but when Trino queries, Alluxio should be able to find the cleared data block in UFS, but the current situation is that the metadata is inconsistent, Alluxio is not aware of the missing data.What causes Alluxio's inability to determine metadata inconsistencies
- Why is the second 256M block generated
- In addition to this block, many blocks experienced the same processing during that time, but they did not have data problems. The reason you proposed at present does not seem to explain the special phenomenon of this block
- according to the master log you shared, alluxio tried to sync the data, but failed
Error=alluxio.exception.FileDoesNotExistException: Path
Failed to sync metadata on root path .... because it does not exist on the UFS or in Alluxio
- The reason is, you've set the block size to 256MB, so any excess beyond 256MB for a 258MB file will be placed in the second block.
- I think this also illustrates that it's a special case. It's highly likely an issue with this HDFS file.
@jasondrogba Thanks again for your patience, but I still have some questions about the previous question:
- I manually synched the metadata and it didn't work because Alluxio didn't recognize the inconsistency. And after the checkConsistency instruction executes, Alluxio thinks the metadata is consistent, which is not expected. What is the cause of this problem
- I know that the size of the block is 256M. My question is, if the block beyond 256M is lost (the size of the cleared block is 2.1M), why there is another 256M data, This leads to a file size larger than 258.1M of HDFS in Alluxio (if the excess blocks are cleared due to insufficient SSD capacity, then it is expected that the file only has a 256M block left, and the data of the missing part can be retrieved by reading UFS).
- I wanted to know if I could avoid this problem by using single-level caching and reducing the metadata refresh interval.
alluxio.worker.tieredstore.levels=1
alluxio.worker.tieredstore.level0.alias=SSD
alluxio.worker.tieredstore.level0.dirs.path=/data1/alluxio-ssd-cache,/data2/alluxio-ssd-cache,/data3/alluxio-ssd-cache
alluxio.worker.tieredstore.level1.dirs.quota=800g,800g,800
alluxio.user.file.metadata.sync.interval=36000
@yuzhu You have more wisdom and experience. Do you have any idea about this issue?
@ziyangRen we recommend to use single tieredstore, you can try it.
@jasondrogba Thank you for your suggestions and patient answers. I will change to single tieredstore. If the same problem still occurs after modification, I will update it again.