zingg icon indicating copy to clipboard operation
zingg copied to clipboard

source column in the input leads to an error

Open sonalgoyal opened this issue 2 years ago • 3 comments

user reported error when input had a column named source. renaming to source_in fixed.

error was Analysis Exceptiion, 'z_source' is ambiguous.

need to investigate

sonalgoyal avatar Jul 24 '22 03:07 sonalgoyal

@Akash-R-7 can you please try and see what is happening here?

sonalgoyal avatar Aug 08 '22 16:08 sonalgoyal

@sonalgoyal Gives "Reference 'z_source' is ambiguous error" in match phase while joining source column and z_source.

Akash-R-7 avatar Aug 11 '22 08:08 Akash-R-7

The source column of input is clashing with the z_source column we create. Need to rename

sonalgoyal avatar Aug 11 '22 09:08 sonalgoyal

If one of the columns is renamed as source:

./scripts/zingg.sh --phase findTrainingData --conf examples/febrl/config.json --zinggDir /tmp/z_temp

this leads to

org.apache.spark.sql.AnalysisException: Reference 'z_source' is ambiguous, could be: z_source, z_source. at org.apache.spark.sql.catalyst.expressions.package$AttributeSeq.resolve(package.scala:377) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolve(LogicalPlan.scala:125) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveQuoted(LogicalPlan.scala:136) at org.apache.spark.sql.Dataset.resolve(Dataset.scala:250) at org.apache.spark.sql.Dataset.col(Dataset.scala:1417) at zingg.spark.client.SparkFrame.col(SparkFrame.java:110) at zingg.spark.client.SparkFrame.col(SparkFrame.java:18) at zingg.common.core.util.DSUtil.alignDupes(DSUtil.java:166) at zingg.common.core.executor.TrainingDataFinder.writeUncertain(TrainingDataFinder.java:136) at zingg.common.core.executor.TrainingDataFinder.execute(TrainingDataFinder.java:122) at zingg.common.client.Client.execute(Client.java:243) at zingg.common.client.Client.mainMethod(Client.java:182) at zingg.spark.client.SparkClient.main(SparkClient.java:65) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 2023-04-26 14:53:35,418 [main] WARN zingg.common.client.util.Email - Unable to send email Can't send command to SMTP host 2023-04-26 14:53:35,419 [main] WARN zingg.common.client.Client - Apologies for this message. Zingg has encountered an error. Reference 'z_source' is ambiguous, could be: z_source, z_source. zingg.common.client.ZinggClientException: Reference 'z_source' is ambiguous, could be: z_source, z_source. at zingg.common.core.executor.TrainingDataFinder.execute(TrainingDataFinder.java:127) at zingg.common.client.Client.execute(Client.java:243) at zingg.common.client.Client.mainMethod(Client.java:182) at zingg.spark.client.SparkClient.main(SparkClient.java:65) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) zingg.common.client.ZinggClientException: Reference 'z_source' is ambiguous, could be: z_source, z_source. at zingg.common.core.executor.TrainingDataFinder.execute(TrainingDataFinder.java:127) at zingg.common.client.Client.execute(Client.java:243) at zingg.common.client.Client.mainMethod(Client.java:182) at zingg.spark.client.SparkClient.main(SparkClient.java:65) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

vikasgupta78 avatar Apr 26 '23 09:04 vikasgupta78

renamed z_source to z_zsource

pull request #574

commits 3bbceb5b , 19d02b4e , ea18f3fa

vikasgupta78 avatar Apr 26 '23 10:04 vikasgupta78