overwatch icon indicating copy to clipboard operation
overwatch copied to clipboard

UPGRADE FAILED function structToMap on 7.0

Open alinealvarez0107 opened this issue 2 years ago • 2 comments

When trying to upgrade to version 7.0, I got this failure in step 3 of the command val upgradeReport = Upgrade.upgradeTo0700(prodWorkspace, startStep = 1).

View the statusMsg: UPGRADE FAILED function structToMap, columnToConvert must be of type struct but found map instead java.lang.Exception: function structToMap, columnToConvert must be of type struct but found map instead at com.databricks.labs.overwatch.utils.SchemaTools$.structToMap(SchemaTools.scala:253) at com.databricks.labs.overwatch.utils.Upgrade$.upgradeTo0700(Upgrade.scala:1251) ... at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020) at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564) at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:221) at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:225) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:1069) at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:1022) at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:225) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$21(DriverLocal.scala:689) at com.databricks.unity.UCSDriver$Manager$Handle.runWith(UCSDriver.scala:104) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$19(DriverLocal.scala:689) at com.databricks.logging.Log4jUsageLoggingShim$.$anonfun$withAttributionContext$1(Log4jUsageLoggingShim.scala:32) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:94) at com.databricks.logging.Log4jUsageLoggingShim$.withAttributionContext(Log4jUsageLoggingShim.scala:30) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:283) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:282) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:60) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:318) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:303) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:60) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:666) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:622) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:614) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:533) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:568) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:438) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:381) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:232) at java.lang.Thread.run(Thread.java:748)

What should I do to fix this error and proceed with the upgrade?

falha_upgrade7 0

alinealvarez0107 avatar Oct 21 '22 12:10 alinealvarez0107

what version are you upgrading from? Is this the first time you're running the upgrade or did you previously run the upgrade? It looks like this table has already been upgraded, please advise.

GeekSheikh avatar Oct 21 '22 18:10 GeekSheikh

I was using version 6.0 and had to upgrade to version 6.1 yesterday to be able to run 7.0.

alinealvarez0107 avatar Oct 21 '22 18:10 alinealvarez0107

Hi @alinealvarez0107,

Which cloud are you using (AWS/Azure) ? Upgrade to version 6.1 was successful ?

Regards, Anusha

AnushaSure avatar Oct 25 '22 15:10 AnushaSure

Hi, @AnushaSure

Yes, it was! We're using Azure.

Could you help me please?

I need to finish this upgrade :/

alinealvarez0107 avatar Oct 25 '22 17:10 alinealvarez0107

Hi @alinealvarez0107,

While moving from 611 to 070, what cluster library you used? Is it 0700 or 07001 JAR? Please let me know your time zone. We are from the IST. If possible we can have a call to analyze the issue better.

Regards, Neha

Neha-vs123 avatar Oct 26 '22 10:10 Neha-vs123

I used 0700 in brazil location.

alinealvarez0107 avatar Oct 26 '22 17:10 alinealvarez0107

@alinealvarez0107 , Please let me know when can we meet today. It would be helpful if we can meet sometime between 9 to 11 am BRT (your time). Please send me your mail id to [email protected], so that I will schedule a meeting.

Neha-vs123 avatar Oct 27 '22 06:10 Neha-vs123

Hello @Neha-vs123 ,

I just sent the meeting invitation to you from 10:30 am to 11:00 am Brazil time.

Thank you.

alinealvarez0107 avatar Oct 27 '22 12:10 alinealvarez0107

Hi @alinealvarez0107 ,

Thanks for sharing the notebook. As we discussed in the call, I'll find a solution and get back to you on Monday.

Regards, Neha

Neha-vs123 avatar Oct 27 '22 14:10 Neha-vs123

could you please try resuming the upgrade from step 4. We're investigating how some customers already have the correct type before the upgrade but since the target type is already a map, no need to complete this step.

val upgradeReport = Upgrade.upgradeTo0700(prodWorkspace, startStep = 4)

GeekSheikh avatar Oct 27 '22 18:10 GeekSheikh

@GeekSheikh , we finished the upgrade by resuming from step 4. It was completed successfully from steps 4 to 7. In the final pip report, there are failures in Bronze_Jobs_Snapshot (step 3 error) and Bronze_SparkEventLogs.

@alinealvarez0107 , cluster library 7.0.0.2 JAR is released in Maven. Please change the JAR to 7002 and run the jobs. Please send the pipeline report after finishing the job run using 7002 JAR. Use the command - select * from overwatch_etl.pipeline_report

Neha-vs123 avatar Nov 02 '22 10:11 Neha-vs123

Hi @alinealvarez0107 , Are you still facing issues after using the 7.0.0.2 JAR?

Neha-vs123 avatar Nov 07 '22 06:11 Neha-vs123

Hi @Neha-vs123!

I still have the same problem with the data types of some tables.

overwatch_jar_7 0 0 2

alinealvarez0107 avatar Nov 07 '22 16:11 alinealvarez0107

@alinealvarez0107 , Any other failed modules other than Bronze_SparkEventLogs? Please send the pipeline_report from the ETL database.

Please check if the event hub streaming is working fine. You can also check it by running the readiness notebook in this workspace.

Neha-vs123 avatar Nov 08 '22 10:11 Neha-vs123

Hi @Neha-vs123 I sent you the pipeline_report by email. The event hub streaming is working fine.

alinealvarez0107 avatar Nov 08 '22 13:11 alinealvarez0107

This is because your upgrade to 0610 did not complete successfully. 0610 Upgrade step 3 was to convert this field to a map. Please also ensure that you are using DBR 10.4LTS on your overwatch cluster. After you've ensured the OW cluster is on 10.4LTS please run the commands below in a notebook. The commands below are the commands that did not complete as part of the 0610 upgrade (I'm guessing you might have forgotten to change you cluster to 10.4LTS maybe).

import org.apache.log4j.{Level, Logger}

val logger: Logger = Logger.getLogger("UPGRADE_RETRY")
val etlDatabaseName = "overwatch_etl" // CHANGE ME IF NECESSARY
val targetName = "spark_events_bronze"

spark.conf.set("spark.databricks.delta.optimizeWrite.numShuffleBlocks", "500000")
spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "2048")
spark.conf.set("spark.sql.files.maxPartitionBytes", (1024 * 1024 * 64).toString)
spark.conf.set("spark.databricks.delta.properties.defaults.autoOptimize.optimizeWrite", "true")

val sparkEventsBronzeDF = spark.table(s"${etlDatabaseName}.${targetName}")
val sparkEventsSchema = sparkEventsBronzeDF.schema
val fieldsRequiringRebuild = Array("modifiedConfigs", "extraTags")

def upgradeDeltaTable(qualifiedName: String): Unit = {
  try {
    val tblPropertiesUpgradeStmt =
      s"""ALTER TABLE $qualifiedName SET TBLPROPERTIES (
    'delta.minReaderVersion' = '2',
    'delta.minWriterVersion' = '5',
    'delta.columnMapping.mode' = 'name'
  )
  """
    logger.info(s"UPGRADE STATEMENT for $qualifiedName: $tblPropertiesUpgradeStmt")
    spark.sql(tblPropertiesUpgradeStmt)
  } catch {
    case e: Throwable =>
      logger.error(s"FAILED $qualifiedName ->", e)
      println(s"FAILED UPGRADE FOR $qualifiedName")
  }
}

upgradeDeltaTable(s"${etlDatabaseName}.${targetName}")

val fieldsToRename = sparkEventsSchema.fieldNames.filter(f => fieldsRequiringRebuild.contains(f))
fieldsToRename.foreach(f => {
  val modifyColStmt = s"alter table ${etlDatabaseName}.${targetName} rename " +
    s"column $f to ${f}_tobedeleted"
  logger.info(s"Beginning $targetName upgrade\nSTMT1: $modifyColStmt")
  spark.sql(modifyColStmt)
})

GeekSheikh avatar Nov 08 '22 21:11 GeekSheikh

Thanks, @GeekSheikh for the detailed explanation.

@alinealvarez0107 , please follow all the steps given above and re-run the job with the same cluster. Make sure the cluster DBR is 10.4 LTS. Let me know if have any doubts.

Neha-vs123 avatar Nov 09 '22 10:11 Neha-vs123

Hello @GeekSheikh and @Neha-vs123 ,

I was using DBR 11.1. I switched to 10.4LTS, copied the commands and got the import failed to execute.

I changed the import command and I'm trying to run it.

alinealvarez0107 avatar Nov 09 '22 12:11 alinealvarez0107

ok, sorry I wasn't able to test this internally -- apologies if there's a syntax bug there. Please let us know status and we'll get on a screen share to get this sorted if necessary.

GeekSheikh avatar Nov 09 '22 15:11 GeekSheikh

Hi @GeekSheikh don´t worry! ;)

I corrected the import and after executing the command you sent me, the overwatch notebook completed successfully and the table that was at fault was updated correctly.

I'm applying the fix to the other workspaces and will let you know if any issues arise.

result 11092022

alinealvarez0107 avatar Nov 09 '22 15:11 alinealvarez0107

Hi @alinealvarez0107 , Great! Thanks for the update. Closing this ticket for now. Feel free to reopen this when you need any assistance.

Regards, Neha

Neha-vs123 avatar Nov 14 '22 10:11 Neha-vs123