e2e-data-engineering icon indicating copy to clipboard operation
e2e-data-engineering copied to clipboard

An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compone...

Results 3 e2e-data-engineering issues
Sort by recently updated
recently updated
newest added

I am in the last steps of the project and when I do spark-submit I got cassandra module not found error. I have checked all the jars and cassandra-driver version...

Exception has occurred: AirflowConfigException Cannot use relative path: `sqlite:///C:\Users\User_Win10x64/airflow/airflow.db` to connect to sqlite. Please use absolute path such as `sqlite:////tmp/airflow.db`. File "D:\Work\data-engineer\dags\kafka-stream.py", line 2, in from airflow import DAG airflow.exceptions.AirflowConfigException:...

Whenever we try to connect to kafka, we get this error: WARNING:root:kafka dataframe could not be created because: An error occurred while calling o36.load. : java.lang.NoClassDefFoundError: scala/$less$colon$less at org.apache.spark.sql.kafka010.KafkaSourceProvider.org$apache$spark$sql$kafka010$KafkaSourceProvider$$validateStreamOptions(KafkaSourceProvider.scala:338) at...