Problem
While creating spark sql session, received following error message.
Environment
Steps to reproduce
Root Cause Analysis
While creating spark sql session, received following error message.
Exception: Java gateway process exited before sending its port number
Environment
- OS - RHEL 7
- Jupyter notebook
Steps to reproduce
- configure jupyter notebook
- Start Jupyter
- Access Jupyter webpage
- Run program in notebook like below
from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession
- now click on "run"
Solution
Add following in .bashrc and restart jupyter notebook
#For jupyter notebook:
export PYSPARK_SUBMIT_ARGS="--master yarn-client pyspark-shell"
If you look at the code below
it says connection file is not being created in tmp directory. In my case issue was environment variable "PYSPARK_SUBMIT_ARGS" was not setup. This caused to not to create "proc"
submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS", "pyspark-shell")
...
command = command + shlex.split(submit_args)
...
proc = Popen(command, **popen_kwargs)
...
# Wait for the file to appear, or for the process to exit, whichever happens first.
while not proc.poll() and not os.path.isfile(conn_info_file):
time.sleep(0.1)
if not os.path.isfile(conn_info_file):
raise Exception("Java gateway process exited before sending its port number")
...