python - How to inspect or save large matrix to file in Spark -




i have created large block matrix in pyspark called mtm 85k x 85k dimensions. inspect matrix make sure created way wanted. have tried different routes , of them failed memory issues exit code 143 or 92.

options have tried far: 1. convert matrix rdd , take @ first entry:

mtm_coor = mtm.tocoordinatematrix() mtm_rdd = mtm_coor.entries mtm_rdd.take(1) 
  1. save text file

    mtm_rdd.saveastextfile('./mtm.txt')

  2. convert dataframe

    mtm_df = mtm_rdd.todf()

my question figure out workflow in pyspark. how inspect large matrix without running out of memory , how save large matrix file without running memory issues?





wiki

Comments

Popular posts from this blog

Asterisk AGI Python Script to Dialplan does not work -

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -