python - How to inspect or save large matrix to file in Spark -
i have created large block matrix in pyspark called mtm 85k x 85k dimensions. inspect matrix make sure created way wanted. have tried different routes , of them failed memory issues exit code 143 or 92.
options have tried far: 1. convert matrix rdd , take @ first entry:
mtm_coor = mtm.tocoordinatematrix() mtm_rdd = mtm_coor.entries mtm_rdd.take(1)
save text file
mtm_rdd.saveastextfile('./mtm.txt')
convert dataframe
mtm_df = mtm_rdd.todf()
my question figure out workflow in pyspark. how inspect large matrix without running out of memory , how save large matrix file without running memory issues?
wiki
Comments
Post a Comment