python - using an if statement over two columns in pandas -

- April 25, 2011

i trying calculate distance between shotpoints in seismic navigation file multiple lines. current code follows:

def delimiter(filename, a, b, c, d, e, f):     data = pd.read_fwf(filename, names=[a, b ,c ,d ,e ,f ], header=none)     data['lineshift'] = data['line'].shift(-1)     data['bool'] = data['lineshift'] == data['line']     _, row in data.iterrows():         data['spdif'] = np.abs(data['sp'].astype(float) - data['sp'].astype(float).shift(-1))         data['xdiff'] = data['x'] - data['x'].shift(-1)         data['ydiff'] = data['y'] - data['y'].shift(-1)         data['xydiff'] = np.sqrt(data['xdiff']**2 + data['ydiff']**2)         data['spdist'] = data['xydiff']/data['spdif']         if row['line'] != row['lineshift']:              data['spdif'] = data['spdif'].replace({0: np.nan})             data['xdiff'] = data['xdiff'].replace({0: np.nan})             data['ydiff'] = data['ydiff'].replace({0: np.nan})             data['xydiff'] = data['xydiff'].replace({0: np.nan})             data['spdist'] = data['spdist'].replace({0: np.nan})     data.info()     print data  delimiter(os.path.splitext(x)[0] + ".csv", "line", "sp", "xcoord", "ycoord", "x", "y")

this code loads csv shotpoint data pandas dataframe. however, want check if code not calculating distance between 2 shotpoints of different line. if 'line' column different 'lineshift' column of same row, want display n/a. if it's same should calculate 5 new columns specific row.

however when run code, gives following error:

valueerror: truth value of series ambiguous. use a.empty, a.bool(), a.item(), a.any() or a.all().

if possible, need add make code run , check every row?

an example of data in csv file:

      line    sp    ycoord     xcoord    x       y          lineshift 8     761298  1080  521754.1n  65132.6e  255355  479838     761298   true 9     761298  1090  5218 2.5n  65154.3e  255760  480107     761298   true 10    761298  1100  521812.1n  65216.0e  256165  480410     761298   true 11    761298  1110  521820.7n  65236.8e  256554  480685     771022  false 12    771022  1020  521835.8n  65238.3e  256573  481153     771022   true 13    771022  1030  521841.0n  65245.2e  256700  481315     771022   true 14    771022  1040  521845.8n  65252.2e  256830  481466     771022   true

this: data['lineshift'] == data['line'] series, not boolean, if data['lineshift'] == data['line'] ambiguous.

i think meant test current row in loop, like:

    _, row in data.iterrows():         if row['lineshift'] == row['line']:             # ...

edit: fixes error reported, should not use loop here.

def delimiter(filename, a, b, c, d, e, f):     data = pd.read_fwf(filename, names=[a, b ,c ,d ,e ,f ], header=none)     data['lineshift'] = data['line'].shift(-1)     data['bool'] = data['lineshift'] == data['line']     # calculate once     data['spdif'] = np.abs(data['sp'].astype(float) - data['sp'].astype(float).shift(-1))     data['xdiff'] = data['x'] - data['x'].shift(-1)     data['ydiff'] = data['y'] - data['y'].shift(-1)     data['xydiff'] = np.sqrt(data['xdiff']**2 + data['ydiff']**2)     data['spdist'] = data['xydiff'] / data['spdif']      data.loc[~data['bool'], ['spdif', 'xdiff', 'ydiff', 'xydiff', 'spdist']] = np.nan      data.info()     print data

wiki

Search This Blog

tL

python - using an if statement over two columns in pandas -

Comments

Post a Comment

Popular posts from this blog

python - Read npy file directly from S3 StreamingBody -

Asterisk AGI Python Script to Dialplan does not work -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -