sql - replacing multiple units of specific character with one unit in hive -




i have dataset in values same except number of semicolons in resulting different records.

for example if in column 1 records has a;b;c , record has a;;b;c, disabling use of distinct function in code. want treated duplicate record ;; needs replaced ;

how can replace multiple ; single ; in strings in dataset in hive?

you can use regexp_replace defined in hive udfs

the first argument string needs changed. can call on table :

with t     (select "a\;\;\;b\;\;c\;d" col )  select regexp_replace(t.col, "\;+", "\;") col t 

this should give output

+-------+ |    col| +-------+ |a;b;c;d| +-------+ 




wiki

Comments

Popular posts from this blog

Asterisk AGI Python Script to Dialplan does not work -

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -