sql - replacing multiple units of specific character with one unit in hive -
i have dataset in values same except number of semicolons in resulting different records.
for example if in column 1 records has a;b;c , record has a;;b;c, disabling use of distinct function in code. want treated duplicate record ;; needs replaced ;
how can replace multiple ; single ; in strings in dataset in hive?
you can use regexp_replace
defined in hive udfs
the first argument string needs changed. can call on table :
with t (select "a\;\;\;b\;\;c\;d" col ) select regexp_replace(t.col, "\;+", "\;") col t
this should give output
+-------+ | col| +-------+ |a;b;c;d| +-------+
wiki
Comments
Post a Comment