sql - replacing multiple units of specific character with one unit in hive -
i have dataset in values same except number of semicolons in resulting different records.
for example if in column 1 records has a;b;c , record has a;;b;c, disabling use of distinct function in code. want treated duplicate record ;; needs replaced ;
how can replace multiple ; single ; in strings in dataset in hive?
you can use regexp_replace defined in hive udfs
the first argument string needs changed. can call on table :
with t     (select "a\;\;\;b\;\;c\;d" col )  select regexp_replace(t.col, "\;+", "\;") col t   this should give output
+-------+ |    col| +-------+ |a;b;c;d| +-------+      wiki
Comments
Post a Comment